[Home] [By Thread] [By Date] [Recent Entries]

  • From: Geert Bormans <geert@g...>
  • To: xml-dev@l...
  • Date: Wed, 21 Oct 2015 19:48:09 +0200 (CEST)

Strange, my earlier post seems messed up one way or another.

Sorry if this is a double post

But here is how I would do this
(technically, ignoring the 'why' of the use case)

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="Test">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Name" type="NameType" />
                <xs:element name="Description" type="DescriptionType" />
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    
    <xs:simpleType name="NameType">
        <xs:restriction base="ascii">
            <xs:maxLength value="10" />
        </xs:restriction>
    </xs:simpleType>
    
    <xs:simpleType name="DescriptionType">
        <xs:restriction base="ascii">
            <xs:maxLength value="20" />
        </xs:restriction>
    </xs:simpleType>
    
    <xs:simpleType name="ascii">
        <xs:restriction base="xs:string">
            <xs:pattern value="[\p{IsBasicLatin}]*" />
        </xs:restriction>
     </xs:simpleType>
    
</xs:schema>


From: "Roger L. Costello" <costello@m...>
To: xml-dev@l...
Sent: Wednesday, October 21, 2015 7:07:02 PM
Subject: A dandy little technique for constraining your strings to ASCII

Hi Folks,

 

So, you’ve created an XML schema. And it contains a lot of elements and attributes of type string.

 

You want each string constrained to just ASCII characters. Use the pattern facet for that.

 

Here’s a dandy little technique you can use:

 

At the top of your schema, place this named entity declaration:


<!DOCTYPE xs:schema [
<!ENTITY ASCII "[\p{IsBasicLatin}]*">
]>

 

The entity ( ASCII ) can then be referenced in each pattern facet:

 

<xs:simpleType name="NameType">
   
<xs:restriction base="xs:string">
       
<xs:maxLength value="10" />
       
<xs:pattern value="&ASCII;" />
   
</xs:restriction>
</xs:simpleType>

<xs:simpleType name="DescriptionType">
   
<xs:restriction base="xs:string">
       
<xs:maxLength value="20" />
       
<xs:pattern value="&ASCII;" />
   
</xs:restriction>
</xs:simpleType>

 

At parse-time the XML parser will substitute each entity reference ( &ASCII; ) with its replacement text ( [\p{IsBasicLatin}]* ).

 

The entity provides useful documentation; i.e., I assert that this:

 

<xs:pattern value="&ASCII;" />

 

is more readable than this:

 

<xs:pattern value="[\p{IsBasicLatin}]*" />

 

Here’s a complete schema to illustrate the technique:

 

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xs:schema [
<!ENTITY ASCII "[\p{IsBasicLatin}]*">
]>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    
    
<xs:element name="Test">
       
<xs:complexType>
           
<xs:sequence>
               
<xs:element name="Name" type="NameType" />
               
<xs:element name="Description" type="DescriptionType" />
           
</xs:sequence>
       
</xs:complexType>
   
</xs:element>
   
    
<xs:simpleType name="NameType">
       
<xs:restriction base="xs:string">
           
<xs:maxLength value="10" />
           
<xs:pattern value="&ASCII;" />
       
</xs:restriction>
   
</xs:simpleType>
   
    
<xs:simpleType name="DescriptionType">
       
<xs:restriction base="xs:string">
           
<xs:maxLength value="20" />
           
<xs:pattern value="&ASCII;" />
       
</xs:restriction>
   
</xs:simpleType>

</xs:schema>

/Roger

 

 




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member