[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Proposal: create a repository of reusable regular expressions usingXML E
Hi Folks, Suppose that you wish to record the date and time of the launch of a new product. You could format the data using the xs:dateTime data type, but you desire slightly different format: <launch> <date-time>22 Oct 2012 07:52:00 -0004</date-time> </launch> The value of <date-time> can constrained to the desired format by using an XML Schema pattern facet. Recall that the value of a pattern facet is a regular expression (regex). Recently I learned from Michael Sperberg-McQueen a fantastic way of creating reusable regular expressions: create XML ENTITY declarations that express the regexes. Recall that an XML ENTITY declaration has a name followed by its replacement text: <!ENTITY name "replacement text"> So, we could create a date-time ENTITY with a regex as its replacement text: <!ENTITY date-time "... regex ..."> Okay, I did just that. I created a file containing the ENTITY declarations for the date-time format. I created the date-time format systematically through a series of ENTITY declarations: --------------------------------------------------- regex-repository.ent --------------------------------------------------- <!-- ********************************* Regex for date-time Here is an example string that conforms to the date-time regex: 22 Oct 2012 07:52:00 -0004 ********************************* --> <!ENTITY date-time "(&day-of-week;)?&date;&time;"> <!ENTITY day-of-week "(&WSP;)?&day-name;"> <!ENTITY day-name "(Mon|Tue|Wed|Thu|Fri|Sat|Sun)"> <!ENTITY date "&day;&month;&year;"> <!ENTITY day "(&WSP;)?[&DIGIT;]{1,2}&WSP;"> <!ENTITY month "(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)"> <!ENTITY year "&WSP;[&DIGIT;]{4,}&WSP;"> <!ENTITY time "&time-of-day;&zone;"> <!ENTITY time-of-day "&hour;&COLON;&minute;(&COLON;&second;)?"> <!ENTITY hour "[&DIGIT;]{2,2}"> <!ENTITY minute "[&DIGIT;]{2,2}"> <!ENTITY second "[&DIGIT;]{2,2}"> <!ENTITY zone "&WSP;[+-][&DIGIT;]{4,4}"> <!ENTITY SP " "> <!-- Space --> <!ENTITY HTAB "	"> <!-- Horizontal tab --> <!ENTITY WSP "(&HTAB;|&SP;)"> <!-- Whitespace --> <!ENTITY COLON ":"> <!ENTITY DIGIT "0-9"> The regex for date-time can be reused by any XML Schema. More precisely, the regex can be reused by referencing, in a pattern facet, the date-time ENTITY declaration. Here I create an XML Schema for the <launch> element and use DOCTYPE to provide access to the regexes in regex-repository.ent. I use a pattern facet to constrain the value of the <date-time> element. The value of the pattern facet is the regex found by referencing the date-time ENTITY. --------------------------------------------------- launch.xsd --------------------------------------------------- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE xs:schema SYSTEM "regex-repository.ent"> <<---- Get access to the regexes here <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="launch"> <xs:complexType> <xs:sequence> <xs:element name="date-time" maxOccurs="unbounded"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="&date-time;"/> <<---- I use the regex here </xs:restriction> </xs:simpleType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> Reusing a regex is simply a matter of referencing the ENTITY that holds it. I think it would be useful to create a repository of regular expressions using this XML ENTITY technique. Thoughts? /Roger
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|