[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] REX: XML Shallow Parsing with Regular Expressions
Recently I've been having a great deal of fun building XML shallow parsers using regular expressions. The result is REX 1.0 as documented in the paper described below. The fun comes from the several cute techniques (shallow parsing with a single regular expression, literate regular expression programming, UTF-8 processing using 8-bit extended ASCII regular expression packages) that combine in a very nice way. In particular, REX parsers for XML are generated from an XML representation of regular expressions which is processed by tools written using REX! Needless to say, initial hand-written parsers were need to bootstrap the process. Fun aside, I think there is serious room for REX in the area of lightweight XML tool implementation. I'd be interested in feedback from the XML development community about possible applications of REX. Robert D. Cameron, "REX: XML Shallow Parsing with Regular Expressions", CMPT TR 1998-17, School of Computing Science, Simon Fraser University, November 1998. http://www.cs.sfu.ca/~cameron/REX.html Abstract The syntax of XML is simple enough that it is possible to parse an XML document into a list of its markup and text items using a single regular expression. Such a shallow parse of an XML document can be very useful for the construction of a variety of lightweight XML processing tools. However, complex regular expressions can be difficult to construct and even more difficult to read. Using a form of literate programming for regular expressions, this paper documents a set of XML shallow parsing expressions that can be used a basis for simple, correct, efficient, robust and language-independent XML shallow parsing. Complete shallow parser implementations of less than 50 lines each in Perl, JavaScript and Lex/Flex are also given. Robert D. Cameron, Associate Professor cameron@c... School of Computing Science FAX: (604) 291-3045 Simon Fraser University Burnaby, B.C., Canada V5A 1S6 Internet Electronic Library Project at SFU http://elib.cs.sfu.ca/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|