[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Request for Comments: XML binary encoding
Big discussion on Binary XML. I don;t know how some people on this list get work done! I listen but rarely contribute to this list because it takes me so long to formulate postings... I've been designing a "next-gen" XML-like language for awhile now, though all my implementation time has been soaked up by a parser compiler I;ve been writing (different project), so I still haven't released a reference implementation of the language (it's called "reticular structure language" (RSL): http://www.inxar.org/rsl). Anyhow, RSL may be expressed using a binary "compiled" representation under certain circumstances. I initially thought this was a cool idea because of the incredible performance gains that would be gleaned from not having to parse the text. As has been discussed previously, punctuated by Tim Bray comments, the gains in this area are pretty limited. It's not really worth it except under very specific conditions. However, RSL has the additional feature that validation is considered the norm -- most RSL documents should be validated. What I discovered is that by compiling the "source" text form into a binary representation, you can organize the information such that structural patterns in the document can be grouped. This pattern grouping allows future validation (of the binary representation) to be significantly faster, which is important for RSL (which determines validity at run-time, not compile-time). For an extreme example, consider an XML representation of a log file. The log file has 10,000 entries, each of which is an element with no attributes and a content model defined in a DTD. Typical processing would involve parsing the text and 10,000 regexp challenges to confirm the validity of each entry to the DTD. A compiled representation allows one to recognize that all 10,000 entries have the same pattern. Validation of this document would require only a single regexp challenge to validate all structures in the document. One potential drawback to the current design of this representation (unpublished) is that is not stream-based. This would prohibit SAX-like processing of the binary reprentaion. The point is that there are trade-offs in whatever your do. Simplest things are almost always best. Paul
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|