[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Xqueeze: Compact XML Alternative
Hi all! Let me first state on the record that I am not a professional developer using XML applications day-in and day-out so you might find the breadth of my knowledge a little limited. Xqueeze is a project I am pursuing purely out of interest - I'm not even paid for it. That said, I'm convinced that there is demand for a uniform and compact XML-like notation purely for machines to use. I am doubtful about the prophecy of any attempt to solve this "problem" being doomed. Here's what I note: * Need to support arbitrary BE (Binary Encoded) documents w/o reference to schema: With a little modification in the grammar, xqML can support construction of trees for arbitrary documents. It would be able to supply all associated data, but it wouldn't tell what elements or attributes etc are present. Even as of now, xqML documents can carry their data dictionary inline, thus freeing the consumers of the burden of managing schema for everything. * Debugging Tools: It is quite trivial to write an "inspector" tool once arbitrary documents are supported. This is closely related to the above. However, I would like to point out one thing. The way I intend Xqueeze to be, it would be a very transparent logic sitting at a low level in your application. You just substitute XML generators with xqML generators and XML parsers with xqML Parsers. The middleware may just provide an option to switch between XML and xqML modes of operation. You can develop your application entirely in XML mode and when all debugging has been done, switch to xqML mode fr deployment. For debugging the xqML tools, Emacs and a chart of ASCII control characters offered all the necessary visual cues for me. hexdump -C and diff came handy too. * Support for Data Types: There are plenty of unused symbols in xqML that can be used to signify data types. Not only that, currently no semantic information is captured from the specifications. Doing this can enable greater optimization of the BE and tighter validation of the data. * Random Access: This is not supported by XML, so it can't be supported by xqML. However, indexes on complete documents can be created by the serving applications and xqML can provide for indexes to be included in the beginning of the document (eww! this is getting messy) * Multiple encoding formats: While designing the specifications for xqML, I had several choices with me. After initially settling for binary symbols, I eliminated choices on the criterion of ease of generation and parsing. That choices exist implies that there can be several encoding formats a la ASN.1 to suit various needs. * Competition with compression: xqML in it's current format is as structured as XML so it too compresses well. In an experiment[1] a 12 kB HTML document zipped to 2 kB. The (handwritten) BE for the same document took 3 kB and when zipped, it took less than 1000 bytes. * Backward Compatibility: As ABS points out, this is a limitation of XML, hence is not directly addressable. However, if the schema is semantically backward compatible (e.g. just adds another child element somewhere so that older docs comply with the new schema), this backward compatibility can be captured in XML and hence xqML. Regarding changes in the XML syntax itself - well, ask the experts. Comments are welcome, as always :-) [1] This was done very early in the project's lifetime and the handwritten encoding was still more verbose than present day xqML. I haven't carried out proper tests yet for two reasons - the API is still too elementary to allow writing large test cases without much effort :p The second reason is that since data is not touched in xqML, how much compaction is achieved depends a lot on the percentage of markup, thus invalidating any claims for a real-world expectation. Even for 100% markup, compaction depends on the length of identifiers in the specs. I may specify extremely_large_element_names_like_this in the spec and jump around with claims of 99% size-reduction. -- Tahir Hashmi (VSE, NCST) http://staff.ncst.ernet.in/tahir tahir AT ncst DOT ernet DOT in We, the rest of humanity, wish GNU luck and Godspeed
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|