|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Parsing efficiency? - why not 'compile'????
Tahir Hashmi wrote: > Robin Berjon wrote: > In the first group, there could be a subgroup that doesn't need binary > markup but may use it simply because it can, without affecting the way > its applications work. That's the group that doesn't need human > read/write-ability for its XML docs - the group of WYSIWYG Office > suites, XML-based instant messaging protocols and so on. I would quite seriously oppose using binary infosets when you don't need them. It adds to the complexity of the system and removes a variety of features of XML. Office suites can (and in fact do) use zip (if only because it doubles as a packaging format with is very convenient for attached files such as images). XML IM either needs binary infosets for performance reasons, or doesn't and shouldn't use it. > Consider this: the application is only interested in strings for date > but the schema designer specified a date type because it is the Right > Thing(TM) for a date (so that the schema need not be changed if at some > point of time the same application or another application does get > interested in the value). > > In a binary representation, the processor will decode the variable > length binary value to arrive at the number of seconds since epoch, > then re-construct a string for the application. Note that the > processor will be *synthesizing* a string that could be read straight > off the document. > > This approach would be better only if the benefits of saved bandwidth > are greater than the cost of synthesizing the date string. And we > can't assume that limited bandwidth is *always* going to be the > motivating factor for using binary markup. That's why in BinXML you can specify how you encode your data. In the case you cite one would simply ask that the xs:fooDate type use the UTF-8 codec. > The particular example I gave is illustrative only and as stated > earlier, I'm not against type-awareness. I'm simply being wary of how > much flexibility might possibily be lost, and in some cases > computation be wasted, in the quest of a super-optimized binary > encoding. Again, if you don't want something encoded just ask the application to not touch it :) >>As for your remark on the speed of decompaction, note that you may be right for >>a naive implementation of the same thing but there's compsci literature out >>there on making such tasks fast. > > Well yes, naivete may lead to bad design. The point is that more the > logic that goes into decoding a format, the higher the bar for small > devices is raised. While one can have small non-validating SAX parsers > for XML, the size of a binary format parser may go up since it would > have to know about synthesizing dates from integers, deducing document > structure from the schema etc, besides the indispensible passing of > strings around. The encoding scheme should require least possible > context information and minimal parsing logic to be accessible > there. Hope I'm able to explain myself better this time! It all depends on what you need. I totally agree that there is no one-size-fits-all but I do believe that it is very much possible to produce a flexible format that can be configured in a variety of ways, without it loosing internal coherence. If you want a tiny and ultra fast decoder you can drop support for encoding of the more complex types, if you want a slightly larger decoder but the smallest possible payload you add codecs to encode the content optimally. -- Robin Berjon <robin.berjon@e...> Research Engineer, Expway http://expway.fr/ 7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








