Re: SAX for Binary Encodings (SAD-SAX)
At 12:10 AM +0000 11/9/03, Alaric B Snell wrote: >In some contexts they are; in some they are not. It all depends on >who is viewing it. If you stop and think, you will realise that an >API like SAX in no way *forces* 7 to be treated identically to 07. >Clearly you don't know a thing about how computers work if you think >that *some* pieces of software treating 7 the same as 07 *if they >wish to* will somehow emit data-destroying rays that spread across >the Internet and lop the significant leading zeroes off of telephone >numbers, eh? Like many people who want typed data you're confusing the local with the global. Of course my software will treat the strings in a way I find useful. However, that in now way means you have to treat them the same way. You may want floats. I may want ints. Simon may want strings. There's no one right answer. The underlying premise that suggests we should exchange typed, binary data is hat there is one right answer; one type that's better for the data than all the others; and that simply isn't true. The more complex the data becomes the less true it is. >You might be surprised to find that a lot of software *does* take >character strings out of an XML document and immediately parse them >as decimal integers. This doesn't appear to have broken XML, does >it? Shock horror! Locally, of course not. The problem is when applications start exchanging their typed binary representations as the one truth rather than recognizing it as simply one way of seeing the world. >Yes, the bit where he said that "This SAX option would be >compulsory, and all XML parsers would be international agreement be >required to have this option turned on; software that does not turn >this option on would not be allowed to be sold, or written, because >I know full well that a character string of '07' at a point in the >document where the schema says 'this is an integer, so leading >zeroes in the decimal representation are irrelevant' means that the >schema author was wrong" really supports your argument, doesn't it? Did you read what Wyman wrote? He was suggesting that we actually exchange four bytes containing a big endian two's complement representation of the number 7 (or some equivalent form), rather than exchanging the text string 7. >> All data in an XML document is text, never anything else. > >I can prove you wrong! > ><numFingers>10</numFingers> > >That "10" is text. It's also an integer. It's also the number of >fingers I have. See? A counter-example. Not a counter example at all, and when you understand that you will have achieved the XML nature. 10 is text, not a number. You choose to interpret that text string as the number ten, which is fine. It's your choice. Just don't believe for a minute that it's the only legitimate interpretation of that text string, or that the string and its interpretation are the same thing. >> It is certainly not something you'd want to exchange on the >>Internet, and it is is absolutely not something that should be >>baked into the core APIs. > >Oh, good! You understand! So what was all that rubbish before about, eh? > >Or are you mistaking an optional thing for something 'baked in'? Simplicity is a virtue. We're trying to produce a Corvette here, not an Edsel. Use the right tools for the right tasks. Don't try to make one API fit all needs. >Now, you are harping on about those who communicate information >rather than just opaque text as "polluting" XML, but don't you think >that demanding that the APIs *they* use be the same as the APIs >*you* use is... polluting *their* use of XML with *your* model, hmm? Oh, come on. Now you're being ridiculous. They can invent and use any APIs (and any formats) they want. The problem is they don't want to do that. They want to hijack the nice clean SAX API and XML format, and stuff it full of mismatched garbage I'm going to have to spend my time explaining. > I mean, it's not like they're forcing SAX processers to report >abstract values instead of character strings, is it? If you read >carefully, you will see that it's an option. And you may have >noticed that there are lots of SAX processers that aren't doing this >in use *as we speak*. They won't be influenced by the magical rays >to change, will they? They will only change if the programmer that's >using them thinks "Hmmm, I'd like the leaf nodes of this XML parsed >into abstract values for me to save some coding, I'll use a SAX >parser that does that for me". If the programmer doesn't want that, >perhaps because the formatting of dates and integers is important to >them, then they won't do it. If you want to translate data into XML to present it through SAX, fine. But don't start complexifying SAX because you discover your data isn't a good fit for XML. >So shock horror! Even when USING this API that produces typed values >WHEN IT CAN, you could still get at the raw character stream to >handle it as you always did! > >So exactly how is this going to destroy XML, eh? Two ways: 1. It will mean people start passing around binary data instead of text. 2. It will make XML so complex that it becomes incredibly difficult to learn and implement. Soon we'll be back in the SGML hell where no parser implements everything, and you're never quite sure which features you can and cannot use. And thus you can no longer safely interchange XML with other parties. As Simon keeps pointing out, schemas, XPath 2, and XSLT 2 have already marched a long way down this road. I don't think it's a coincidence that those of use who spend the largest part of our time trying to explain and teach these technologies are most adamant that this is the wrong road to follow. -- Elliotte Rusty Harold elharo@m... Effective XML (Addison-Wesley, 2003) http://www.cafeconleche.org/books/effectivexml http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format