[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: The triples datamodel -- was Re: SemanticWeb per
Michael Champion wrote: > Before XML (and related technologies) people had > little choice but to stick with rigid formats, because all hell would > break loose when they were changed. People are jumping on XML and the > design philosophies it enables because there has been a pent up demand > for more flexibility. Well, CSV has been around for ages, and the way I've always written CSV parsers is to take the first line as a line of column headings, and use those to select what is done with each column's fields. And my software ignores fields it doesn't know a use for, and assumes that missing fields it expects have a NULL value, which may or may not cause higher-level code to reject the row. Also, ASN.1 has an extension mechanism, where people using different variants of a 'schema' can still communicate; the decoder may inform the application that it had to discard some data it didn't understand, but still provides the fields that the decoder knows. This 'flexibility' isn't something new to XML, it's inherent in any format that uses some kind of tagged values; including things like TIFF and PNG image files. And MIME headers, and SMTP email messages. Nothing special about XML in this respect! XML fans seem to have similar marketing ideas to Microsoft, picking up a good idea from elsewhere and claiming to have invented it ;-) Ah, TIFF files! Whereas PNG files, which are well specified, are pretty damned interoperable now most browsers support them, TIFF files are a bit of a gamble. They have so many unconstrained options, due to lots of folks adding their own extensions here and there, that hardly any decoders seem to understand all the options - so although app A may export TIFF and app B may import it, that's no guarantee that you can actual transport an image that way. Even if B ignores elements it doesn't understand in the TIFF, it usually falls apart because the element it ignored was critical to decoding the image. PNG files learnt from this; PNG chunks have a flag in them which indicates if they are necessary to decode the image (eg, a flag saying the image data is in Yxy instead of RGB) or if they are not (eg, extra annotations; thumbnail images, textual descriptions, information stating that the image has a scale whereby each pixel corresponds to a 1mmx1mm square of some surface, etc). The latter types of chunk also carry a flag saying if they should be discarded if the image data is changed; a thumbnail should, since it would be incorrect, but a textual note needn't. This mechanism can only be made to work because there is one 'core' piece of information - the image data. In general, one would have to have each optional part of a file format list all the parts it depended upon for its meaning, and an algorithm to deduce the consequences of a change or not understanding a part. I think agreed schemas will *increase* reliability of systems. The objection to this seems to be "Oh, so your system dies the second it sees somebody's private extension?" - which is in no way implied by schemas being agreed between the communicating parties. And for very large scale systems, that 'agreement' can be as simple as saying "This site publishes data in the format documented _here_" ABS
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|