|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML 2 so far
Cool thread by all, and I think its beginning to ask the hard questions:
Most of the problems with this come from the malcoding of RSS documents. I'd argue that a regex filter preprocess of such files would make not just this but a number of issues with XML go away.
Much of the difficulty here comes with legacy code; there's a lot of XML that was encoded as ISO-8559-1 early on that's still in the system. Agreed, would like to see UTF-8 become standard.
Agreed - entirely too much of my career has been spent recoding HTML encodings to their numeric equivalents. The encoding tables are well defined and would not take up a significant amount of memory or processing time on today's systems. There is some interesting work that was done in XSLT2 on character encodings and mappings that should also be pushed into the parser.
Agreed. Internal subset processing introduces semantics and complexity that would be better handled via a transformation process or some other formal processing tool post facto.
This is one area where I'd be inclined to disagree. I think that there is a technical need, though not necessarily one that shows up in HTML. The primary use case I see here comes in query operations; most queries return multiple nodes of content (thinking XML databases here), with the enclosing node added primarily because XML currently does not support it (this is akin to retrieving a JSON array).
I've long referred the lax syntax argument as being the "Grandmother Argument" - that my grandmother should be able to write invalid (fill in the blank language) and the system should be able to handle this laxness. It's a weak argument in HTML (if only because I believe that the amount of HTML being written by hand is a small and (more importantly shrinking) percentage of the overall production of HTML as more and more of it gets produced by automated mechanisms), but it's a terribly argument in XML, in great part because the only way you can derive even marginal semantics is by incorporating an XSD or similar type definition language, and the ability to introduce mechanisms to compensate for such laxness assuming a greater degree of competency in schema design than I've seen evinced in most XSD developers.
What this does imply is that if a decision to create lax XML is permitted, there needs to be a way of introducing into the schemas some way of defining how such laxness is handled - This would be analogous to saying that if you have a P tag that the tag would be lax (would resolve with no terminating tag) if a given set of opening tags were encountered (<P>,<DIV>,<Hn>, etc.). I don't necessarily see this as being a bad solution, but it would put more onus on the schema developer and would require rethinking XSDs in particular. Other areas, such as <b><i> inversions (</b></i>) might also require such a set of rules.
The question here is whether this benefits any language other than HTML?
This is again an area where an E4X-like language would prove beneficial. If XML was a native format in JavaScript, then you could readily have JSON of the form {"a":"foo","b":10,"c":<bar><bat>text</bat></bar>} which would readily resolve all of these issues, which have been proposed largely because of the challenge of mixing JSON and XML. Minimization is not an issue on the XML side - outside of HTML, most people who work with XML have become quite accustomed to its form, and minimization would likely add a considerable learning curve and overhead to the process.
The business case here is making XML more workable on the browser in the enterprise context, making JSON a reasonably mechanism for the transmission of multiple XML content as well as Javascript encodings, and better supporting both the rich data cases where document content plays into it. Again, I think that E4X may very well be the model that should be looked at, because it does a decent job of mixing the two message metaphors and has had the benefit of solid real world implementations. I'd call that a huge win.
Kurt Cagle
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








