[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Strict-Mode and Lax-Mode MicroXML
On Sun, 2012-06-03 at 10:00 +0100, Pete Cordell wrote: > Note that this wouldn't be Postel lax processing because the laxness > wouldn't be left up to the parser implementers to define. It would be part > of the spec. (Or is that too HTML5-ish!). Jon Postel's "Law" - be liberal in what you accept and strict in what you put out - when applied to hTML 5 (for example) would say that Web servers are in violation of the relevant specs (HTTP, MIME, HTML) when they emit content labelled as HTML that is not valid, but that Web browsers should not be overly strict in processing what they receive. The tradeoff is in whether the receiver can determine appropriate behaviour for out-of-band input. For HTML it's "can I display something, even if it's now what the author intended." For XML it's "can I use this data correctly even though the input contains syntax errors" - this is more akin to TCP/IP (and closer to the original domain of Postel's law) where a network packet with a bad checksum is rejected and retransmission is requested. So, > If a < character is not followed by a nameStartChar or a ? character > or a ! character then it should be treated as a < character that has no > special meaning. or / of course :-) > If a & character is not followed by one of the character sequences > gt; or lt; or amp; or quot; or apos; then it should be treated as a & > character that has no special meaning. or # presumably. If I were designing XML (or µ-XML) from scratch I'd probably just want an escape character so that I could write \< or \& or <a b="\""> or whatever. Optimising for hand-authored documents is a mistake - make hand authoring easy, but not at the expense of harder machine processing. An example was the CDATA section, included in XML because the spec authors wanted it for examples despite the fact it made parsing irregular. Better might have been to include a CDATA element, e.g. by saying element names "starting with %" (in SGML terms) were literal, <%foo> ...</%foo>. > I see this as a migration strategy to get away from some of the SGML baggage > that is no longer relevant, and maybe in 10 years time we can safely adopt > lax-mode for 99% of what developers want to do and have -- in comments etc. There's no acceptable value of "10 years" for breaking changes. XML today is used in consumer devices, in computer boot sequences, in aircraft and car engines, it's not something that can change; µXML, if successful and in use a decade from now, would be in a similar situation. Best, Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|