Re: UTF-8 BOM
|| == David Brownell | == Michael Brennan | || it fits with the other backwards-problematic stuff || in Blueberry ... though it's a "new feature" that's encoding-specific. | | Except that a UTF-8 BOM isn't really a new feature; it's just one that all | too many implementors overlook. It's a new feature added by E105, in the last batch of "errata" before the 2nd edition spec was published. That was called a "clarification", but it seems to me like a substantive change ... previously, BOM was discussed exclusively (!!) in the UTF-16 context. Though as Rob Lugt pointed out, the relevant normative text (4.3.3) is unchanged. That part of E105 sure seem like it matches the "Blueberry" goals of aligning with Unicode, not "2nd edition" goals of removing (not adding :) ambiguity. > == Tim Bray > > Actually, I think that the UTF-8 BOM is a deeply stupid idea that > serves no useful purpose in any imaginable universe. That's where I'm coming from. UTF-8 is the default encoding, and the only way un-MIME-typed data would NOT be in UTF-8 is if it has a UTF-16 BOM, or an XML (or text) declaration. This change wasn't necessary; thrashing infrastructure is bad (unless maybe you're a company needing a stick to force customers to buy new software :). > We wouldn't > be thinking about were it not for the fact that MS Notepad happens > to write one for UTF-8 documents. So what's the next desired erratum ... somewhere in 4.3.3, it should get updated to say that "for interoperability" any (real) UTF may have a BOM? Whereas right now it only says that UTF-16 "must" have one, and requires otherwise that xml (or text) decls must appear "at the beginning" (that is, where such a BOM could now be)? - Dave
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format