|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: UTF-8 BOM
> > This issue should perhaps become part of "XML Blueberry" > > The main problem with this is that it would mean that the legality of > the first bytes of an entity would depend on whether there was an text > declaration following them and what version number it contained, which > seems the wrong way round. Today's XML spec requires text/xml decls to be the first thing in the document, no leading characters. The UTF-16 BOM is explicitly (in the XML spec) not part of document data, which is why it doesn't affect that logic. > It should be possible to handle a BOM at a > level below that at which the text declaration is processed. Works OK for UTF-16 today ... where the BOM is explicitly not part of the document's data, so it's never before the text/xml decl. > (Of > course, this can't really be done. If you get encoding="iso-8859-1" > after a UTF-8 BOM there's something wrong which ought to be reported.) I guess I'm thinking that a UTF-8 BOM would be a "new feature" that's an error today. Hence it fits with the other backwards-problematic stuff in Blueberry ... though it's a "new feature" that's encoding-specific. It's already declared to be a fatal error if the declared encoding doesn't match the actual one. (Not that one can always detect that case, since those actual encodings have so many synonyms to recognize.) - Dave
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








