[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: UTF-8+names
Tim Bray wrote: > Well, there's no doubt that +names is optimized for the needs of XML > users, in that it defines lots of things like &eacu; but *doesn't* > define the XML magic 5; this means that < and & and so on go > through untouched, which is what you need for the purposes of XML users. That's the biggest problem I have with the proposed encoding, actually. A human reader staring at a chunk of UTF-8+names-encoded text can't readily tell if "&xxx;" is really (1) part of the encoding, to be replaced by a real character before being fed to the parser, or (2) not part of the encoding, to be passed through to the parser unchanged; where it will then either (2a) be interpreted as an XML entity reference or (2b) signal an error. I get the feeling this is just asking for trouble. Just think of what will happen when people start publishing RSS feeds with UTF8+names-encoded double-escaped XHTML in the <description> element ('cause you know they will). Now pretend you're a DPH and you see "&&&semicolon;" in one of these feeds. Answer fast: what does that mean? Can you write a regexp that will process it correctly? --Joe English jenglish@f...
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|