[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML Blueberry
Vincent-Olivier Arsenault wrote: > But couldn't that be deduced from the binary representation (on the > platform level) so that the parser just has to deal with a "current" (at > the time of the parser implementation) UNICODE spec string? Why does the > (XML) parser need to know the charset used? It does not in one sense. But it needs to know what characters are and are not legal in names, which is quite independent of encoding: DOUBLE DAGGER is illegal, whether you encode it 0x2028 (UTF-16) or 0x87 (CP-1252). Since the list of encoded characters is still growing, although slowly, new name characters come into existence from time to time. >> That's dangerous: it leads to interop failures. What if the version of >> Java at the receiving end has slightly different tables from the one >> at the sending end? > > That's not XML interop but UNICODE interop. The one depends on the other. > Aren't such "recovery" > mechanism specified in UNICODE? No. > And anyways the problem exists with > implementations based on the current spec, you said it yourself : some > parsers have tables and some don't. In which case one is RIGHT and the other is WRONG, because there is a normative list of characters in the XML Rec. We can check. > Think abstraction! Think hairiness. -- There is / one art || John Cowan <jcowan@r...> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- Piet Hein
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|