RE: XML and mainframes, yet again (was RE: So me com
>> It is unnatural to allow #85 as white space in XML as >(currently at least) >it >> isn't as far as I know an end of line character in any >ascii/unicode based >system. >> So it is completely unlike the situation with #10 and #13. > >Ummm, the Unicode consortium has supplied an entire technical report >http://www.unicode.org/unicode/reports/tr13/ on this >fascinating subject. >The first sentence says "Newlines are represented on >different platforms by >carriage return (CR), line feed (LF), CRLF, or next line (NEL)." That >implies to me that #85 is completely IDENTICAL to the >situation with #10 and >#13 in Unicode. Pardon my naive question, but how comes that Unicode, which can handle different character representations depending on the encoding used, does not have a SINGLE newline codepoint that would map onto 0x0D0A (CRLF) on some platform, 0x0D (CR) or 0x0A (LF) on others, 0x85 (NEL) on mainframes, etc. ? If such a characted existed, the XML spec could just mention it as a possible whitespace, letting the parser handle the various end-of-line markers based on the 'encoding' parameter in the <?xml header... A bit like the <?xml encoding="ascii-with-nel" proposed by David. The character encoding would therefore give the parser a hint of the end-of-line encoding of the Unicode newline codepoint used in the document. But it's a fantasy since the concept of character encoding does not includes "end-of-line" encoding, does it ? Regards, Nicolas Lehuen Responsable R&D / Head of R&D UBICCO, the Multi-Access Software Vendor http://www.ubicco.com/ mailto:nicolas.lehuen@u... Phone : +33 155 040 321 Fax : +33 155 040 304 Mobile: +33 661 907 640
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format