[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Unix/Java design issues (Was: Re: Is CDATA "structure"?)
Sorry if I just bark into this conversation about CL LF discussion and I hope I did not misunderstand what the actuall discussion is about. For sure the LF vs CRLF and CR in theory (the spec) and for viewing in Notepad is all correctly debated or noted, but pragmatically, does this really provide a problem? The encoding for XML is UTF-8. So in allmost any text viewer/editor, in normal(?) circumstances it will show strange in these applications, since they do not understand UTF-8 (in windows). The API on XML, for example DOM, is also UTF-8, which most applications may treat as 7-bit ASCII, but for encoding generic applications this should be treated as UTF-8. Windows is not UTF-8 aware, so it has to be converted to Unicode anyway. In that little almost like DOM API I wrote, the API on the XML structure handles all data as UTF-8, and all 'newlines' are LF. My API, in addition to DOM, defines conversion functions for those cases, where data is passed to or from the outside world, mostly the GUI or some kind of Database. In the windows implementation of my API, conversion functions convert from the Windows Code Page or Unicode, to UTF-8, converting each CR/LF into LF automatically, and from UTF-8 to the Windows Code Page or Unicode, converting each LF into CR/LF. On Unix platforms, the conversion preserves the one LF. Like this, on all platforms, the API delivers and expects the kind of 'newlines' that platform expects. With a good parser (I use SP and expat), the Data in the internal XML structure always uses a single LF. The above is true for writing XML applications using C or C++. Using Java, isn't the Java engine supposed to handle it likewise, and I think it does. So any platform such as Windows or Macintosh may use their favorite 'newline' sequences, but it does not, or shouldn't affect XML applications. But it is true that it's a pitty that they treat 'newlines' differently and it will hurt slightly the performance of these applications. Vor simply viewing text in a text viewer/editor, there are many Windows text editors around, that view unix and windows text files correctly (But true is, that most do display incorrect, if the text file has mixed CRLF and LF). And while copying textfiles from windows to macintosh, or the other way around, automatic conversion (I think, or at least with the "read mac disks little program" on my windows) takes place. Best regards, Arthur Rother xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|