Subject:Unicode char set validation Author:Jeff Horan Date:14 Aug 2003 11:02 AM
How do I validate that the char set encoding in the xml header line is indeed valid? I need some way to guarantee that a file is the encoded set, otherwise it should kick out an error. Is that possible?
Subject:Re: Unicode char set validation Author:Minollo I. Date:14 Aug 2003 11:17 AM
I'm not sure I understand the question; do you want to validate that the
encoding set is supported by a parser? Or do you want to validate that the
encoding set is in-synch with what the actual encoding is?
In the first case, if Stylus opens the document, and you are able to view
it in tree mode, then the encoding has been successfully parsed and understood.
In the second case, there isn't a way really; the encoding set in the PI is
the one which is telling the parser how to decode the document; if they are
out of synch, you will be in trouble. When you save a document from inside
Stylus, the document is encoded according to the information found in the
XML PI (or UTF-8 if that information is missing). When you load a document
inside Stylus, the XML PI is used to retrieve the specified encoding, and
the document is parsed accordingly; the only exception to the rule is if
the XML document starts with a BOM, in which case we also try to interpret
the document according to what the BOM is saying.