[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: expat whitespace weirdness?
Tim Crook wrote: > I was looking around to see if there might have been a > particular reason why expat was implemented such that no leading > white space is allowed before the standard <?xml version="1.0" ?> > line. You get the error XML_ERROR_MISPLACED_XML_PI if there are any > leading carriage returns, line feeds, spaces or tabs. From the XML Rec [1]: [22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)? [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>' I.e., if the XML declaration is in the stream, it must occupy the first characters of the stream passed to the parser. > From my understanding of things, the Byte Order Mark is > what allows an XML parser to determine which character set in use. > (see Appendix F, Autodetection of Character Encodings in > http://www.w3.org/TR/REC-xml) If the Byte Order Mark is not found, > shouldn't the starting content of the data stream be discarded > until the Byte Order Mark is located? Yes. But by the application (or other parser user), not the parser. Note also that Appendix F is NON-normative -- compliant parsers are not required to produce results consistent with it. Steve Rowe MNIS-TextWise Labs [1] http://www.w3.org/TR/REC-xml
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|