[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: expat whitespace weirdness?

  • From: Lars Marius Garshol <larsga@g...>
  • To: "'xml-dev@x...'" <xml-dev@x...>
  • Date: Mon, 17 Jul 2000 14:16:52 +0200

attribute whitespace expat

* Tim Crook
|
| I was looking around to see if there might have been a particular
| reason why expat was implemented such that no leading white space is
| allowed before the standard <?xml version="1.0" ?> line. 

The reason is that the XML recommendation requires it. :-)

| From my understanding of things, the Byte Order Mark is what allows
| an XML parser to determine which character set in use. 

Not really. It allows a parser to determine whether UTF-16 was used,
and if so which variety of UTF-16 (BE or LE). However, if UTF-16 is
not used then the encoding can basically be anything.

| (see Appendix F, Autodetection of Character Encodings in
| http://www.w3.org/TR/REC-xml) If the Byte Order Mark is not found,
| shouldn't the starting content of the data stream be discarded until
| the Byte Order Mark is located?

If the BOM is not at the beginning of the data stream then there most
likely isn't one, for example because iso-8859-1 was used. This is
what makes it so handy that the XML declaration must appear first in
the document if it appears at all.

The rules then become something like:

 a) does the stream begin with a BOM? if yes, assume UTF-16
 b) does the stream begin with an XML declaration (in some encoding
    that the parser is able to figure out)? if yes, see what the
    encoding pseudo-attribute says.
 c) assume UTF-8

--Lars M.


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.