[Home] [By Thread] [By Date] [Recent Entries]

  • From: John Cowan <cowan@l...>
  • To: XML Dev <xml-dev@i...>
  • Date: Tue, 02 Mar 1999 10:39:24 -0500

MURATA Makoto wrote:

> It is my understanding that Unicode 3.0 will have many ideographic
> characters which are outside of the BMP.

The Unicode Consortium has indicated on its mailing list
that no non-BMP characters will appear in Unicode 3.0.
(Unless Vertical Extension A is being put in Plane 2 after all?)

> >An application receiving data may either use these signatures to
> >identify the coded representation form, or may ignore them and treat
> >FEFF as the ZERO WIDTH NO-BREAK SPACE character.
> How do you interpret this "or"?

I interpret it as "inclusive or", "and/or", "vel".

> One could argue that when EF BB BF
> is recognized as a signature, it is not treated as the ZWNS.

I think that it may or may not be treated as the ZWNBSP.  In any event,
the whole annex is informative, and describes "a convention [...]
applied by a certain class of applications".  It is reasonable to
suppose that XML is not in that class of applications, at least
so far as UTF-8 recognition is concerned.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@c...
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member