RE: UTF-8+names

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

To: "'Elliotte Rusty Harold'" <elharo@m...>,"'Tim Bray'" <tbray@t...>,"'XML Dev'" <xml-dev@l...>
Subject: RE: UTF-8+names
From: "Michael Kay" <michael.h.kay@n...>
Date: Sun, 19 Oct 2003 08:20:47 +0100
Importance: Normal
In-reply-to: <p06002001bbb7431b0313@[192.168.254.4]>
Reply-to: <michael.h.kay@n...>

> Interesting idea and a neat hack. If I'm reading this write, though, 
> it would require writing &lt; in XML as &&lt; and so forth for other 
> genuine entity and character references. 

Actually it says:

In UTF-8+names, the sequence consisting of an "&", a character string,
and a ";" is called a "replacement". The characters contained between
the "&" and the ";" are called the "replacement name" and the Unicode
character sequence which is represented is called the "replacement
value."

and then says:

For replacements whose names are not given a replacement value by this
specification, the replacement value is identical to the replacement
name. For example, the replacement "&U2;" represents the Unicode
character sequence of length 4 containing the characters U+0026
AMPERSAND, U+0055 LATIN CAPITAL LETTER U, U+0032 DIGIT TWO, and U+003B
SEMICOLON.

The two sentences here are in conflict. The rule tells you thatt the
replacement value for &LT; is "LT", while the example suggests it is
"&LT;".

(Another observation on this rule: it means that the set of names that
is recognized is frozen for all time, it can never be extended.)

I think you would have to write &lt; as &&;lt; If you believe the
example rather than the rule above is correct, you could also write it
as &&#x3c;; or as &#x3C;

Either way, the thousands of poor users who are already badly confused
about entity references are going to become even more confused.

Michael Kay

Follow-Ups:
- Re: UTF-8+names
  - From: Tim Bray <tbray@t...>

References:
- Re: UTF-8+names
  - From: Elliotte Rusty Harold <elharo@m...>

Prev by Date: RE: UTF-8+names
Next by Date: RE: UTF-8+names
Previous by thread: Re: UTF-8+names
Next by thread: Re: UTF-8+names
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >