[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: UTF-8+names
> -----Original Message----- > From: Seairth Jacobs [mailto:seairth@s...] > Sent: Saturday, October 18, 2003 15:09 > To: xml-dev > Subject: Re: UTF-8+names > > > From: "Tim Bray" <tbray@t...> > > > > Check out http://tbray.org/tag/utf-8+names.html > > Instead of throwing an error for a missing semicolon, why not > let that case fall under section 4. Since the set of > pre-defined reference names are known, you also know the > maximum length possible. If you reach length+1 characters, > you know it's not a reference and leave it as-is. It also > means that a lone ampersand won't be an error. I suspect the > motivation to require the semicolon is for consistency with > XML's same requirements. However, this disallows some valid > UTF-8 from being valid UTF-8+names as well. While this may > be fine if used solely within the context of XML (which would > throw an error as soon as it hit the invalid reference > anyhow), I suspect people will try using this outside of XML as well. > > Also, I suspect that using the same format as XML/SGML's > references is going to confuse people. Maybe use a similary > format such as #name; or @name;. This way, at least the two > references (UTF-8+name and XML) would not be confused. I think there are other problems. As I understand, in UTF-8+name, an ampersand is represented as &&; which means that, if UTF-8+name is used for XML, "normal" entity references will look like: &&;myentity; and numeric character references will look like: &&;#12345; which is ugly. In addition, "UTF-8+name entities" would have the usual syntax: < but this can be confusing because it would denote a **literal** < character, not one obtained by including the entity. As a consequence, I would *not* be allowed to use < in the value of an attribute, for example, but would have to use &&;lt; for the same purpose. I think that these problems can be overcome by using something different from the ampersand, as you suggest. As for the final ; being mandatory, I believe it should be, for robustness. It is not very clear to me where UTF-8+name would be useful, as I don't think it is useful in XML. Is it being proposed for use in areas where, for some reason, XML cannot be used? Alessandro Triglia OSS Nokalva > > --- > Seairth Jacobs (seairth@s...) > Looking: http://www.seairth.com/blog/65 > > > ----------------------------------------------------------------- > The xml-dev list is sponsored by XML.org > <http://www.xml.org>, an initiative of OASIS <http://www.oasis-open.org> The list archives are at http://lists.xml.org/archives/xml-dev/ To subscribe or unsubscribe from this list use the subscription manager: <http://lists.xml.org/ob/adm.pl>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|