|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Private-Use Characters (RE: Character Range: surrogate blocks)
> From: owner-xml-dev@i... [mailto:owner-xml-dev@i...]On Behalf Of > Richard Emberson > To extend the available characters in Unicode one > can use to 16 bit characters with surrogate blocks. If you want to extend the available characters with your own, use the "Private-Use" or "user-defined" character block. The surrogates are codepoints reserved for messing up software later as more registered national characters sets are added; they are not for private use; implementors of current systems can ignore them at least for the next year, as far as I know. FIRST check that your character could not be represented by using an existing ISO 10646 character with some appropriate attribute on the element. In particular, if it is a regional variant of a character, try to use the xml:lang attribute. Note that a "language" includes far more than just simple regional language: I could have xml:lang='en-US-legal' to indicate US legalese; or it could be xml:lang='x-physics' to indicate that it is using the language of physics, but this language has not been recognised by IANA: in this case, your stylesheet can say "Oh, this is an X, but an X to be rendered as physicists will want it rendered." NEXT note that if you need mathematical characters, check out MML http://www.w3.org/TR/REC-MathML/chapter6.html first. FINALLY there are two contradictory needs for a user-defined character: searching (collation) and display. Which fits you?-- If your primary need is DISPLAY, then it is better to use an entity reference for the character. The corresponding entity contains an element with a hypertext reference to the glyph of the character: e.g. <!ENTITY my-alpha "<http:img src='url'/>"> If your system is smart, you could use content-negotiation to get the best form: GIF or whatever. (And it lets you tie into some Web fonts system, as that becomes available.) If you also need a little bit of collatability, you could add an attribute to indicate collation sequence posisition. If your primary need is for simple SEARCHING (collation) rather than presentation, then use the Private-Use area. (In the Private-Use characters, avoid using E200-E600; MML uses them.) You should always enter any of the Private-Use area characters using a numeric character reference (or, if you use these characters more than once, or want to provide a modicom of documentation, define an entity for them and use an entity reference)-- this will prevent possible transcoding errors later, and also makes the text more readable in editors which do not allow private-use characters to be added. (Western readers may be surprised that allowing user-defined characters is not uncommon in CJK publishing software, since the standard sets only go so far, even though it is almost unheard of in the West.) Rick Jelliffe <kisses xml:lang='x-love'>XXX</kisses> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








