[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: A heavier-weight proposal for character entitydefinition
On Wed, 2002-02-06 at 20:36, James Clark wrote: > Interesting. Those are compelling use cases but this significantly > complicates things. In particular, automatically using entities on output > becomes much more complicated. Instead of a simple hash table that maps > character codes to entities, you have to have a trie. I also see a > slippery slope opening up here: > > 1. single character > 2. base character + combining character(s)/other Unicode modifier (MathML) > 3. arbitrary sequence of characters (why limit 2? don't want to check > character types) > 4. arbitrary well-formed content (3 allows arbitrary text, and for I18N > arbitrary text needs elements for eg BIDI and ruby) > > Not clear what the right place to draw the line is here. Drawing the line at (3) seems okay to me - that permits lexical substitution at any point in the processing. The tree does become a problem at some point, but I suspect combining characters and surrogates will force us there anyway. Ents doesn't presently support trees, though it can (hackishly) support multiple characters. Something to work on... -- Simon St.Laurent Ring around the content, a pocket full of brackets Errors, errors, all fall down! http://simonstl.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|