|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML 1.1 and Unicode normalization
Rick Jelliffe writes: >One good argument against text normalization is that the APIs just >don't exist. (Putting ICU aside, and waiting like a bride at the altar >for Java 1.5) However, the normalization APIs don't exist because the >libraries are based on Unicode circa version 3.0. So saying we cannot >have text normalization because the libraries don't exist is really >tieing us to obsolescent Unicode versions. Please pardon my ignorance on this matter, as a brief look at the Unicode site hasn't helped. Is there a machine-readable list of character sequences for normalization that is updated from version to version? I can find normalization corrections, and the enormous but not very comprehensible Derived Normalization Properties, but I don't see a single list of pathways. Normalization doesn't seem all that different from some of the work I'm doing in character entities, and it seems like a declarative list of normalization sequences would make it a lot easier for us to forget about specific APIs and write normalizers which keep up with Unicode. Any thoughts on this? I'd be happy to do some of this work, if it isn't already there. -- Simon St.Laurent Ring around the content, a pocket full of brackets Errors, errors, all fall down! http://simonstl.com -- http://monasticxml.org
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








