[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML schema xs:string and non BMP character like 𐌀,
Martin Honnen wrote: > I am seeing inconsistencies between different schema validating parsers > when it comes to Unicode characters outside of the BMP, like 𐌀 > for instance, and length restrictions on xs:string. > However Xerces Java 2.11 reports "[Error] oneCharInstance1.xml:3:25: > cvc-length-valid: Value '?' with length = '2' > is not facet-valid with respect to length '1' for type 'one-char'." so > it seems to consider the contents of the "test" element as a string with > two characters. Based on David's and Pete's agreement I went ahead and filed a bug https://issues.apache.org/jira/browse/XERCESJ-1592 on Xerces Java but it turned out the issue had been resolved weeks ago independent of this thread by "Introducing a system property for controlling how string length is computed by the schema validator. When org.apache.xerces.impl.dv.xs.useCodePointCountForStringLength=true, the length of an xs:string or xs:anyURI value is calculated by counting the number of Unicode code points in the string. The value of the system property is false by default, preserving the long standing behaviour of computing length in Java chars (i.e. String.length())." -- Martin Honnen --- MVP Data Platform Development http://msmvps.com/blogs/martin_honnen/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|