|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Internationalising Regular Expressions
AndrewWatt2000@a... scripsit: > So, <xsd:pattern value="\w" /> would match many (unwanted) characters that < > xsd:pattern value="[A-Za-z0-9_] /> would reject as non-matching. Correct? Definitely. > In W3C XML Schema, and therefore in XForms, is it correct that the only way > to express the notion of an English language / ASCII "word character" in a > regular expression is using [A-Za-z0-9_]? Correct. > Is there any facility to express the notion of, for example, a French word > character? Or German? You'd have to concoct a similar character class, and there is always a measure of controversy about these things. The standard English spellings of "naïve" and "façade" require letters outside [A-Za-z], and so does one spelling of "coöperate". > Or is the \p{Basic_Latin} the smallest / most precise > "chunk" of characters that can be used in such a setting? That certainly doesn't do what you want: it matches any ASCII character, rejecting the non-ASCII ones. -- We call nothing profound jcowan@r... that is not wittily expressed. John Cowan --Northrop Frye (improved) http://www.reutershealth.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Cast Your Vote
We need your help – Vote for DataDirect XML Products!
Winners and finalists announced at SOA World Conference in November. Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||







