|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Filtering, xslt 2.0
On Wed, Nov 02, 2022 at 02:10:09PM -0000, Eliot Kimber eliot.kimber@xxxxxxxxxxxxxx scripsit:
> The second argument to tokenize() is a regular expression, so b, *b means
> bcomma followed by zero or more spacesb.
>
> I would write it as b,\s*b, which is clearer and handles all white space
> (space, tab, etc.).
This is true, though I would note that in general, the Unicode character
category,
tokenize($param,',\p{Zs}*')
can be safer. \s usually matches a space, a tab, a carriage return, a
line feed, or a form feed, but what the exact match is depends on the
regular expression implementation. Whereas you know what Zs,
"Separator, spaces", is and unlike \s it includes U+00A0 NO-BREAK SPACE.
Less of a concern with a param but potentially helpful with document
content when you mean "spaces between words" more than you mean "a
pre-Unicode general definition of white space".
--
Graydon Saunders | graydonish@xxxxxxxxx
CC&s oferC)ode, C0isses swC! mC&g.
-- Deor ("That passed, so may this.")
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








