|
[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Filtering, xslt 2.0
[a brief somewhat pedantic side-track]
On Wed, 2022-11-02 at 14:19 +0000, Graydon graydon@xxxxxxxxx wrote:
> On Wed, Nov 02, 2022 at 02:10:09PM -0000, Eliot Kimber
> eliot.kimber@xxxxxxxxxxxxxxx scripsit:
> > The second argument to tokenize() is a regular expression, so b, *b
> > means
> > bcomma followed by zero or more spacesb.
> >
> > I would write it as b,\s*b, which is clearer and handles all white
> > space
> > (space, tab, etc.).
>
> This is true, though I would note that in general, the Unicode
> character
> category,
>
> tokenize($param,',\p{Zs}*')
>
> can be safer. \s usually matches a space, a tab, a carriage return, a
> line feed, or a form feed, but what the exact match is depends on the
> regular expression implementation.B
For XSLT 2 and later it's defined as equivalent to the character class
[ \t\n\r] by XML Schema so there should not be any variation.
Unicode properties., however, are defined by the Unicode Consortium and
can vary over time - usually by additions.
(actually XSD omits the "&" but i think we can safely say that's a typo
and i seem to remember there may be an erratum about it.
> Whereas you know what Zs,
> "Separator, spaces", is and unlike \s it includes U+00A0 NO-BREAK
> SPACE.
That's true. Well, the second part is true; the first part strictly
speaking requires checking the Unicode version in use by the
implementation and then looking up the corresponding information.
But i doubt many people find \p{Zs} clearer than \s, and \s is likely
fine for this usage :)
Perl allows \p{Space} - see perldoc perluniprops - and for Zs,
\p{Space_Separator}, as well as \p{Zs}. I wish XML Schema had included
the longer names, although hard-wiring that much English makes
internationalization people nervous.
liam
--
Liam Quin,B https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations: B http://www.fromoldbooks.org
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|

Cart








