[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: tokenize a string with escaped spaces

Subject: Re: tokenize a string with escaped spaces
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 4 Apr 2020 08:24:02 -0000
Re:  tokenize a string with escaped spaces
Double-quotes in an XML attribute value should be written as `&quot;`. Also
remember that this is an attribute value template, so curly braces need
special treatment.

I often write such things as

<xsl:variable name="regex">\S*('[^']*')?("[^"]*")?</xsl:variable>
<xsl:analyze-string regex="{$regex}">....

which reduces these problem (fortunately & and < aren't metacharacters in
regular expressions).

Michael Kay
Saxonica

> On 4 Apr 2020, at 02:17, Mark Giffin m1879@xxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Thanks Michael. The double quotes " in the regex give errors in this
context:
>
> <xsl:analyze-string select="$attr" regex="\S*('[^']*')?("[^"]*")?">
>
> Should those be single quotes instead? Or should I put the regex in a
variable?
>
> On 4/3/2020 4:38 PM, Michael Kay mike@xxxxxxxxxxxx
<mailto:mike@xxxxxxxxxxxx> wrote:
>> Try using xsl:analyze-string with a regex of
>>
>> \S*('[^']*')?("[^"]*")?
>>
>> I've had to guess at your specification from your single example, but you
should be able to adapt it if the spec is different.
>>
>> You could also extend the regex to pick up the keyword (before '=') and
value (after '=') as captured substrings:
>>
>> (\S+)=(\S+|('[^']*')|("[^"]*"))
>>
>> and then regex-group(1) gives you the keyword, and regex-group(2) the
value.
>>
>> Michael Kay
>> Saxonica
>>
>>> On 4 Apr 2020, at 00:17, Mark Giffin m1879@xxxxxxxxxxxxx
<mailto:m1879@xxxxxxxxxxxxx> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx
<mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote:
>>>
>>> I am tokenizing an XML attribute that has info I need in it. Example:
>>>
>>> myattr="ng-model=mymodel ng-show-mymodel=='Radio button 1'"
>>>
>>> So I want to tokenize into these two values:
>>>
>>> ng-model=mymodel
>>> ng-show='Radio button 1'
>>>
>>> Using white space like tokenize($attr, '\s') gives me this, not what I
want:
>>>
>>> ng-model=mymodel
>>> ng-show='Radio
>>> button
>>> 1'
>>>
>>> Do you have a suggestion on how to do this? Doesn't have to use
tokenize().
>>>
>>> Thanks,
>>> Mark
>>>
>>
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/805141> (by
email <applewebdata://6CD9FE65-1099-427D-AE7C-76090E62A6E7>)
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email <>)

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.