[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Regular expression functions (Was: Re: comments on
Hi Steven, Very interesting :) Could you explain a little more about how the matchers work? You call them by name - does each of them search over the entire string, or do later matchers only match on what's left after matching the earlier ones? Did you try any other designs? What made you choose this one? > One of the things which doesn't work well currently is the > specification of the regex as an attribute to the <matcher> element. > We will avoid this by putting the regex inside a CDATA section of a > <regex> subelement (will be optional, we are testing this right > now). Not sure whether this is good practice, advice welcome. It is > only partially related to this discussion of course. I can see why you'd want to do that, given that you're matching HTML tags. Note that you're doing more escaping than you have to in the attribute value, though. Consider: <matcher regex="CLASS="story3">([^<]+)<BR></SPAN>< /FONT></STRONG><FONT\sCOLOR="#333333"\sFACE=" sans-serif,\sarial"><SPAN\sCLASS="story">([^< ]+)&nbsp;(.+)<A\sHREF="([^"]+)">More" name="items"> The greater-than signs don't have to be escaped in attribute values (they only have to be escaped if they occur in the sequence ]]> in element content). And you could avoid escaping double-quotes if you delimited the attribute with single-quotes. So you could have: <matcher regex='CLASS="story3">([^<]+)<BR></SPAN></FONT></STRONG> <FONT\sCOLOR="#333333"\sFACE="sans-serif,\sarial"><SPAN\sCLASS="s tory">([^<]+)&nbsp;(.+)<A\sHREF="([^"]+)">More' name="items"> But I agree - if you've got regular expressions like this, it's best to put them in an element where you can use CDATA sections to at least make it look like the stuff you're matching. For XSLT, I think that attributes are more natural because attributes are used for this kind of thing elsewhere (matching nodes, for instance). It would be handy if the regular expressions could be held in (global) variables because then they could be defined in content (with CDATA sections) rather than in an attribute. However, that would run up against the dynamic regular expression problem that David and I talked about yesterday. I don't think it'll be too big a problem, though - the regular expressions in XSLT are likely to be a lot smaller than these, and not include tags (hopefully!). Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|