[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Regular expression functions (Was: Re: comments on
> -----Original Message----- > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of > Jeni Tennison > Sent: woensdag 9 januari 2002 23:32 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: Re: Regular expression functions (Was: Re: comments on > December F&O draft) > > > Hi Steven, > > Very interesting :) thanks Jeni, we hope so :-) > Could you explain a little more about how the matchers work? You call > them by name - does each of them search over the entire string, or do > later matchers only match on what's left after matching the earlier > ones? Did you try any other designs? What made you choose this one? the principal matcher included with the root <element> is matched against the entire input document depending on its outcome, nodes (atts and elems) are generated, or additional matchers are called: implicitely on the entire matched region (for-each like), or explicitely using regex groups (comparable to the tokenization of requests in Cocoon):each "parenthesized" pattern region can be addressed individually using an integer this way, you can define which matcher has to be applied to which region > > One of the things which doesn't work well currently is the > > specification of the regex as an attribute to the <matcher> element. > > We will avoid this by putting the regex inside a CDATA section of a > > <regex> subelement (will be optional, we are testing this right > > now). Not sure whether this is good practice, advice welcome. It is > > only partially related to this discussion of course. > > I can see why you'd want to do that, given that you're matching HTML > tags. Note that you're doing more escaping than you have to in the > attribute value, though. Consider: yes, I got lazy after a while and started to escape everything ;-) > delimited the attribute with single-quotes. So you could have: > > <matcher > regex='CLASS="story3">([^<]+)<BR></SPAN></FONT>< > ;/STRONG> > <FONT\sCOLOR="#333333"\sFACE="sans-serif,\sarial"><SPAN\ > sCLASS="s > tory">([^<]+)&nbsp;(.+)<A\sHREF="([^"]+)">More' > name="items"> I find this mixture even less readible somehow :-) but on the ' and ", you are absolutely correct - it was just my XML IDE that uses double quotes by default > But I agree - if you've got regular expressions like this, it's best > to put them in an element where you can use CDATA sections to at least > make it look like the stuff you're matching. and that is what we will do - a pity one cannot declare an attribute of being CDATA type in the sense of CDATA sections on the document content level > For XSLT, I think that attributes are more natural because attributes > are used for this kind of thing elsewhere (matching nodes, for indeed, and exactly the reason why we started off with atts for our regexes > instance). It would be handy if the regular expressions could be held > in (global) variables because then they could be defined in content > (with CDATA sections) rather than in an attribute. However, that would > run up against the dynamic regular expression problem that David and I > talked about yesterday. I don't think it'll be too big a problem, > though - the regular expressions in XSLT are likely to be a lot > smaller than these, and not include tags (hopefully!). I will try to read and understand your discussion - because we already thought of storing the regexes in such a way but threw that idea away because it was affecting the readability of the regexslt transformationsheet I like all parameters to a certain action to be contained in the same area, and storing the regexes inside 'global variables' would conflict with that thanks for your reaction, Steven Noels http://outerthought.org/ (+32)478 292900 XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|