[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Another tokenize() question

Subject: RE: Another tokenize() question
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Wed, 11 Aug 2004 10:21:38 +0100
tokenize html
Interesting. My solution put everything into element structure and then used
for-each-group to add a level of hierarchy; David's solution put everything
into textual form and then used analyze-string to create the tree. Both seem
to work. What you can't do, it seems, is to handle the mix of textual
structure and element structure in one go.

Michael Kay 

> -----Original Message-----
> From: David Carlisle [mailto:davidc@xxxxxxxxx] 
> Sent: 11 August 2004 10:01
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re:  Another tokenize() question
> 
> 
> > analyze and regex
> apart from my usual grotty level of typing, I have to cope with
> American spelling as well...
> 
> > When really:
> > 
> > <l><w>Why</w> <w>ha<supplied>l</supplied>dest</w> <w>&thorn;u</w>
> > <w>were</w> <w>agaynes</w> <w>me</w></l>
> 
> > is what is wanted. 
> 
> 
> some people want everything:-)
> 
> handling mixed content _across_ element boundaries is a bit more
> complicated. Probably I'd do something like this
> First have a mode a that does
> 
> <xsl:template mode="a" match="*">
> <xsl:text>{</xsl:text>
> <xsl:value-of select="name()"/><!-- perhaps atributes too if 
> you need them-->
> <xsl:text> </xsl:text>
> <xsl:value-of select="."/>
> <xsl:text>}</xsl:text>
> </xsl:template>
>  
> Then run that inside a variable as a first pass so you get
> 
>  <l>Why ha{suppliedl}dest
>  &thorn;u were agaynes me</l>
> 
> Then do your main ana-whatist-thingy, making sure anything inside
> braces, ie "{[^{}]*}" is part of the "word" regexp
> 
> 
> Then you would have
> 
> <l><w>Why</w> <w>ha{supplied l}dest</w>
> <w>&thorn;u</w> <w>were</w> <w>agaynes</w> <w>me</w></l>
> 
> except that instead of using value-of in the matching-substring part
> where you are adding <w> do another nested analyaze-strang 
> and match on  
> "{([a-z]*) ([^{}])*}
> and put the <supplied> element back by using
> <xsl:element name="{regex-group(1)}">
>   <xsl:value-of select="regex-group(2)"/>
> </xsl:element>
> 
> 
> David
> 
> 
> ______________________________________________________________
> __________
> This e-mail has been scanned for all viruses by Star Internet. The
> service is powered by MessageLabs. For more information on a proactive
> anti-virus service working around the clock, around the globe, visit:
> http://www.star.net.uk
> ______________________________________________________________
> __________

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.