[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: stylesheet vs egrep

Subject: Re: stylesheet vs egrep
From: Ahmad J Reeves <ahmad@xxxxxxxxxxxxxx>
Date: Fri, 25 Jan 2002 14:07:05 +0000
egrep by last name
Hi Trevor,

First many thanks for your reply. The files I am processing
are 20megs each by the way.

I tried the stylesheet and it gave me 28,792 unsorted and
163 sorted, which was the same as my last stylesheet and
still not the 254 given to me by egrep. My egrep command

egrep "<CHARACTER_ID> [0-9]{3,6} </CHARACTER_ID>" 1.xml |sort -u | wc -l

is maybe doing something strange? Heres the first 20..

          <CHARACTER_ID> 10946 </CHARACTER_ID>
          <CHARACTER_ID> 11084 </CHARACTER_ID>
          <CHARACTER_ID> 11116 </CHARACTER_ID>
          <CHARACTER_ID> 11311 </CHARACTER_ID>
          <CHARACTER_ID> 11457 </CHARACTER_ID>
          <CHARACTER_ID> 12284 </CHARACTER_ID>
          <CHARACTER_ID> 12426 </CHARACTER_ID>
          <CHARACTER_ID> 12597 </CHARACTER_ID>
          <CHARACTER_ID> 12969 </CHARACTER_ID>
          <CHARACTER_ID> 13172 </CHARACTER_ID>
          <CHARACTER_ID> 13680 </CHARACTER_ID>
          <CHARACTER_ID> 13685 </CHARACTER_ID>
          <CHARACTER_ID> 14371 </CHARACTER_ID>
          <CHARACTER_ID> 16142 </CHARACTER_ID>
          <CHARACTER_ID> 16783 </CHARACTER_ID>
          <CHARACTER_ID> 16851 </CHARACTER_ID>
          <CHARACTER_ID> 17443 </CHARACTER_ID>
          <CHARACTER_ID> 17583 </CHARACTER_ID>
          <CHARACTER_ID> 17933 </CHARACTER_ID>
          <CHARACTER_ID> 17958 </CHARACTER_ID>

And the first 20 of your stylesheet...

 10010 
 10347 
 10904 
 10946 
 11084 
 11116 
 11237 
 11311 
 11457 
 12284 
 12426 
 12597 
 12599 
 12969 
 13172 
 13680 
 13685 
 14211 
 14371 
 14791 

so there are numbers in the stylesheet that egrep is missing
e.g the top 3, but still produces less....!?

Mystery..

Any one?

Ahmad


Ahmad 



Trevor Nash wrote:
> 
> On Fri, 25 Jan 2002 11:35:49 +0000, Ahmad J Reeves wrote:
> 
> >Hi there,
> >
> >I have xml files that contain 4 types of tags,
> >direct,local,global and admin in varying numbers
> 
> >I need to get a list of all the character_id's, and then
> >remove the duplicates and count them. With the following
> >stylesheet,
> >[snip]
> >Is it my stylesheet thats lying, or my egrep ?
> >
> The stylesheet, because you are forgetting the built-in templates.
> This means two things:
> 1. the default is to copy text nodes to the output: some of these are
> numbers, hence the strange results.
> 2. you are doing much more work than is necessary, since most of your
> templates are just visiting children, which is what the default does
> anyway.
> 
> Try this:
> <xsl:stylesheet
>         xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>         version="1.0">
>  <xsl:output method="text"/>
> 
> <xsl:variable name="NL" select="'&#xA;'"/>
> 
>  <xsl:template match="CHARACTER_ID">
> 
>         <xsl:value-of select="."/>
>         <xsl:value-of select="$NL"/>
> 
>  </xsl:template>
> 
> <!-- throw away all text nodes -->
> <xsl:template match="text()" />
> 
> </xsl:stylesheet>
> 
> The only reason for putting other templates in would be to avoid
> traversing bits of the document where you know there are no
> CHARACTER_ID nodes, which might make the transform a bit faster.
> Unless the input document is huge this isn't likely to make much
> difference, and of course it makes it more prone to bugs.
> 
> Regards
> Trevor Nash
> --
> Traditional training & distance learning,
> Consultancy by email
> 
> Melvaig Software Engineering Limited
> voice:     +44 (0) 1445 771 271
> email:     tcn@xxxxxxxxxxxxx
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.