[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

Find count of a string in an xml file

Michael Kay mike at saxonica.com
Fri Jun 6 09:57:57 PDT 2008


  Find count of a string in an xml file
Your XML isn't well-formed - displaySubject uses "Day" as an attribute with
no attribute name. Assuming you meant <display subject="Day">, the answer
would be
 
count((//display/@subject | //alt)[. = $word]
 
where the variable $word is initialized to the word you are looking for.
 
That looks for $word as the whole of the element or attribute value. If
you're interested in matching substrings of the value, it would be
 
count((//display/@subject | //alt)[contains(., $word)]
 
I want to find all the strings that occur more than 50 times in the
document.

Again this depends rather whether you are looking for strings that make up
the whole of an element or attribute value, or for substrings, If the
latter, you need to define how they are delimited (e.g. on word boundaries).
The naive solution to this is something like this:
 
let $allWords := for $i in (//*, //@*) return tokenize(., '\W+')
let $distinctWords := distinct-values($allWords)
where count($allWords[. = $distinctWords] gt 50)
return $distinctWords
 
But this could be horrendously inefficient unless your XQuery engine has a
rather clever optimizer. There's an XSLT 2.0 solution on page 19 of my XSLT
2.0 Programmers Reference (4th edition) that makes use of built-in grouping
facilities in XSLT, and is likely to run much faster.
 
Michael Kay
http://www.saxonica.com/


  _____  

From: http://x-query.com/mailman/listinfo/talk [mailto:http://x-query.com/mailman/listinfo/talk] On Behalf
Of Mudita Nain
Sent: 04 June 2008 17:21
To: http://x-query.com/mailman/listinfo/talk
Subject:  Find count of a string in an xml file


Hi all,
 
I am using SQL Server 2005. I have a table with xml column in which I have
loaded an XML file. I want to write an xquery which finds the number of
occurrences of a string in the document in some defined tags.
 
The structure of the document is as follows:
 
<subject>
<displaySubject = "Day">
<alt> </alt>
<alt> </alt>
<alt> </alt>
<alt> </alt>
</displaySubject>
</subject>
<subject>
<displaySubject>
<alt> </alt>
<alt> </alt>
<alt> </alt>
<alt> </alt>
</displaySubject>
</subject>
 
 
So, I want to find how many time the string "Day" occurs whether in
displaySubject or alt anywhere in the document.
Also, the "Day" string is not known. I want to find all the strings that
occur more than 50 times in the document.
 
I hope I am clear.
I would appreciate any help from you.
 
Thanks
Mudita

 
 


  _____  

Instantly invite friends from Facebook and other social networks to join you
on Windows LiveT Messenger. Invite friends now!
<https://www.invite2messenger.net/im/?source=TXT_EML_WLH_InviteFriends>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://x-query.com/pipermail/talk/attachments/20080606/4d02ed83/attachment.htm


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.