[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Processing elements with a default or fixed value can be tricky ...here'

  • From: Roger L Costello <costello@mitre.org>
  • To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
  • Date: Fri, 27 May 2022 11:40:08 +0000

Processing elements with a default or fixed value can be tricky ...here'
Hi Folks,

Scenario: There is an XML document containing a list of book data. Each book element has, among other things, the name of the author of the book. For some books there is no information about the name of the author. In those cases the author element is omitted and its value defaults to "UNKNOWN". Here is a sample XML document:

<Books>
    <Book>
        <Title>The Science of Programming</Title>
        <Author>David Gries</Author>
    </Book>
    <Book>
        <Title>Compiler Construction for Digital Computers</Title>
        <Author>David Gries</Author>
    </Book>
    <Book>
        <Title>The Emperor's New Mind</Title>
    </Book>
    <Book>
        <Title>Algorithms</Title>
    </Book>
    <Book>
        <Title>The Path to Power</Title>
        <Author>Robert A. Caro</Author>
    </Book>
</Books>

What we would like to know is: 

  	How many distinct authors are there in the book list, 
	where the books with an UNKNOWN author are 
	counted as 1.

For the above XML document, the answer is: 3 

Problem: Write an XPath expression that produces the correct answer.

You might be tempted to use this XPath expression:

	count(distinct-values(/Books/Book/Author))

However, that is not correct. For the above XML document it returns: 2

Position your cursor on this Book element:

    <Book>
        <Title>The Emperor's New Mind</Title>
    </Book>

And ask the question: How many Author elements are in the book?

This XPath expression: 

	count(Author)

returns 0

However, if we apply the string() function to the Author element:

	count(string(Author))

we get this result: 1

Michael Kay explains the behavior difference this way:

> string() applied to an empty sequence returns the zero-length string, 
> whereas atomization applied to an empty sequence returns an empty sequence.

In other words, string() applied to an empty sequence returns one value (a string which happens to have a length=0), whereas without applying the string() function there is no value.

Returning to our question:

  	How many distinct authors are there in the book list, 
	where the books with an UNKNOWN author are 
	counted as 1.

The appropriate XPath expression is now clear:

count(distinct-values(/Books/Book/string(Author)))

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.