# Re: Tricky inclusion match

 Subject: Re: Tricky inclusion match From: Wendell Piez Date: Wed, 30 Mar 2005 12:27:27 -0500
Hi Karl (and Aron),

At 12:11 PM 3/30/2005, you wrote:
color[not(.=preceding-sibling::color)][.=\$colors]

```Is it possible to explain how using "preceding-sibling" in this
context correctly itterates all color nodes?  Wouldn't you need
following-sibling too?
```

No, you don't....

Expanded into long syntax the expression looks like this:

child::color[not(self::node() = preceding-sibling::color)][self::node() = \$colors]

That is, it selects all the child 'color' elements, eliminates those whose values are the same as a preceding-sibling's value, and from those, keeps those whose values are equal to \$colors.

The second predicate (bracketed expression) is a standard idiom for removing duplicates, and as such is simple enough. For large sets of siblings it's an expensive test (though it's the analogous test on the preceding:: axis that really gets expensive), which is why we often prefer key-retrieval techniques for de-duplication. (In this case the key-retrieval technique is cumbersome and doesn't gain us much.) You've seen this: it's central to Muenchian grouping.

Because of the way the equality operator works with node-sets (it returns true if the value of any node in the first set is equal to the value of any node in the second set), this has the result of keeping any color that is listed among the \$colors.

We look only at the preceding-sibling axis in the second predicate, not at all the siblings, because we want to skip only the second and subsequent appearances of a given color. That is, if we have

```<color>red</color>
<color>blue</color>
<color>red</color
<color>green</color>
<color>red</color```

we want to skip the second and third "red" colors (the ones preceded by a red) -- if we checked against all siblings, all three would be skipped. (We could test against following-sibling and skip the first two, if we liked; but we only want to skip two of them.)

This deduplication is necessary because if all the reds were listed, we'd count three color elements that appear in \$colors (all red).

```Cheers,
Wendell```

```======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================```

### PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!