[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: mixed content grouping by whitespace
Hi,
On Gerrit's excellent explanation of group-adjacent.... At 06:52 PM 4/12/2010, he wrote: This groups the nodes in the variable you've created by the boolean (so the truth or falsehood of whether the pattern matches? I didn't know you could do that in a group-* pattern) of the existence of the segs you've created on tei:seg/text() which mark the whitespace. It's really helpful to keep this distinction in mind. One sort of grouping works with a key; @group-by or @group-adjacent calculates that key. The other sort simply applies a match criterion to each node in the group to determine whether it's the particular sort of node (group-starting or group-ending) of interest for that sort of grouping. For all but the nodes marked-up as WS in our example, evaluating self::tei:seg[@type='sep'] yields the empty sequence. Since the empty sequence cannot be used as a grouping key for group-adjacent [1], its boolean value is calculated, which is false for empty sequences [2]. I could have used empty() instead of boolean() which would just flip each node's true()/false() key. In this case, I would have to swap the "when current-grouping-key" and the "otherwise" actions accordingly, or test="not(current-grouping-key())". Indeed; and "not(self::tei:seg[@type='sep'])" would work like empty(). Similarly, "exists(self::tei:seg[@type='sep'])" would work like boolean(). The main thing is that splitting logic is really "group-adjacent" logic in which the key is used to assign nodes to the categories for splitting. Another illustration of this principle would be group-adjacent="ceiling(position() div 5)", which splits into groups of five members (with the last group given the remainder). Here (the most common case for splitting) those categories are two, hence the expressions returning Boolean values. Booleans are nice since we can then examine current-grouping-key() straightforwardly with a test to tell which sort of group (of the two sorts) one is in. In the word wrap example, it's a matter of taste whether to use group-starting-with or group-adjacent. But try to tackle the group-adjacent example given in the spec [3] using group-starting-with (or group-ending-with), and you'll find yourself writing all kinds of complicated lookaheads and lookbehinds that for-each-group promised to liberate you from. The same holds for trying to solve group-starting-with problems using group-adjacent. There's a reason THey created all 4 forms of for-each-group. And THey saw it was good. Sometimes it's a matter of taste, and sometimes it's a tough call; but group-adjacent is frequently more elegant. Cheers, Wendell ====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|