[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Scatter/gather pattern [was: New XPipe Presentation Availa


auto scatter
[Roger Costello]
>[...]
>On slide 52 you say, "For any given task t to be performed on documents
>conforming to schema s, there is a fragment expression that can be used
>to chop any document into n pieces on which t can be performed
>independently."
>
>1. I am not sure what you mean by "fragment expression"?  I am guessing
>that it refers to "how we slice up the XML document".  Correct?

Yes. In many cases, an XPath.

>For the
>above instance document I would guess that the "fragment expression"
>would correspond to an XPath expression such as: BookCatalogue/Book,
>i.e., break up the document into 3 Book fragments.  Right?

Right.


>You follow with this statement: "These points are called fulcra and are
>a function of (t,s)."
>
>2. Why is the fulcra a function of the schema, s?  I don't see how the
>"slicing-up strategy" depends on the schema.  In the above XML document
>I don't even have a schema.  Any fragments that I might create aren't
>depending on a schema.

Perhaps more accurate to say that it depends on the vocabulary - in this
case, for this task, BookCatalogue/Book.

Having a formal schema - as opposed to just a vocabulary - is
not necessary to identify fulcra but having a formal schema gives you
a fighting chance of auto-detecting them.  This provides some interesting
scope for auto scatter/gather where the user does not need to even
know it is going on!


>On slide 55 you say: "For data-oriented XML, the fulcra ... may be
>independent of t."
>
>3. I read this as saying that "the task to be performed is indendent of
>how we slice up an XML document."  I am struggling to see how this could
>be true.  It seems to me that if we want to perform parallel processing
>on an XML document, the task to be performed will heavily influence how
>we slice up an instance document.  No?

The point I am aiming at is that for data-oriented XML the fulcrum coincides
with the concept of "Record". DBMS people are at home with the
notion of record independence and load records into memory as atomic
units, perform seeking based on records and so on.


>4. I am not real clear on the difference between document-oriented
>versus data-oriented (perhaps someone could explain the differences?),
>but I believe that the above XML document would be considered
>data-oriented. Yes?

Yes, this is data-oriented. The difference between data-oriented and
doc-oriented is primarily (a) homogenous structure, (b) no recursion
and (c) no mixed content.

Data oriented XML is typically just like your example. Homogenous - all
<Book> elements have identical structure. No elements nest within
themselves. No mixed content.

XHTML is an example of a document-oriented XML structure. Structures
are hetrogenous, elements such as tables can nest within each other
and <p> elements can mix tags and PCDATA creating mixed-content.

Fulcra in document-oriented XML are more varied and more likely
to be dependent on t. e.g. If I am downtranslating Docbook to
PDF my fulcra might be "chapter" elements. When downtranslating
to HTML my fulcra might be "section" elements and so on.

regards,
Sean


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.