Re: Separate the data model from presentation and makethe data

From: Thomas Passin <list1@tompassin.net>
To: "xml-dev@lists.xml.org" <xml-dev@lists.xml.org>
Date: Sun, 22 Nov 2015 13:52:19 -0700

Play the video

On 11/22/2015 10:47 AM, Costello, Roger L. wrote:

Hi Folks,

Assertion: Accessing/manipulating element nodes is simpler, easier,
and faster than accessing/manipulating text. An element-based data
model is better suited to XML's strengths than is a text-based data
model.

Proof by example:

Case 1: Retrieve "west"

Text-based data model:

<edge>garden west door home</edge>

XPath to retrieve "west":

substring-before(substring-after(., ' '), ' ')

Element-based data model:

<edge> <garden/> <west/> <door/> <home/> </edge>

XPath to retrieve "west":

*[2]/name()

Clearly, retrieving "west" in the element-based data model is
simpler, easier, and faster.

Au contraire: take for example the following

<edge>garden west door home</edge>
<edge>garage west door office</edge>

Now how will you distinguish

<edge> <garden/> <west/> <door/> <home/> </edge>
from
<edge> <garage/> <west/> <door/> <office/> </edge>
?

Clearly you have to have something stored somewhwere that will tell you what "garage" and "office" and "garden" and "home" are, and you will have to somehow know that the two "wests" have the same meaning ... if they do, but they might not.

Now each element name has to be stored against the element's node structure - as text (OK yes, could be hashed or otherwise encoded - but in the text node, the content also has to be stored as text (or some encoded version). Now you have many more element structures but the same amount of text. The text of each element name - e.g., the "west" names - must either be duplicated or interned, but the same can be said for the text content itself.

How is that supposed to be more "efficient"? And why should it be any easier to process? It all depends on what use you plan to of the data, and what the meaning is supposed to be. "West garden door" might be an atomic label, in which case you probably want it to be stored verbatim, not dissassembled.

Before holding forth about processing efficiencies, and which XPATH expression might be "faster", it would be wise to ask the people who actually know how the processing is done, such as XSLT designers. Otherwise you are merely engaging in the equivalent of premature optimization.

Anyway, XML's strengths are for data interchange, not necessarily "processing". If you want to emphasize small, fast data transfer, you could always use e.g., one of the ASN.1 binary encodings. If you want to emphasize querying, you could always import into a database.

But I should stop here, since the original message sounds too much like a troll.

TomP

Case 2: Retrieve a range of data items (2nd to 3rd data items)

Text-based data model:

<edge>garden west door home</edge>

XPath to retrieve the desired range of data items:

tokenize(., ' ')[position() = (2,3)]

Element-based data model:

<edge> <garden/> <west/> <door/> <home/> </edge>

XPath to retrieve the desired range of data items:

*[position() = (2,3)]/name()

Clearly, retrieving a range of data items in the element-based data
model is simpler, easier, and faster.

Of course, at the point of presentation we will want to convert the
element-based data model to a text-based data model.

Recommendation: Separate the data model from presentation details and
make data models element-based.

Let's take a dramatic example. Rather than this:

<living-room> you are in the living room. a wizard is snoring loudly
on the couch. </living-room>

use this as your data model:

<living-room> <you/><are/><in/><the/><living-room./>
<a/><wizard/><is/><snoring/><loudly/><on/><the/><couch./>
</living-room>

and use the former only for presentation.

Do you agree?

/Roger

_______________________________________________________________________

 XML-DEV is a publicly archived, unmoderated list hosted by OASIS to
support XML implementation and development. To minimize spam in the
archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or
unsubscribe: xml-dev-unsubscribe@lists.xml.org subscribe:
xml-dev-subscribe@lists.xml.org List archive:
http://lists.xml.org/archives/xml-dev/ List Guidelines:
http://www.oasis-open.org/maillists/guidelines.php

References:
- Separate the data model from presentation and make the data modelelement-based
  - From: "Costello, Roger L." <costello@mitre.org>

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >