[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Penance for misspent attributes


elements of penance
In <1021562444.984.6178.camel@l...>, "Simon St.Laurent"
<simonstl@s...> wrote:

| It's striking me more and more that developers, myself included, have
| done a poor job of examining and explaining how markup works and what
| the parts do best.  That extends to a key discussion which is generally
| considered dull but radioactive: the elements/attributes distinction.

The litmus test is whether one thinks these are or should be equivalent:

  1)  <foo bar="baz"/>
  2)  <foo><bar>baz</bar></foo>

I would say that the job so far, poor or not, has almost entirely been one
of propounding the equivalence view. 

| A lot of people have been storing data in attributes rather than in
| element content.  There are lot of reasons for this, ranging from a more
| compact form to simpler processing in SAX.  

And, of course, Keeping Things Safe For Netploder.  Is there some taboo on
mentioning this?

| To some extent, the misuse arose because attributes had features
| (defaulting, free order, some types, enumeration) that elements didn't
| have.  W3C XML Schema condones those practices for attributes and
| extends the same features to elements.  Maybe this is an improvement,
| maybe it isn't.

Taking the minority view, I would say that it isn't.  That is, rather than
trying to unify attributes and (sub)elements - especially those that wind
up with the moral equivalent of (#PCDATA) content models - it may be more
fruitful to keep them distinct. 

| Separating markup from content - and putting attributes squarely in the
| markup side - seems like one means of at least alleviating the headache.

Well, that's how it all started (see eg, [1]).  My personal rule of thumb
has always been "elements for analysis, attributes for annotation".  The
key is the sense in which attributes are not directly "analytic".  In my
own attempts to explain this to (computer-savvy?) people, I've often drawn
a parallel with parsing theory, based on the similarities between content
models and BNFs (extended regular grammars).

Given a set of production rules, a successful parse yields a parse tree
with nonterminals as nodes and terminals as leaves.  With one twist, the
SGML/XML serialization of such a parse tree is obvious.  (The twist is in
the treatment of what are *taken* to be terminals, in that programs such
as Bison allow terminals of two kinds: variables instantiated by a lexer,
and string constants.  The former actually correspond to #PCDATA elements
with obvious expansions, the latter to text directly.)  

The basic outcome is a complete partitioning of the data into a hierarchy
of semantically meaningful categories.  Turning this around, a SGML/XML
instance basically represents a *complete parsing* of its text content.
That is, while the problem in parsing theory is to recognize input, the
primary intent of generalized markup is to express the result of a prior
process of recognition in the same formalism of parse trees.  Pushing the
analogy further, where attributes make their appearance in the semantic
processing of parse trees, markup-attributes are very similar to inherited
(as opposed to synthesized) parse-attributes.  

The basic lesson: Do not use attributes to *analyse* wholes into parts. 

[1] http://www.sgmlsource.com/history/AnnexA.htm

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.