[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Penance for misspent attributes
On Thu, 2002-05-16 at 14:34, Arjun Ray wrote: > The litmus test is whether one thinks these are or should be equivalent: > > 1) <foo bar="baz"/> > 2) <foo><bar>baz</bar></foo> > > I would say that the job so far, poor or not, has almost entirely been one > of propounding the equivalence view. Yep - it's a good simple test. > And, of course, Keeping Things Safe For Netploder. Is there some taboo on > mentioning this? No - I just don't use it much. Is there something specific in its handling of attributes I've missed? > Taking the minority view, I would say that it isn't. That is, rather than > trying to unify attributes and (sub)elements - especially those that wind > up with the moral equivalent of (#PCDATA) content models - it may be more > fruitful to keep them distinct. That's the conclusion I'm reaching, and strongly. Suddenly I can abolish a whole group of annoying problems - if I just stick to elements for content. > | Separating markup from content - and putting attributes squarely in the > | markup side - seems like one means of at least alleviating the headache. > > Well, that's how it all started (see eg, [1]). My personal rule of thumb > has always been "elements for analysis, attributes for annotation". The > key is the sense in which attributes are not directly "analytic". In my > own attempts to explain this to (computer-savvy?) people, I've often drawn > a parallel with parsing theory, based on the similarities between content > models and BNFs (extended regular grammars). Quite reasonable. (And the history is excellent to see.) > Given a set of production rules, a successful parse yields a parse tree > with nonterminals as nodes and terminals as leaves. With one twist, the > SGML/XML serialization of such a parse tree is obvious. (The twist is in > the treatment of what are *taken* to be terminals, in that programs such > as Bison allow terminals of two kinds: variables instantiated by a lexer, > and string constants. The former actually correspond to #PCDATA elements > with obvious expansions, the latter to text directly.) I think I get this, though I've not done much with Bison. > The basic outcome is a complete partitioning of the data into a hierarchy > of semantically meaningful categories. Turning this around, a SGML/XML > instance basically represents a *complete parsing* of its text content. > That is, while the problem in parsing theory is to recognize input, the > primary intent of generalized markup is to express the result of a prior > process of recognition in the same formalism of parse trees. Pushing the > analogy further, where attributes make their appearance in the semantic > processing of parse trees, markup-attributes are very similar to inherited > (as opposed to synthesized) parse-attributes. That makes good sense. > The basic lesson: Do not use attributes to *analyse* wholes into parts. Yes! A very nice way to describe the distinction. -- Simon St.Laurent Ring around the content, a pocket full of brackets Errors, errors, all fall down! http://simonstl.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|