[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Handling unknown elements?
One dilemma I have been trying to figure out with XML is the problem of handling unknown element types and what to do with their children. For simple tree based data modeling this is pretty simple, if you come across an unknown element that the application does not understand, you just ignore it and all of its children. However what if like in the case of HTML an application may have mixed content where it understands the <B> tag for boldface text but not understand the <I> for italicized text. The actual character data may be a child of the <I> element in this case. In case you anyone would like to know I have designed an XML Application framework that for now works fine for tree-based data modeling, but it really will have problems with documents that have all sorts of element (and their properties) applied to the character content, rather than with tree-based data modeling where you simply have elements as nodes and the leaf nodes have the actual character content stored in them. The only alternative for documents is to use something like a DOM tree or else an event based parser. The framework I have designed is pretty much what you could call object based in the sense that when the parser encounters a start or empty element tag it retrieves its name and asks the current parent element for an element to handle that tags attributes and content. Does anyone have any ideas for a solution that could be both object based, but document based as well? I have thought of maybe having an opaque "UNKNOWN" element handler object that would forward all requests queries for finding child elements to its parent element, but the problem with that is how do you know and tell the application if a particular tag should be treated as an object based tag where all of its children should certainly be ignored, or else you should simply join all of its children (symbolically) to the "UNKNOWN" tags parent tag. I know this might seem a little convoluted but here is what I am trying to say in XML <B> <I> Foo </I> <I> Bar </I> </B> Using the opaque "UNKNOWN" element it would look like this in tree form if the <I> tag were unknown: <B> | | <UNKNOWN> <UNKNOWN> | | "Foo" "Bar" Symbolically this could be represented as simply: <B> | | "Foo" "Bar" Which in document format would evaluate to: <B> | "FooBar" However, if I were to do all of this in Object format, any unknown child elements of <B> which in this case happens to be the <I> element would be skipped as well as all of the other sub elements contained in <I> regardless of their type. The only solution I can possibly think of to this dilemma is to have each element object have a boolean flag that tells the XML Application Framework (which happens to be a parser now but could easily be built on top of SAX in 1/2 an hour) whether to ignore unknown child elements or else join the children of unknown child elements as children themselves. Anyone here got any better ideas on this? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|