[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Handling unknown elements?
At 18:45 08/04/98 -0400, Tyler Baker wrote: >One dilemma I have been trying to figure out with XML is the problem of >handling unknown element types and what to do with their children. [...] > >Anyone here got any better ideas on this? Well I have some ideas ... :-) The problem I address (in JUMBO2) is " "what do I do when someone sends me an XML document without any/enough accompanying material telling me what to do with it?" If this is similar to your problem, read on :-) (1) If the DTD is present it can tell you if the document is valid. There is no agreed mechanism whereby a DTD can carry additional semantics. So your DTD could tell you if a B element can contain mixed content including an I element - it can't tell you what they mean. (2) There is no universal generic mechanism for adding semantics to an XML document. (3) If the main purpose of the document is to be rendered for humans, then stylesheets should be used. If the author creates their own tagset and doesn't provide a stylesheet, many XML-aficionados will give up at this stage. i.e. a document: This is a <FOO>bold <BAR>italic</BAR> phrase</FOO> is as valid as B and I, but the reader has to do some detective work. They'd probably give up on most. (4) If the main purpose of the document is for a machine to act upon it (and not everyone realises the enormous potential of XML here), then another way of communicating semantics has to be provided. The method I use is to map Java classes onto elements. This can use a wide degree of context-dependence and can be very powerful. Example: <MOL><ATOMS> <ARRAY BUILTIN="X2">... </ARRAY></ATOMS></MOL> will draw a chemical line drawing. <MOL><ATOMS> <ARRAY BUILTIN="X3">... </ARRAY></ATOMS></MOL> will draw a rotatable 3-D molecule. The JUMBO-MOL software is (obviously) application-specific and uses XPointers extensively to decide on context. (5) To help with the first three problems JUMBO2 now has to following *generic* facilities which help with 'unstyled' random XML documents - search the document for all elements, attributes, attribute values, and PCDATA content and uniquify them - display this as a tree showing unique markup components. This is linked to the original document (tree). Thus, I may find that <bibref> occurs in rec.xml. What does it mean? I can use JUMBO2 to find all the occurrences of <bibref> in the doc and highlight them all (almost instantaneous , now :-) - find all 'whitespace' elements and delete them. This aids tree navigation in some cases - display the content of any node (whether mixed or element) in several different styles. These include: raw XML untagged event stream (e.g. similar to removal of unknown tags) prettyprinted XML (indented) whitespace specifically highlighted 'default' styling. The default styling applies simple heuristics to display elements. Thus <SPEAKER>MACBETH</SPEAKER> is displayed as: SPEAKER: MACBETH where the markup term is in a different font. This is useful for may generic XML documents. In addition JUMBO will allow you to add your own style to individual elements. Thus <olist> in rec.xml would appear to be a list, so the user can interactively add list-formatting to it. In your case you could arrange that <B> was made bold and <I> was made italic. [I am not prepared to 'guess' the meaning of common tags - e.g. <A> - and the reader has to take the responsibility for this. I would hope that the world might converge towards common semantics for common terms, and XML-DEV is here if anyone wishes. But if you want to use <PARA> for a chemical term rather than a paragraph, you're perfectly welcome to - XML doesn't care :-)]. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|