[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: XML and HTML Intermixed
In message <9706261658.AB03565@h...> "Seibel, Robert R" writes: > XML Dev. Team: There is no 'team' other than the public-spirited members of this list and others :-). Everyone is invited to join in - no entrance qualifications - just a willingness to help the development process. > > In my application, I see the need to be able to mix XML (my own tags) > and HTML tags in a core content database. I plan on using a DTD > at various authoring points to validate structure and tags. This is an absolutely key question - which some of us raise at regular intervals. My analysis - which I hope others will challenge or amplify - is something like this: HTML2.0 and HTML3.2 *at present* are SGML-compatible (if properly authored, with balanced tags, quoted attributes, etc.) They are not XML-compatible for reasons which have been discussed here (inclusions/exclusions, '&' content models, etc. in the DTD, and some EMPTY tags which require the <FOO/> syntax in XML). We all expect that 'someone' will convert common DTDs to XML and HTML is a leading candidate but so far no-one has actually done it. (IMO it needs to have the (in)formal blessing of the W3C, since HTML is a W3C protegee). So the question might break down to: (a) can I mix HTML(non-XML) with XML in the same document? This would not be a valid XML document overall, but it might be valid input to an HTML browser which recognised XML markup. It's up to the browser (or other software) creator as to whether that's meaningful. (b) can I refer to an XML document from an HTML document? This is simple if there is a MIME type for XML, since standard helper technology can be used. [This is what I do for CML (Chemical Markup Language) and I use the browser to call a viewer for text/xml or chemical/x-cml]. It is generally believed that 'someone' is submitting an application to IETF/IANA for registration of the text/xml MIME type (??Progress??). (c) can I XML-ise HTML and mix it with my own DTD? Yes. It depends on how this is done. I have edited HTML2.0 to be XML-compliant for my own purposes. CML 'contains' HTML2.0 as part of the CML DTD. This guarantees there are no namespace problems (i.e. CML cannot have identical ELEMENTs to those in HTML). So this allows CML documents to contain chunks of XML-ised HTML. Rendering these is non trivial, because it is not easy to pass HTML to the browser without using Javascript and I do not like doing this (non-portable, flaky, etc.) Moreover I have tweaked my HTML to use the full XML-LINK syntax for tags such as <A>. (d) Can I use HTML with my document if I have an ElementType which clashes with one in HTML? Not easily. The question of combining DTDs and document fragments has exercised the ERB/WG and generated megabytes of opinion. A solution will appear at some time in the future. (e) Can I use XML-ised HTML and include XML-LINKs to other XML documents? Yes, if the HTML has been extended to use XML-LINK. This is what I do to avoid namespace clashes. It may have its detractors. Be warned that there is not much software which can display XML documents using two different DTDs at the same time; I'm working out how JUMBO will do this - if I get some answers to my LINK queries it should be fairly straighforward. > > Do you see mixing tags as reasonable? The XML tags could be converted > to the appropriate HTML tags if sent to a browser. Then again There are normally no default 'appropriate HTML tags'. How would you convert <FOO> <BAR>276+354/872=6354?</BAR> </FOO> to HTML? One way to tackle this is through stylesheets (CSS1 or DSSSL) where appropriate formatting/rendering is applied to each tag, including context. Alternatively (as in JUMBO) Java classes can be supplied for each ElementType which might convert to HTML. (For example, MOLecule in CML has 1500 lines of Java which among many other things will render it as HTML). > all of the tags or information could be formatted for the appropriate > output device > on the fly. > > For instance, I may have a tag called PROBLEM and another called > SOLUTION. > As I'm explaining the solution, it would be nice to use HTML tags to > explain the > solution. > > Example: > > <PROBLEM>Problem description</PROBLEM> > <SOLUTION> > <OL> > <LI>Do this first</LI> > <LI>This is second</LI> > </OL> > <P>Call me on questions.</P> > <SOLUTION> > > Let's say I used a style sheet to display the contents. It seems to me > that > using HTML tags intermixed with XML tags is a good thing. I don't have > to > reinvent my own tags when HTML already defines them. > Comments? I am strongly in favour or re-using DTDs and document fragments. So many chemical documents will draw from 3 DTDs: - HTML for the main text - MathML for the mathematics - CML for the chemistry The ERB/WG has debated this at great length and accepts it as very desirable and high-priority. No actual mechanism is given at present. An additional character has been reserved for NAMEs in case we need to use it for namespace #in the future, but we're not allowed to use it yet [I think that is the correct position??]. To summarise, I believe that mix-and-match from different DTDs is a valid and useful approach to XML. It means that there can be 'islands of validity' [an idea from the WG] within XML documents, so that XML-WF docs will not be semantically void tag soup. The difficulty at present is how those islands are identified - there is no consensus yet. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|