[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Never mind the browser, let's do MicroXML

  • From: Kurt Cagle <kurt.cagle@gmail.com>
  • To: David Carlisle <davidc@nag.co.uk>
  • Date: Sat, 18 Dec 2010 09:22:36 -0500

Re:  Never mind the browser
So you end up with problematic elements such as <br/>, which in XML is interpreted as an empty element named "br" with no children, being interpreted in HTML5 as <br></br> as an element with an associated text node, albeit one with a string-length of 0?

Kurt Cagle
XML Architect
Lockheed / US National Archives ERA Project



On Fri, Dec 17, 2010 at 9:41 PM, David Carlisle <davidc@nag.co.uk> wrote:
On 18/12/2010 00:45, Kurt Cagle wrote:
Interesting (and thanks for the civil reply - I've rather been making a
stink of myself on this lately).

What I sense that you're saying is that while the parser will attempt to
parse anything thrown at it, there is still a core set of parse rules
that are independent of the underlying semantics of the language. Put
another way, there is a set of well-formedness rules, but the role of
the parser is to provide a guess, based upon its internal heuristics, as
to which particular rules apply when it encounters non-well-formed
content in order to turn it into well-formed content prior to rendering
it. Or, to state it yet another way, if a creator knows the heuristics
they could encode any content ... just that there are specific use cases
in XML that would create a different parse tree in HTML5. Would you say
this is correct?

Kurt Cagle
XML Architect
/Lockheed / US National Archives ERA Project/

I don't think you can (you at least should) use words like guess and heuristic to a process that is entirely mechanical and deterministic.

html5 isn't an extensible meta language like sgml or xml it has a fixed set of element names and any use of any other name is non conforming (which is the closest analog to xml or sgml concept of validity). The difference however with xml or sgml is that in the non conforming case it doesn't just declare the input as out of scope "not well formed". It defines for _every_ input a parse tree. Essentially conformance rules are just defined as applying to authors (and authoring systems) an html5 processor has a defined behaviour on any old rubbish.

<aaa<bbb</zzz>

has a defined parse tree, I don't actually know what it is, but FF4 will tell me...

If I read that right it parses as an element with name aaa<bbb< and a singe attribute with name zzz with value "".
I may have read that wrong (it's late) but it doesn't really matter the point it has some fixed parse tree.

You can not create any valid xml xml tree as conforming html5 as it doesn't conform as soon as you use a non html/mathml/svg element name, however so long as you avoid those names, you can determisitically produce input that will parse to give essentially the same tree structure as xml without namespaces (basically just avoid using /> syntax.

David











[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.