[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Parser Behaviour (serious)

  • From: Peter Murray-Rust <peter@u...>
  • To: xml-dev@x...
  • Date: Sun, 02 Apr 2000 10:22:51 +0100

xmlns parser
<Note>This message is serious<smiley/></Note>

I have been preparing a large amount of XHTML (for our VirtualXML activity)
and using Dave Raggett's excellent tidy program (with option -asxml) to
produce XHTML files of the sort:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Test page</title>
</head>
<body>
<p>A test</p>
</body>
</html>

These files work fine as HTML, and are conforming XML 1.0, but when I try
to parse them on my laptop using either AElfred or Xerces I get:

java.net.UnknownHostException: www.w3.org

What's wrong? Ah! The parser is trying to resolve the URL for the DTD and
since I'm offline (connections cost money over here) it can't. So the file
I have created can only be processed as XML if:
	(a) I am connected online
	(b) the W3C maintain *** for all time *** a means of dereferencing either
the FPI or the URL

I can't believe this is what the community wants. It fooled me, and I've
been working with XML for some time.

I still believe that undefined parser behaviour is going to be a major
deterrent to may people who want to take up XML. I have posted on this
before. I am going to keep on about it. The most common reaction I seem to
have so far is "Well that's how XML behaves - it's *your* problem to decide
how to process XML". This isn't good enough. In the current case I simply
want to switch off the parser's attempt to resolve the DTD. I would
appreciate something like:

	"Parser failed to resolve external SYSTEM identifier in DOCTYPE:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
	  To disable DTD look-up use -nosysid option"

So, for about the third time (and it took 3 times to get SAX1.0 off the
ground, what are we going to do about specifying parser behaviour? I have
shown in public how the failure to process external entities breaks systems. 

Until we resolve this question (and probably several others), XML 1.0 is
broken as an interoperable "standard"

	P.

[No criticism is aimed at Dave Raggett, who has written a splendid tool, or
the W3C who actually have a real DTD mounted at the URL mentioned. Nor to
the authors of the parsers who have done their best to provide a default
behaviour, and in the absence of any guidance have required their parsers
to access an external DTD (very reasonably).




***************************************************************************
This is xml-dev, the mailing list for XML developers.
To unsubscribe, mailto:majordomo@x...&BODY=unsubscribe%20xml-dev
List archives are available at http://xml.org/archives/xml-dev/
***************************************************************************

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.