[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Error and Fatal Error

  • From: Peter Flynn <peter@silmaril.ie>
  • To: xml-dev@lists.xml.org
  • Date: Sun, 17 Jul 2011 23:32:13 +0100

Re:  Error and Fatal Error
On 16/07/11 18:17, Stephen D Green wrote:
> Absolutely.
> �
> I hunted around the .Net framework hoping to find such
> a parser which allowed me to repair the XML but I couldn't
> find one.

I suspect it's more the case that such a program is really a "tool", not 
a "parser" per se. As such it would of course *contain* a parser, and be 
capable of parsing an XML document, and could therefore quite correctly 
be described as "containing a conformant XML parser".

It's what it would do (or let you do)  with the (possibly mangled) 
*results* of the parse that would differentiate it from the traditional 

IMHE only relatively trivial (usually single-character) errors can be 
corrected on-the-fly, such as

  * mistyped element type names, attribute names, or token-list

  * missing or extra attribute quotes, ampersands, or pointy brackets;

  * bogus or garbled characters resulting in or from a character-
    encoding error.

These tend to happen because the document has been hand-corrected 
without using a conformant editor -- still a frighteningly common 
occurrence. In these cases there is often more than one such error 
present, except in very short documents, and it is thus often better to 
handle the document in a suitable editor with good error-reporting and 
the robustness to cope with partially-marked or invalid documents.

Any error more complex than these, such as those where entire subtrees 
of the structure are misplaced in the markup framework, or where a 
persistent disruption of the syntax causes a cascade of errors, can 
really only be dealt with by opening the broken document in an editor 
and fixing it (or by regenerating it, as appropriate).

 > I think we need a parser which understands the
 > slightly erroneous XML and can find any errors in it:
 > In short we need a parser which has an API which
 > can allow the web developer (in this case with .NET)
 > to repair XML.

I'm not entirely convinced that a parser-with-editorial-cleanup would be 
significantly more use for this purpose than the standard 
editor-with-builtin-parser model. But I can well understand the 
attraction of wanting to cope inline with the kind of garbage most users 
blithely paste into text fields in web-based applications, fondly 
imagining that the Elves will automagically fix their crud into XML.

(I a few circumstances I am in the very fortunate position of being able 
to send it back to them and tell them to fix it, because we have very 
strict rules about this, and the penalty for disobedience is that their 
web page or document simply won't be published until they send us good 
data. But that is a luxury that I can justify by having cut the cost of 
cleanup and error to virtually zero by dint of a lot of user training in 
how to avoid creating crud in the first place. Unfortunately that's a 
long-term strategy that most companies won't even consider; not because 
of the up-front cost, which is high but not unaffordable; but because it 
shows up their internal quality controls to be useless to the point of 
non-existence, and that embarrasses the senior people.)


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.