XML Editor
Sign up for a WebBoard account Sign Up Keyword Search Search More Options... Options
Chat Rooms Chat Help Help News News Log in to WebBoard Log in Not Logged in
Show tree view Topic
Topic Page 1 2 3 4 5 6 7 8 9 Go to previous topicPrev TopicGo to next topicNext Topic
Postnext
Yves GenestSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Yves Genest
Date: 14 Jul 2006 11:17 AM
Hello,

When I try to extract data from a xml file I get the following error:

XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).


I'm using the javax.xml.transform.Transformer class to extract data using a xsl stylesheet.

I guess i'm missing some language tag in my input xml file but I don't know exactly what it is.

Please help!


Postnext
Yves GenestSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Yves Genest
Date: 14 Jul 2006 11:37 AM
I found that the problem come from a url embedded in my html that has a cgi parameter called lang. I need to find a way for jtidy to ignore or bypass url parameters. Any idea?

Postnext
Ivan PedruzziSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Ivan Pedruzzi
Date: 14 Jul 2006 11:46 AM
Hi Yves

Could you post the HTML fragment the XML fragment after processing with tidy?

Ivan Pedruzzi
Stylus Studio Team

Postnext
Yves GenestSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Yves Genest
Date: 14 Jul 2006 01:10 PM
Hi Ivan,

This is the fragment before:
<A HREF="/nexres/search/power_search.cgi?&src=10014963&ses=61f08d084812d3568f490653fb9863a926798&src_aid=&path=&unps=&lang=" target=_parent>Hotels</A>

The ragment after:
<a
href="/nexres/search/power_search.cgi?

&amp;src=10014963&amp;ses=892d7442df877c3aa4b886503227ae1590827&amp;src_aid=&amp;path=&amp;unps=&lang;="
target="_parent">Hotels</a>

As you can see, tidy sees lang as an entity and add the ; after. Should't it be translated into &amp;lang= ?

Postnext
Yves GenestSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Yves Genest
Date: 14 Jul 2006 01:15 PM
Hi Ivan,

This is the fragment before:
<A HREF="/nexres/search/power_search.cgi?&src=10014963&ses=61f08d084812d3568f490653fb9863a926798&src_aid=&path=&unps=&lang=" target=_parent>Hotels</A>

The ragment after:
<a
href="/nexres/search/power_search.cgi?

&amp;src=10014963&amp;ses=892d7442df877c3aa4b886503227ae1590827&amp;src_aid=&amp;path=&amp;unps=&lang;="
target="_parent">Hotels</a>

As you can see, tidy sees lang as an entity and add the ; after. Should't it be translated into &amp;lang= ?

Postnext
Tony LavinioSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Tony Lavinio
Date: 17 Jul 2006 09:55 AM
The problem is that the HTML is ambiguous.
Ampersands in HTML should be escaped, although browsers generally
will let them through.
But, &lang; is a valid HTML entity. That's why it behaves differently
from the others. It corresponds to Unicode 9001 decimal - the
left-pointing angle bracket.
Tidy thinks it is a special character, which actually it is. A fully
standards-conforming browser, not operating in 'quirks' mode, would also
see it that way and not as a parameter named lang.
See http://www.w3.org/TR/html401/loose.dtd, strict.dtd and
HTMLsymbol.ent and also symbol hex 2329 (decimal 9001) at
http://www.unicode.org/charts/PDF/U2300.pdf

The solution is either to change the source HTML if you can, or write
code that replaces the value &#x2329; with the string &amp;lang.

Postnext
Yves GenestSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Yves Genest
Date: 17 Jul 2006 01:16 PM
Thanks Tony,

In fact I discovered that the internal stylus studio html-xml converter does not have this problem. I also ran into problems with Tidy with javascript. What Java class Stylus is using to convert HTML into XML?

Postnext
Tony LavinioSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Tony Lavinio
Date: 14 Jul 2006 12:33 PM
Try replacing the ?lang= in your URL with %3Flang=, and if that doesn't
work try %25%3Flang=. Let us know if this helps.

Postnext
Yves GenestSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Yves Genest
Date: 14 Jul 2006 12:57 PM
I found that the problem come from a url embedded in my html that has a cgi parameter called lang. I need to find a way for jtidy to ignore or bypass url parameters. Any idea?

Posttop
Tony LavinioSubject: XML-22004: (Fatal Error) Error while parsing input XML document (Missing entity 'lang'.).
Author: Tony Lavinio
Date: 14 Jul 2006 01:06 PM
Ivan asked if you could post a fragment; without seeing
more detail, it's not obvious how to help you further.

 
Topic Page 1 2 3 4 5 6 7 8 9 Go to previous topicPrev TopicGo to next topicNext Topic
Download A Free Trial of Stylus Studio 6 XML Professional Edition Today! Powered by Stylus Studio, the world's leading XML IDE for XML, XSLT, XQuery, XML Schema, DTD, XPath, WSDL, XHTML, SQL/XML, and XML Mapping!  
go

Log In Options

Site Map | Privacy Policy | Terms of Use | Trademarks
Stylus Scoop XML Newsletter:
W3C Member
Stylus Studio® and DataDirect XQuery ™are from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2016 All Rights Reserved.