[XML-DEV Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
RE: JDOM XSLT TransformerConfigurationException
- From: "Michael Kay" <mike@s...>
- To: "'Jack Bush'" <netbeansfan@y...>,"'Robert Koberg'" <rob@k...>
- Date: Mon, 5 Jan 2009 11:19:14 -0000
Well, for some reason it looks as if you are trying to
parse using TagSoup but the stack trace shows you are actually parsing using
Xerces.
Michael Kay
http://www.saxonica.com/
Hi Michael,
The following statements generated state.xml file:
URL stateUrl = new URL("http://www.abc.com");
URLConnection stateconnection = stateUrl.openConnection();
stateisInHtml = stateconnection.getInputStream();
statedisInHtml = new DataInputStream(new
BufferedInputStream(stateisInHtml));
System.out.flush();
statefosOutHtml = new FileOutputStream("state.html");
while ((oneChar=statedisInHtml.read()) != -1)
statefosOutHtml.write(oneChar);
.....
statefrInHtml = new FileReader("state.html");
statebrInHtml = new BufferedReader(statefrInHtml);
SAXBuilder statesaxBuilder = new
SAXBuilder("org.ccil.cowan.tagsoup..Parser", false);
org.jdom.Document statejdomDocument =
statesaxBuilder.build(statebrInHtml);
XMLOutputter stateoutputter = new XMLOutputter();
statefwOutXml = new FileWriter("state.xml");
statebwOutXml = new BufferedWriter(statefwOutXml);
stateoutputter.output(statejdomDocument, statebwOutXml);
XPath had no problem looking up state.xml.
Thanks,
Jack
From: Michael Kay
<mike@s...> To:
Jack Bush <netbeansfan@y...>; Robert Koberg
<rob@k...> Cc:
xml-dev@l... Sent:
Monday, 5 January, 2009 2:13:33 AM Subject: RE: JDOM XSLT
TransformerConfigurationException
Nevertheless, I now encountered another
issue this time:
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8
sequence.
There's only one
explanation of that: the parser is expecting the document to be encoded in
UTF-8 but it isn't. To understand why it isn't, you need to examine how the
document was created and any transcodings that might have taken place before
it reached the parser.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown
Source) at
org.apache.xerces.impl.io.UTF8Reader.read(Unknown
Source) at
org.apache.xerces.impl.XMLEntityScanner.load(Unknown
Source) at
org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown
Source) at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
Source) at
org.apache..xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source) at
org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source) at
org.apache.xerces..parsers.XML11Configuration.parse(Unknown
Source) at
org.apache.xerces.parsers.XMLParser.parse(Unknown
Source) at
org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown
Source) at
org.jdom.input.SAXBuilder.build(SAXBuilder.java:489)
at org.jdom..input.SAXBuilder.build(SAXBuilder.java:928)
at
JDOMTrAXPojoInvestmentBean.main(JDOMTrAXPojoInvestmentBean.java:45)
The header of state.xml is as follows:
<?xml version="1.0" encoding="UTF-8"
?>
<!DOCTYPE html (View Source for full
doctype...)>
- <html
xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml"> Any ideas on what is the cause of this
issue and how to overcome it? Likewise, how to define the
correct proper namespace prefix? Is it possible that this document
has two namespaces. A default one and one with prefix 'html'? If so, which
one should I use?
It's certainly inelegant to bind the same
namespace to two prefixes like this, though it's not incorrect. Again to
prevent it happening we need to understand how you created the
document.
Michael
Kay
Stay connected to the people that matter most with a smarter inbox. http://au.rd.yahoo.com/galaxy/mail/tagline2/*http://au.docs.yahoo.com/mail/smarterinbox.
[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
|
Atom 0.3 |
|
|
Stylus Studio has published XML-DEV in RSS and ATOM formats,
enabling users to easily subcribe to the list from their preferred news reader application.
|
Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website.
they were not included by the author in the initial post. To view the content without the Sponsor Links please
click here.
|
|