[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: External subset processing by browsers

  • From: George Cristian Bina <george@o...>
  • To: Andrew Welch <andrew.j.welch@g...>
  • Date: Mon, 08 Dec 2008 13:21:00 +0200

Re:  External subset processing by browsers
Hi Andrew,

Try setting http://xml.org/sax/features/external-general-entities to 
false. See also:
http://xerces.apache.org/xerces2-j/features.html#external-general-entities

Best Regards,
George
-- 
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

Andrew Welch wrote:
> Hi Elliotte,
> 
> 2008/12/5 Elliotte Rusty Harold <elharo@m...>:
>> Firefox. There are two separate issues here:
>>
>> 1. Whether Firefox should read the external DTD subset.
>> 2. How it should treat unrecognized entities when it doesn't read the
>> external subset.
>>
>> Let me check the spec, but my recollection is that if the external DTD
>> subset is not read, unrecognized entities are not a fatal error.
> 
> I have a similar issue, for example there are some RSS feeds which
> contain entity references but no doctype:
> 
> <foo>foo &euro; bar</foo>
> 
> I was trying the handle them by supplying a LexicalHandler (to trap
> and convert them to numeric refs), and setting a few Xerces features,
> but it always throws an exception for it before the startEntity event.
> 
> Sample code (using Xerces 2.9.0):
> 
> public class Test extends XMLFilterImpl implements LexicalHandler {
> 
>     public static void main(String... args) throws Exception {
>         new Test();
>     }
> 
>     public Test() throws Exception {
> 
>         String xml = "<foo>foo &euro; bar</foo>";
> 
>         XMLReader xmlReader =
> XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
>         xmlReader.setProperty("http://xml.org/sax/properties/lexical-handler",
> this);
>         xmlReader.setFeature("http://apache.org/xml/features/scanner/notify-char-refs",
> true);
>         xmlReader.setFeature("http://apache.org/xml/features/validation/unparsed-entity-checking",
> false);
>         xmlReader.setFeature("http://xml.org/sax/features/external-parameter-entities",
> false);
>         xmlReader.setEntityResolver(this);
>         xmlReader.parse(new InputSource(new StringReader(xml)));
>     }
> 
>     @Override
>     public void startDocument() throws SAXException {
>         super.startDocument();
>     }
> 
>    public void startEntity(String name) throws SAXException {
>        System.out.println("Start ent: " + name);
>     }
> 
>     public void endEntity(String name) throws SAXException { }
>     public void startCDATA() throws SAXException { }
>     public void endCDATA() throws SAXException {  }
>     public void startDTD(String name, String publicId, String
> systemId) throws SAXException { }
>     public void endDTD() throws SAXException { }
>     public void comment(char[] ch, int start, int length) throws
> SAXException { }
> }
> 
> The output when running this is:
> 
> [Fatal Error] :1:16: The entity "euro" was referenced, but not declared.
> Exception in thread "main" org.xml.sax.SAXParseException: The entity
> "euro" was referenced, but not declared.
>         at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>         at Test.<init>(Test.java:37)
> 
> 
> It would be really nice to handle this non-well-formed input using XML
> tools without resorting to a regex replace across every feed... I'm
> not sure it's possible but the features make it seem like it should be
> - any ideas?
> 
> 
> thanks


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.