[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: practical question re: Java/XML handling
How do you handle entities in the XML ? 2009/9/3 David A. Lee <dlee@calldei.com>: > I solved this problem in a different that is less destructive. This also > works to replace a DTD with a different one or to force validation on a > schema even if a non-existant DTD is specified. > > This particular implementation requires using the SAXParser but I belive > the idea would work with other parsers that provide similar functionality, > namely an override of "resolveEntity". The key trick is to resolve all > DTD's with a "NullInputStream" ( these are trivial to write so I wont supply > the code here) > An empty DTD file validates any XML (atleast it does in my tests). > > Here's the snippet > > > private class ValidatorHandler extends DefaultHandler { > ..... // other methods as needed > @Override > public InputSource resolveEntity(String publicId, String systemId) > throws IOException, > SAXException { > > if( systemId.toLowerCase().endsWith(".dtd")) > return new InputSource( new NullInputStream()); > else > return super.resolveEntity(publicId, systemId); > } > } > > > > SAXParserFactory f = SAXParserFactory.newInstance(); > .... setup the factory > > > > SAXParser parser = f.newSAXParser(); > ... setup the parser > > > > parser.parse(xml, new ValidatorHandler()); > > > > > > > > > David A. Lee > dlee@calldei.com > http://www.calldei.com > http://www.xmlsh.org > 812-482-5224 > > Mike Sokolov wrote: > > After all the discussion about "What is data?" I don't know if this list is > the place to discuss actual details of implementation, but please feel free > to send me elsewhere if you can think of a better venue. > > I have a need to handle XML that references a non-existent DTD. The DTD is > irrelevant to the actual processing of the XML, and isn't available > anywhere, but it is declared in in the DOCTYPE. I'm sure many of you have > encountered this situation: it's practically the norm, in my experience. > > After years of dealing with this inherently unsatisfactory situation in a > variety of ways, I came up with a new one that I am liking at the moment, > which is to insert a Stream into a Java XML processing stack that strips out > the prolog of the XML document before handing it off to a parser. This has > the nice property that it doesn't require modifications to the stored XML > files. It loses PIs and comments and the XML decl, but I can live with > that. > > My question is twofold: > > 1) does the following code snippet actually do what it is claiming to? Does > anybody see any obvious mistakes? My knowledge of the format of DOCTYPE > decls and so on is somewhat limited. I read the spec and this seems to work > on the examples I have, but I suspect there are some cases I'm not handling. > > 2) Is there a better approach? Existing code to do the same thing? Some > way to tell parsers to ignore the DOCTYPE (even though that seems to run > counter to the spec)? > > Thanks for your attention... > > -Mike Sokolov > > /** > * An InputStream for XML that strips off the prolog of an XML > * document. The idea is to avoid having to prevent parsers from > attempting > * to process an external DTD. > * > * @author sokolov > * > */ > class XmlNoPrologInputStream extends PushbackInputStream { > XmlNoPrologInputStream (InputStream base) throws IOException { > super (base, 2); > int c; > while ((c = read()) >= 0) { > if (c == '<') { > int c1 = read(); > if (c1 < 0) { > // ill-formed > reset(); > return; > } > // XML declaration, PI, comment or DOCTYPE > if (c1 == '?' || c1 == '!') > continue; > // must be the start of the document: arrange to begin > // reading here > unread(c1); > unread(c); > return; > } > } > } > > _______________________________________________________________________ > > XML-DEV is a publicly archived, unmoderated list hosted by OASIS > to support XML implementation and development. To minimize > spam in the archives, you must subscribe before posting. > > [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ > Or unsubscribe: xml-dev-unsubscribe@lists.xml.org > subscribe: xml-dev-subscribe@lists.xml.org > List archive: http://lists.xml.org/archives/xml-dev/ > List Guidelines: http://www.oasis-open.org/maillists/guidelines.php > -- Andrew Welch http://andrewjwelch.com Kernow: http://kernowforsaxon.sf.net/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|