[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Make XML::XPath bypass DTD?


bypass dtd validation
Tim Bray wrote:
> I'm using XML::XPath in the obvious way, along the lines of
> 
> my $xp = XML::XPath->new(filename => $ARGV[0]);
> 
> and the silly file has <!DOCTYPE foo SYSTEM "./foo.dtd">
> 
> I don't want it to validate and in fact I don't want it to read 
> foo.dtd.  I can achieve this by doing "cp /dev/null foo.dtd" but is 
> there a way to tell whatever machinery underlies XML::XPath to just 
> ignore the <!DOCTYPE>? -Tim

I don't recall the details as I haven't used XML::Parser in ages, but 
XML::Parser can be told to not read DTDs. You can then pass your 
properly configured XML::Parser instance to XML::XPath's constructor 
using the 'parser' option. There is also a SAX handler in XML::XPath so 
that you could use the options of any SAX parser.

The same question has been asked of XML::LibXML, which tends to 
stubbornly insist on reading the DTD no matter how strongly you tell it 
not to. The following makes it skip the resolution of all external 
entities which end in ".dtd" (which of course can be improved upon):

use strict;
use XML::LibXML;
my $parser = XML::LibXML->new;
$parser->validation(0);
$parser->load_ext_dtd(0);

$parser->callbacks(\&matchIRI, \&openIRI, \&readIRI, \&closeIRI);

my $doc = $parser->parse_string(<<'EOT');
<?xml version='1.0' encoding='iso-8859-1' standalone='yes'?>
<!DOCTYPE library SYSTEM 'acme.dtd'>
<acme>
   <foo id='baz'/>
   <foo id='baz'/>
</acme>
EOT

sub matchIRI { return shift =~ /\.dtd$/; }
sub openIRI { return \*GLOB; }
sub readIRI { return ""; }
sub closeIRI {};

-- 
Robin Berjon
   Research Scientist
   Expway, http://expway.com/

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.