[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: SAX - not well formed data
Incidentally, you could also achieve the same effect with a one-line query using the Saxon-SA streaming capabilities. java com.saxonica.Query -qs:"saxon:stream(doc('in.xml')/xml/page)[1]" should do the job. It will automatically stop reading the input when it has found the data it needs. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Johannes Lichtenberger > [mailto:Johannes.Lichtenberger@u...] > Sent: 03 February 2009 15:49 > To: Michael Kay > Cc: 'xml-dev' > Subject: RE: SAX - not well formed data > > Am Dienstag, den 03.02.2009, 14:39 +0000 schrieb Michael Kay: > > > I have a document like this: > > > > > > <xml> > > > <page> > > > <rev>...</rev> > > > <rev>...</rev> > > > </page> > > > ... (some hundreds of pages) > > > <page> > > > <rev>... > > > > > > so it's not well formed. > > > > It's not clear from that description why it isn't well-formed. > > Well, I'm downloading and extracting a file with `curl > http://... | bzcat > test.xml`, but because it's very big, > and I maybe haven't got the time to analyse the whole data, > I'm extracting pages from the beginning, so I press CTRL+C > sometime afterwards. Maybe I could extract pages on-the-fly, > with something like `curl http://... | bzcat | java -jar > ExtractArticles but I'm not really familiar with Pipes and so > on :( Probably I would need XMLStreamReader instead of the > reader and buffer input or something like that, but I tried > it and failed... > > > > I only want to be able to write out the first pages, but the SAX > > > Parser throws errors: > > > > You should be able to abort the parse when you have read what you > > want, by throwing an exception from any of the callback > methods (e.g endElement()). > > The parser will then exit back to your application with an > exception, > > which you can catch. You should check that this exception > is the one > > you were expecting, not some other unrelated error in your input. > > Ok, that's possibly the best thing. > > Thank you! > > >
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|