[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: SAX and Locator
>>I am new to XML so if this is a dumb question, please don't be too hard on me. << Of course not! >>I need to know exactly where in the original document certain SAX events originate, so I am using the Locator mechanism.<< SAX has never been very concerned with lexical information. The locator is intended to give accurate information but the implementations I have tried are fairly non-conformant (if there is "conformance" at that level). This is something some people would like to see improved (myself included) but it is not high priority. That said, the spec itself should provide enough information >>Would it be possible to have Locator return the starting character offset and the ending character offset of the sequence of characters that generated the SAX event? For example, after a tag is processed, within the corresponding startElement(...) callback a call to Locator.getStartCharacter() would return the location of the "<" at the start of the tag and a call to Locator.getEndCharacter() would return the location of the ">" at the end. The location would simply be the index of the characters within the original stream.<< You could get the start character for the whole section of character data (preseumably) as that is where startElement is called, and is more reliable: <foo>Some data</foo> ^ Again it is because SAX is trying to omit ignorable/unimportant lexical information like the amount of space between attributes: <foo att1="1" att2="2">Some data</foo> The space between att1 and att2 is unimportant (as is the start element itself). The SAX spec, wrt to locator info, (from my experience) takes the view that document structures are mostly syntactic and ignorable-- the text is the renderable portion (in say, HTML). I don't know if any of this helps. What also won't help (at least not right away) is to add an RFE on the SAX Project page at sourceforge. I am not sure how soon this type of thing will be addressed though-- historically SAX has avoided any type of lexical regulation. Your best bet would be to (bite my tongue) modify a parser if you need the info. Of course interop goes out the window then. All the best, Jeff Rafter Defined Systems http://www.defined.net XML Development and Developer Web Hosting
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|