[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Accurate line and column numbers for elements and attribut

  • From: "Klotz, Leigh" <Leigh.Klotz@x...>
  • To: "John Snelson" <john.snelson@O...>
  • Date: Fri, 6 Jun 2008 11:14:18 -0700

RE:  Accurate line and column numbers for elements and attribut
Thank you for your quick and detailed answer, John.  I think you're
saying that because of attribute value normaliation rules, it's
inconvenient for parser authors to report attribute line number and
ill-defined for column number, because of the space normaliation rules
mean navigating beyond the column is difficult.  

For the record, with saxon9 I'm able to use the AugmentedSource and
obtain line numbers for attributes retrieved from XPath expressions,
just not column numbers as they're not in the NodeInfo API.  (NodeInfo
gives warnings about how vague line numbers can be, especailly for
SAX-constructed documents.)

            AugmentedSource source =
AugmentedSource.makeAugmentedSource(new StreamSource(new
File(filename)));
            source.setLineNumbering(true);
            Configuration config = new Configuration();
            config.setLineNumbering(true);
            StaticQueryContext sqc = new StaticQueryContext(config);
            DocumentInfo doc = sqc.buildDocument(source);
            XPathEvaluator xpath = new XPathEvaluator(doc);
            List nodeList = (List)xpath.evaluate("/data/*/@line", doc,
XPathConstants.NODESET);
            int len = nodeList.size();
            for (int i = 0; i < len; i++) {
                Object item = nodeList.get(i);
                if (item instanceof NodeInfo) {
                    NodeInfo nodeInfo = (NodeInfo)item;
                    System.out.println(nodeInfo.getDisplayName() + " " +
nodeInfo.getLineNumber());
                } else {
                    System.out.println(item);
                }
            }

Another reason it may not be implemented is that using SAX to handle
validation errors is 80% of the solution because SAX will give you the
correct answer in its own errors.

Thank you,
Leigh.
 

-----Original Message-----
From: John Snelson [mailto:john.snelson@O...] 
Sent: Friday, June 06, 2008 10:44 AM
To: Klotz, Leigh
Cc: xml-dev@l...
Subject: Re:  Accurate line and column numbers for elements and
attributes in Java

Hi Leigh,

There's a reason that most XML parsers don't report accurate line and
column numbers - the XML specification makes it difficult in a number of
ways. For instance, the namespaces spec basically requires that all
attributes of an element be parsed before the element can be reported,
which often means that line and column information for the attributes is
lost. Similarly, the XML spec applies attribute value normalization to
each attribute value, meaning that the original position of characters
in the source document is lost.

What a lot of XML tools end up doing is reporting a line number and then
doing something like quoting the attribute value marked with the
position of the error. This is quite unsatisfactory when you're used to
accurate line and column numbers from other tools - especially for
programming languages.

In writing my own parser FAXPP [1] (in C, so not great for you), one of
my aims was to be able to use it to report accurate line and column
numbers when parsing XSLT. In order to do that it has an option to turn
off attribute value normalization - but of course the application then
has to perform the same operation later to be described as a conformant
XML parser.

John

[1] http://faxpp.sourceforge.net/

Klotz, Leigh wrote:
> I'm developing an XML representation for a templating system which can
> report errors at both parse and at execution time.
> 
> The existing system does a good job of reporting input file line
numbers
> on its element- and attribute-like constructs, and I'd like to provide
> that same functionality with the XML representation.
> 
> Most SAX2 implementations (including the default one with recent Sun
> Java systems) offer start and end line numbers of tags, but provide
only
> the ending column number of element content.
> 
> While I'm able to make use of this information, the most critical need
> for column numbers comes in attribute values, so that errors in
> attribute-value template expressions can be reported with column
numbers
> accurate in the original input stream.  SAX2 is no help here, as the
> attributes spring forth fully formed as if from Zeus's forehead.
> 
> I've taken a look at Piccolo, which has a commented-out set of column
> number info (on elements only), and at Micahel Kay's suggestion of
using
> the Saxon AugmentedSource (
>
http://www.nabble.com/-xml-dev--Line-number-of-a-node-to7990605.html#a79
> 90605 ) but again this provides only line numbers, not column numbers,
> and not for attributes.
> 
> Does anyone have experience or recommendations for a non-validating,
> namespace-aware XML parser in Java which supports or can easily be
made
> to support accurate beginning line and column numbers of both elements
> and attributes (and maybe text)?  The solution needs to be released
> under a BSD license.
> 
> Thank you,
> Leigh.

-- 
John Snelson, Oracle Corporation            http://snelson.org.uk/john
Berkeley DB XML:            http://oracle.com/database/berkeley-db/xml
XQilla:                                  http://xqilla.sourceforge.net


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.