[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

SAX-ext proposal #3: entity encoding, version

  • From: David Brownell <david-b@p...>
  • To: xml-dev@l..., sax-devel@l...
  • Date: Wed, 01 Aug 2001 18:00:32 -0700

entity encoding
Locator infoset extensions

- Two of the infoset properties for documents are not supported by
  the current SAX2 API (including extensions):  the character encoding
  used, and the XML version used.

- These are actually characteristics of all parsed entities, not just
  the document entity, just like the [base URI] currently exposed through
  the Locator interface.

- There may be up to three kinds of encoding name to be concerned with:

    * What's declared inline, using an xml/text decl, or defaulted
      (UTF-8, UTF-16)
    * Sometimes an external declaration, through MIME type, which
      is authoritative but which may not agree with the inline  decl
    * For Java, the name of the encoding actually used by a Reader
      will often not match the "winning" declaration name.  (For one
      example, "UTF8" really means "UTF-8".)

  The actual encoding used affects the kind of Unicode normalizations
  that need to be done.  That's what the infoset needs (yes?), and it'd
  be the one that's declared (externally, else internally), a non-Java name.

PROPOSAL

    - Define a new org.xml.sax.ext interface:

   public interface Locator2 extends Locator
   {
     public String getXMLVersion ();
     public String getEncoding ();
   }
    
      Strings returned would be the relevant values, or null if
      the values are not known.  The encoding string would
      reflect the active declaration.

      That would be implemented by Locator objects provided in
      setDocumentLocator() callbacks, to expose this information.

    - Define a new org.xml.sax.ext class implementing that
      interface, inheriting from org.xml.sax.helpers.LocatorImpl

    - Define a new standard feature ID:

 http://xml.org/sax/features/use-locator2
   Read-only

   If true, the Locator object passed in setDocumentLocator
   events will also implement the Locator2 interface,
   and can be cast to it.
    
      Note that because of the way Java typing works, testing that
      feature would be optional:  applications could always try to
      cast (if they were willing to take the performance hit).
    
QUESTIONS:

    - Is it necessary to expose both types of declared encodings?

      If so, proposal:  a new String getEncodingDecl () returns the
      internal label; getEncoding () would return the (authoritative)
      external label.  The internal label might be null if it was
      defaulted.  (Tracking this info costs, and it's not clear any
      apps should actually care, which is why it's omitted.)
    
    - Is there a better convention to use for extending interfaces
      than the numeric suffix?  (Meta-1)
    
    - Is the new implementation class really needed?  Alternative:
      update LocatorImpl.  (Meta-2)





PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.