[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: SAX: New Idea for Entity Resolution
James Clark writes: > You could just have a class that encapsulates a structure with three > members: > > - a CharacterStream > - a ByteStream > - a String > > At least one of the CharacterStream and ByteStream must be non-null. If > the ByteStream is non-null the String can specify the encoding. [Read on to the bottom for a large-ish design change.] This implies, then, the following three interfaces: public interface ByteStream { public abstract int read () throws SAXException; public abstract int read (byte b[], int start, int count) throws SAXException; } public interface CharacterStream { public abstract int read () throws SAXException; public abstract int read (char ch[], int start, int count) throws SAXException; } public class InputSource { // For each variable, imagine a get/set pair instead... public ByteStream byteStream; public CharacterStream characterStream; public String encoding; } The nice thing here is that all of these can live on separate systems in a distributed environment: the InputSource can be a C-program on a VAX, the CharacterStream can come a Python program running under alpha Linux, and the parser can be running in Java on a Windows box. There is no dependency on language- or system-specific features (except for java.lang.String, which should be able to map predictably to other languages). Now, why not take this a step further? public class InputSource { // For each variable, imagine a get/set pair instead... public String publicId; public String systemId; public ByteStream byteStream; public CharacterStream characterStream; public String encoding; } We'd have to define rules of precedence: 1) if there is a character stream, use it; 2) if there is no character stream but there is a byte stream, use the byte stream; 3) if there is neither a character stream nor a byte stream but there is a system identifier, open a connection to the system identifier; 4) if there is no character stream, byte stream, or system identifier, throw an exception (or invoke the ErrorHandler). Now, we can get away with only one parse() method in org.xml.sax.Parser: public abstract void parse (InputSource source) throws Exception; It might still be useful to keep two separate methods in EntityResolver, though: public interface EntityResolver { public String resolveSystemId (String publicId, String systemId) throws SAXException; public InputSource openEntity (String systemId) throws Exception; } Comments? All the best, David -- David Megginson ak117@f... Microstar Software Ltd. dmeggins@m... http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i... Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@i... the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@i... the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|