[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: There can be only one! (was are we losing our grammar?)
From: Charles Reitzel <creitzel@m...> >In your other posting, you infer that the schema designer can control the >use of DOM vs. streaming. Am I in thinking Schematron implements it's own >internal model of the input and by judicious use of XPath you can control >how big or small that model becomes? Not in the XSLT-based (nor the Perl & Python ones AFAIK) implementations of Schematron! I think people on Schematron mail-list are so far quite happy with exploring what you can do when you no longer worship streams. But in a Java or C++ implementation, for example, there could certainly be rules that could be applied to decide when nodes in a DOM are no longer required for schema validation and whether there are nodes that don't need to be constructed. For example, if the Schema was simple <rule context="/"> <assert test="absynthe" >The top-level element must be absynthe</assert> </rule> I am sure rules could be made to handle this kind of case efficiently. I want to download Sun's XSLT compiler to see what they do (except their dumb pages kept sending me the same page again and again) for optimised implementations. It is an interesting area. IBM's lazy DOM approach, where branches are only fully parsed and constructed when they are accessed, would also be a useful option there too. But, again, if one does not start from the assumption that the data has not already been loaded, it is not always important. XML Schemas has been designed to allow streaming implementations. The key/uniqueness constraints being the sticking points for this...the implentation of these would probably use a big hash table for all candidates rather than maintaining the DOM, though. But this would help access but be not much help for pruning. There may well be some stripped down profile of XPath defined in the next few months which would make inferenced-pruning easier to do: for example by only including axis with down-branch and up-ancestor scope. I think OmniMark may still be the most sophisticated in this regard: it is defined for streaming use and apparantly prunes the partially-built tree once there data will not be accessed by any location paths in it (not XPath-based). The OmniMark developers seem to play things close to the chest about this, perhaps because 10+ years of tuning their implementation to their language is one of their competitive advantages. Cheers Rick Jelliffe
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|