[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] ANN: Regular Fragmentations
Back in April I suggested that regular expressions might be a useful tool for fragmenting XML 'molecule' content into smaller pieces which could then be processed as 'atoms': http://www.xml.com/pub/a/2001/04/25/deviant.html I've finally found the time to put together an implementation of this approach, building a SAX2 filter which uses an XML configuration file and the regular expression support built into the Xerces parser. As content passes through the filter, elements identified by the configuration file are processed and broken down into smaller elements using rules built on regular expressions. This filter is written in Java (1.3) and requires the Xerces parser. I've released it under the Mozilla Public License (MPL) and plan to continue developing it in the directions noted in the documentation. This release is version 0.02 and I don't make extensive claims for its stability, though it works quite well on the tests I've fed it. The regular expression package in the Xerces parser is largely compliant with the regular expression language defined in Appendix F of XML Schema Part 2: Datatypes. (I'm still trying to determine how much this implementation differs from other regular expression approaches, but my experiments are only really getting started.) You can use the recursive feature built into the processor to perform multiple-level fragmentation if necessary. The "Regular Fragmentation" package is available from: http://simonstl.com/projects/fragment Documentation is still primarily javadoc, though an overview provides examples and some explanation. A list of planned improvements is at the end of the overview, and probably the most notable improvement planned is support for attribute content and content identification. Currently only element content is processed, and the rules only support identification through element names. (It is namespace-aware.) Comments, suggestions, and contributions are welcome, either privately or to the xml-dev mailing list. Simon St.Laurent Associate Editor O'Reilly & Associates http://simonstl.com
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|