[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: sets of parsing rules
Perhaps, a creative use of Cocoon pipelines and sitemaps in the background, with its out-of-the-box components: generators, transformers, serializers, matchers, selectors, and readers? http://cocoon.apache.org/2.1/features.html Or add a custom generator to extend Cocoon. -----Original Message----- From: Nathan Young -X (natyoung - Artizen at Cisco) [mailto:natyoung@c...] Sent: Wednesday, February 07, 2007 5:36 PM To: XML Developers List Subject: sets of parsing rules Hi. I have seen parts of this question addressed but I think it's worth asking the whole question anyway, since I'm sure others have run into this problem but I haven't been able to dig up any best practices in my searching so far. I may just need to search with the right terminology, in which case this should be any easy one for someone who already knows... I have an application that parses a large number of HTML pages. A few of them are well formed XHTML but that's the exception rather than the rule. By grabbing pages, manipulating them a bit (regexps have been sufficient here so far), then tidying them I can get them to a state where they are parsable XML. From there I can use XSL to get them the rest of the way (although I have a process that allows me to run regexps here too, supplementing XSLT 1.0). The wrinkle is that I have several kinds of pages, each one requiring a distinct set of steps in order to parse it. I'm starting down the road of modularizing the transforms so that I can handle more page types over time in a way that's transparent to the rest of my application. I've been exposed XML only pipelines, are there pipeline tools that allow for non-XML steps? ------------>Nathan .:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.:||:._.: ||:. Nathan Young Cisco.com->Interface Development A: ncy1717 E: natyoung@c... _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@l... subscribe: xml-dev-subscribe@l... List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php [Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|