[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Identifying patterns within texts
Jim,
At 05:04 PM 11/29/2007, you wrote: The two responses I got (thank you....) reiterated the problem that I identified at the beginning of this project. How do you identify "math"? Since I am working on an educational tool where I am taking an old format (strings within xml tags)and converting to a new format(strings within new xml tags), it is tough to identify What is considered math. 1/2/99 Is that math or a date? Basically, I am to decide what represents a mathematical expression and place it within its own element/tag. Therefore, the software that processes it, will be able to display it in a 2-dimensional format....So the algorithm I come up with must be flexible and expandable. And it may not be perfect.... Spaces between text will be a killer.... I guess what is acceptable will be up to the Systems guys...... This sort of squishiness isn't unusual for this kind of problem, if that's any comfort. However, I have went on the path to choose the XSLT implementation that was used on the GNOME project (xmlsoft.org) which implements XSLT 1.0. The engine I chose must be easily added to an existing dll and later ported to a MAC library (now that Mac is very much Unix :) ). I needed something that was free, something the lawyers would approve the licensing, and something that would be portable among those two platforms. I have seen some Java and C++ (Xalan with Xerces) implementations, but I did not want the added tasks of integration (JNI and C++ bindings). Please comment on my logic if you see flaws. If Java isn't a realistic option for you, the reasoning seems sound enough. If Java is conceivable, you should at least consider Saxon8, which will give you all the 2.0 features and more. But I'm certainly not qualified to say whether it should be. Therefore, the idea of using xsl:analyze-string element or regular expressions in XSLT 2.0 is not an option right now. If it isn't, then you basically have two choices: 1. Embrace the "fun" of setting out to be a killer XSLT 1.0 programmer, devising various sorts of amazing trickiness in a language not designed for the task at hand. In this case I'd recommend getting a copy of Jeni Tennison's "XSLT and XPath On the Edge", which covers this sort of thing along with much else. You will become skilled in recursive templates, arcane tricks with the translate() function, and other sorts of madness. If you want to stay sane, however ... I guess I could use a package like Boost/regex to post process my converted Xml. I assume I can generate the XML from the result tree in memory and then parse that looking for math using C. This sounds like a worthwhile option. Just to make sure there isn't an intermediate course, you might investigate what sort of extensibility your processor of choice offers. Maybe you could manage the requirement by writing your own function library, and if you're lucky, maybe some of it has already been done for you. Cheers, Wendell
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|