[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

up conversion using regex (was something else)

Subject: up conversion using regex (was something else)
From: David Carlisle <davidc@xxxxxxxxx>
Date: Mon, 16 Aug 2004 11:03:46 +0100
up conversion xml
Wendell wrote

> Well up-conversion has been going on along as e-text has. But XML is only 
> now beginning to catch up to where Omnimark and even Perl were years ago. 

yes the XSLT2 functionality in this area is going to prove very useful I
think.

I wrote some kind of stream of consciousness thoughts on requirements
in this area a couple of years back:
http://lists.w3.org/Archives/Public/xsl-editors/2002JanMar/0083.html

The exact syntax that the WG came up with is rather different to the
suggestions in that mail, but I believe that it can cope with most if
not all of the requirements. For example one of the harder cases that I
floated was

  RE-6c: HTML Markup.
    As above but with HTML, in particular with implied end tags. In general
    this requires a DTD and knowledge of SGML omitted tag rules. To handle
    general HTML as it appears in the wild, arbitrarily complicated "tag
    soup" parsing heuristics as implemented in the browsers would be
    needed. However this appears to be a very common requirement often
    generated by storing HTML fragments as strings in a database. One may
    hope that specific simple cases may be handled for example:



The regexp support in XSLT2/saxon8 is enough to do this: see for example
http://www.dcarlisle.demon.co.uk/htmlparse.xsl
which does the above (parsing dubiously formed html, together with some
support for embedded xml_+namespace syntax). This provides an
alternative for the FAQ question on what to do with
<foo><![CDATA[...<a href="#x">click here</a> ...]]><foo>
as you can now handle that quite reasonably with XSLT2.

David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.