|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] dynamic xhtml with xqueryAndrew Welch andrew.j.welch at gmail.comTue Jul 14 10:06:02 PDT 2009
If you want to process RSS then I'm afraid it's far too much of a mess to not preprocess, unless you are only transforming feeds from sources you know generate good feeds. I recently wrote a small app to process feeds from all over the world. The goal was to process the contents of the feed, the stuff that is nearly always escaped or wrapped in CDATA. In the end I had to: - Run each escaped/CDATA section through tagsoup to create well-formed XML (to convert the characters to markup) - Post processes the result to convert entity references to character references - Transform that with XSLT Certain feeds would fail with parse errors using my app, but work with Google Reader (and various others) so I wonder if they use XML tools to process the feeds... (I cant see them jump through the same hoops I did) Basically RSS annoyed me a lot - what's the point of using XML if you can't process it as XML. It may be that people tasked with authoring the feeds struggled to generate well-formed XML (because perhaps they used non-XML tools to create the feeds) and then took the easy but misguided option of escaping/CDATA. Or perhaps because a lot of the feeds are loaded with adverts, they wanted to readers to be forced to treat the contents as a single lump... Either way, writing a feed reader that processes the feeds (rather than just copies them to the output) isn't trivial. 2009/7/13 Gary Lewis <http://x-query.com/mailman/listinfo/talk>: > Michael got the diagnosis right (escaped markup) even though I didn't > have the words to describe the problem accurately. > > Today I read Norman Walsh's 2003 article in xml.com on escaped markup. > My simple question to this mailing list suddenly looks like a can of > worms. And a google search of the topic turned up many other > discussions of the problem. > > Yikes! Guess I'll let go of this problem temporarily. > > If anyone knows of an "easy-read" on escaped markup suitable for an > xquery rookie, please let me know. > > Thanks. > Gary Lewis > > On Fri, Jul 10, 2009 at 2:15 PM, Michael Kay<http://x-query.com/mailman/listinfo/talk> wrote: >> If your input contains escaped markup that you want to treat as markup then >> you have to unescape it. The easiest way to do that is to put it through an >> XML parser. Unfortunately there's no standard XQuery function to do that. >> Saxon has saxon:parse(), other products may have similar extensions, or you >> may be able to write your own. >> >> A better solution is to persuade people not to escape their markup... >> >> Regards, >> >> Michael Kay >> http://www.saxonica.com/ >> http://twitter.com/michaelhkay >> >> >>
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






