[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message]

dynamic xhtml with xquery

Andrew Welch andrew.j.welch at gmail.com
Tue Jul 14 10:06:02 PDT 2009


  dynamic xhtml with xquery
If you want to process RSS then I'm afraid it's far too much of a mess
to not preprocess, unless you are only transforming feeds from sources
you know generate good feeds.

I recently wrote a small app to process feeds from all over the world.
 The goal was to process the contents of the feed, the stuff that is
nearly always escaped or wrapped in CDATA.  In the end I had to:

- Run each escaped/CDATA section through tagsoup to create well-formed
XML (to convert the characters to markup)
- Post processes the result to convert entity references to character references
- Transform that with XSLT

Certain feeds would fail with parse errors using my app, but work with
Google Reader (and various others) so I wonder if they use XML tools
to process the feeds... (I cant see them jump through the same hoops I
did)

Basically RSS annoyed me a lot - what's the point of using XML if you
can't process it as XML.  It may be that people tasked with authoring
the feeds struggled to generate well-formed XML (because perhaps they
used non-XML tools to create the feeds) and then took the easy but
misguided option of escaping/CDATA.  Or perhaps because a lot of the
feeds are loaded with adverts, they wanted to readers to be forced to
treat the contents as a single lump... Either way, writing a feed
reader that processes the feeds (rather than just copies them to the
output) isn't trivial.



2009/7/13 Gary Lewis <http://x-query.com/mailman/listinfo/talk>:
> Michael got the diagnosis right (escaped markup) even though I didn't
> have the words to describe the problem accurately.
>
> Today I read Norman Walsh's 2003 article in xml.com on escaped markup.
> My simple question to this mailing list suddenly looks like a can of
> worms. And a google search of the topic turned up many other
> discussions of the problem.
>
> Yikes! Guess I'll let go of this problem temporarily.
>
> If anyone knows of an "easy-read" on escaped markup suitable for an
> xquery rookie, please let me know.
>
> Thanks.
> Gary Lewis
>
> On Fri, Jul 10, 2009 at 2:15 PM, Michael Kay<http://x-query.com/mailman/listinfo/talk> wrote:
>> If your input contains escaped markup that you want to treat as markup then
>> you have to unescape it. The easiest way to do that is to put it through an
>> XML parser. Unfortunately there's no standard XQuery function to do that.
>> Saxon has saxon:parse(), other products may have similar extensions, or you
>> may be able to write your own.
>>
>> A better solution is to persuade people not to escape their markup...
>>
>> Regards,
>>
>> Michael Kay
>> http://www.saxonica.com/
>> http://twitter.com/michaelhkay
>>
>>
>>


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2011 All Rights Reserved.