Re: problem with processing CDATA tags in xml

Play the video

Subject: Re: problem with processing CDATA tags in xml
From: David Carlisle <davidc@xxxxxxxxx>
Date: Thu, 08 Apr 2010 13:32:26 +0100

On 08/04/2010 13:01, Robby Pelssers wrote:

Ok....

I need to clarify one thing...

Their product schema does not allow a<Value> to have subtags...

or rather elements don't have element content.

That's why they use CDATA.

a bad workaround (compared to fixing the input schema) as in particular you lose a lot of validation that the input is at least well formed. Which is the course of the present difficulty.

And in my opinion that's not so bad since from a data point of view these html tags are pure a rendition thing.

If you are going to quote the XML fragment as CDATA it is your responsibility to check that what you are quoting is well formed XML, since the XML parser will not do so. the posted fragment was not well formed, so it seems reasonable that an error is generated at some point once the fragment is unquoted.

If you want to do automatic fixup to the quoted fragments (which is often necessary when processing feeds for example with spurious "html" markup in them) then the thing to do is parse the fragment using an excessively lenient parser such as tag soup, tidy or my own htmlparse but exactly what errors they will tolerate depends on the parser. I'm not sure what those three do with an unquoted < as occurs in your fragment for example.

But basically I see 2 options from the responses:
(1): Use cdata-sections attribute  on<xsl:output>
(2): make changes to the schema for all elements which may have html tags as children
I still see a problem with (1)... in the end when serializing to html I still want to disable-output-escaping so the browser will recognize<sub> and<sup> as tags instead of plain text... but then the greater then '>' will result in invalid xml.

And I'm not sure if (2) will be accepted since this will involve quite a bit of work to implement the changes.


________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________

Current Thread
RE: problem with processing CDATA tags in xml, (continued) Robby Pelssers - 8 Apr 2010 11:58:59 -0000 Martin Honnen - 8 Apr 2010 12:07:54 -0000 Robby Pelssers - 8 Apr 2010 12:44:07 -0000 Martin Honnen - 8 Apr 2010 13:06:06 -0000 David Carlisle - 8 Apr 2010 12:32:51 -0000 <= Michael Kay - 8 Apr 2010 13:07:53 -0000 Wendell Piez - 8 Apr 2010 16:13:39 -0000

<- Previous	Index	Next ->
Re: problem with processing C, Martin Honnen	Thread	RE: problem with processing C, Michael Kay
Re: problem with processing C, Martin Honnen	Date	RE: problem with processing C, Robby Pelssers
	Month

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >