|
[XML-DEV Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
RE: CDATA conversion issue
- From: Michael Brennan <Michael_Brennan@a...>
- To: 'Lynda VanVleet' <lyndavv@e...>, xml-dev@l...
- Date: Wed, 15 Aug 2001 15:59:19 -0700

|
I
haven't used Oracle's parser, and the last time I used the Forte 4GL parser was
about 2 years ago, so I can't comment specifically on either of these. However,
one thing you can investigate is whether either of these support the
LexicalHandler interface (which is a SAX2 standard extension). If so, you can
use SAX to parse the document and build the DOM yourself in response to SAX
events. This is more work, but the LexicalHandler interface permits your
application to be notified of CDATA sections. If neither of these support the
LexicalHandler interface, then you may want to explore using another parser.
Sun's Crimson (included in their JAXP distribution), Apache Xerces, and Aelfred
all support this interface. Also, Microsoft's XML SDK version 3.0 and higher
support this interface; if you are running your code on the Microsoft platform,
this may be an option.
I
would advise against trying to write your own XML parser. There are a number of
hidden nuances that are not evident up front; writing an XML parser is not as
trivial as it may appear, at first.
We need to allow people to view/edit XML messages
in our application. When we import a document that has a CDATA section
into an XML Parser the values are converted
using the escape characters and created in the DOM as a text
node. (We have tried this with 2 parsers, Sun Forte 4GL and Oracle's Java
Parser) Thus, if the document is exported from the DOM, the CDATA section is
gone, and the data now contains the escape characters.
Example Starting Doc
<aDoc><![CDATA[<aTag>Hello</aTag>]]></aDoc>
Example Ending Doc
<aDoc><aTag>Hello</aTag></aDoc>
I am sure this is the intended DOM behavior,
but it clearly does not satisfy our
needs.
Note: If we add a CDATA section
programmatically using the DOM, the export is correct. This leads us to
our only proposed solution so far which is to write our own code that parses
the document and creates a DOM representation of the data not converting the
CDATA sections. Hopefully this is not the only solution!
Here is the code we are using (This is using the
Sun Forte 4GL but you should be able to read this for the
logic.):
//Open a Test
File aFile:File=New; aFile.SetLocalName('c:\\temp\\ainfile.xml'); aFile.open(sp_am_read);
//Import the
document aDocument:Document=New; aDocument.ImportDocument(aFile);
//Create an output
file aOutFile:File=New; aOutFile.SetLocalName('c:\\temp\\aoutfile.xml'); aOutFile.open(sp_am_write);
//Export the
docuemnt aDocument.ExportDocument(aoutFile);
//Close the
files. aFile.close(); aOutFile.close();
I remembered this sort of question
occuring on the list in June and looked in the archives but the solution
was encoding base64 and that won't work for me. What really scares me is
in the archive thread Tim Bray says " I'm always happy to avoid using
CDATA sections if I can."
Lynda
VanVleet
Lynda Van Vleet Software Design
Engineer lvanvleet@c... http://www.classiq.com/
|

|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
| RSS 2.0 |
 |
| Atom 0.3 |
 |
| |
Stylus Studio has published XML-DEV in RSS and ATOM formats,
enabling users to easily subcribe to the list from their preferred news reader application.
|
Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website.
they were not included by the author in the initial post. To view the content without the Sponsor Links please
click here.
|
|