[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: A simple guy with a simple problem

  • From: "Martin v. Loewis" <martin@l...>
  • To: sean@d...
  • Date: Thu, 15 Mar 2001 21:59:51 +0100

bob s videos
> >I think I'm missing your point. The document you got afterwards is the
> >same as it was before. Is that not what you wanted?
> 
> Therein lies the nub of the issue, the words "the same".

What ultimately matters is: Are they the same when processed with by
the applications at Bob's company? or, if Bob's company sends them to
somebody else: Do they still follow the application-level protocol
that was set between Bob's company and that somebody else?

Bob did not explain what the application was, or how the protocol was
defined (nor did he say exactly why it is desirable to replace STUFF
with stuff). Most likely, what a SAX parser would do would not break
the application.

> Lexical approach: Leaves lots of the document "the same" but it
> is very difficult to get the processing right in the face of all
> the things that are hidden beneath the term "DTD valid XML".
> foo1.xml is an example of these gotchas.

Clearly, if you do processing, the result *won't* be the same,
lexically - or else you would not need the processing. This is the
place where Bob's description of the problem comes into play: "all
occurrences of STUFF" implies "in text and attributes", which means
that rewriting CDATA sections is probably ok. Again, without knowing
what the application is, you cannot say for sure - but if CDATA
sections matter, I'd argue that the application is broken.

> Parser based approach: A lot easier to get the processing
> right but fiendishly difficult to leave unprocessed parts of
> the document "the same" in the face of all the things
> hidden beneath the term "DTD valid XML".
> The output of a SAX or XSLT transform of foo1 is an example
> of the problem.

I lost track what foo1 is, here. Why is the output of the parser an
example? My point is that it *does* leave "unprocessed parts" the same
(even though they change lexically). The task was to globally replace
STUFF, so there are essentially no "unprocessed parts".

I'd suggest a more pragmatic route: If Bob is reasonable confident
that documents that will appear in the context of his applications are
not "broken" by the processing it does, then the processing is
fine. There may be cases where it can be proven that global
search-and-replace will do no harm (e.g. if the negotiated protocol
implies restrictions on the use of XML).

If you really need an environment where more reliable data
transformations are possible, you might need to look for alternatives
to XML :-)

Regards,
Martin

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.