|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: A simple guy with a simple problem
> >I think I'm missing your point. The document you got afterwards is the > >same as it was before. Is that not what you wanted? > > Therein lies the nub of the issue, the words "the same". What ultimately matters is: Are they the same when processed with by the applications at Bob's company? or, if Bob's company sends them to somebody else: Do they still follow the application-level protocol that was set between Bob's company and that somebody else? Bob did not explain what the application was, or how the protocol was defined (nor did he say exactly why it is desirable to replace STUFF with stuff). Most likely, what a SAX parser would do would not break the application. > Lexical approach: Leaves lots of the document "the same" but it > is very difficult to get the processing right in the face of all > the things that are hidden beneath the term "DTD valid XML". > foo1.xml is an example of these gotchas. Clearly, if you do processing, the result *won't* be the same, lexically - or else you would not need the processing. This is the place where Bob's description of the problem comes into play: "all occurrences of STUFF" implies "in text and attributes", which means that rewriting CDATA sections is probably ok. Again, without knowing what the application is, you cannot say for sure - but if CDATA sections matter, I'd argue that the application is broken. > Parser based approach: A lot easier to get the processing > right but fiendishly difficult to leave unprocessed parts of > the document "the same" in the face of all the things > hidden beneath the term "DTD valid XML". > The output of a SAX or XSLT transform of foo1 is an example > of the problem. I lost track what foo1 is, here. Why is the output of the parser an example? My point is that it *does* leave "unprocessed parts" the same (even though they change lexically). The task was to globally replace STUFF, so there are essentially no "unprocessed parts". I'd suggest a more pragmatic route: If Bob is reasonable confident that documents that will appear in the context of his applications are not "broken" by the processing it does, then the processing is fine. There may be cases where it can be proven that global search-and-replace will do no harm (e.g. if the negotiated protocol implies restrictions on the use of XML). If you really need an environment where more reliable data transformations are possible, you might need to look for alternatives to XML :-) Regards, Martin
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








