[Home] [By Thread] [By Date] [Recent Entries]

  • To: xml-dev@l...
  • Subject: XML's Scylla and Charybdis - parse and regexp
  • From: Sean McGrath <sean.mcgrath@p...>
  • Date: Tue, 01 Apr 2003 09:48:59 +0100

Here is the conundrum that is at the heart of the border guard thread:

Option ! :  Using regexp to do process my XML
1.1 I will not be able to say for sure that it works for all WF inputs - 
false negatives/positives possible
1.2  I will be able to say that my processing will leave things like entity 
refereces, whitespace, attribute delimiters etc. unharmed.

Option 2: Using a parse to process my XML
2.1 I will be able to say for sure that it works for all WF inputs - no 
false positives/negatives possible
2.2. I will not be able to say my processing will not negatively effect 
things like entity references, whitespace, attribute
delimiters etc. unharmed.

Correctness or input fidelity - pick one - you cannot have both.

This is at the core of why I've always argued that we *do* need a data 
model for XML and we *do* need something like
common XML because I want my processing to be both correct *and* non-lossy 
(high input fidelity).

Is that too much to ask?

Any I the only one who wants both?

Sean



http://seanmcgrath.blogspot.com



Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member