|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: patterns vs. identifiers
8/19/2002 11:55:33 PM, "Thomas B. Passin" <tpassin@c...> wrote: >Actually, I think we agree on just about everything except perhaps how >possible it would be to have the computer end figure things out from >context, which I still see as fairly hard. I may well be over-optimistic; I'm trying to put together some code to explore the issue. For what it's worth, my suspicion that there *is* a lot one could do with fairly simple heuristics was strengthened by reading http://www.paulgraham.com/spam.html (a discussion of spam filtering): " A few simple rules will take a big bite out of your incoming spam. Merely looking for the word "click" will catch 79.7% of the emails in my spam corpus, with only 1.2% false positives." Also check out Eugene Kuznetzov's article in XML Journal on XML-aware network equipment http://www.sys-con.com/xml/articleprint.cfm?id=459 In discussing the challenge of recognizing a specific XML vocabulary and routing messages in that vocabulary to a specialized processor, he says "the same device could send messages in a particular XML vocabulary to the server capable of processing them, or it could send separate XML-RPC and SOAP messages. The routing rules are specified using either proprietary pattern-matching languages or a limited subset of XPath." This must be done under severe performance constraints: "Because enterprise network equipment is expected to function at wirespeed (at least Fast Ethernet or 100 megabits per second), the same is required of the XML processing core embedded in the device." So, it looks to me that it is quite possible to use pattern matching and/or XPath "queries" to usefully perform tasks with heuristics that "logically" require much more complex namespace processing and schema type validation. Also, I really hate to mention this :-) but think of the "wonderful" job that browsers do in making sense out of hideously invalid HTML. Is there any reason to think that that level of creative hackery can't or won't be applied to the challenge of making sense out of business messages in XML, some of which will come from buggy software, some of which will be human edited, some of which will come from organizations that support newer versions of some spec than the receiver does, some will be generated by software that interprets the ambiguities in the spec differently from the receiver, some of which will come from software that "embraces and extends" the spec .... ad nauseum? A "draconian" error handling policy just won't be any more viable than it would have been in Netscape 1.0. [I don't want to argue that any of this is a Good Thing ... just that it is technically possible, and the business geeks will probably think it desireable.]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








