Re: patterns vs. identifiers
[Mike Champion] > 8/19/2002 7:06:40 PM, "Thomas B. Passin" <tpassin@c...> wrote: > > > > >Well, you do want to remember that we have both computers and people > >involved. For people only, we want nice readable names and can make a lot > >out of a little context - plus we understand about furniture when we see > >"chair", etc. For a computer, you just about need CycL to do anything > >human-like with "chair", absent a schema-like something or other. > > I think this gets to the heart of Simon's point: He's asserting, and > I'm agreeing, that you DON'T need something like Cyc or a huge > RDF ontology to disambuguate / figure out how to process markup > via its context rather than an elaborate system of identifiers. > You probably won't get the accuracy with a pattern matching > approach as you do with an identity-determination approach, but > you may well hit an 80/20 point in actual costs/benefits. > Elliotte Rusty Harold seems to have made a similar point in the > "generic xml" thread on the TAG list, and called down the wrath > of various WAI people for his pains -- you may need strong AI > to recognize what is a "headline" in a loose XML+CSS system > rather than a well-known standard, but you can probably make > a very good guess with some pattern matching heuristics. > > XML lives in the middle ground between purely human-driven systems > and purely machine-driven systems. Compromises are necessary -- > it's got to be somewhat human-authorable, and somewhat machine- > processable, but if you go to far in either direction you miss > the point. If ther are machines at both ends, you might as well > use ASN.1 protocols; if there are humans at both ends you might > as well use PDF or HTML. The point I take from this is that > if an architecture requires human authors to type long URIs to > get an unambiguous identity, there are inevitably going to be > errors that make all that logic moot (Recall the recent post > about a colleague practically going postal when he discovered that the > bug he had spent days tracking down was due to using "w3c.org" rather > than "w3.org" in a namespace URI). In other words, identity-based > pay for their accuracy and machine-friendliness with fragility if > the identifiers get screwed up somehow. > > Finding the right balance between *easily* machine processable > markup and *easily* human authorable markup is not trivial; I > think all Simon's trying to say is to remember the human element, > both as a part of the system you have to work with and as a > metaphor for how data can be processed using patterns rather > than formal identities to associate markup with processes. > Actually, I think we agree on just about everything except perhaps how possible it would be to have the computer end figure things out from context, which I still see as fairly hard. And I am definitely aligned with bringing the balance back more to the human side. Hmm, heuristics giving a less than perfect accuracy - it is starting to sound like quantum computing where the result is a probabilistic sum of amplitudes rather than a single logically entailed result. Heuristic quantum markup, I like it. Come to think of it, the Dirac braket notation is nearly markup already (e.g., <a>=<m|A>) Cheers, Tom P
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format