Martin Bryan wrote: > Walter Perry wrote: > > >In fact, aren't we ready to go the whole > way and acknowledge that the ontologies addressable at the nodes of a > semantic > web must in fact *be* executable against the various inputs we put to them? > > Not unless we agree a fixed API for returning node data in a given language. No. As I argue regularly at length (and will spare you a full version of here), the scalability and adaptability of a semantic web (or whatever we call this network of autonomous nodes) depends in the first instance on each node's ability to handle a significant variation of input presentation. Considered from its own point of view, each node implements one or more processes. Within its autonomous black box, that node knows what input it requires at the threshold of executing a process and it knows the output which successful completion of that process produces. From the node's viewpoint, the forms of both its input and output are semantically fixed. That is, both input and output necessarily exhibit some structure, and such structure necessarily implies some particular understanding of the internal relationships of its constituent data and some epistemological perspective on that body of data as a whole. That is, the form of the data, together with such incidentals as what is included or not, conveys significant semantics. A 'fixed API', in exhibiting such a structure, necessarily conveys such semantics of its own. Those API semantics, however, are unlikely to be the 'native' semantic understanding which both sender and receiver bring to the data that is the substance of any specific exchange between them. In a sufficiently complex network, an agreed fixed API is unlikely to represent fully and accurately the semantic understanding of either party. This has immediate and devastating consequences for the scalability and adaptability of the system as a whole. If there is a single API across the entire semantic web, then as each node grows more specialized and the interactions among them more complex, an increasing percentage of each node's work will be wasted on conversions in and out of the API with each process it executes. The maintainer or designer of function implementation at each node will face a similarly increasing percentage of effort squandered in figuring out how to get from the data presented to the data required in building the ever more complex functionality required at each specialized node. This problem should look very familiar: those of us who have been harnessing processes distributed across different enterprises, national practices and regulations, time zones, hardware and operating system platforms, and interchange protocols have already lived through three generations of proposed solutions just since 1980. Need I point out that it is just this problem which the semantic web proposes to solve with nodes that understand the substance of their interaction at a semantic, rather than a purely syntactic, level (all of it based on the underlying ability of markup to assure self-describing data)? Fine; then get on with it, but don't introduce the eventually fatal bottleneck of conversion in every case through a static intermediate level. It is just as futile to try solving this problem by creating a separate API for each class of node-to-node pair interactions. This is the flaw in the agreed vertical market data vocabularies (ESteel, FpML, etc.--more than 2000 of them when I last gave up trying to do an accurate count, as well as to discover whether even one of them operated from a different premise). To the extent that a particular node is truly expert--that is, that within the semantic web the introduction or application of its unique ontology facilitates a semantic outcome more elaborate, more nuanced, or in some other way more than the common denominator of the vertical market vocabulary--that node requires a richer semantic vocabulary to express the output of its process. To my mind, this is precisely why we use *extensible* markup as our basic syntax. For any node to make use of the particular expertise of another surely means using primarily what is unique to the output of that node. This means using as an input what is outside the standard API in the output of that node. So, in order both to have access to the particular expertise of other nodes, and also to avoid in the general case the proliferating waste of constant conversions into and out of standard APIs, why don't we try a different premise entirely: that it is the responsibility of each node to instantiate for its own purposes (and therefore in its unique native semantics) whatever it might take as input from the unique output of another node. Not only does this put the solution to the problem in the right place philosophically, but as a practical matter it correctly factors the process design, implementation and maintenance tasks incumbent on the expertise of each node. The node's expectations for input data necessarily reflect its unique understanding of the process it is to perform, the unique ontological structure it instantiates. Part of what is unique about that node is the process by which it gets data from a form in currency outside of it into the unique internal form which directly expresses its epistemology. To share that form with an upstream data provider might well mean sharing, for example, the algorithmic processes by which that form is achieved, which may be the very raison d'etre of this node. Why undo the useful factoring of a problem which incorporates this node into its solution? It may also be that the data provider's understanding of its output is fundamentally at odds with this node's understanding of that same data as input. This touches directly upon the very nature of reuse. Surely one crucial premise of a semantic web is that different nodes may take the output product of any one and do vastly different things with it, predicated on entirely different notions of how, within their individual ontologies, the product of that node should be understood. This is why it is a semantic web and not a semantic pipeline, though the thread of any sequence of processes within it might be most easily understood as a unidimensional pipe. And, finally, the practical business of instantiating input data afresh to fit the ontology of each node just isn't that difficult in most cases. The node already knows what it needs, so the 'output' side of the 'transform' (both contentious terms here, I know) is already fixed. To the extent it is looking for some component of the input which is unique to the data source node, behavior which utilitizes that unique element has already been built into this receiver and with it, presumably, some hints to identify what it is looking for. To the extent that it has some previous experience with this data source node, this receiver has previous familiarity with the forms of data which it has received and with how it has previously instantiated that data for its own purposes. Finally, even where input data may appear at first incomprehensible, it is often possible for the receiver to instantiate it by brute force into something usable. The receiver knows, after all, the data semantics which its own processes require. If by working the permutations of instantiating what it receives into possible versions of what it needs, and then by performing its unique processes upon one of those versions, the receiver is able to achieve a useful--for it--outcome, no one else has the standing to say that this was not a correct instantiation of the input data. The point is that in all these cases there is processing code--unique to each node--to be written, and nothing will magically obviate that chore. This is why I agreed so wholehearted with Jonathan Borden's analogy of the specification of an ontology to source code. If the 'ontological node' is going to *do* anything useful then yes, of course, executable code is what is required. > > > >Let us submit the same body > > of input simultaneously to various different diagnostic > methodologies--each > > expressed as an ontology to which we can form a nexus at an addressable > > node--and, provided that we can retrieve or address the output of each, we > can > > ignore the particulars of what happens in those opaque boxes. > > Works OK for short term data, but try looking at medical records over the 90 > year life of a patient on this basis and you will run into problems. Even > Jonathon will admit that drugs get reclassified in their life-time. You need > to know the classification at the time they were administered, not the > classification today. Opium was de rigour in our grandparents time. Do you > want it adminstered to your grandchildren? As I hope I have very nearly exhaustively covered above (and I said that this would not be the version in full!--sorry), this is simply a task fulfilled by the proper factoring of the problem at each node, based upon the unique expertise of that node. In the example case, if a constellation of symptoms is submitted to multiple diagnostic nodes (good practice), there must then be a node which effectively proxies for this particular patient, or perhaps for this particular patient as treated under the philosophy of his trusted primary care physician, which then evaluates the various diagnoses and prescriptives from the unique perspective which none of the diagnostic nodes can have--that of the unique individual who will implement the next step of the process. Respectfully, Walter Perry
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format