[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Percentage of XML documents exclusively processed by machines?

  • From: "Cox, Bruce" <Bruce.Cox@USPTO.GOV>
  • To: "Costello, Roger L." <costello@mitre.org>, "xml-dev@l..."<xml-dev@l...>
  • Date: Mon, 12 Dec 2011 13:42:07 -0500

RE: Percentage of XML documents exclusively processed by machines?
As others have observed, it might depend on the class of documents.  We publish 10,000 -- 15,000 documents a week in XML.  These won't be looked at by a person unless there is a problem (rendering with XSLT, loading to database, complaint from content owner, complaint from value-added reseller), or an exception (mega content, i.e., more than 100,000 paragraphs, more than 100,000 embedded images, sequence listings with 1,000,000 or more sequences).  Resolving the problem, or finding a workaround for the exceptions, invariably leads to studying the instance to determine the root cause.

At the other extreme, we've just about completed a system that supports the editing of the Manual of Patent Examining Procedure (MPEP) and other similar documents.  A handful of editors use oXygen linked to a Documentum repository to create and update the content.  Even though oXygen provides a wysiwyg editor, some of the editors prefer to edit certain types of content in the raw XML, especially the hyperlinks (where the target is always a GUID).   They do so even though it meant they had to learn enough about XML to do so successfully and even though they are fully aware of the risks of doing so.  On the other hand, they have a law degree in addition to their technical degree, and, after all, XML is not rocket science.

Maybe one could generalize that any class of documents where the volume of instances is large (RSS feeds, news feeds, various kinds of business transactions), the XML will most likely not be viewed by humans unless there is an anomaly; and where the volume of instances is low, the XML is likely to be viewed by humans much more frequently.

But even in the low volume case, the intention is that an instance should be processed without human intervention.  When the MPEP content has been revised, and an editor issues the Publish order, the remainder of the processing (creating PDF, HTML, and loading the search engine) should proceed with no further human intervention.

Even though it's supposed to be hidden in most circumstances, using business terminology for tag names saves significant time and confusion in getting the business logic in schema structure right.

Bruce B Cox
OCIO/AED/Software Architecture and Engineering Division
571-272-9004


-----Original Message-----
From: Costello, Roger L. [mailto:costello@mitre.org] 
Sent: 2011 December 3, Saturday 08:13
To: xml-dev@lists.xml.org
Subject: Percentage of XML documents exclusively processed by machines?

Hi Folks,

What percentage of XML documents are exclusively processed by machines?

Allow me to explain what I mean by "exclusively processed by machines."

First, consider the opposite - XML documents that are processed by humans. An XML document is received and then displayed directly to a human. Or the XML document is received, transformed to a visually friendly form such as HTML, and then the visually friendly form is displayed to a human.

XML documents that are exclusively processed by machines don't have a human in the loop. An XML document is received and then processed by a machine. No human ever sees the XML.

Are there any statistics on the  percentage of XML documents that are exclusively processed by machines?

I'll take a wild guess and say that 99% of all XML documents are exclusively processed by machines. Is that a reasonable estimate?

/Roger


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.