[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] RE: Word processors and semantic content
Bruce Cox wrote: > > While the software itself was problematic, the bigger issue was that the > person creating the document did not use the styles (structures) > appropriately (abstract tagged as the last claim, for example). If people can do something 'wrong', they will. The environment allowed them to tag the abstract as the last claim. They are just creating a nice looking document. They can because there is no relation between the styles and the way the environment responds other than the way things are styled. > These were folks who knew all about the structure of a patent application > (professional clerks in very large IP law firms), but had no economic > motivation to be careful with the markup. One approach I've heard of is actually paying authors to create valid XML. But I don't think such motivation needs to be economic. It is our experience that people will work in a controlled environment if it helps them do their job. The available structure can actually be used to help people, to guide them while editing. If the complexities of the structure are hidden and the benefits of the structure are emphasized you will get people motivated to create structured content. > Creating a successful patent application is the art of conforming to the > rules of the MPEP, correctly using language to which the courts have > assigned specific interpretations, disclosing the invention to one of > ordinary skill in the art while escaping the attention of competitors, and > still compelling the examiner to allow the application. How do you create > an authoring tool that enables that process without sacrificing > sufficient, correct structure? I think you shouldn't sacrifice structure. The key feature is prevalidation (the ability to hide all actions that would render the document invalid thus preventing the need for the user to fix validation problems). This is the main reason non-technical people can use Xopus to create structured content. This allows you to specify a document structure that supports the logical structure of a patent application. Specify required substructures and element order, the tool will build and maintain those structures for the user. Less flexibility in the document type definition will allow the tool to make more decisions about the structure which allows the user to focus on the content. For instance, fix the element order if reordering doesn't add information, the software will maintain the order. Now you can use the structure to add context sensitive help to explain the user the type of information needed in a certain local structure. Since the information and the structure enforce each other, the rules that define the structure will be clear to the user. The documentation contains the guidelines that can't be specified in the document type definition. You could add checkboxes for content properties that are hard to validate automatically (objectivity, use of correct legal terms, use of jargon, etc) so the user can manually validate the content according to the rules stated in the documentation. We have a study on our website [1] of a less complex but somewhat similar case where law reporters use Xopus to create semantic structured case digests. > the blank-page paradigm would evolve into something friendlier to explicit > structure Many people don't understand folder structures in file dialogs and just save their files in My Documents. We can't expect people to mentally map a document structure to their view of the information. This is why WYSIWYG editors are so popular. In order to be able to create structured content, people need to see the end result of their actions. But the whole point of structured content is that it can be more than a screen/page of flowing text. Therefore to get the general public start creating semantic structured xml documents they need to start using applications that require valid mixed content xml. Social profiles, wikis or mashups might evolve in that direction. > Perhaps Google has the best opportunity to do otherwise, but I've seen > nothing yet to suggest that they will. So far they have been big advocates of analyzing unstructured content using brute force. The result is that their spider currently can't tell the difference between the main content of a page and the sidebars (or ads!). My guess is that they will continue to add CPU cycles to solve that problem. [1] http://xopus.com/blog/2007/lexisnexis-buttersworth-case-study/ Best, Laurens van den Oever CEO Xopus Company laurens at xopus.com http://xopus.com +31 70 4452345 Waldorpstraat 17G 2521 CA Den Haag The Netherlands KvK 27308787
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] |
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|