[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Identifying XML Document Types (was XML media types revisited)

  • From: Walter Underwood <wunder@i...>
  • To: XML-Dev Mailing list <xml-dev@i...>
  • Date: Fri, 15 Jan 1999 13:57:10 -0800

bibliography types of xml
At 03:12 PM 1/15/99 -0500, Simon St.Laurent wrote:
>
>With XML the expectations (for being able to process documents with both
>specific and generic tools) are much higher, yet the tools for identifying
>document types are actually weaker in many ways.

I'm not sure that things are all that bad. An Excel spreadsheet
can be a lot of different things, but it is always parsed the
same way. Word documents or FrameMaker documents may use different
templates, but the file format is the same. MIME types do a fine
job at that level.

More ambitious schemes for description become more and more 
application specific.

For example, my application is reading XML so that our search
engine can index it. The document features that are important
to a search engine are not specified in DTDs, style-sheets,
schemas, or anything else. We need to know which element is the
title, which is the description, and whether some parts of
the document are more important for search purposes (a bibliography
is less important, a problem description might be more important).

The search engine does not care whether the document is valid
or has a DTD at all, but it does care whether XLink is used
in the document (namespaces do help in this case).

Documents are often put to unexpected uses--indexing for
search, legal discovery, corpus linguistics, whatever.
Committing to a document description too early can actually
make a document harder to use. 

In case you're curious, the search engine is a commercial 
product (Ultraseek Server), and has supported simple XML
searching since last September.

wunder

Walter R. Underwood
wunder@i...
wunder@b... (home)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@i... the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.