[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML indexing/search engine


xpath search engine
Tim, et. al.,

The major new release of Oracle9i Release 2, has new features
for registering XML Schemas with the database and using schema
annotations (or letting the system default for you) to control
how your XML is stored. Can be stored in CLOB's, object/relationally,
or a mix of the two, with XPath search access (optionally indexed)
over both the CLOB-stored document chunks as well as the
object/relationally stored chunks. You can work with non-schema
related XML, too, but just with CLOB storage/indexing and
not object/relational.

My XMLEurope 2002 talk and demos on it are at:

  http://www.geocities.com/smuench/ann-xdb-demo.zip

More info about the XML Database features at:

  http://otn.oracle.com/tech/xml/xmldb/

__________________________________________________________________
Steve Muench - Developer, Product Mgr, Java/XML Evangelist, Author
Simplify J2EE and EJB Development with BC4J
http://otn.oracle.com/products/jdev/htdocs/j2ee_bc4j.html
Building Oracle XML Apps, www.oreilly.com/catalog/orxmlapp
----- Original Message -----
From: "Tim Bray" <tbray@t...>
To: "Roth, Scott (ITD)" <Scott.Roth@s...>
Cc: <xml-dev@l...>
Sent: Friday, August 23, 2002 8:04 PM
Subject: Re:  XML indexing/search engine


| Roth, Scott (ITD) wrote:
| > Hi -
| >
| > I am starting to design an application that will be a calendaring/event engine for the State of Massachusetts and all of its
agencies (Department of Public Health, Registry of Motor Vehicles, etc...).  We plan on putting an appropriate calendar event schema
in place, and then starting to generate 1 XML file per event (public hearing, course, forum, workshop, whatever...).  This will
build up quite a large amount of small XML files quickly.  My question is this - what is the best way to store these files for easy
indexing and searching?  The actual files will be stored in our content management system, so I am not worried about updating the
information - merely being able to efficiently query the collection.  Apache's Xindice seems to be the frontrunner so far.  I am
envisioning storing the collection in Xindice and returning a nodeset to my XSL that contains file names that match whatever the
query was.  The XSL is then free to iterate through each matching file using the d
| ocument function and grab whatever information for display that the current page requires.  Is there other software that I should
be considering?  Other approaches?
|
| The idea of making this information available in XML is a good one and I
| salute Massachusets for this progressive and sensible move.  Publishing
| the schema is smart too.  Of course, just because you're going to make
| it available in XML doesn't mean you have to store/maintain the data in
| XML.  Could you put an output filter on your content-management system
| and hook it up to the web with one of the many gateway products?
|
| Of course, many CM systems don't take kindly to a high volume of queries
| & exports (as in choke, fall over, die, lock up)... maybe you could
| batch-dump this stuff into a simple rdbms (oracle, mysql, whatever), and
| gateway to that while XMLifying the export; these things tend to search
| well and hold up under query loads.  Does the retrieval really need to
| be full-text or could fielded query search out of an RDBMS handle it?
|
| Summary: XML for export and interchange is totally the way to go.  How
| you get there?  Acronyms that begin with X aren't that relevant.
|
| Now all the XDBMS vendors are going to complain about my lack of
| fidelity to the religion of the XML data model, oh well.
|
| > I am anxious to get this right, as this will be the model for other statewide templatizing applications - for example, press
releases.
|
| It shouldn't be *that* hard.  Once you do it, let us know how it went,
| or submit a paper to one of the conferences or something.  -Tim
|
|
| -----------------------------------------------------------------
| The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
| initiative of OASIS <http://www.oasis-open.org>
|
| The list archives are at http://lists.xml.org/archives/xml-dev/
|
| To subscribe or unsubscribe from this list use the subscription
| manager: <http://lists.xml.org/ob/adm.pl>
|
|


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.