[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: XML indexing/search engine


xml indexing java

Hi Scott,

An emerging new technology is the OASIS ebXML Registry [1] which may
address your needs very well.

You can think of the ebXML Registry as a union of:

1. A Database Server capable of storing arbitrary content (including XML
content) and ad hoc query using SQL and XML query syntax

2. A Web Server with content managent API defined as a standards based
web service

3. A Directory server for the internet

4. A pub/subscribe event bus for the internet

The ebXML Registry may be accessible via Java API for XML Registries
[2]. Royalty free open source implementation of both ebXMl Registry and
JAXR are available at ebxmlrr project [3] in Source Forge and described
in detail at [4].

ebXML registry is an international standard approved by OASIS [5] and
developed by the OASIS ebXML Registry TC [6]. JAXR is an standard Java
API developed within the Java Community Process [7].

If you are interested in exploring the ebXMl Registry further please
contact me. Given our close proximity we can explore this further in a
face-to-face meeting if needed.

[1] OASIS ebXML Registry V2.1 specifications
http://www.oasis-open.org/committees/regrep/documents/2.1/specs/

[2] JAXR API 1.0 Specification
http://java.sun.com/xml/jaxr
http://jcp.org/jsr/detail/093.jsp

[3] ebxmlrr Open Source Project
http://ebxmlrr.sourceforge.net/

[4] Announcement of Open Source ebXMl Registry and JAXR Provider
 http://sourceforge.net/forum/forum.php?forum_id=197238

[5] OASIS
http://www.oasis-open.org

[6] OASIS ebXML Registry TC
http://www.oasis-open.org/committees/regrep

[7] Java Community Process
http://jcp.org

--
Regards,
Farrukh


<scott.roth@s...>
I am starting to design an application that will be a calendaring/event
engine for the State of Massachusetts and all of its agencies
(Department of Public Health, Registry of Motor Vehicles, etc...).  We
plan on putting an appropriate calendar event schema in place, and then
starting to generate 1 XML file per event (public hearing, course,
forum, workshop, whatever...).  This will build up quite a large amount
of small XML files quickly.  My question is this - what is the best way
to store these files for easy indexing and searching?  The actual files
will be stored in our content management system, so I am not worried
about updating the information - merely being able to efficiently query
the collection.  Apache's Xindice seems to be the frontrunner so far.  I
am envisioning storing the collection in Xindice and returning a nodeset
to my XSL that contains file names that match whatever the query was.
The XSL is then free to iterate through each matching file using the
document function and grab whatever information for display that the
current page requires.  Is there other software that I should be
considering?  Other approaches?

I am anxious to get this right, as this will be the model for other
statewide templatizing applications - for example, press releases.
</scott.roth@s...>





PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.