[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Industrial Strength XML Serving

  • From: John Robert Gardner <jrgardn@e...>
  • To: "'xml-mailinglist'" <xml-dev@i...>
  • Date: Thu, 7 Oct 1999 09:23:56 -0400 (EDT)

industrial strength ide

I'm venturing this question as a general call for input--and pitches--with
regard to the following project we're undertaking:

	750,000 pages of journals, in both text form and gif images for 
		"canonical preservation" and cross-check

	Typed text version,  in XML (using TEI largely) yielding 
		~400,000,000 words (our initial estimates suggest 
		something in the range of 30-50 gigs of total content 
		including gifs), avg.'d to ~60,000,000 tag nodes, 
		searchable based on content of tags (word strings), 
		element heirarchy, and attribute values, with final form
		changing infrequently (archival/institutional memory)

	Primary access point being MARC records we're rendering into
		highly granular XML, for crosswalking to DC/RDF/GILS
		(we're starting with some 200 megs of MARC records alone)

I've been asking offlist for possible consultants as our systems staff has
a strong inclination to Oracle 8i and I'm hardly fluent enough on such
software to argue based upon what I know.  Based on Oracle's white paper,
it sounds viable . . . however:

In some of my offlist correspondence, I've detected a dichotomy between
the view that "it doesn't matter if it's XML, pizza's, or washing machines
you're storing, it's the size that counts (no pun intended)" -- so
Oracle's great.  ON the other side, is a sense that 8i's newness is a
potential unknown for such size in XML (we'll also likely be
subcontracting the serving of the gifs, likely out-of-state).  The
implication was that there were more SGML/XML-native packages out there if
we have the budget (we do, within the limits that, say, commissioning a
whole new softwre package is out of the question). :)

Our project is perhaps one of the best funded efforts in the humanities in
markup for some time, and surely in a class by itself viz. XML.  As it's
likely to be a model in various senses/case study, I really want to be
sure we commit down the "right" road on this, and be sure of our options
along that road.  The vision I'm implementing from teh XML side is meant
to go beyond another research resource to a full-scale research
environment which exploits XSLT for having our stuff accessible--e.g., the
MARC--in multiple tag vocabularies (DC, RDF, GILS, etc.), as well as very
sophisticate construction of the resources found through the search (e.g.,
with DOM, etc.).

At any rate, this question is in no way an obviation either of my offlist
inquiries for a consultant, nor of their input thus far.  Instead, since
the vichy soisse is not yet ready to be stirred, nor even on the stove,
all chef's are needed-- if there is a better mousetrap to be made without
a reinvention of the wheel, now's the time to know.

TYIA, 

jr

=-=-=-=-=-=-=-=-=-==-=-=-=
John Robert Gardner, Ph.D.
XML Engineer
ATLA-CERTR
------------------------------------------------------------
http://vedavid.org/



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)



PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.