Industrial Strength XML Serving

Cart

XML Editor - Download a Free Trial >

See What's New >

Buy Now >

[Home] [By Thread] [By Date] [Recent Entries]

From: John Robert Gardner <jrgardn@e...>
To: "'xml-mailinglist'" <xml-dev@i...>
Date: Thu, 7 Oct 1999 09:23:56 -0400 (EDT)

I'm venturing this question as a general call for input--and pitches--with
regard to the following project we're undertaking:

750,000 pages of journals, in both text form and gif images for
"canonical preservation" and cross-check

Typed text version, in XML (using TEI largely) yielding
~400,000,000 words (our initial estimates suggest
something in the range of 30-50 gigs of total content
including gifs), avg.'d to ~60,000,000 tag nodes,
searchable based on content of tags (word strings),
element heirarchy, and attribute values, with final form
changing infrequently (archival/institutional memory)

Primary access point being MARC records we're rendering into
highly granular XML, for crosswalking to DC/RDF/GILS
(we're starting with some 200 megs of MARC records alone)

I've been asking offlist for possible consultants as our systems staff has
a strong inclination to Oracle 8i and I'm hardly fluent enough on such
software to argue based upon what I know. Based on Oracle's white paper,
it sounds viable . . . however:

In some of my offlist correspondence, I've detected a dichotomy between
the view that "it doesn't matter if it's XML, pizza's, or washing machines
you're storing, it's the size that counts (no pun intended)" -- so
Oracle's great. ON the other side, is a sense that 8i's newness is a
potential unknown for such size in XML (we'll also likely be
subcontracting the serving of the gifs, likely out-of-state). The
implication was that there were more SGML/XML-native packages out there if
we have the budget (we do, within the limits that, say, commissioning a
whole new softwre package is out of the question). :)

Our project is perhaps one of the best funded efforts in the humanities in
markup for some time, and surely in a class by itself viz. XML. As it's
likely to be a model in various senses/case study, I really want to be
sure we commit down the "right" road on this, and be sure of our options
along that road. The vision I'm implementing from teh XML side is meant
to go beyond another research resource to a full-scale research
environment which exploits XSLT for having our stuff accessible--e.g., the
MARC--in multiple tag vocabularies (DC, RDF, GILS, etc.), as well as very
sophisticate construction of the resources found through the search (e.g.,
with DOM, etc.).

At any rate, this question is in no way an obviation either of my offlist
inquiries for a consultant, nor of their input thus far. Instead, since
the vichy soisse is not yet ready to be stirred, nor even on the stove,
all chef's are needed-- if there is a better mousetrap to be made without
a reinvention of the wheel, now's the time to know.

TYIA,

=-=-=-=-=-=-=-=-=-==-=-=-=
John Robert Gardner, Ph.D.
XML Engineer
ATLA-CERTR
------------------------------------------------------------
http://vedavid.org/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)

Follow-Ups:
- RE: Industrial Strength XML Serving
  - From: "Steve Muench" <smuench@u...>
- Re: Industrial Strength XML Serving
  - From: "Michael Champion" <michael_champion@a...>

References:
- problem with input form
  - From: sunker@t...

Prev by Date: RE: C++ DOM Implementation
Next by Date: Mainframe-based parser?
Previous by thread: problem with input form
Next by thread: Re: Industrial Strength XML Serving
Index(es):
- Date
- Thread

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >