[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: More on Vector Models

lost vectors
On Wed, 4 May 2005 3:39 am, Bullard, Claude L (Len) wrote:
> Not at all.  There is no implied or explicit profit
> motive to 'move on'.  There are
> problems that markup doesn't solve, but possibly
> there are also old solutions that can improve markup.
> That is why I was querying Steve DeRose.  His is a
> world class mind with lots of experience in this
> and other fields.  From time to time, the idea of
> combining vector techniques with markup comes up.
> Bosworth's presentation is another stimulus.

ok, well I'm lost. Vectors are a simple mathematic
paradigm. How do they apply to xml? or is it just
a new type of marketing speek?

> 1.  Vector space models are old.  (See Salton et al).
> VSM technologies incorporate a set of techniques
> that have been refined over the years to enable such
> things as normalization, increased use of probability,
> relaxed constraints on term independence, use of the
> document vectors to get relevance feedback, etc.

I've obviously been out doing other things... :-)

> 2. One doesn't move on to the next big thing.  One
> looks at the data environment and builds systems that
> cope with what is as is and then possibly, pushes it
> to be otherwise.


> Again, in the record systems I see, there is far more
> unstructured text data than any other kind.  

What kind of record systems are they?

> XML is a step
> above the level of 'bag o' words' which is the level
> where VSM thrives.   The question is, given VSM and
> 'bag o' words', when should one move on to markup?

Good question.

Traditionally, that would have been after hundreds of 
thousands of dollars went into that IT project that didn't
quite work. And it was "implementation" time.

But if you have something like an Accounting system
that runs on XML, maybe the time is when you need
to do that next customer invoice.

As a contractor, it always seemed dumb to me that
big companies would spend lotsa money on some
sort of business 'communications' system, but when
you wanted to put in your own invoice to get paid, 
there was no easy system to do it.

And then they all got fed up, and that was the end
of the IT boom. ca-boom as they say.

As IT workers, we are now left with all these medium 
sized enterprises, as opposed to large ones. That
the University graduates don't want to go and work
for because they don't give the big bucks (like we
used to be able to easily get).

But conversely, these medium sized orgs are much
less rigid than their larger counterparts. They'll give
any sort of xml technology a go provided it works.

So I think the IT world has moved on. It's a far more
interesting place than it was even in 1999 I would

I'm yet to figure out vector models, but it sounds
like what people used to refer to as "document
indexing". Maybe I haven't caught up with the
latest terminologies.


Computergrid : The ones with the most connections win.


Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
First Name
Last Name
Subscribe in XML format
RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.