[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: indexing and querying XML (not XQuery)
'Alan Gutierrez' wrote: > * Robert Koberg <rob@k...> [2005-08-23 10:42]: > >>Bullard, Claude L (Len) wrote: >> >>>Index what? Ideas, ideas emerging from conversations, the conversations? >>>So far, what you are describing seems to be Google. Can you out Google >>>Google? >> >>It is not like google. Google indexes HTML and it gives better rankings >>to well marked up (according to google) HTML (which is why small >>companies like us can get page rankings as high or higher than much >>larger companies). >> >>With an XML indexer, you can index glossentries, faqs, quizes, whatever >>and keep them separate so if you want to run a query against just faqs, >>you can. >> >>You can do a search to get all external links (we distinguish between >>external, internal and whatever other kind of links there might be) and >>validate them. >> >>You can also use the searches to do things you might do with XQuery >>(again, I don't know XQuery...). For example, in our CMS we have the >>concept of page regions. Content pieces are assigned to folder/page >>regions. Say I want to find out where a content piece has been assigned. >>I can run a query on all assignments to return references to the >>pages/folders where it has been assigned. You can do searches for all >>users in a particular group, all projects that a user has access to, >>etc.. etc... > > > Which is why I'd propose defining a full-text schema language, > so XML content can be described to a full-text search engine. It does sound very interesting. How would it work? What would it look like? I have tried doing this with XML Schema but gave up. I had tried to use annotations to give weight to different things, then I tried to make a type system. For me, it was just easier to write java to handle it. Now I write org.xml.sax.ext.DefaultHandler2's that suit my needs. I know, not very scalable or user friendly. best, -Rob > > The langauge would permit ranking based on markup, define what > constitues a document, what constitutes a document collection, etc. > > -- > Alan Gutierrez - alan@e... > - http://engrm.com/blogometer/index.html > - http://engrm.com/blogometer/rss.2.0.xml >
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|