|
[XQuery Talk Mailing List Archive Home] [By Date] [By Thread] [By Subject] [By Author] [Recent Entries] [Reply To This Message] Regular Expression searchMichael Kay mhk at mhk.me.ukFri Dec 16 10:10:53 PST 2005
> Search and querying are very different. Search is basically for > document-centric XML (like XHTML), where as querying is for > data-centric > XML (like invoices, etc). If you're using regular expressions for > data-centric XML, then I'd say you have a design flaw - but > not if you > are using them for document-centric XML. That seems very simplistic to me, for a number of reasons. (1) The distinction between document-centric and data-centric is not a hard-and-fast one. If you take any real application, for example a system for managing insurance claims, then it contains a spectrum of information from highly-structured to very loosely-structured. One of the big benefits of XML is that we can now handle this full spectrum using a single technology. (2) XML structures are often designed primarily for information interchange, not for storage and query. The database often needs to contain the message as transmitted or received. The fact that the XML design is not optimized for query is not a design flaw, it is a consequence of the fact that information interchange rather than query is now the primary driver. (3) I can think of many perfectly good reasons for using regular expressions to search highly structured data, even when it was designed primarily for querying. For example if I receive an invoice that's damaged in the post and I can't quite read the purchase order number, I might want to do a regular expression search for the parts of the number that I can read. (4) Any argument that says "in data-centric XML there should be no implicit structure in textual fields, it should all be denoted by explicit markup" can be applied equally well to document-centric XML. In both cases the argument is false: it's entirely reasonable to store a UK postcode such as "RG4 7BS" as a single string even though the "RG4" on its own carries meaning; similarly dates, part numbers, etc. The granularity of markup involves a design compromise, you can't argue that finer-grained markup is always better. Michael Kay http://www.saxonica.com/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|






