|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: JITTs and DOM
Hi Rick, > Would it be more "XML"-ish to hang preprocessing off namespaces? > > So that you provide a preprocessor with a list of allowed namespaces > or prefixes and strip the tags of the rest? "Ignore html1: and strip > rdf:" for example. Filtering by namespace is definitely what I believe people term "low-hanging fruit" (that phrase always reminds me of a guy that mistakenly talked about his "low-hanging plums" [1]). Using different namespaces (or combinations of namespaces) is certainly a very easy way of identifying different hierarchies, but I don't think that it hits all the use cases. The main problem is what happens when you have markup that should be shared between two hierarchies. Taking the bible example, we might want to extract two hierarchies: /bible/testament/book/chapter/verse /bible/testament/book/section/para I guess that you could place anything that's "common" in yet another namespace, so we end up with three: /bib:bible/bib:testament/bib:book/log:chapter/log:verse /bib:bible/bib:testament/bib:book/phys:section/phys:para but (a) too many namespaces spoil the markup (make it harder to read etc.) and (b) if you're making divisions like these, you're effectively dictating from the outset which structures can be extracted from the data, which I think goes against the principal of descriptive markup (i.e. describe the data; let the applications choose what to do with it). > There would then be kind of overlapping WF check easily possible, > just checking that all elements of each prefix/namespace form a > balanced tree: one tree per namespace. (With some scoping > conventions for xmlns declarations.) If we're "coming down from the trees" (to use Patrick's phrase), I'm also uncomfortable with the notion of stating that the elements/ranges in each namespace must form a tree. That would prevent, for example, a "CommentryML" namespace for comments that overlap, which I think is an important use case for these technologies. [By the way, the scoping of namespace declarations is a tricky area when you get into overlapping markup; in LMNL we manage it by distinguishing namespaces-for-markup (which are declared with [!ns] declarations that scope to "the rest of the document") and namespaces-for-content (which are declared as normal annotations that scope to "the content of the range" or whatever else is appropriate for the particular application).] Cheers, Jeni [1] 'plums' is slang for 'testes' in England. --- Jeni Tennison http://www.jenitennison.com/
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








