[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
Re: Was: mode and moved to Namespaces
Subject: Re: Was: mode and moved to Namespaces
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Wed, 20 Apr 2011 09:35:13 +0200
|
Hi ac,
This thread is rather long, so please forgive me if I've misunderstood
anything, but I'd like to add my thoughts to the discussion.
You seem to want to use namespaces as tags for names, which is not what
they're intended for. As reason for doing so you consider space saving,
but if space is an issue, don't use XML. If it's about the memory
footprint, it doesn't matter whether your nodes have namespaces, because
practically every node takes approximately the same amount of memory,
regardless its name or kind (there's an older thread by Michael Kay
where he explains how much memory each node takes). In other words, your
argument for size doesn't play.
A namespace is prefix-agnostic. That means that, if <en:word /> is
connected to namespace "http://example.com/french", and <fr:word /> is
so too, both qualified names are equal. Treating them differently is
wrong design.
The real problem, however, comes from portability and understandability.
You redefine namespaces to something that's nothing more than a tag or
prefix. That makes your solution unportable and not machine readable
anymore. I.e., if a simple identity transform would take all namespace
prefixes and replaced them with ns1, ns2 etc (but leave the namespace
itself, and hence the qualified names, intact), your application would
fail. However, such transformations are quite common in XML and totally
legal.
By redefining what a namespace means (or, more specifically, by ignoring
it's real meaning and making it part of the local-name, which is
basically what you are doing), you stop using XML by how it was meant to
be. Your XML in and of itself is still compliant, but your applications
and how they treat XML are not. That's a choice, but if you go down that
path, you can just as well choose your own format, which will give you
far better results in performance, space and requirements.
----
Back to your real problem: suppose we accept that you need to use XML
and that you do not want to abuse namespaces for something they're not.
How could we tackle your issue? I'd go for a straight structure and use
what's already there:
<word type="title" xml:lang="en-GB" gender="female">Mrs</word>
<word type="title" xml:lang="fr-FR" gender="female">Mme</word>
this is the approach Microsoft chooses (or at least similar) in Word-ML,
which looks big, but is quite workable. Now, suppose you want to
minimize the disk footprint (as already said, the memory footprint will
be largely the same regardless), you could do something like this:
<word type="title" en-m="Mr" en-f="Mrs" fr-m="M." fr-f="Mme"/>
as it turns out, this is effectively smaller than your
namespace-oriented approach. If you really want the type of the word in
each and every attribute-name and split the atribute name later, you can
do that, but code-smell ahead! Something like:
<word en-title="Mr" en-f-title="Mrs" fr-title="M." fr-f-title="Mme" ... />
But really, you shouldn't go down that path, it has exactly the same
drawbacks as your namespace approach (albeit slightly better
extensible). It will backfire once you start using it.
Moral of this story: use XML for what it is for: a verbose and
descriptive method of describing data. If space is of essence, don't use
XML, as it will work against you. Use namespaces for what they're
supposed for: separating semantically different sets of names that are
supposed to be treated differently (compare xslt namespace and svg
namespace: they require different applications).
Kind regards,
Abel Braaksma
/_On 20-4-2011 3:01, ac wrote:_/
Hi Jirka,
I appreciate your time, consideration, suggestions, and arguments.
You are right, there is a lookup cost, and this is not the way I
prefer to use namespaces. OTOH, the space saving and associated
overhead saving can justify the lookup cost for something that can get
large and needs to stay in busy memory, at least for a while.
It would be much nicer if namespaces could be further supported,
including support for hierarchical namespaces, as well as namespace
optimization. Namespaces are, apart from comments, one of the three
basic XML constructs. Three isn't much, which is fine, but each should
be maximized to help better satisfy application requirements.
I do not doubt that you are open-minded and I certainly appreciate
your constructive comments. In fact I agree with them. I do admit
that "simply wrong" did not allow me to understand and contribute
technically. But it is looking much better now.
I also realize that matching on the names is risky and would be better
addressed through the URIs. The added cost is not high enough to
justify the risk, and the space saving is probably still worth the
effort, depending on the number of languages that need to be
supported, the size of the vocabulary, and the memory constraints.
Still, as everything is a trade off, I would still maintain, given all
constraints, that this is another valid use case for namespaces, when
it applies. I would also recommend that we consider how namespaces
can better fulfill more useful roles in XML, including how they can be
expanded, and more efficiently supported.
There is a real conceptual need for namespaces and it may be that we
are just starting to better realize it.
Regards,
ac
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
ac wrote:
The current translation dictionary is setup somewhat like:
...
<word en:title="Mr" f-en:title="Mrs" fr:title="M." f-fr:title="Mme"
... />
<word en:noun="chair" fr:noun="chaise" ... />
...
all feminine variants can be returned with:
/dic/word/@*[starts-with(name(.), 'f-')]
Such lookups will tend to be quite slow because matching on name of
element/attribute can't be done using dictionary -- high efficient XSLT
implementations doesn't store element/attributes names for each node,
but they store just number pointing to dictionary with the real
qualified name. This saves memory and makes matching on name very fast.
But if name is not directly present in XPath such fast matching can't be
done.
all French feminine can be returned with
/dic/word/@f-fr:*
all French feminine adjectives can be returned with
/dic/word/@f-fr:adjective
all translated English words return form
/dic/word/@en:*
The trouble with such approach is that you can't change language during
the runtime. You have to pregenerate all queries before running
transformation or use dynamic XPath evaluation (which is not part of
XSLT standard yet).
all English nouns, whatever gender, can be obtained with something like
/dic/word/@*:nouns[contains(name(), 'en:')]
If you are using namespaces then this code is not correct. You should
match on namespace name not actual prefix used. So query should be more
like:
/dic/word/@*:nouns[namespace-uri() = 'whatever URI was assigned to en']
It must be good to know what is right from what is wrong,
especially with an absolute perspective.
I have to admit that I have always had some disbelief about absolute
beliefs,
but I will keep an open mind, at least just in case.
I consider myself very open-minded. Your usage for namespaces in this
particular case surely works for you, but it's misuse of namespaces.
They were not designed for this and their usage for this several
engineering flaws.
- --
- ------------------------------------------------------------------
Jirka Kosek e-mail: jirka@xxxxxxxx http://xmlguru.cz
- ------------------------------------------------------------------
Professional XML consulting and training services
DocBook customization, custom XSLT/XSL-FO document processing
- ------------------------------------------------------------------
OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
- ------------------------------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk2t+6cACgkQzwmSw7n0dR4F/ACfRIwtkthd9SXVzk4fV+iKoHoe
XbkAnR6T4sWLdIzdyi/+J9gjIr/V8jEd
=1Loa
-----END PGP SIGNATURE-----
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
|
Atom 0.3 |
|
|