[XSL-LIST Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

RE: Yet Another Entity Ref question!

Subject: RE: Yet Another Entity Ref question!
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Fri, 20 Dec 2002 13:13:40 -0500
xslt getting entity ref
Marco,

Clarifying this a bit further, you have two separate problems here:

(1) How to have your users represent special characters
    a. <entity>ent</entity>, or
    b. &ent;

(For various reasons including authority control, validation, code overhead etc, I think b. is better)

(2) How to get these entities represented as "true" entities, rather than
    numerica character references, in the output of your transformation

You'll have problem no. 2 no matter which approach you choose to problem no. 1.

I addressed problem no. 2 in a post just sent. Problem no. 1 is, strictly speaking, not an XSLT question but an XML design question (but b. is still better IMHO).

BTW -- don't decide on a solution to 1. based on guesses on which you think will perform better (which "takes more memory" etc.). These are not the only factors (in your case, I bet usability and validation are more critical issues), and your guesses may not be right (e.g. declaring and using an entity isn't as expensive as you seem to think).

Cheers,
Wendell

At 09:16 AM 12/20/2002, you wrote:
Hi Michael,
maybe my brain has been drinking!!! :-P
...maybe I didn't explain what I would like to produce very well!
So, let's repeat again:
User may write an XML, possibly with reference to entity.
So there are two solutions:
A) include all the possible symbolic entity in the DTD:
      <!ENTITY ent "&#xNN;">
   and then produce the output with these entities either encoded or not
   (using "us-ascii" enconding)
B) the user will use a special XML element, say "entity", to refer to the
   entity:
     <entity>ent</entity>
   Through XSLT this will be managed so that the resulting output contains
   the entity. To manage this you have two possibilities:
   B.1) the solution previously proposed by David, that's to insert an entity
        dictionary in the XML doc and referring to it with an XPath query.
   B.2) the solution originally proposed by me, that's to construct the
        entity by outputting "&", "ent", ";".

Here below my considerations; the (+) symbol means good while the
(-) means bad.
Solution A:
(+) is the more elegant solution
(+) in general faster than B (at least than B.1), even if ... [see (-)s
    below]
(-) consumes more memory to store the entities
(-) I have to take care about writing down all possible symbolic entities to
    construct the DTD
(-) even if the user does not insert entities, the document will contain
    DTD, consuming time and space (the DTD infact will be automatically
    included in the doc by the content management engine).

Solution B.1:
(-) conceptually is equivalent to "A"; however, instead of having the
    entities representations in the DTD, it stores them in XML elements;
    So, in addition to disadvantages came from "A", I add the one
    regarding the template processing in XSLT; furthermore, respect to
    "B.2" we have one more Xpath query.

Solution B.2:
(+) I haven't to take care about writing down all possible symbolic
    entities
(+) don't consume additional memory (except that for storing the template)
(-) produces a not-well formed document, since it wants to output the "&"
    symbol.
(-) in general is less faster than "A" since we have to apply the
    template; however when the user does not insert an entity, the
    XML parser don't have to parse the DTD for the entities (like "A").
(-) It seems that when the output of the entity template processing is
    stored in a variable I have to use xsl:value-of with d-o-e (see my
    original email).

Thanks very much!!!

--------------------------------
Marco Guazzone
Software Engineer
Kerbero S.r.L. - Gruppo TC
Viale Forlanini, 36
Garbagnate M.se (MI)
20024 - Italy
mail: marco.guazzone@xxxxxxxxxxx
www: http://www.kerbero.com
Tel. +39 02 99514.247
Fax. +39 02 99514.399
--------------------------------

On Fri, 20 Dec 2002, Michael Kay wrote:

> You are making things ridiculously complicated. If you are producing
> output that you want to view in an editor that can't understand UTF-8,
> just set <xsl:output encoding="us-ascii"/> as you were initially
> advised.
>
> Michael Kay
> Software AG
> home: Michael.H.Kay@xxxxxxxxxxxx
> work: Michael.Kay@xxxxxxxxxxxxxx
>
> > -----Original Message-----
> > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of
> > Marco Guazzone
> > Sent: 20 December 2002 11:55
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: Re:  Yet Another Entity Ref question!
> >
> >
> > Hi David,
> > your idea is good.
> > Currently I'm using the LibXSLT processor version 1.0.23
> > (with libxml2 version 2..4.30). However, with this I will
> > produce the encode UNICODE char in the output
> > (HTML) doc; i.e.
> > XML:
> > <doc>
> >   <label>Foobar<entity>copy</entity></label>
> >   <entity-dict>
> >      <entity-item name="copy" value="&#169;" />
> >   </entity-dict>
> > </doc>
> >
> > XSL:
> > <!-- ... like the previous except for: -->
> > <xsl:template match="entity">
> >   <xsl:value-of select="/doc/entity-dict/entity-item[@name =
> > current()]/@value" /> </xsl:template>
> >
> > This produce as output
> > Foobar(C)Foobar(C)Foobar(C)
> > where (C) is the encoded value of &#169;
> > This may cause problem in non-UNICODE editors or browser,
> > especially if I include the result in a source document (e.g.
> > Perl, C) as a return value of a function (problems may arise
> > in compiling/interpreting phase). Instead what I would
> > generate is: Foobar&copy; or more generally: Foobar&ent;
> > where "ent" is specified by an anonymous user in XML via:
> > <entity>ent</entity> What do you think about it?
> >
> > --------------------------------
> > Marco Guazzone
> > Software Engineer
> > Kerbero S.r.L. - Gruppo TC
> > Viale Forlanini, 36
> > Garbagnate M.se (MI)
> > 20024 - Italy
> > mail: marco.guazzone@xxxxxxxxxxx
> > www: http://www.kerbero.com
> > Tel. +39 02 99514.247
> > Fax. +39 02 99514.399
> > --------------------------------
> >
> > On Fri, 20 Dec 2002, David Carlisle wrote:
> >
> > > <xsl:template match="doc">
> > >    <xsl:apply-templates select="label" /> <!-- ok! -->
> > >    <xsl:variable name="label">
> > >       <xsl:apply-templates select="label" />
> > >    </xsl:variable>
> > >    <xsl:value-of select="$label" />  <!-- not ok -->
> > >    <xsl:value-of disable-output-escaping="yes" select="$label" />
> > > <!-- ok
> > > -->
> > > </xsl:template>
> > >
> > > which processor are you using?
> > >
> > > d-o-e is optional so a processor can ignore it altogether,
> > but if it
> > > supports it at all I think that in xslt1 the character
> > should keep the
> > > d-o-e property even when it goes through the variable.
> > >
> > > Is your input form fixed?
> > >
> > > It would be easier if your
> > > <ent>xxx</ent>
> > > only took entity names, as then you could convert them easily to
> > > characters without using d-o-e just by looking them up in a
> > document
> > > of the form
> > >
> > > <entity name="copy" char="&#169;"/>
> > > ...
> > >
> > >
> > > There is no need to have an input form of
> > > <ent>#x0A</ent>
> > >
> > > as the user can more simply write
> > > &#x0A;
> > > which then doesn't need any processing at all at the xslt level.
> > >
> > > David
> > >
> > >
> > _____________________________________________________________________
> > > This message has been checked for all known viruses by Star
> > Internet
> > > delivered through the MessageLabs Virus Scanning Service.
> > For further
> > > information visit http://www.star.net.uk/stats.asp or alternatively
> > > call Star Internet for details on the Virus Scanning Service.
> > >
> > >  XSL-List info and archive:
> > http://www.mulberrytech.com/xsl/xsl-list
> > >
> > >
> >
> >
> >  XSL-List
> > info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> >
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>
>


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.