Re: SGML, XML and SML

From: Paul Tchistopolskii <paul@q...>
To: xml-dev@i...
Date: Mon, 22 Nov 1999 23:56:18 -0800

Play the video


> From: Paul Tchistopolskii <paul@q...
> 
> >1. Why entities should live in the core, if one can use
> >any macroprocessor to get  *more*  flexible functionality?
> 
> 1) So that I can reference entities in attribute values

I'm talking about SML. ;-) No attributes there ( I think ).
Anyway - you can do it with m4.

> 2) So that I can have all my URIs in a header at the top,
> for maintainability

m4

> 3) Because I am not using UNIX pipes: is there a java
> version of M4?

You could wrap m4 with JNI. 1 day task, I think.

> 4) Because if I want to, I can already use a preprocessor;
> removing entities does not increase my options;

Why SGML has been stripped to XML? It does not 
increazed any  options ....

> 5) Because no macro preprocessors are internationalized;

Oh... I was also thinking that XML is internationalized. 
Until I tried ( occasionaly) to invoke Expat on some file 
with <?xml encoding="windows-something">. 
It appears that it may be  better to talk about 
Java-internationalized XML and MS-internationalized XML ?

> 6) Because entities allow tracking of file and line-number:
> if a macro package does this and it is implemented using
> pipes then there must be some kind of PI or conventional
> comment embedded (I guess Don and the Elementalists would
> want this to be an element which a post-processor would
> then hide!)

I agree,  it is sometimes inconvinient to use 
macroprocessing. However, I think that because 
most of users are *not* using macroprocessing 
anyway, they'l not notice that entities has been 
gone. ;-)

Also,  one can use belowed XML macroprocessor on server-side
( together with belowed weak DTD-based validation), but 
should all that code go to the client ? 

I already  know people who are writing simplified  XML parsers 
( right now ;-) to fit into some hardware.  There  is no 
validation, no entities and ... well ... they  have hacked together 
something very close to SML ;-)

Next step for them will be to write simplified XT, DOM e t.c. 

As far as I remember - there will be always some device with 
'small' memory.  For last 10 years I'm moving from 2-16-32K to 
Megabytes and then back to Kilobytes again ;-) 

I don't know why it happens, I'l better to consider it to be 
some rule.
 
> >2. How often do we need entities outside the DTD's ?
> 
> Often.

Big question. For example, when placing bigger and bigger 
load on  XT,  macroses become templates , so  it could result
in plain XML and heavy stylesheet.  Macroses give us nothing 
with the database. Stylesheets give a lot.

I guess macroprocessing was so important  to SGML
because of  manual editing of many documents, long 
time ago. Right ? 
 
Actualy, macroprocessor looks like not a  good thing 
when it  comes to dataflow. C++ tried to avoid 
macroprocessing as a bad practice.  

Having 'internal' macroprocessor  is ... maybe nice ... 
but suspicious ( even I like macroprocessing itself,  
I prefer not to overuse it.)  I don't  think it should live 
in the core, because it's something 'extra'.  'Optional'.

I think it's the issue of taste.

> >> I suspect CDATA sections are hard to live
> >> without if you're writing XML documents about HTML or XML, though.
> 
> Or any time you are writing programming code that uses <, & or >
> 
> >Let us have <CDATA> element ? I think up to 3-5 elements with
> >'hardcoded' semantics will not cause a big problem.
> 
> So the parent element of data changes depending on which way the
> person has marked the data up?   I suppose that in DOM & XPath
> "parent" will become useless; instead it would be "first ancestor that
> is not a CDATA element".

Perfect.  I'm glad I already agree to remove CDATA  from  SML  - 
there is something to think about. 
 
> This is the same problem that open
> content models give: with open content models, the "previousSibling"
> and "nextSibling" elements are not so useful (and, indeed, "firstChild"
> and "lastChild") because we cannot be sure what they are without
> checking.  But that is a small cost.  However, moving us from a
> "parent" to a "virtualParent" relationship would be quite a major
> change.

I agree - it is not easy to find the workaround. It's better to drop 
CDATA ;-)

> >It's why I think that SML vs XML is very similiar to
> >XML vs SGML.
> >
> >At some point it would be easier to break the
> >compatibility than to support legacy. As far as I
> >understand, exactly that thing happened with
> >XML vs SGML.
> 
> No, SGML was enhanced to allow XML.  (By the way, I think
> WebSGML already allows SML: I think it is legit to
> map the MDO and PIO delimiter to some "shunned
> characters" for example. The only difficulty is if multiple
> headers are involved.)

> SGML was based on the
> idea that it is impossible to get agreement on lexical
> issues from everyone, so the important issue is that
> different requirements can be described formally/legally.
> So XML did not break compatability with the standard;
> but no-one has ever thought that all SGML products
> should support all possible SGML syntaxes (despite James
> Clarks heroic efforts).
> 
> XML is interesting not because it is simpler (my first
> simplified-SGML text-processing system was more than
> 10 years ago) but because there is *agreement*.  That
> has been the new thing.

Oh ... It's something I can not understand ...   
It's too complex  for me...

The only reason  why I supported the idea of SML 
was that 2 weeks ago I just had a couple of long 
discussions  with one real-life client.  

As a result I realized that they are reimplementing 
almost *every*part* of XML standard, because 
existing XML framework  is  too 'bloated' for them.  

So when I saw the SML posting - I provided some 
thoughts I got after that discussion. The thoughts 
are: there is some place for SML.  

How big is it ?

I don't know.

Is XML good? Of course it is. Is it 'bloated' .... well ... 
I don't know.  It's the issue of taste.

It *could* be considered to be bloated.
 
> I think this list will contain many old SGML hacks who
> will be dissapointed but not surprised if eventually
> XML fragments.
> ---
> By the way, I missed sending this before: it gives the
> cannonical XML productions with an guess at SML
> productions: they are the same size. (And, if we take
> out namespaces from one, we can take them out
> from the other, so no effective difference.)
> 
> The current canonical XML grammar is this:
> [1]    canonXML    ::=    (PI #xA)* element #xA (PI #xA)*
> [2]    element    ::=    Stag (Datachar | element | PI)* Etag
> [3]    Stag    ::=    '<' Name NSDecl? (Att NSDecl?)* '>'
> [4]    Etag    ::=    '</' Name '>'
> [5]    NSDecl    ::=    #x20 'xmlns:' Prefix '=' '"' Attvalchar* '"'
> [6]    Att    ::=    #x20 Name '=' '"' Attvalchar* '"'
> [7]    Datachar    ::=    '&amp;' | '&lt;' | '&gt;' | '&#xD;'
>    | (Char - ('&' | '<' | '>' | #xD ))
> [8]    Attvalchar    ::=    '&amp;' | '&lt;' | '&quot;' | '&#x9;' |
> '&#xA;' | '&#xD;'
>    | (Char - ('&' | '<' | '"' | #x9 | #xA | #xD))
> [9]    Name    ::=    (Prefix ':')? NCName
> [10]    Prefix    ::=    'n' [1-9] [0-9]*
> [11]    PI    ::=    '<?' PITarget (#x20 (Char+ - (Char* '?>' Char*)))?
> '?>'
> [12]    PITarget    ::=    NCName - (('X' | 'x') ('M' | 'm') ('L' |
> 'l'))
> 
> which is hardly complex. Presumably, your SML would be
> something like this
> 
> [1]    canonXML    ::=    (Comment | #xA)*  element (Comment | #xA)*
> [2]    element    ::=   ( Stag (Datachar | element  | Comment )* Etag )
> | Mtag
> [3]    Stag    ::=    '<' Name NSDecl? (Att NSDecl?)* '>'
> [4]    Etag    ::=    '</' Name '>'
> [5]    NSDecl    ::=    #x20 'xmlns:' Prefix '=' '"' Attvalchar* '"'
> [6]    Att    ::=    #x20 Name '=' '"' Attvalchar* '"'
> [7]    Datachar    ::=    '&amp;' | '&lt;' | '&gt;' | (Char - ('&' | '<'
> | '>' | #x9 | #xA | | #xD ))
> [8]    Attvalchar    ::=    '&amp;' | '&lt;' | '&quot;'  | (Char - ('&'
> | '<' | '"' | #x9 | #xA | #xD))
> [9]    Name    ::=    (Prefix ':')? NCName
> [10]    Prefix    ::=    'n' [1-9] [0-9]*
> [11]    Comment   ::=    '<!--' (Char+ - (Char* '-->')))? '-->'
> [12 ]  Mtag   ::=    '<' Name NSDecl? (Att NSDecl?)* '/>'
> 
> Which is the same level of complexity. You take out PIs and
> add comments and empty start tags.

As far as I understand,  SML has no attributes ... And also 
maybe no MTag ...  rules are simpler ...  I suggest  
"take no prisoners" :

+  Stag and Etag 
+  Datachar

+  Comment ( optional )
+  Mtag for empty element ( optional )

I feel very suspicious when I see some project  that 
could not be implemented by one 'ideal' developer. 

All the programs I love - could be implemented by 
one developer ( to the shape of  reasonable prototype) . 

Implementing the XML framework to get something 
handy looks like a *huge*  task. Implementing reasonable 
SML framework looks doable.

Rgds.Paul.





xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@i...
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@i... the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@i... the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@i...)

References:
- Re: SGML, XML and SML
  - From: "Rick Jelliffe" <ricko@a...>

Prev by Date: How to keep "useless" information with SAX (2?).
Next by Date: [ANNOUNCE] JAXP Specification v1.0 Public Draft 1
Previous by thread: Re: SGML, XML and SML
Next by thread: RE: SGML, XML and SML
Index(es):
- Date
- Thread

PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Subscribe in XML format

RSS 2.0
Atom 0.3

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.

Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

XML Editor - Download a 15 Day Free Trial Now >

See What's New in Stylus Studio >

Buy Stylus Studio - XML Editor - Now >