[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Strong Typing in SGML and XML

  • From: Richard Light <richard@l...>
  • To: Eric Albright <eric_albright@s...>
  • Date: Wed, 7 May 1997 11:38:06 +0100

a5 catalogue
In message <199705070346.UAA02957@m...>, Eric Albright
<eric_albright@s...> writes
>
>Having said that, I ask when is strong data typing necessary? As far as I
>can tell there is only one place where it is useful -- when the document is
>being created or altered. There will always be data validation that cannot
>be handled by data typing and as such must be delegated to a validating
>application or a human. e.g.
><NAME><FIRST>Albright</FIRST><LAST>Eric</LAST></NAME>

>From a museum perspective, we have found the need for two types of data
validation/strong typing, which we call 'syntax control' and 'vocabulary
control'.  

Syntax control deals with things like the form of personal names.  These
are _not_ analysed in our application, but expressed in a consistent way
suitable for alphabetical sorting, e.g.:

        Light, Richard B.
rather than
        Richard B. Light

The syntax check would pick up non-capitalised words (apart from a 'stop
list' of known weak prefixes), inconsistent use of full stop and/or
spaces after initials, etc.  This starts to be hard work for a regular
expression, and might more easily be supported as a 'notation', for
which an external helper applet is called up in the context of editing.

Vocabulary control involves checking the data content against an
external authority, which could be a simple termlist or a complex
thesaurus.

Another use we make of data syntax is as a short-cut for markup.  (This
was before we knew about SGML, by the way!  The conventions were
originally devised to make optimal use of A5 catalogue cards ...)  We
use colons as a 'field separator', e.g.:

        <person>maker : Light, R.B.</person>
implies:
        <person>
                <role>maker</role>
                <persname>Light, R.B.</persname>
        </person>

and ampersands (definitely pre-SGML!) as keyword separators:

        <place>Burgess Hill & W. Sussex & U.K.</place>
implies:
        <place>
                <placename>Burgess Hill</placename>
                <placename>W. Sussex</placename>
                <placename>U.K.</placename>
        </place>

These practices tie in with the SGML concept of short references, which
are not available in XML.  So a general conclusion I have come to is
that ':' and '&' need to be mapped to suitable subelements, and our
users need to come to terms with more heavily tagged records than they
are used to.  

This is relevant (really!) in the context of Tim's suggestion that
strong typing should apply only to PCDATA-only elements.  In the more
general case of 'data validation' we might well want to validate
elements with substructure.

Richard Light
SGML and Museum Information Consultancy
richard@l...
3 Midfields Walk 
Burgess Hill
West Sussex RH15 8JA
U.K.
tel. (44) 1444 232067

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@i... the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@i...)


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.