|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Strong Typing in SGML and XML
In message <199705070346.UAA02957@m...>, Eric Albright <eric_albright@s...> writes > >Having said that, I ask when is strong data typing necessary? As far as I >can tell there is only one place where it is useful -- when the document is >being created or altered. There will always be data validation that cannot >be handled by data typing and as such must be delegated to a validating >application or a human. e.g. ><NAME><FIRST>Albright</FIRST><LAST>Eric</LAST></NAME> >From a museum perspective, we have found the need for two types of data validation/strong typing, which we call 'syntax control' and 'vocabulary control'. Syntax control deals with things like the form of personal names. These are _not_ analysed in our application, but expressed in a consistent way suitable for alphabetical sorting, e.g.: Light, Richard B. rather than Richard B. Light The syntax check would pick up non-capitalised words (apart from a 'stop list' of known weak prefixes), inconsistent use of full stop and/or spaces after initials, etc. This starts to be hard work for a regular expression, and might more easily be supported as a 'notation', for which an external helper applet is called up in the context of editing. Vocabulary control involves checking the data content against an external authority, which could be a simple termlist or a complex thesaurus. Another use we make of data syntax is as a short-cut for markup. (This was before we knew about SGML, by the way! The conventions were originally devised to make optimal use of A5 catalogue cards ...) We use colons as a 'field separator', e.g.: <person>maker : Light, R.B.</person> implies: <person> <role>maker</role> <persname>Light, R.B.</persname> </person> and ampersands (definitely pre-SGML!) as keyword separators: <place>Burgess Hill & W. Sussex & U.K.</place> implies: <place> <placename>Burgess Hill</placename> <placename>W. Sussex</placename> <placename>U.K.</placename> </place> These practices tie in with the SGML concept of short references, which are not available in XML. So a general conclusion I have come to is that ':' and '&' need to be mapped to suitable subelements, and our users need to come to terms with more heavily tagged records than they are used to. This is relevant (really!) in the context of Tim's suggestion that strong typing should apply only to PCDATA-only elements. In the more general case of 'data validation' we might well want to validate elements with substructure. Richard Light SGML and Museum Information Consultancy richard@l... 3 Midfields Walk Burgess Hill West Sussex RH15 8JA U.K. tel. (44) 1444 232067 xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








