[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Announce: XML Schema,


font foundary
From: "Jonathan Borden" <jborden@a...>


> I suppose that if I went to the trouble to specify "text-en" that I probably
> wouldn't want that to validate. Come to think of it, the French would
> probably pay good money to obtain a reliable validator that fails on words
> that smack of English, so that pattern * - text-en (or something akin) might
> become quite popular  :-))

The company Alis has tools which they say can reliably detect many
different languages (and even some encodings) based on statistics.

But, again, the point of validating a character repertoire would be to 
assert which characters *are* expected, so that you can be told
when an unexpected character is found and so that programmers
can cope.  

The issue of what deciding "What is in English?" or "What is in French?"
is a red herring.  An English language document may well have a
an unmarked greek character, for example.  By being able to validate
that, say, only ASCII characters are used for English, we force the
special character to be marked up specially, or we alert the typesetter
or whatever that the data contains something that a programmer was
told not to expect. 

Another example might be in Chinese.  A military document type
for the Taiwanese army might say, for example, that only characters
found in Big5 or only characters learnt as part of end of year 10 should
be allowed in the body text of training manuals, to correspond to 
baseline literacy of conscripts. 

Very few fonts have all Unicode characters. And with good reason:
fonts are large and high-quality publishing fonts will often come
from regional type foundaries: we wouldn't expect that Chinese font from
a Singaporean font foundary will support Polish orthography well or
have Arabic characters, for example.  The modern trend, spearheaded
from Asia and now part of Java, is to have virtual fonts, where you 
mix an match ranges of existing fonts. 

So again it comes down to what a schema is for. If it to express
the static and dynamic constraints that a given production flow
requires to be checked for high quality operation, then 
things like range-checking mixed content is something that
*some* schema module should do. 

Cheers
Rick Jelliffe





PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.