Re: Announce: XML Schema,
From: "Jonathan Borden" <jborden@a...> > I suppose that if I went to the trouble to specify "text-en" that I probably > wouldn't want that to validate. Come to think of it, the French would > probably pay good money to obtain a reliable validator that fails on words > that smack of English, so that pattern * - text-en (or something akin) might > become quite popular :-)) The company Alis has tools which they say can reliably detect many different languages (and even some encodings) based on statistics. But, again, the point of validating a character repertoire would be to assert which characters *are* expected, so that you can be told when an unexpected character is found and so that programmers can cope. The issue of what deciding "What is in English?" or "What is in French?" is a red herring. An English language document may well have a an unmarked greek character, for example. By being able to validate that, say, only ASCII characters are used for English, we force the special character to be marked up specially, or we alert the typesetter or whatever that the data contains something that a programmer was told not to expect. Another example might be in Chinese. A military document type for the Taiwanese army might say, for example, that only characters found in Big5 or only characters learnt as part of end of year 10 should be allowed in the body text of training manuals, to correspond to baseline literacy of conscripts. Very few fonts have all Unicode characters. And with good reason: fonts are large and high-quality publishing fonts will often come from regional type foundaries: we wouldn't expect that Chinese font from a Singaporean font foundary will support Polish orthography well or have Arabic characters, for example. The modern trend, spearheaded from Asia and now part of Java, is to have virtual fonts, where you mix an match ranges of existing fonts. So again it comes down to what a schema is for. If it to express the static and dynamic constraints that a given production flow requires to be checked for high quality operation, then things like range-checking mixed content is something that *some* schema module should do. Cheers Rick Jelliffe
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format