[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

MicroASCII proposal

  • From: rjelliffe <rjelliffe@allette.com.au>
  • To: <xml-dev@lists.xml.org>
  • Date: Thu, 13 Jan 2011 18:36:44 +1100

MicroASCII proposal
 One of the major complications in software is that there are simply too 
 many characters. Think of how many hours (and reputations!) are lossed 
 due to spelling errors, how many bugs due to typos, and the extra 
 parsing costs. We need to move XML (and computing) away from this 
 unfortunate legacy which are really just niche publishing "requirements" 
 and which made SGML ultimately fail.

 In order to do this, I am proposing MicroASCII. This would restore 
 ASCII to its Latin essentials and reduce the insane repeats. Syntactical 
 sugar such as K, Y and Z are no-brainers of course: I doubt that anyone 
 will really miss them. But more recent fads such J, W and U are better 
 off treated as presentation forms and taken care of by another layer: 
 ASCII violates this basic separation of concerns. Indeed, the whole 
 lower-case is redundant.

 What about internationalization? Well, we often think that 
 internationalization requires *more* features than any one alphabet 
 could get away with, but it ain't necessarily so. Lets say we support 
 Hebrew and the other semitic languages, and use letters for digits. We 
 can then get rid of the hindu digits from ASCII too.

 We can learn from the world of computing too. In LISP S-expressions, 
 the parenthesis is all that is needed for grouping. So out goes {} and 
 []. We don't need the control characters either. With all this, we 
 should be able to get to 32 (2^5) characters: MicroASCII will have 1/8 
 the number of code points taken up by usual ASCII bytes and therefore be 
 8 times faster to parse and 8 times simpler to understand! This is 
 enough of a speed up that Moore's law can be restarted, at least for a 
 year or two.  Mobile phone keyboards will be simplified.

 The other advantage is that it frees up many code points in the byte 
 that can be used for other purposes, such as sending around strings of 
 nulls and nils, which the database community has a voracious appetite 
 for. We could dedicate the whole of the codespace 0x90 -0xFF to 
 different kinds of nulls and nils and NELs.

 If someone did want other characters, I suppose we could insert them 
 using a convenient URL, such as
  (-!http://www.unicode.org/tables/Unicode5.0/ampersand!-)


 Cheers
 RIC IELLIFFE


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery ™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2013 All Rights Reserved.