[Home] [By Thread] [By Date] [Recent Entries]
dareo@m... (Dare Obasanjo) writes: >Same here. I wonder what Simon considered food for thought. Notes on markup practice, mostly. To spare everyone else the slogging: > SGML is a good idea when the markup overhead is less than 2%. Even > attributes is a good idea when the textual element contents is the > "real meat" of the document and attributes only aid processing, so > that the printed version of a fully marked-up document has the same > characters as the document sans tags. Explicit end-tags is a good > idea when the distance between start- and end-tag is more than the > 20-line terminal the document is typed on. Minimization is a good > idea in an already sparsely tagged document, both because tags are > hard to keep track of and because clusters of tags are so intrusive. > Character entities is a good idea when your entire character set is > EBCDIC or ASCII. Validating the input prior to processing is a good > idea when processing would take minutes, if not hours, and consume > costly resources, only to abend. SGML had an important potential in > its ability to let the information survive changes in processing > equipment or software where its predecessors clearly failed. > ... > We are clearly not at the stage of human development > where writers are willing to accept the burden of communicating to > the machine what they are thinking. One has to marvel at the wide > acceptance of our existing punctuation marks and the sociology of > their acceptance. "Tagging" text for semantic constructs that the > human mind is able to discern from context must be millennia off. And in a totally different direction: > But the one thing I would change the most from a markup language > suitable for marking up the incidental instruction to a type-setter > to the data representation language suitable for the "market" that > XML wants, is to go for a binary representation. The reasons for > /not/ going binary when SGML competed with ODA have been reversed: > When information should survive changes in the software, it was an > important decision to make the data format verbose enough that it > was easy to implement a processor for it and that processors could > liberally accept what other processors conservatively produced, but > now that the data formats that employ XML are so easily changed > that the software can no longer keep up with it, we need to slam on > the breaks and tell the redefiners to curb their enthusiasm, get it > right before they share their experiments with the world, and show > some respect for their users. One way to do that is to increase the > cost of changes to implementations without sacrificing readability > and without making the data format more "brittle", by going binary. > Our information infrastructure has become so much better that the > nature of optimization for survivability has changed qualitatively. > The question of what we humans need to read and write no longer has > any bearing on what the computers need to work with. One of the > most heinous crimes against computing machinery is therefore to > force them to parse XML when all they want is the binary data. I'll certainly confess that much of my interest stems from Naggum's infamous reputation from SGML days, though. -- Simon St.Laurent Ring around the content, a pocket full of brackets Errors, errors, all fall down! http://simonstl.com -- http://monasticxml.org
|

Cart



