|
[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message] Re: Re Whitespace
At 3:36 PM -0500 8/28/97, Sean Mc Grath wrote: >>> Bye bye the entire Unix family of line oriented text processing apps:-( >> >>Come on, This is a crock. > >[Discussion about a *single* tool - Perl - from the genus "Unix familiy of >line >oriented text processing apps" elided] Perl is of course the tool whose usage was made a part of the design goals of XML. It's also the most common language of web-hackers, by far. >Since when is Perl == Unix family of line oriented text processing apps? >The world is littered with s/w tools that have line length >limits. These tools are *blown* by WS-less XML. The mainframe world was littered with tools that couldn't edit nything other than 80 character fixed length records -- but that eventually changed. It think a little less passion is in order here: there's _no requirement_ that XML tools not use whitespace, nor is there a requirement that they _do_ use whitespace. People will do what is convenient for them, and for the people whose convenience they care about. This is as it always is. I suspect that line-breaks will in fact be common in XML files for some time to come. The thing that worries me is that most tools are not as smart as the editor I use on my Mac, that can edit and save files in their native line-ending convention without even worrying about it. And it is unfortunately true that stupid processors (like emailers and non-XML editors) _are_ going to "convert" files. This won't mess up PCDATA chunk counts, but will destroy character offsets (a riskiy linking mechanism anyway). It is likely to cause problem for verbatim-style formatting in carelessly written stylesheets, and I don't see any way (other than painful experience) that solutions are to be found to this -- because the solutions are either reformed behavior (Don't convert linend strings) or smarter processing software (prepare to accept CR, LF, or CRLF at any time). This is a problem that XML has not created, but simply tries not to make worse, by at least picking a simple rule that can be understood. >Throw out that grep, that text editor, that fgets(), that diff,sort,uniq >utility >There all busted for XML use. gets is of course Broken As Designed, as the cause of most security bugs in Unix systems. Again, they are broken for XML use with files created a particular way. They are also broken for HTML files created the same way, and I don't hear the weeping and wailing. >"Crock". I'll add that to my collection of spicy ripostes I have had >accumulates over the course of this thread.:-) I meant it as a description, in a similar, (but IMHO) slightly less-frantic tone. >Time to end. Can you suggest any solution to the "grep" problem other than requiring a fixed line-max in XML. Do you think that that hideous hack to accomodate defective (if very useful) tools is really worth it. Can you suggest how we would determine that buffer size? (Test Grep and AWK on our favorite 5 unices (what about wc, and Minix)) There are too many arbitrary lines that would have to be drawn in the sand to "solve" that problem. What about card-format editors like XEDIT, where editing lines of more than 72 characters is inconvenient (and lines of more than about 1800 characters is unbelievably convenient). There's still a lot of IBM iron out there. Or should we only worry about _your_ favorite tools being able to handle any XML document? Certainly authors can work within the limits of their chosen tools with XML. I don't see that we can realistically provide them with more. >If nothing else, David's five paragraphs have been born from this. >I suggest they should be mandatory reading for anyone approaching >XML development. Edits for clarity would be appreciated, and if they pass muster by other experts, maybe they should move to a section of the XML-FAQ for developers. If there isn't such a section, maybe we should start one! >It is clear that I see a problem that others don't. >Thus the odds are I am wrong. >I hope so. Actually, I agree with you that there are problems (there are legal XML documents that won't work with grep). There are plausible and common file operations, like changing line-end marking conventions, that _may well_ cause problems with some documents and stylesheets. I just don't see any solutions to these problems other than to let them work themselves out in the many different environments where they must be worked out. There is no solution that isn't so complex in its ramifications and details that it wouldn't simply be another problem for some reasonable application of XML. -- David _________________________________________ David Durand dgd@c... \ david@d... Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________ xml-dev: A list for W3C XML Developers Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To unsubscribe, send to majordomo@i... the following message; unsubscribe xml-dev List coordinator, Henry Rzepa (rzepa@i...)
|
PURCHASE STYLUS STUDIO ONLINE TODAY!Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced! Download The World's Best XML IDE!Accelerate XML development with our award-winning XML IDE - Download a free trial today! Subscribe in XML format
|
|||||||||

Cart








