[XML-DEV Mailing List Archive Home]
[By Thread]
[By Date]
[Recent Entries]
[Reply To This Message]
Re: Formatless files
- From: David Carlisle <d.p.carlisle@gmail.com>
- To: Roger L Costello <costello@mitre.org>
- Date: Tue, 9 Aug 2022 22:15:12 +0100

A file has no inherent format.
The format of a file is determined by the programs that use it.
Since file types are not determined by the file system, the "kernel" can't tell you the type of file: it doesn't know.
You might wonder why the system doesn't track file types more carefully, so that, for example, the "sort" program is never given a directory as input. One reason is to avoid precluding some useful computations. Although
> sort /bin
doesn't make much sense, there are many commands that can operate on any file at all, and there's no reason to restrict their capabilities. Octal dump (od), word count (wc), copy (cp), compare (cmp), and many others process files regardless of their contents. But the formatless idea goes deeper than that. If, say, the input to LaTeX were distinguished from Java source, a text editor would be forced to make the distinction when it created a file, and probably when it read in a file for editing again.
Instead of creating distinctions, the system tries to erase/lessen them. All text consists of lines terminated by newline characters, and most programs understand this simple format. This uniformity is unusual; most systems have several file formats, even for text, and require negotiation by a program or a user to create a file of a particular type. In the system there is just one kind of file, and all that is required to access a file is its name.
There's a good test of file system uniformity, due originally to Doug Mcllroy. Can the output of a FORTRAN program be used as input to the FORTRAN compiler? A remarkable number of systems have trouble with this test.
-------
The above are excerpts from the book, The Art of UNIX Programming, page 46-47. The "system" being referred to is the UNIX system.
How do those excerpts apply to XML? Why are there so many file formats - the XML file format, the JSON file format, the CSV file format, and so on? Isn't that contrary to the idea of formatless files?
You changed the definition of format at this point. Above you quote latex, java, fortran as all having the same format, text file.
xml, json and csv are also text in that sense. They have different syntax within the text file, just as java has a different syntax to latex.
David
/Roger
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php

[Date Prev]
| [Thread Prev]
| [Thread Next]
| [Date Next]
--
[Date Index]
| [Thread Index]
|
PURCHASE STYLUS STUDIO ONLINE TODAY!
Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!
Download The World's Best XML IDE!
Accelerate XML development with our award-winning XML IDE - Download a free trial today!
Subscribe in XML format
RSS 2.0 |
 |
Atom 0.3 |
 |
|
Stylus Studio has published XML-DEV in RSS and ATOM formats,
enabling users to easily subcribe to the list from their preferred news reader application.
|
Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website.
they were not included by the author in the initial post. To view the content without the Sponsor Links please
click here.
|
|