[XML-DEV Mailing List Archive Home] [By Thread] [By Date] [Recent Entries] [Reply To This Message]

Re: Why is Encoding Metadata (e.g. encoding="UTF-8) putIns

  • From: "Rick Jelliffe" <rjelliffe@a...>
  • To: "Jonathan Robie" <jonathan.robie@r...>
  • Date: Thu, 20 Sep 2007 11:54:00 +1000 (EST)

Re:  Why is Encoding Metadata (e.g. encoding="UTF-8)     putIns
Jonathan Robie said:
> Michael Kay wrote:
>>> Why? Shouldn't metadata be external to a document?
>>>
>>
>> Sadly, most of us are using file systems based on 1960s thinking that
>> don't
>> allow metadata to be held anywhere other than in the content of the file
>> (or
>> potentially in its name).
>
> I love the fact that the Gnome help system recognizes my DocBook files,
> and lets me view them as help files, with tables of contents, links, and
> very nice formatting ... it makes it a little nicer to write DocBook
> documents when working on Fedora ....

There has always been a split between systems based on "magic numbers" (in
the UNIX sense) which the XML encoding header is an elaborate example of,
systems based on richer file structures (e.g. old Mac) and systems using
registries. But it is the file read and write APIs that are the weak links
in the chain: information about encoding is lost when writing out a file,
and the only way to maintain it is to write it somewhere. And the only
place to write it that is cross-platform and cross-application and
transparent is inside the file itself.

Actually, it continues the trend of web resources being self-identifying
rather than requiring external metadata; this was the same trend that
allowed XML to omit data attriutes from entity declarations. (Which then
had to be re-invented in element syntax again, being useful, e.g. XLinks
and OPC.)

For XML we looked at two different mechansisms: Gavin Nicol suggested that
we should just use the existing MIME header syntax at the start of the
file. This had two drawbacks: first, when you use EBCDIC it means a file
in two different encodings, and second the file was not longer an
acceptable SGML entity. So the PI syntax was adopted instead, even though
it meant a disconnect from MIME header syntax.

Cheers
Rick Jelliffe


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index]


PURCHASE STYLUS STUDIO ONLINE TODAY!

Purchasing Stylus Studio from our online shop is Easy, Secure and Value Priced!

Buy Stylus Studio Now

Cast Your Vote

We need your help – Vote for DataDirect XML Products!

  • Best SOA or XML site

Winners and finalists announced at SOA World Conference in November.

Download The World's Best XML IDE!

Accelerate XML development with our award-winning XML IDE - Download a free trial today!

Don't miss another message! Subscribe to this list today.
Email
First Name
Last Name
Company
Subscribe in XML format
RSS 2.0
Atom 0.3
 

Stylus Studio has published XML-DEV in RSS and ATOM formats, enabling users to easily subcribe to the list from their preferred news reader application.


Stylus Studio Sponsored Links are added links designed to provide related and additional information to the visitors of this website. they were not included by the author in the initial post. To view the content without the Sponsor Links please click here.

Site Map | Privacy Policy | Terms of Use | Trademarks
Free Stylus Studio XML Training:
W3C Member
Stylus Studio® and DataDirect XQuery™are products from DataDirect Technologies, is a registered trademark of Progress Software Corporation, in the U.S. and other countries. © 2004-2007 All Rights Reserved.