Subject: Re: Challenges with unparsed-text() and reading UTF-8 file as UTF-16
From: Colin Paul Adams <colin@xxxxxxxxxxxxxxxxxx>
Date: 13 Oct 2006 12:13:51 +0100
|
>>>>> "Michael" == Michael Kay <mike@xxxxxxxxxxxx> writes:
Michael> operating systems without any reliable metadata (in fact,
Michael> without any metadata at all). An implementation could
Michael> interpret this rule as allowing "I know this file is on
Michael> an AS400 and I know that on an AS400 the default file
Michael> encoding is IBM-EBCDIC". But I would hope that this would
Michael> be done under rule 4 (implementation-defined heuristics)
Michael> rather than under rule 1 (external encoding information).
My thoughts are different - I believe that doing it under rule 1 is
the best way, and make up for the lack of decent O/S support for
metadata (OS/2 had it a decade ago!), by adding hooks so that the user
can supply the encoding information in some way.
--
Colin Adams
Preston Lancashire
|