OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to <undefined>


On Wed, May 23, 2018 at 8:31 AM, Peter J. Holzer <hjp-python at hjp.at> wrote:
> On 2018-05-23 07:38:27 +1000, Chris Angelico wrote:
>> On Wed, May 23, 2018 at 7:23 AM, Peter J. Holzer <hjp-python at hjp.at> wrote:
>> >> The best you can do is to go ask the canonical source of the
>> >> file what encoding the file is _supposed_ to be in.
>> >
>> > I disagree on both counts.
>> >
>> > 1) For any given file it is almost always possible to find the correct
>> >    encoding (or *a* correct encoding, as there may be more than one).
>>
>> You can find an encoding which is capable of decoding a file. That's
>> not the same thing.
>
> If the result is correct, it is the same thing.
>
> If I have an input file
>
>     4c 69 65 62 65 20 47 72 fc df 65 0a
>
> and I decode it correctly to
>
>     Liebe Gr??e
>
> it doesn't matter whether I used ISO-8859-1 or ISO-8859-2. The mapping
> for all bytes in the input file is the same in both encodings.

Sure, but if you try it as ISO-8859-5 or  -7, you won't get an error,
but you also won't get that string. So it DOES matter.

ChrisA