[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Python 3.2 has some deadly infection

On Fri, 06 Jun 2014 18:32:39 +0300, Marko Rauhamaa wrote:

> Michael Torrie <torriem at gmail.com>:
>> On 06/06/2014 08:10 AM, Marko Rauhamaa wrote:
>>> Ethan Furman <ethan at stoneleaf.us>:
>>>> ASCII is *not* the state of "this string has no encoding" -- that
>>>> would be Unicode; a Unicode string, as a data type, has no encoding.
>>> Huh?
>> [...]
>> What part of his statement are you saying "Huh?" about?
> Unicode, like ASCII, is a code. Representing text in unicode is
> encoding.

A Unicode string as an abstract data type has no encoding. It is a 
Platonic ideal, a pure form like the real numbers. There are no bytes, no 
bits, just code points. That is what Ethan means. A Unicode string like 

s = u"NOBODY expects the Spanish Inquisition!"

should not be thought of as a bunch of bytes in some encoding, but as an 
array of code points. Eventually the abstraction will leak, all 
abstractions do, but not for a very long time.

Steven D'Aprano