[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Valid encodings for a Python source file

On 2018-06-07 22:40, Daniel Glus wrote:
> I'm trying to figure out the entire list of possible encodings for a Python
> source file - that is, encodings that can go in a PEP 263
> <https://www.python.org/dev/peps/pep-0263/> encoding specification, like #
> -*- encoding: foo -*-.
> Is this list the same as the list given in the documentation for the codecs
> library, under "Standard Encodings"
> <https://docs.python.org/3.6/library/codecs.html#standard-encodings>? If
> not, where can I find the actual list?
> (I know that list is the same as the set of unique values in CPython's
> /Lib/encodings/aliases.py
> <https://github.com/python/cpython/blob/master/Lib/encodings/aliases.py>,
> or equivalently, the set of filenames in /Lib/encodings/
> <https://github.com/python/cpython/blob/master/Lib/encodings/>, but again
> I'm not sure.)
> -Daniel

It's none of these.

To quote PEP 263:

> Any encoding which allows processing the first two lines in the way indicated above is allowed as source code encoding, this includes ASCII compatible encodings as well as certain multi-byte encodings such as Shift_JIS. It does not include encodings which use two or more bytes for all characters like e.g. UTF-16. The reason for this is to keep the encoding detection algorithm in the tokenizer simple.

All of the lists above include encodings like UTF-16 that are not
sufficiently ASCII-compatible.

Of course, as Terry Reedy writes,
> For new code for python 3, don't use an encoding cookie.  Use an editor that can save in utf-8 and tell it to do so if it does not do so by default. 

-- Thomas