osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] PEP 587: Python Initialization Configuration


Hi,

Since Steve Dower asked me to write a formal PEP for my proposal of a
new C API to initialize Python, here you have!

This new PEP will be shortly rendered as HTML at:
https://www.python.org/dev/peps/pep-0587/

The Rationale might be a little bit weak at this point and there are
still 3 small open questions, but the whole PEP should give you a
better overview of my proposal than my previous email :-)

I chose to start with a new PEP rather than updating PEP 432. I didn't
feel able to just "update" the PEP 432, my proposal is different. The
main difference between my PEP 587 and Nick Coghlan's PEP 432 is that
my PEP 587 only allows to configure Python *before* its
initialization, whereas Nick wanted to give control between its
"Initializing" and "Initialized" initialization phases. Moreover, my
PEP only uses C types rather Nick wanted to use Python types.

By setting the private PyConfig._init_main field to 1, my PEP 587
allows to stop Python initialization early to get a bare minimum
working Python only with builtin types and no importlib. If I
understood correctly, with my PEP, it remains possible to *extend*
Python C API later to implement what Nick wants. Nick: please correct
me if I'm wrong :-)

Victor


PEP: 587
Title: Python Initialization Configuration
Author: Victor Stinner <vstinner at redhat.com>
Discussions-To: python-dev at python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Mar-2018
Python-Version: 3.8

Abstract
========

Add a new C API to configure the Python Initialization providing finer
control on the whole configuration and better error reporting.


Rationale
=========

Python is highly configurable but its configuration evolved organically:
configuration parameters is scattered all around the code using
different ways to set them (mostly global configuration variables and
functions).  A straightforward and reliable way to configure Python is
needed. Some configuration parameters are not accessible from the C API,
or not easily.

The C API of Python 3.7 Initialization takes ``wchar_t*`` strings as
input whereas the Python filesystem encoding is set during the
initialization.


Python Initialization C API
===========================

This PEP proposes to add the following new structures, functions and
macros.

New structures (4):

* ``PyConfig``
* ``PyInitError``
* ``PyPreConfig``
* ``PyWideCharList``

New functions (8):

* ``Py_PreInitialize(config)``
* ``Py_PreInitializeFromArgs(config, argc, argv)``
* ``Py_PreInitializeFromWideArgs(config, argc, argv)``
* ``Py_InitializeFromConfig(config)``
* ``Py_InitializeFromArgs(config, argc, argv)``
* ``Py_InitializeFromWideArgs(config, argc, argv)``
* ``Py_UnixMain(argc, argv)``
* ``Py_ExitInitError(err)``

New macros (6):

* ``Py_INIT_OK()``
* ``Py_INIT_ERR(MSG)``
* ``Py_INIT_USER_ERR(MSG)``
* ``Py_INIT_NO_MEMORY()``
* ``Py_INIT_EXIT(EXITCODE)``
* ``Py_INIT_FAILED(err)``


PyWideCharList
--------------

``PyWideCharList`` is a list of ``wchar_t*`` strings.

Example to initialize a string from C static array::

    static wchar_t* argv[2] = {
        L"-c",
        L"pass",
    };
    PyWideCharList config_argv = PyWideCharList_INIT;
    config_argv.length = Py_ARRAY_LENGTH(argv);
    config_argv.items = argv;

``PyWideCharList`` structure fields:

* ``length`` (``Py_ssize_t``)
* ``items`` (``wchar_t**``)

If *length* is non-zero, *items* must be non-NULL and all strings must
be non-NULL.

.. note::
   The "WideChar" name comes from the existing Python C API. Example:
   ``PyUnicode_FromWideChar(const wchar_t *str, Py_ssize_t size)``.

PyInitError
-----------

``PyInitError`` is a structure to store an error message or an exit code
for the Python Initialization.

Example::

    PyInitError alloc(void **ptr, size_t size)
    {
        *ptr = PyMem_RawMalloc(size);
        if (*ptr == NULL) {
            return Py_INIT_NO_MEMORY();
        }
        return Py_INIT_OK();
    }

    int main(int argc, char **argv)
    {
        void *ptr;
        PyInitError err = alloc(&ptr, 16);
        if (Py_INIT_FAILED(err)) {
            Py_ExitInitError(err);
        }
        PyMem_Free(ptr);
        return 0;
    }

``PyInitError`` fields:

* ``exitcode`` (``int``): if greater or equal to zero, argument passed to
  ``exit()``
* ``msg`` (``const char*``): error message
* ``prefix`` (``const char*``): string displayed before the message,
  usually rendered as ``prefix: msg``.
* ``user_err`` (int): if non-zero, the error is caused by the user
  configuration, otherwise it's an internal Python error.

Macro to create an error:

* ``Py_INIT_OK()``: success
* ``Py_INIT_NO_MEMORY()``: memory allocation failure (out of memory)
* ``Py_INIT_ERR(MSG)``: Python internal error
* ``Py_INIT_USER_ERR(MSG)``: error caused by user configuration
* ``Py_INIT_EXIT(STATUS)``: exit Python with the specified status

Other macros and functions:

* ``Py_INIT_FAILED(err)``: Is the result an error or an exit?
* ``Py_ExitInitError(err)``: call ``exit(status)`` for
  an error created by ``Py_INIT_EXIT(status)``,
  call ``Py_FatalError(msg)`` for other errors. Must not be called for
  an error created by ``Py_INIT_OK()``.

Pre-Initialization with PyPreConfig
-----------------------------------

``PyPreConfig`` structure is used to pre-initialize Python:

* Set the memory allocator
* Configure the LC_CTYPE locale
* Set the UTF-8 mode

Example using the pre-initialization to enable the UTF-8 Mode::

    PyPreConfig preconfig = PyPreConfig_INIT;
    preconfig.utf8_mode = 1;

    PyInitError err = Py_PreInitialize(&preconfig);
    if (Py_INIT_FAILED(err)) {
        Py_ExitInitError(err);
    }

    /* at this point, Python will speak UTF-8 */

    Py_Initialize();
    /* ... use Python API here ... */
    Py_Finalize();

Functions to pre-initialize Python:

* ``PyInitError Py_PreInitialize(const PyPreConfig *config)``
* ``PyInitError Py_PreInitializeFromArgs( const PyPreConfig *config,
int argc, char **argv)``
* ``PyInitError Py_PreInitializeFromWideArgs( const PyPreConfig
*config, int argc, wchar_t **argv)``

These functions can be called with *config* set to ``NULL``. The caller
is responsible to handler error using ``Py_INIT_FAILED()`` and
``Py_ExitInitError()``.

``PyPreConfig`` fields:

* ``allocator``: name of the memory allocator (ex: ``"malloc"``)
* ``coerce_c_locale_warn``: if non-zero, emit a warning if the C locale
  is coerced.
* ``coerce_c_locale``: if equals to 2, coerce the C locale; if equals to
  1, read the LC_CTYPE to decide if it should be coerced.
* ``dev_mode``: see ``PyConfig.dev_mode``
* ``isolated``: see ``PyConfig.isolated``
* ``legacy_windows_fs_encoding`` (Windows only): if non-zero, set the
  Python filesystem encoding to ``"mbcs"``.
* ``use_environment``: see ``PyConfig.use_environment``
* ``utf8_mode``: if non-zero, enable the UTF-8 mode

The C locale coercion (PEP 538) and the UTF-8 Mode (PEP 540) are
disabled by default in ``PyPreConfig``. Set ``coerce_c_locale``,
``coerce_c_locale_warn`` and ``utf8_mode`` to ``-1`` to let Python
enable them depending on the user configuration.

Initialization with PyConfig
----------------------------

The ``PyConfig`` structure contains all parameters to configure Python.

Example of Python initialization enabling the isolated mode::

    PyConfig config = PyConfig_INIT;
    config.isolated = 1;

    PyInitError err = Py_InitializeFromConfig(&config);
    if (Py_INIT_FAILED(err)) {
        Py_ExitInitError(err);
    }
    /* ... use Python API here ... */
    Py_Finalize();


Functions to initialize Python:

* ``PyInitError Py_InitializeFromConfig(const PyConfig *config)``
* ``PyInitError Py_InitializeFromArgs(const PyConfig *config, int
argc, char **argv)``
* ``PyInitError Py_InitializeFromWideArgs(const PyConfig *config, int
argc, wchar_t **argv)``

These functions can be called with *config* set to ``NULL``. The caller
is responsible to handler error using ``Py_INIT_FAILED()`` and
``Py_ExitInitError()``.

PyConfig fields:

* ``argv``: ``sys.argv``
* ``base_exec_prefix``: ``sys.base_exec_prefix``
* ``base_prefix``: ``sys.base_prefix``
* ``buffered_stdio``: if equals to 0, enable unbuffered mode, make
  stdout and stderr streams to be unbuffered.
* ``bytes_warning``: if equals to 1, issue a warning when comparing
  ``bytes`` or ``bytearray`` with ``str``, or comparing ``bytes`` with
  ``int``. If equal or greater to 2, raise a ``BytesWarning`` exception.
* ``dll_path`` (Windows only): Windows DLL path
* ``dump_refs``: if non-zero, display all objects still alive at exit
* ``exec_prefix``: ``sys.exec_prefix``
* ``executable``: ``sys.executable``
* ``faulthandler``: if non-zero, call ``faulthandler.enable()``
* ``filesystem_encoding``: Filesystem encoding,
  ``sys.getfilesystemencoding()``
* ``filesystem_errors``: Filesystem encoding errors,
  ``sys.getfilesystemencodeerrors()``
* ``hash_seed``, ``use_hash_seed``: randomized hash function seed
* ``home``: Python home
* ``import_time``: if non-zero, profile import time
* ``inspect``: enter interactive mode after executing a script or a
  command
* ``install_signal_handlers``: install signal handlers?
* ``interactive``: interactive mode
* ``legacy_windows_stdio`` (Windows only): if non-zero, use
  ``io.FileIO`` instead of ``WindowsConsoleIO`` for ``sys.stdin``,
  ``sys.stdout`` and ``sys.stderr``.
* ``malloc_stats``: if non-zero, dump memory allocation statistics
  at exit
* ``module_search_path_env``: ``PYTHONPATH`` environment variale value
* ``module_search_paths``, ``use_module_search_paths``: ``sys.path``
* ``optimization_level``: compilation optimization level
* ``parser_debug``: if non-zero, turn on parser debugging output (for
  expert only, depending on compilation options).
* ``prefix``: ``sys.prefix``
* ``program_name``: Program name
* ``program``: ``argv[0]`` or an empty string
* ``pycache_prefix``: ``.pyc`` cache prefix
* ``quiet``: quiet mode (ex: don't display the copyright and version
  messages even in interactive mode)
* ``run_command``: ``-c COMMAND`` argument
* ``run_filename``: ``python3 SCRIPT`` argument
* ``run_module``: ``python3 -m MODULE`` argument
* ``show_alloc_count``: show allocation counts at exit
* ``show_ref_count``: show total reference count at exit
* ``site_import``: import the ``site`` module at startup?
* ``skip_source_first_line``: skip the first line of the source
* ``stdio_encoding``, ``stdio_errors``: encoding and encoding errors of
  ``sys.stdin``, ``sys.stdout`` and ``sys.stderr``
* ``tracemalloc``: if non-zero, call ``tracemalloc.start(value)``
* ``user_site_directory``: if non-zero, add user site directory to
  ``sys.path``
* ``verbose``: if non-zero, enable verbose mode
* ``warnoptions``: options of the ``warnings`` module to build filters
* ``write_bytecode``: if non-zero, write ``.pyc`` files
* ``xoptions``: ``sys._xoptions``

There are also private fields which are for internal-usage only:

* ``_check_hash_pycs_mode``
* ``_frozen``
* ``_init_main``
* ``_install_importlib``

New Py_UnixMain() function
--------------------------

Python 3.7 provides a high-level ``Py_Main()`` function which requires
to pass command line arguments as ``wchar_t*`` strings. It is
non-trivial to use the correct encoding to decode bytes. Python has its
own set of issues with C locale coercion and UTF-8 Mode.

This PEP adds a new ``Py_UnixMain()`` function which takes command line
arguments as bytes::

    int Py_UnixMain(int argc, char **argv)

Memory allocations and Py_DecodeLocale()
----------------------------------------

New pre-initialization and initialization APIs use constant
``PyPreConfig`` or ``PyConfig`` structures. If memory is allocated
dynamically, the caller is responsible to release it.  Using static
strings is just fine.

Python memory allocation functions like ``PyMem_RawMalloc()`` must not
be used before Python pre-initialization.  Using ``malloc()`` and
``free()`` is always safe.

``Py_DecodeLocale()`` must only be used after the pre-initialization.


XXX Open Questions
==================

This PEP is still a draft with open questions which should be answered:

* Do we need to add an API for import ``inittab``?
* What about the stable ABI? Should we add a version into
  ``PyPreConfig`` and ``PyConfig`` structures somehow? The Windows API
  is known for its ABI stability and it stores the structure size into
  the structure directly. Do the same?
* The PEP 432 stores ``PYTHONCASEOK`` into the config. Do we need
  to add something for that into ``PyConfig``? How would it be exposed
  at the Python level for ``importlib``? Passed as an argument to
  ``importlib._bootstrap._setup()`` maybe?


Backwards Compatibility
=======================

This PEP only adds a new API: it leaves the existing API unchanged and
has no impact on the backwards compatibility.


Alternative: PEP 432
====================

This PEP is inspired by Nick Coghlan's PEP 432 with a main difference:
it only allows to configure Python before its initialization.

The PEP 432 uses three initialization phases: Pre-Initialization,
Initializing, Initialized. It is possible to configure Python between
Initializing and Initialized phases using Python objects.

This PEP only uses C types like ``int`` and ``wchar_t*`` (and
``PyWideCharList`` structure). All parameters must be configured at once
before the Python initialization using the ``PyConfig`` structure.


Annex: Python Configuration
===========================

Priority and Rules
------------------

Priority of configuration parameters, highest to lowest:

* ``PyConfig``
* ``PyPreConfig``
* Configuration files
* Command line options
* Environment variables
* Global configuration variables

Priority of warning options, highest to lowest:

* ``PyConfig.warnoptions``
* ``PyConfig.dev_mode`` (add ``"default"``)
* ``PYTHONWARNINGS`` environment variables
* ``-W WARNOPTION`` command line argument
* ``PyConfig.bytes_warning`` (add ``"error::BytesWarning"`` if greater
  than 1, or add ``"default::BytesWarning``)

Rules on ``PyConfig`` and ``PyPreConfig`` parameters:

* If ``isolated`` is non-zero, ``use_environment`` and
  ``user_site_directory`` are set to 0
* If ``legacy_windows_fs_encoding`` is non-zero, ``utf8_mode`` is set to
  0
* If ``dev_mode`` is non-zero, ``allocator`` is set to ``"debug"``,
  ``faulthandler`` is set to 1, and ``"default"`` filter is added to
  ``warnoptions``. But ``PYTHONMALLOC`` has the priority over
  ``dev_mode`` to set the memory allocator.

Configuration Files
-------------------

Python configuration files:

* ``pyvenv.cfg``
* ``python._pth`` (Windows only)
* ``pybuilddir.txt`` (Unix only)

Global Configuration Variables
------------------------------

Global configuration variables mapped to ``PyPreConfig`` fields:

========================================  ================================
Variable                                  Field
========================================  ================================
``Py_LegacyWindowsFSEncodingFlag``        ``legacy_windows_fs_encoding``
``Py_LegacyWindowsFSEncodingFlag``        ``legacy_windows_fs_encoding``
``Py_UTF8Mode``                           ``utf8_mode``
``Py_UTF8Mode``                           ``utf8_mode``
========================================  ================================

Global configuration variables mapped to ``PyConfig`` fields:

========================================  ================================
Variable                                  Field
========================================  ================================
``Py_BytesWarningFlag``                   ``bytes_warning``
``Py_DebugFlag``                          ``parser_debug``
``Py_DontWriteBytecodeFlag``              ``write_bytecode``
``Py_FileSystemDefaultEncodeErrors``      ``filesystem_errors``
``Py_FileSystemDefaultEncoding``          ``filesystem_encoding``
``Py_FrozenFlag``                         ``_frozen``
``Py_HasFileSystemDefaultEncoding``       ``filesystem_encoding``
``Py_HashRandomizationFlag``              ``use_hash_seed``, ``hash_seed``
``Py_IgnoreEnvironmentFlag``              ``use_environment``
``Py_InspectFlag``                        ``inspect``
``Py_InteractiveFlag``                    ``interactive``
``Py_IsolatedFlag``                       ``isolated``
``Py_LegacyWindowsStdioFlag``             ``legacy_windows_stdio``
``Py_NoSiteFlag``                         ``site_import``
``Py_NoUserSiteDirectory``                ``user_site_directory``
``Py_OptimizeFlag``                       ``optimization_level``
``Py_QuietFlag``                          ``quiet``
``Py_UnbufferedStdioFlag``                ``buffered_stdio``
``Py_VerboseFlag``                        ``verbose``
``_Py_HasFileSystemDefaultEncodeErrors``  ``filesystem_errors``
``Py_BytesWarningFlag``                   ``bytes_warning``
``Py_DebugFlag``                          ``parser_debug``
``Py_DontWriteBytecodeFlag``              ``write_bytecode``
``Py_FileSystemDefaultEncodeErrors``      ``filesystem_errors``
``Py_FileSystemDefaultEncoding``          ``filesystem_encoding``
``Py_FrozenFlag``                         ``_frozen``
``Py_HasFileSystemDefaultEncoding``       ``filesystem_encoding``
``Py_HashRandomizationFlag``              ``use_hash_seed``, ``hash_seed``
``Py_IgnoreEnvironmentFlag``              ``use_environment``
``Py_InspectFlag``                        ``inspect``
``Py_InteractiveFlag``                    ``interactive``
``Py_IsolatedFlag``                       ``isolated``
``Py_LegacyWindowsStdioFlag``             ``legacy_windows_stdio``
``Py_NoSiteFlag``                         ``site_import``
``Py_NoUserSiteDirectory``                ``user_site_directory``
``Py_OptimizeFlag``                       ``optimization_level``
``Py_QuietFlag``                          ``quiet``
``Py_UnbufferedStdioFlag``                ``buffered_stdio``
``Py_VerboseFlag``                        ``verbose``
``_Py_HasFileSystemDefaultEncodeErrors``  ``filesystem_errors``
========================================  ================================


``Py_LegacyWindowsFSEncodingFlag`` and ``Py_LegacyWindowsStdioFlag`` are
only available on Windows.

Command Line Arguments
----------------------

Usage::

    python3 [options]
    python3 [options] -c COMMAND
    python3 [options] -m MODULE
    python3 [options] SCRIPT


Command line options mapped to pseudo-action on ``PyConfig`` fields:

================================  ================================
Option                            ``PyConfig`` field
================================  ================================
``-b``                            ``bytes_warning++``
``-B``                            ``write_bytecode = 0``
``-c COMMAND``                    ``run_module = COMMAND``
``--check-hash-based-pycs=MODE``  ``_check_hash_pycs_mode = MODE``
``-d``                            ``parser_debug++``
``-E``                            ``use_environment = 0``
``-i``                            ``inspect++`` and ``interactive++``
``-I``                            ``isolated = 1``
``-m MODULE``                     ``run_module = MODULE``
``-O``                            ``optimization_level++``
``-q``                            ``quiet++``
``-R``                            ``use_hash_seed = 0``
``-s``                            ``user_site_directory = 0``
``-S``                            ``site_import``
``-t``                            ignored (kept for backwards compatibility)
``-u``                            ``buffered_stdio = 0``
``-v``                            ``verbose++``
``-W WARNING``                    add ``WARNING`` to ``warnoptions``
``-x``                            ``skip_source_first_line = 1``
``-X XOPTION``                    add ``XOPTION`` to ``xoptions``
================================  ================================

``-h``, ``-?`` and ``-V`` options are handled outside ``PyConfig``.

Environment Variables
---------------------

Environment variables mapped to ``PyPreConfig`` fields:

=================================  =============================================
Variable                           ``PyPreConfig`` field
=================================  =============================================
``PYTHONCOERCECLOCALE``            ``coerce_c_locale``, ``coerce_c_locale_warn``
``PYTHONDEVMODE``                  ``dev_mode``
``PYTHONLEGACYWINDOWSFSENCODING``  ``legacy_windows_fs_encoding``
``PYTHONMALLOC``                   ``allocator``
``PYTHONUTF8``                     ``utf8_mode``
=================================  =============================================

Environment variables mapped to ``PyConfig`` fields:

=================================  ====================================
Variable                           ``PyConfig`` field
=================================  ====================================
``PYTHONDEBUG``                    ``parser_debug``
``PYTHONDEVMODE``                  ``dev_mode``
``PYTHONDONTWRITEBYTECODE``        ``write_bytecode``
``PYTHONDUMPREFS``                 ``dump_refs``
``PYTHONEXECUTABLE``               ``program_name``
``PYTHONFAULTHANDLER``             ``faulthandler``
``PYTHONHASHSEED``                 ``use_hash_seed``, ``hash_seed``
``PYTHONHOME``                     ``home``
``PYTHONINSPECT``                  ``inspect``
``PYTHONIOENCODING``               ``stdio_encoding``, ``stdio_errors``
``PYTHONLEGACYWINDOWSSTDIO``       ``legacy_windows_stdio``
``PYTHONMALLOCSTATS``              ``malloc_stats``
``PYTHONNOUSERSITE``               ``user_site_directory``
``PYTHONOPTIMIZE``                 ``optimization_level``
``PYTHONPATH``                     ``module_search_path_env``
``PYTHONPROFILEIMPORTTIME``        ``import_time``
``PYTHONPYCACHEPREFIX,``           ``pycache_prefix``
``PYTHONTRACEMALLOC``              ``tracemalloc``
``PYTHONUNBUFFERED``               ``buffered_stdio``
``PYTHONVERBOSE``                  ``verbose``
``PYTHONWARNINGS``                 ``warnoptions``
=================================  ====================================

``PYTHONLEGACYWINDOWSFSENCODING`` and ``PYTHONLEGACYWINDOWSSTDIO`` are
specific to Windows.

``PYTHONDEVMODE`` is mapped to ``PyPreConfig.dev_mode`` and
``PyConfig.dev_mode``.


Annex: Python 3.7 API
=====================

Python 3.7 has 4 functions in its C API to initialize and finalize
Python:

* ``Py_Initialize()``, ``Py_InitializeEx()``: initialize Python
* ``Py_Finalize()``, ``Py_FinalizeEx()``: finalize Python

Python can be configured using scattered global configuration variables
(like ``Py_IgnoreEnvironmentFlag``) and using the following functions:

* ``PyImport_AppendInittab()``
* ``PyImport_ExtendInittab()``
* ``PyMem_SetAllocator()``
* ``PyMem_SetupDebugHooks()``
* ``PyObject_SetArenaAllocator()``
* ``Py_SetPath()``
* ``Py_SetProgramName()``
* ``Py_SetPythonHome()``
* ``Py_SetStandardStreamEncoding()``
* ``PySys_AddWarnOption()``
* ``PySys_AddXOption()``
* ``PySys_ResetWarnOptions()``

There is also a high-level ``Py_Main()`` function.


Copyright
=========

This document has been placed in the public domain.