[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] Python startup time

 On Tue, May 1, 2018 at 11:55 PM, Ray Donnelly <mingw.android at gmail.com>

> Is your Python interpreter statically linked? The Python 3 ones from the
anaconda distribution (use Miniconda!) are for Linux and macOS and that
roughly halved our startup times.

My Python interpreters use a shared library. I'll definitely investigate
the performance of a statically-linked interpreter.

Correct me if I'm wrong, but aren't there downsides with regards to C
extension compatibility to not having a shared libpython? Or does all the
packaging tooling "just work" without a libpython? (It's possible I have my
wires crossed up with something else regarding a statically linked Python.)

On Wed, May 2, 2018 at 2:26 AM, Victor Stinner <vstinner at redhat.com> wrote:

> What do you propose to make Python startup faster?

That's a very good question. I'm not sure I'm able to answer it because I
haven't dug too much into CPython's internals much farther than what is
required to implement C extensions. But I can share insight from what the
Mercurial project has collectively learned.

> As I wrote in my previous emails, many Python core developers care of
> the startup time and we are working on making it faster.
> INADA Naoki added -X importtime to identify slow imports and
> understand where Python spent its startup time.

-X importtime is a great start! For a follow-up enhancement, it would be
useful to see what aspects of import are slow. Is it finding modules
(involves filesystem I/O)? Is it unmarshaling pyc files? Is it executing
the module code? If executing code, what part is slow? Inline
statements/expressions? Compiling types? Printing the microseconds it takes
to import a module is useful. But it only gives me a general direction: I
want to know what parts of the import made it slow so I know if I should be
focusing on code running during module import, slimming down the size of a
module, eliminating the module import from fast paths, pursuing alternative
module importers, etc.

> Recent example: Barry Warsaw identified that pkg_resources is slow and
> added importlib.resources to Python 3.7:
> https://docs.python.org/dev/library/importlib.html#module-
> importlib.resources
> Brett Cannon is also working on a standard solution for lazy imports
> since many years:
> https://pypi.org/project/modutil/
> https://snarky.ca/lazy-importing-in-python-3-7/

Mercurial has used lazy module imports for years. On 2.7.14, it reduces `hg
version` from ~160ms to ~55ms (~34% of original). On Python 3, we're using
`importlib.util.LazyLoader` and it reduces `hg version` on 3.7 from ~245ms
to ~120ms (~49% of original). I'm not sure why Python 3's built-in module
importer doesn't yield the speedup that our custom Python 2 importer does.
One explanation is our custom importer is more advanced than importlib.
Another is that Python 3's import mechanism is slower (possibly due to
being written in Python instead of C). We haven't yet spent much time
optimizing Mercurial for Python 3: our immediate goal is to get it working
first. Given the startup performance problem on Python 3, it is only a
matter of time before we dig into this further.

It's worth noting that lazy module importing can be undone via common
patterns. Most commonly, `from foo import X`. It's *really* difficult to
implement a proper object proxy. Mercurial's lazy importer gives up in this
case and imports the module and exports the symbol. (But if the imported
module is a package, we detect that and make the module exports proxies to
a lazy module.)

Another common undermining of the lazy importer is code that runs during
import time module exec that accesses an attribute. e.g.

import foo

class myobject(foo.Foo):

Mercurial goes out of its way to avoid these patterns so modules can be
delay imported as much as possible. As long as import times are
problematic, it would be helpful if the standard library adopted similar
patterns. Although I recognize there are backwards compatibility concerns
that tie your hands a bit.

> Nick Coghlan is working on the C API to configure Python startup: PEP
> 432. When it will be ready, maybe Mercurial could use a custom Python
> optimized for its use case.

That looks great!

The direction Mercurial is going in is that `hg` will likely become a Rust
binary (instead of a #!python script) that will use an embedded Python
interpreter. So we will have low-level control over the interpreter via the
C API. I'd also like to see us distribute a copy of Python in our official
builds. This will allow us to take various shortcuts, such as not having to
probe various sys.path entries since certain packages can only exist in one
place. I'd love to get to the state Google is at where they have
self-contained binaries with ELF sections containing Python modules. But
that requires a bit of very low-level hacking. We'll likely have a Rust
binary (that possibly static links libpython) and a separate JAR/zip-like
file containing resources.

But many people obtain Python via their system package manager and no
matter how hard we scream that Mercurial is a standalone application, they
will configure their packages to link against the system libpython and use
the system Python's standard library. This will potentially undo many of
our startup time wins.

> IMHO Python import system is inefficient. We try too many alternative
> names.
> Example with Python 3.8
> $ ./python -vv:
> >>> import dontexist
> # trying /home/vstinner/prog/python/master/dontexist.cpython-38dm-
> x86_64-linux-gnu.so
> # trying /home/vstinner/prog/python/master/dontexist.abi3.so
> # trying /home/vstinner/prog/python/master/dontexist.so
> # trying /home/vstinner/prog/python/master/dontexist.py
> # trying /home/vstinner/prog/python/master/dontexist.pyc
> # trying /home/vstinner/prog/python/master/Lib/dontexist.cpython-
> 38dm-x86_64-linux-gnu.so
> # trying /home/vstinner/prog/python/master/Lib/dontexist.abi3.so
> # trying /home/vstinner/prog/python/master/Lib/dontexist.so
> # trying /home/vstinner/prog/python/master/Lib/dontexist.py
> # trying /home/vstinner/prog/python/master/Lib/dontexist.pyc
> # trying /home/vstinner/prog/python/master/build/lib.linux-x86_64-
> 3.8-pydebug/dontexist.cpython-38dm-x86_64-linux-gnu.so
> # trying /home/vstinner/prog/python/master/build/lib.linux-x86_64-
> 3.8-pydebug/dontexist.abi3.so
> # trying /home/vstinner/prog/python/master/build/lib.linux-x86_64-
> 3.8-pydebug/dontexist.so
> # trying /home/vstinner/prog/python/master/build/lib.linux-x86_64-
> 3.8-pydebug/dontexist.py
> # trying /home/vstinner/prog/python/master/build/lib.linux-x86_64-
> 3.8-pydebug/dontexist.pyc
> # trying /home/vstinner/.local/lib/python3.8/site-packages/dontex
> ist.cpython-38dm-x86_64-linux-gnu.so
> # trying /home/vstinner/.local/lib/python3.8/site-packages/dontex
> ist.abi3.so
> # trying /home/vstinner/.local/lib/python3.8/site-packages/dontexist.so
> # trying /home/vstinner/.local/lib/python3.8/site-packages/dontexist.py
> # trying /home/vstinner/.local/lib/python3.8/site-packages/dontexist.pyc
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<frozen importlib._bootstrap>", line 983, in _find_and_load
>   File "<frozen importlib._bootstrap>", line 965, in
> _find_and_load_unlocked
> ModuleNotFoundError: No module named 'dontexist'
> Why do we still check for the .pyc file outside __pycache__ directories?
> Why do we have to check for 3 different names for .so files?

Yes, I also cringe every time I trace Python's system calls and see these
needless stats and file opens. Unless Python adds the ability to tell the
import mechanism what type of module to import, Mercurial will likely
modify our custom importer to only look for specific files. We do provide
pure Python modules for modules that have C implementations. But we have
code that ensures that the C version is loaded for certain Python
configurations because we don't want users accidentally using the non-C
modules and then complaining about Mercurial's performance! We already
denote the set of modules backed by C. What we're missing (but is certainly
possible to implement) is code that limits the module finding search
depending on whether the module is backed by Python or C. But this only
really works for Mercurial's modules: we don't really know what the
standard library is doing and coding assumptions into Mercurial about
standard library behavior feels dangerous.

If we ship our own Python distribution, we'll likely have a jar-like file
containing all modules. Determining which file to load will read an
in-memory file index and not require any expensive system calls to look for

> Does Mercurial need all directories of sys.path?

No and yes. Mercurial by itself can get by with just the standard library
and Mercurial's own packages. But extensions change everything. An
extension could modify sys.path though. So limiting sys.path inside
Mercurial is somewhat reasonable. Although it's definitely unexpected for a
Python application to be removing entries from sys.path when the
application starts.

> What's the status of the "system python" project? :-)
> I also would prefer Python without the site module. Can we rewrite
> this module in C maybe? Until recently, the site module was needed on
> Python to create the "mbcs" encoding alias. Hopefully, the feature has
> been removed into Lib/encodings/__init__.py (new private _alias_mbcs()
> function).

I also lament the startup time effects of site.py. When `hg` is a Rust
binary, we will almost certainly skip site.py and manually perform any
required actions that it was performing.

> Python 3.7b3+:
> $ python3.7 -X importtime -c pass
> import time: self [us] | cumulative | imported package
> import time:        95 |         95 | zipimport
> import time:       589 |        589 | _frozen_importlib_external
> import time:        67 |         67 |     _codecs
> import time:       498 |        565 |   codecs
> import time:       425 |        425 |   encodings.aliases
> import time:       641 |       1629 | encodings
> import time:       228 |        228 | encodings.utf_8
> import time:       143 |        143 | _signal
> import time:       335 |        335 | encodings.latin_1
> import time:        58 |         58 |     _abc
> import time:       265 |        322 |   abc
> import time:       298 |        619 | io
> import time:        69 |         69 |       _stat
> import time:       196 |        265 |     stat
> import time:       169 |        169 |       genericpath
> import time:       336 |        505 |     posixpath
> import time:      1190 |       1190 |     _collections_abc
> import time:       600 |       2557 |   os
> import time:       223 |        223 |   _sitebuiltins
> import time:       214 |        214 |   sitecustomize
> import time:        74 |         74 |   usercustomize
> import time:       477 |       3544 | site

As for things Python could do to make things better, one idea is for
"package bundles." Instead of using .py, .pyc, .so, etc files as separate
files on the filesystem, allow Python packages to be distributed as
standalone "archive" files. Like Java's jar files. This has the advantage
that there is only a single place to look for files in a given Python
package. And since the bundle is immutable, you can index it so imports
don't need to touch the filesystem to discover what is present: you do a
quick memory lookup and jump straight to the available file. If you go this
route, please don't require the use of zlib for file compression, as zlib
is painfully slow compared to alternatives like lz4 and zstandard.

I know this kinda/sorta exists with zipimporter. But zipimporter uses zlib
(slow) and only allows .py/.pyc files. And I think some Python application
distribution tools have also solved this problem. I'd *really* like to see
a proper/robust solution in Python itself. Along that vein, it would be
really nice if the "standalone Python application" story were a bit more
formalized. From my perspective, it is insanely difficult to package and
distribute an application that happens to use Python. It requires vastly
different solutions for different platforms. I want to declare a minimal
boilerplate somewhere (perhaps in setup.py) and run a command that produces
an as-self-contained-as-possible application complete with platform-native
installers. Presumably such a self-contained application could take many
shortcuts with regards to process startup and mitigate this general
problem. Again, Mercurial is trending in the direction of making `hg` a
Rust binary and distributing its own Python. Since we have to solve this
packaging+distribution problem on multiple platforms, I'll try to keep an
eye towards making whatever solution we concoct reusable by other projects.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180502/4a0516a4/attachment.html>