[Python-Dev] Compile-time resolution of packages [Was: Another update for PEP 394...]
On 2019-02-26, Gregory P. Smith wrote:
> On Tue, Feb 26, 2019 at 9:55 AM Barry Warsaw <barry at python.org> wrote:
> For an OS distro provided interpreter, being able to restrict its use to
> only OS distro provided software would be ideal (so ideal that people who
> haven't learned the hard distro maintenance lessons may hate me for it).
Interesting idea. I remember when I was helping develop Debian
packaging guides for Python software. I had to fight with people
to convince them that Debian packages should use
The situtation is much better now but I still sometimes have
packaged software fail because it picks up my version of
/usr/local/bin/python. I don't understand how people can believe
grabbing /usr/local/bin/python is going to be a way to build a
> Such a restriction could be implemented within the interpreter itself. For
> example: Say that only this set of fully qualified path whitelisted .py
> files are allowed to invoke it, with no interactive, stdin, or command line
> "-c" use allowed.
I think this is related to an idea I was tinkering with on the
weekend. Why shouldn't we do more compile time linkage of Python
packages? At least, I think we give people the option to do it.
Obviously you still need to also support run-time import search
(interactive REPL, support __import__(unknown_at_compiletime)__).
Here is the sketch of the idea (probably half-baked, as most of my
- add PYTHONPACKAGES envvar and -p options to 'python'
- the argument for these options would be a colon separated list of
Python package archives (crates, bales, bundles?). The -p option
could be a colon separated list or provided multiple times to
specify more packages.
- the modules/packages contained in those archives become the
preferred bytecode code source when those names are imported. We
look there first. The crawling around behavor (dynamic import
based on sys.path) happens only if a module is not found and could
be turned off.
- the linking of the modules could be computed when the code is
compiled and the package archive created, rather than when the
'import' statement gets executed. This would provide a number of
advantages. It would be faster. Code analysis tools could
statically determine which modules imported code corresponds too.
E.g. if your code calls module.foo, assuming no monkey patching,
you know what code 'foo' actually is.
- to get extra fancy, the package archives could be dynamic
link libraries containing "frozen modules" like this FB experiment:
That way, you avoid the unmarshal step and just execute the module
bytecode directly. On startup, Python would dlopen all of the
package archives specified by PYTHONPACKAGES. On init, it would
build an index of the package tree and it would have the memory
location for the code object for each module.
That would seem like quite a useful thing. For an application like
Mercurial, they could build all the modules/packages required into a
single package archive. Or, there would be a small number of
archives (one for standard Python library, one for everything else
that Mercurial needs).
Now that I write this, it sounds a lot like the debate between
static linking and dynamic linking. Golang does static linking and
people seem to like the single executable distribution.