[Python-Dev] Store startup modules as C structures for 20%+ startup speed improvement?
On Fri, 14 Sep 2018 at 23:28, Neil Schemenauer <nas-python at arctrix.com> wrote:
> On 2018-09-14, Larry Hastings wrote:
> > [..] adding the stat calls back in costs you half the startup. So
> > any mechanism where we're talking to the disk _at all_ simply
> > isn't going to be as fast.
> Okay, so if we use hundreds of small .pyc files scattered all over
> the disk, that's bad? Who would have thunk it. ;-P
> We could have a new format, .pya (compiled python archive) that has
> data for many .pyc files in it. In normal runs you would have one
> or just and handlful of these things (e.g. one for stdlib, one for
> your app and all the packages it uses). Then you mmap these just
> once and rely on OS page faults to bring in the data as you need it.
> The .pya would have a hash table at the start or end that tells you
> the offset for each module.
Isn't that essentially what putting the stdlib in a zipfile does? (See
the windows embedded distribution for an example). It probably uses
normal IO rather than mmap, but maybe adding a "use mmap" flag to the
zipfile module would be a more general enhancement that zipimport
could use for free.