[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] A fast startup patch (was: Python startup time)


On May 7, 2018 9:15:32 PM Steve Dower <steve.dower at python.org> wrote:

> ?the data shows that a focused change to address file system inefficiencies 
> has the potential to broadly and transparently deliver benefit to users 
> without affecting existing code or workflows.?
>
> This is consistent with a Node.js experiment I heard about where they 
> compiled an entire application in a single (HUGE!) .js file. Reading a 
> single large file from disk is quicker than many small files on every 
> significant file system I?m aware of. Is there benefit to supporting import 
> of .tar files as we currently do .zip? Or perhaps having a special 
> fast-path for uncompressed .zip files?

I kind of built something like this, though I haven't really put in the 
effort to make it overly usable yet:

https://github.com/kirbyfan64/bluesnow

(Bonus points to anyone who gets the character reference in the name, 
though I seriously doubt it.)

Main thing I noticed was that reading compiled .pyc files is far faster 
than uncompiled Python code, even if you eliminate the disk access. Kind of 
obvious in retrospect, but still something to note

However, there are more obstacles to this in the Python world than the JS 
world. C extensions have a heavier prevalence here, distribution is a bit 
weirder (sorry, even with Pipfiles), and JavaScript already has an entire 
ecosystem built around packing files together from the web world.

>
> Top-posted from my Windows phone
>
> From: Carl Shapiro
> Sent: Monday, May 7, 2018 14:36
> To: Nathaniel Smith
> Cc: Nick Coghlan; Python Dev
> Subject: Re: [Python-Dev] A fast startup patch (was: Python startup time)
>
> On Fri, May 4, 2018 at 6:58 PM, Nathaniel Smith <njs at pobox.com> wrote:
> What are the obstacles to including "preloaded" objects in regular .pyc 
> files, so that everyone can take advantage of this without rebuilding the 
> interpreter?
>
> The system we have developed can create a shared object file for each 
> compiled Python file.? However, such a representation is not directly 
> usable.? First, certain shared constants, such as interned strings, must be 
> kept globally unique across object code files.? Second, some marshaled 
> objects, such as the hashed collections, must be initialized with 
> randomization state that is not available until after the hosting runtime 
> has been initialized.
>
> We are able to work around the first issue by generating a heap image with 
> the transitive closure of all modules that will be loaded which allows us 
> to easily maintain uniqueness guarantees.? We are able to work around the 
> second issue with some unobservable changes to the affected data structures.
> ?
> Based on our numbers, it appears there should be some hesitancy--at this 
> time--to changing the format of compiled Python file for the sake of 
> load-time performance.? In contrast, the data shows that a focused change 
> to address file system inefficiencies has the potential to broadly and 
> transparently deliver benefit to users without affecting existing code or 
> workflows.?
>
>
>
>
> ----------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
>


--
Ryan (????)
Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone else
https://refi64.com/