[Python-Dev] Usage of the multiprocessing API and object lifetime
On Tue, 11 Dec 2018 16:33:54 +0100
Victor Stinner <vstinner at redhat.com> wrote:
> Le mar. 11 d?c. 2018 ? 16:14, Antoine Pitrou <solipsis at pitrou.net> a ?crit :
> > What you are proposing here starts to smell like an anti-pattern to
> > me. Python _is_ a garbage-collected language, so by definition, there
> > _are_ going to be resources that are automatically collected when an
> > object disappears. If I'm allocating a 2GB bytes object, then PyPy may
> > delay the deallocation much longer than CPython. Do you propose we add
> > a release() method to bytes objects to avoid this issue (and emit a
> > warning for people who don't call release() on bytes objects)?
> We are not talking about simple strings, but processes and threads.
Right, but do those have an impact on the program's correctness, or
simply on its performance (or memory consumption)?
> "user-visible consequences" are that resources are kept alive longer
> than I would expect. When I use a context manager, I expect that
> Python will magically releases everything for me.
I think there's a balancing act here: between "with pool" releasing
everything, and not taking too much time to execute the __exit__ method.
Currently, threads and processes may finish quietly between __exit__
and __del__, without adding significant latencies to your program's
> I prefer to explicitly manager resources like processes and threads
> since they can exit with error: killed by a signal, waitpid() failure
> (exit status already read by a different function), etc.
But multiprocessing.Pool manages them implicitly _by design_. People
who want to manage processes explicitly can use the Process class