[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] Usage of the multiprocessing API and object lifetime

On Tue, Dec 11, 2018, 07:13 Antoine Pitrou <solipsis at pitrou.net wrote:

> What you are proposing here starts to smell like an anti-pattern to
> me.  Python _is_ a garbage-collected language, so by definition, there
> _are_ going to be resources that are automatically collected when an
> object disappears.  If I'm allocating a 2GB bytes object, then PyPy may
> delay the deallocation much longer than CPython.  Do you propose we add
> a release() method to bytes objects to avoid this issue (and emit a
> warning for people who don't call release() on bytes objects)?

I know this question is rhetorical, but it does actually have a principled
answer. Memory is a special resource, because the GC has a complete picture
of memory use in the program, and if there's a danger of running out of
memory then the GC will detect that and quickly run a collection before
anyone has a chance to notice. But it doesn't know about other resources
like descriptors, threads, processes, etc., so it can't detect or prevent
unbounded leaks of these resources.

Therefore, in a classic GC-ed language, bytes() doesn't need to be
explicitly released, but all other kinds of resources do. And according to
the language spec, Python is a classic GC-ed language.

But things are complicated, because CPython isn't a classic GC-ed language,
exactly. In practice it's a sort of hybrid RAII/GC language. People
regularly write programs that on the refcount quick-release semantics for
correctness. A quick way to check: the only thing a reference cycle does is
make CPython start acting like an ordinary GC-ed language, so if you're
worrying about reference cycles, that's a strong sign that you're writing
CPython, not Python.

This puts libraries like multiprocessing in a tricky position, because some
users are writing CPython, and some are writing Python, and the two groups
have contradictory expectations for how resource management should be
handled, yet somehow we have to make both groups happy.

I don't know what multiprocessing should do here, but I certainly admire
the problem :-).


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20181211/23a0b8c8/attachment.html>