[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] PEP 556 threaded garbage collection & linear recursion in gc

[Gregory P. Smith <greg at krypto.org>]
> ...
> A situation came up the other day where I believe this could've helped.
> Scenario (admittedly not one most environments run into): A Python process
> with a C++ extension module implementing a threaded server (threads
> spawned by C++) that could call back into CPython within server request
> handlers.  (ie: how all threaded servers tend to work regardless of core
> loop implementation language)
> Python code in the application had done something (unknown to me, I
> didn't dive into their code) that built up large enough presumably nested
> or recursive data structures that the garbage collector, when triggered,
> would wind up in very deep recursion.  This caused a stack overflow
>  as the C++ spawned threads were only being given a 256k stack (to
> conserve virtual address space - there can potentially be a _ton_ of
> threads in this code).
> That had a C++ stack trace 1000+ levels deep repeating the pattern of
> ...
>     @     0x564d59bd21de         32  func_dealloc
>     @     0x564d59bce0c1         32  cell_dealloc
>     @     0x564d5839db41         48  tupledealloc
>     @     0x564d59bd21de         32  func_dealloc
>     @     0x564d59bce0c1         32  cell_dealloc
>     @     0x564d5839db41         48  tupledealloc
> ...
> If our gc were done on a thread of its own spawned by Python, with a typical
> normal larger default stack size (8mb) this would've been less likely
> to trigger a crash (though obviously still possible if the recursion is linear).

Just noting that gc is probably a red herring in that scenario.  gc
(cleverly!) determines the set of trash objects without needing any
recursive graph crawls.  And gc doesn't deallocate anything - not
directly.  It simply applies Py_DECREF to trash objects, once for each
PyObject* found pointing to them.  _If_ deallocation occurs as a
result, it's the same as if the user had done `del` on the appropriate
object manually.  The recursion then occurs in the chain of
XXX_dealloc calls, which in turn do more Py_DECREFs of their own, and
which have nothing in particular to do with gc.  Approximately the
same stack depth would be hit in a Python built without gc if the user
did the same sequence of `del`s by hand to break trash cycles.

The good news is that the "trashcan" mechanism - which predates gc -
was introduced to limit call depth in the presence of some potentially
unbounded chains of dealloc calls.  So teach trashcan about the
pattern here, and the stack depth problem goes away, with or without

The bad news is that the traschcan mechanism is excruciating, a
long-time source of subtle bugs of its own :-(

Note:  without actual code to run, it's possible that trashcan is
_already_ trying to limit stack depth in this specific case, but the
stack size is too small to accommodate the largest depth of recursive
calls trashcan is willing to allow before interfering.  Or some bug
snuck into trashcan, or its invoking code, that causes it to fail to
work as intended due to some unknown detail of the code above.
There's just no way to guess without code to provoke it.

> Another take away from this is that it appears possible to cause our gc
> to go into linear recursion.

As above, it's probably not gc, but deallocation as a side effect of
Py_DECREF dropping a refcount to 0.  gc is just one way Py_DECREF can
get invoked.