osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]


Given that any PyTypeObject's base is of type PyType_Type, setting PyVarObject_HEAD_INIT(&PyType_Ready) is superfluous. Therefore, setting all static PyTypeObjects to their ob_type to NULL should be a safe assumption to make.

Uninitialized Types:
A quick s/PyVarObject_HEAD_INIT(&PyType_Type/PyVarObject_HEAD_INIT(NULL/  shows that some objects do need to have their ob_type set from the outset, violating the previous assumption. After writing a quick script, I found that out of the ~300 PyVarObject_HEAD_INIT present in CPython, only these 14 types segfaulted:

PyByteArrayIter_Type
PyBytesIter_Type
PyDictIterKey_Type
PyDictIterValue_Type
PyDictIterItem_Type
PyClassMethod_Type
PyAsyncGen_Type
PyListIter_Type
PyListRevIter_Type
PyODictIter_Type
PyLongRangeIter_Type
PySetIter_Type
PyTupleIter_Type
PyUnicodeIter_Type


Bug:
It turns out that these PyTypeObjects are never initialized through PyType_Ready. However, they are used as fully initialized types. It is by pure chance that the work without having to call the initializer on them. This though is undefined behavior. This not only might result in a weird future bug which is hard to chase down but also, it affects other runtimes as this behavior depends on implementation details of CPython.

This is a pervasive pattern that should be removed from the codebase and ideally extensions should follow as well.

Solution:
Here are my proposed solutions in order from less controversial to most controversial. Note that all of them I already tried in a local branch and are working:

1) Initialize all uninitialized types.

Example: https://github.com/eduardo-elizondo/cpython/commit/bc53db3cf4e5a6923b0b1afa6181305553faf173

2) Move all PyVarObject_HEAD_INIT to NULL except PyType_Type, PyBaseObject_Type and the booleans.

3) Special case the initialization of PyType_Type and PyBaseObject_Type within PyType_Ready to now make all calls to PyVarObject_HEAD_INIT use NULL. To enable this a small change within PyType_Ready is needed to initialize PyType_Type PyBaseObject:
if (base == NULL) {
    Py_TYPE(&PyType_Type) = &PyType_Type;
    Py_TYPE(type) = &PyType_Type;
}

Also, the booleans have to be fully initialized without calling PyVarObject_HEAD_INIT. I propose:

struct _longobject _Py_FalseStruct = {
    PyObject_HEAD_INIT(&PyBool_Type), 0, { 0 }
};

This will clean-up the entire codebase of this anti-pattern.

Example: https://github.com/eduardo-elizondo/cpython/commit/542fd79e4279c64c077c127b175a8d856d3c5f0b

4) Modify PyVarObject_HEAD_INIT to ignore the input and initialize to NULL and 0.
In order to prevent this antipattern within extension code as well, we should make PyVarObject_HEAD_INIT ignore the inputs and just set the value to NULL.

#define PyVarObject_HEAD_INIT(type, size)       \
    { PyObject_HEAD_INIT(NULL) 0 },

This will prevent external code to have a semi-initialized type that is not initialized through PyType_Ready.

5) Finally, I would go even further and suggest making PyVarObject_HEAD_INIT argumentless.

#define PyVarObject_HEAD_INIT       \
    { PyObject_HEAD_INIT(NULL) 0 },

However, this breaks backward compatibility. That being said, this will make extension maintainers aware of this antipattern.

Example: https://github.com/eduardo-elizondo/cpython/commit/3869e53843008ff096764f4adaf26efbb5625996

Thoughts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180806/702762a1/attachment.html>