[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Thread-safe way to add a key to a dict only if it isn't already there?

Chris Angelico <rosuav at gmail.com>:

> On Mon, Jul 9, 2018 at 5:18 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>> Chris Angelico <rosuav at gmail.com>:
>>> Are you assuming that Python's semantics are defined by the semantics
>>> of one possible implementation language?
>> What are Python's semantics defined by? I've been using these:
>>    <URL: https://docs.python.org/3/reference/>
>>    <URL: https://docs.python.org/3/library/>
>> Unfortunately, neither spec says anything about the atomicity of
>> dict.setdefault().
>> Therefore, the application programmer must assume it is not atomic. In
>> fact, as brought up in this discussion, the consultation of
>> object.__hash__() and object.__eq__() almost guarantee the
>> *non*-atomicity of dict.setdefault().
> If by "atomic" you mean that absolutely no context switch can occur
> during setdefault, then it probably isn't. But the point of an atomic
> query/update operation is that there are exactly two possibilities:
> 1) The key did not exist in the dictionary. It now does, with the
> provided default, which was returned.
> 2) The key did exist in the dictionary. The provided default is
> ignored, and the previous value is returned.

This is a classic test-and-set race condition:

    # Initially
    assert "A" not in d

    # Thread 1
    d.setdefault("A", 1)

    # Thread 2
    d["A"] = 2

If dict.setdefault() is atomic, the end result in all timings must be:

   d["A"] == 2

However, if dict.setdefault() is not atomic, the end result may also be:

   d["A"] == 1

> Neither object.__hash__ nor object.__eq__ gives any way for this to be
> violated (unless you mess with the concept of "the key did exist in
> the dictionary" by breaking the definition of equality, but that's
> nothing to do with atomicity). Here's the definition of setdefault:
> setdefault(key, default=None, /) method of builtins.dict instance
>     Insert key with a value of default if key is not in the dictionary.
>     Return the value for key if key is in the dictionary, else default.
> Simple semantics. Straight-forward. Now, maybe there's a bug in some
> implementation whereby threads can violate this; but that would be a
> bug to be fixed, and nothing more. You should be able to assume that
> it will behave as stated.

Since atomicity is not guaranteed by the definition of the method, the
application programmer must be prepared for the end result to be 1 (or
even something completely different because the Language Specification
doesn't say anything about the outcome of data races).