[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Thread-safe way to add a key to a dict only if it isn't already there?

On Sun, Jul 8, 2018 at 12:12 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Sun, 08 Jul 2018 11:15:17 +1000, Chris Angelico wrote:
> [...]
>> Python threads don't switch only between lines of code,
> As I understand it, there could be a switch between any two byte codes,
> or maybe only between certain bytes codes. But certain more fine grained
> than just between lines of code.
>> so the actual
>> interaction is a bit more complicated than you say. In CPython, the
>> increment operation is:
>>   3           0 LOAD_GLOBAL              0 (i)
>>               2 LOAD_CONST               1 (1)
>>               4 INPLACE_ADD
>>               6 STORE_GLOBAL             0 (i)
>> A context switch could happen between any pair of statements.
> If you actually mean *statements* as opposed to byte codes, then the only
> place there could be a switch would be either before the LOAD_GLOBAL or
> after the STORE_GLOBAL (given that i is a built-in int and cannot have a
> custom __iadd__ method).
> Is that what you mean?

I may be wrong, but I always assume that a context switch could happen
between any two bytecode operations - or, if you're reading the
disassembly, between any two lines *of disassembly*. So there could be
a switch before LOAD_GLOBAL, a switch between that and LOAD_CONST,
another switch before the ADD, another before the STORE, and another
right at the end. Well, there won't be *all* of those, but there could
be any of them.

This might not be entirely correct - there might be pairs that are
functionally atomic - but it's the safe assumption.

>> For instance, if you replace "i
>> += 1" with "i += i", to double the value, you'll get this:
>>   3           0 LOAD_GLOBAL              0 (i)
>>               2 LOAD_GLOBAL              0 (i)
>>               4 INPLACE_ADD
>>               6 STORE_GLOBAL             0 (i)
>> and that could potentially have both of them load the initial value,
>> then one of them runs to completion, and then the other loads the result
>> - so it'll add 1 and 2 and have a result of 3, rather than 2 or 4.
> Some people, when confronted with a problem, say, "I know, I'll use
> threads". Nothhtwo probw ey ave lems.

Right. Now they have to deal with interleaving, but that's all. And
honestly, MOST CODE wouldn't notice interleaving; it's only when you
change (either by rebinding or by mutating) something that can be seen
by multiple threads. Which basically means "mutable globals are a
risk, pretty much everything else is safe".

>> But you're absolutely right that there are only a small handful of
>> plausible results, even with threading involved.
> Indeed. Even though threading is non-deterministic, it isn't *entirely*
> unconstrained.

Yeah. Quite far from it, in fact. Python threading is well-defined and
fairly easy to work with. Only in a handful of operations do you need
to worry about atomicity - like the one that started this thread.