[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Thread-safe way to add a key to a dict only if it isn't already there?

On Sun, 08 Jul 2018 10:52:15 +0300, Marko Rauhamaa wrote:

> You are on the right track, Chris, but you are still deducing behavior
> from a particular implementation. For example Java's definition is
> approximately the one I give above:
>    Without correct synchronization, very strange, confusing and
>    counterintuitive behaviors are possible.

Indeed, and that applies to all languages with threading, including 
Python. But notice that it doesn't say:

    Without correct synchronization, impossible things are possible.

>    <URL: https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.htm
>    l#jls-17.4.5>

You seem to have read this as "Java threading can do impossible things", 
but I think the correct reading of the article is "Here are the possible 
things that Java threading can do to mess with your mind". That's a much 
smaller subset of things which can go wrong.

(But still pretty big.)

See, for example Example 17.4-1, which explicitly shows that the Java 
execution model is permitted to reorder statements:

    r1 = A;  B = 1;

may be reordered to 

    B = 1;  r1 = A;

That's harmless in single-threaded code, but risky in multi-threaded 
code. The result is that in the example given, r1 could end up being set 
to 1 instead of the expected values -- but it cannot possibly be set to 
the string "impossible", since that goes against the Java type system.

In Python, there's no possible order of operations of assigning 1 and 
adding 1 that could result in a string. No Python compiler or interpreter 
is permitted to evaluate `1 + 1` or `2 + 1` as the string "impossible", 
even if threads are involved.

(That's not to say that threads might not assign a string to the variable 
we expected to contain 2, but it would have to do so by assignment, not 
by mucking about with the execution order of integer addition.)

In the case of Example 17.4-1, starting with A = B = 0, two threads 

    Thread 1:    r2 = A; B = 1;     
    Thread 2:    r1 = B; A = 2;

Java compilers are permitted to inspect each thread *in isolation*, 
decide that the order of lines is invisible to the caller, and so re-
order them. The result of that is that those four lines can be executed 
in any permutation, giving a total of 24 separate possibilities.

py> from itertools import permutations
py> instructions = ["r2 = A;", "B = 1;", "r1 = B;", "A = 2;"]
py> len(list(permutations(instructions)))

But Python (at least for now) cannot do this. It has a more deterministic 
execution order which guarantees top-to-bottom execution of Python 
statements. That implies that there are only *six* possible execution 
orders of those two threads (which is still enough to add non-determinism 
to your code!) not twenty-four:

    r2 = A;  B = 1;  r1 = B;  A = 2;
    r2 = A;  r1 = B;  B = 1;  A = 2;
    r2 = A;  r1 = B;  A = 2;  B = 1;
    r1 = B;  A = 2;  r2 = A;  B = 1;
    r1 = B;  r2 = A;  A = 2;  B = 1;
    r1 = B;  r2 = A;  A = 2;  B = 1;

and that remains true even if the underlying implementation is written in 
Java, or Malbolge for that matter.

> And we know that Python has been implemented using Java...

That's irrelevant. A legal Python does not inherit the quirks of the 
underlying implementation -- it must still follow the rules of the 
language, or it isn't Python. Java is statically typed, with machine 
ints, Python is dynamically typed, with no machine ints.




Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson