Multithread and locking issue
On 14May2020 22:36, Stephane Tougard <stephane at sdf.org> wrote:
>A multithreaded software written in Python is connected with a Postgres
>database. To avoid concurrent access issue with the database, it starts
>a thread who receive all SQL request via queue.put and queue.get (it
>makes only insert, so no issue with the return of the SQL request).
Just a remark that PostgreSQL ets callers work in parallel, just make
multiple connections. But if you're coming from just one connection, yes
serialising the SQL queries is good.
>As long as it runs with 10 threads, no issues. At 100 threads, the
>software is blocked by what I think is a locking issue.
Or a timeout or busy loop. More print statements will help you.
>I guess Python multithreading and queue are working good enough that it
>can handle 100 threads with no issue (give me wrong here),
Yep, that is just fine.
>so I guess
>the problem is in my code.
>The function (thread) who handles SQL requests.
> print("Start SQL Thread")
> while True:
> data = q.get(True,5)
2 problems here:
1: a "bare except" catches _any_ exception instead of just the one you
want (Empty). Don't do that - catch _only_ what you expect to handle.
2: since you're catching anything you never know why an exception
occurred - at least report it:
except Exception as e:
print("No data: %s" % e)
> print("No data")
The first thing I would do is get rid of the timeout (5) parameter. And
since block=True is the default, also that parameter:
data = q.get()
This never times out, so there's no need to test for a timeout.
I would also put a print() _immediately_ before _and_ after the q.get()
so that you know it was called and that it completed, and what it got.
To indicate no more SQL, send a sentinel such as None, and test for that:
data = q.get()
if data is None:
> print("RECEIVED SQL ORDER")
> if data == "EXIT":
Ah, here's your sentinel. I prefer None or some other special value
rather than a magic string ("EXIT"). Partly because some other queue you
use might be processing arbitrary strings, so a string won't do.
> request = data['request']
> arg = data['arg']
Another bare except. Catch specific exceptions and report the exception!
> print("Can not execute SQL request")
>The code to send the SQL request.
> sql = dict()
> sql['request'] = "update b2_user set credit = credit -%s where id = %s"
> sql['arg'] = (i,username,)
Again the bare except. But just drop the timeout and go:
and forget the try/except.
> print("Can not insert data")
>The launch of the SQL thread (nothing fancy here).
>q = qu.Queue()
>t = th.Thread(target = execute_sql, args = (q,))
That looks superficially good. Take out the timeouts, put in more
print()s, and see what happens.
You also need to send the end-of-SQL sentinel:
depending on what you decide to use.
My problem with the timeouts is that they make your code far less
predictable. If you get rid of them then your code must complete or
deadlock, there's no fuzzy timeouts-may-occur middle ground. Timeouts
are also difficult to choose correctly (if "correct" is even a term
which is meaningful), and it is often then better to not try to choose
them at all.
Cameron Simpson <cs at cskk.id.au>