logo       

Race condition with lwip_select (and potential fix): msg#00034

network.lwip.general

Subject: Race condition with lwip_select (and potential fix)

Hi all,

I've been noticing occasional crashes within lwip when using select(). I think I've tracked the problem down to a race condition hidden in the interaction between lwip_select() and event_callback() (both in sockets.c).

The failure case is as follows:

User Thread: Calls lwip_select():
- Creates a select_cb on the stack.
- Creates select_cb.sem.
- Takes selectsem.
- Inserts the select_cb into select_cb_list.
- Releases selectsem.
- Waits for select_cb.sem to be signalled (or timeout to expire).

LwIP Thread: Data arrives, so calls event_callback():
- Takes selectsem.
- select_cb_list is examined and scb stores a pointer to the select_cb.
- Releases selectsem.

User Thread: Timeout expires, so resumes execution of lwip_select():
- Takes selectsem.
- Removes the select_cb from select_cb_list.
- Releases selectsem.
- Frees select_cb.sem.

LwIP Thread: Resumes execution of event_callback():
- Tries to signal scb->sem, but select_cb and/or sem no longer exist.

Based on the above analysis, the fix would appear to be to swap the order in which event_callback() signals the two semaphores. I've tried this and so far it appears to work. I'd appreciate any thoughts from other LwIP users though!

Regards,

Chris.


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise