[NET]: Fix race between poll_napi() and net_rx_action()
netpoll_poll_lock() synchronizes the ->poll() invocation
code paths, but once we have the lock we have to make
sure that NAPI_STATE_SCHED is still set. Otherwise we
get:
cpu 0 cpu 1
net_rx_action() poll_napi()
netpoll_poll_lock() ... spin on ->poll_lock
->poll()
netif_rx_complete
netpoll_poll_unlock() acquire ->poll_lock()
->poll()
netif_rx_complete()
CRASH
Based upon a bug report from Tina Yang.
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/core/dev.c b/net/core/dev.c
index 853c8b5..02e7d83 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2172,7 +2172,15 @@
weight = n->weight;
- work = n->poll(n, weight);
+ /* This NAPI_STATE_SCHED test is for avoiding a race
+ * with netpoll's poll_napi(). Only the entity which
+ * obtains the lock and sees NAPI_STATE_SCHED set will
+ * actually make the ->poll() call. Therefore we avoid
+ * accidently calling ->poll() when NAPI is not scheduled.
+ */
+ work = 0;
+ if (test_bit(NAPI_STATE_SCHED, &n->state))
+ work = n->poll(n, weight);
WARN_ON_ONCE(work > weight);