mac80211: fix race between the AGG SM and the Tx data path
When a packet is supposed to sent be as an a-MPDU, mac80211 sets
IEEE80211_TX_CTL_AMPDU to let the driver know. On the other
hand, mac80211 configures the driver for aggregration with the
ampdu_action callback.
There is race between these two mechanisms since the following
scenario can occur when the BA agreement is torn down:
Tx softIRQ drv configuration
========== =================
check OPERATIONAL bit
Set the TX_CTL_AMPDU bit in the packet
clear OPERATIONAL bit
stop Tx AGG
Pass Tx packet to the driver.
In that case the driver would get a packet with TX_CTL_AMPDU set
although it has already been notified that the BA session has been
torn down.
To fix this, we need to synchronize all the Qdisc activity after we
cleared the OPERATIONAL bit. After that step, all the following
packets will be buffered until the driver reports it is ready to get
new packets for this RA / TID. This buffering allows not to run into
another race that would send packets with TX_CTL_AMPDU unset while
the driver hasn't been requested to tear down the BA session yet.
This race occurs in practice and iwlwifi complains with a WARN_ON
when it happens.
Cc: stable@kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c
index eea6e5c..331472c 100644
--- a/net/mac80211/agg-tx.c
+++ b/net/mac80211/agg-tx.c
@@ -192,6 +192,20 @@
*/
clear_bit(HT_AGG_STATE_OPERATIONAL, &tid_tx->state);
+ /*
+ * There might be a few packets being processed right now (on
+ * another CPU) that have already gotten past the aggregation
+ * check when it was still OPERATIONAL and consequently have
+ * IEEE80211_TX_CTL_AMPDU set. In that case, this code might
+ * call into the driver at the same time or even before the
+ * TX paths calls into it, which could confuse the driver.
+ *
+ * Wait for all currently running TX paths to finish before
+ * telling the driver. New packets will not go through since
+ * the aggregation session is no longer OPERATIONAL.
+ */
+ synchronize_net();
+
tid_tx->stop_initiator = initiator;
tid_tx->tx_stop = tx;