IB/qib: Modify software pma counters to use percpu variables

The counters, unicast_xmit, unicast_rcv, multicast_xmit, multicast_rcv
are now maintained as percpu variables.

The mad code is modified to add a z_ latch so that the percpu counters
monotonically increase with appropriate adjustments in the reset,
read logic to maintain the z_ latch.

This patch also corrects the fact the unitcast_xmit wasn't handled
at all for UC and RC QPs.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 092b0bb..1b00734 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -662,7 +662,7 @@
 		mcast = qib_mcast_find(ibp, &hdr->u.l.grh.dgid);
 		if (mcast == NULL)
 			goto drop;
-		ibp->n_multicast_rcv++;
+		this_cpu_inc(ibp->pmastats->n_multicast_rcv);
 		list_for_each_entry_rcu(p, &mcast->qp_list, list)
 			qib_qp_rcv(rcd, hdr, 1, data, tlen, p->qp);
 		/*
@@ -689,7 +689,7 @@
 			rcd->lookaside_qpn = qp_num;
 		} else
 			qp = rcd->lookaside_qp;
-		ibp->n_unicast_rcv++;
+		this_cpu_inc(ibp->pmastats->n_unicast_rcv);
 		qib_qp_rcv(rcd, hdr, lnh == QIB_LRH_GRH, data, tlen, qp);
 	}
 	return;