perf: Store active software events in a hashlist
Each time a software event triggers, we need to walk through
the entire list of events from the current cpu and task contexts
to retrieve a running perf event that matches.
We also need to check a matching perf event is actually counting.
This walk is wasteful and makes the event fast path scaling
down with a growing number of events running on the same
contexts.
To solve this, we store the running perf events in a hashlist to
get an immediate access to them against their type:event_id when
they trigger.
v2: - Fix SWEVENT_HLIST_SIZE definition (and re-learn some basic
maths along the way)
- Only allocate hlist for online cpus, but keep track of the
refcount on offline possible cpus too, so that we allocate it
if needed when it becomes online.
- Drop the kref use as it's not adapted to our tricks anymore.
v3: - Fix bad refcount check (address instead of value). Thanks to
Eric Dumazet who spotted this.
- While exiting cpu, move the hlist release out of the IPI path
to lock the hlist mutex sanely.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6e96cc8..bf896d0 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -589,6 +589,14 @@
PERF_GROUP_SOFTWARE = 0x1,
};
+#define SWEVENT_HLIST_BITS 8
+#define SWEVENT_HLIST_SIZE (1 << SWEVENT_HLIST_BITS)
+
+struct swevent_hlist {
+ struct hlist_head heads[SWEVENT_HLIST_SIZE];
+ struct rcu_head rcu_head;
+};
+
/**
* struct perf_event - performance event kernel representation:
*/
@@ -597,6 +605,7 @@
struct list_head group_entry;
struct list_head event_entry;
struct list_head sibling_list;
+ struct hlist_node hlist_entry;
int nr_siblings;
int group_flags;
struct perf_event *group_leader;
@@ -744,6 +753,9 @@
int active_oncpu;
int max_pertask;
int exclusive;
+ struct swevent_hlist *swevent_hlist;
+ struct mutex hlist_mutex;
+ int hlist_refcount;
/*
* Recursion avoidance: