fs: icache RCU free inodes
RCU free the struct inode. This will allow:
- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.
The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.
In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.
The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index b2df490..8c5c0ed 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -351,9 +351,16 @@
return &ip->ip_vfs_inode;
}
+static void dlmfs_i_callback(struct rcu_head *head)
+{
+ struct inode *inode = container_of(head, struct inode, i_rcu);
+ INIT_LIST_HEAD(&inode->i_dentry);
+ kmem_cache_free(dlmfs_inode_cache, DLMFS_I(inode));
+}
+
static void dlmfs_destroy_inode(struct inode *inode)
{
- kmem_cache_free(dlmfs_inode_cache, DLMFS_I(inode));
+ call_rcu(&inode->i_rcu, dlmfs_i_callback);
}
static void dlmfs_evict_inode(struct inode *inode)
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index cfeab7c..17ff46f 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -569,9 +569,16 @@
return &oi->vfs_inode;
}
+static void ocfs2_i_callback(struct rcu_head *head)
+{
+ struct inode *inode = container_of(head, struct inode, i_rcu);
+ INIT_LIST_HEAD(&inode->i_dentry);
+ kmem_cache_free(ocfs2_inode_cachep, OCFS2_I(inode));
+}
+
static void ocfs2_destroy_inode(struct inode *inode)
{
- kmem_cache_free(ocfs2_inode_cachep, OCFS2_I(inode));
+ call_rcu(&inode->i_rcu, ocfs2_i_callback);
}
static unsigned long long ocfs2_max_file_offset(unsigned int bbits,