d7263704edf4ee2783b116f64f407752d5b2f2bf - LeafOS-Devices/android_kernel_samsung_gta4xl

commit	d7263704edf4ee2783b116f64f407752d5b2f2bf	[log] [tgz]
author	Yu Kuai <yukuai3@huawei.com>	Fri Mar 22 16:10:05 2024 +0800
committer	Vegard Nossum <vegard.nossum@oracle.com>	Mon Jul 15 17:44:33 2024 +0000
tree	dffa6c8a94921a7a4be1cbfcedf727ae6c50cb85
parent	aa9c43942fc69f5e652d6b4f68e0e2bf75868c7e [diff]

md/raid5: fix deadlock that raid5d() wait for itself to clear MD_SB_CHANGE_PENDING commit 151f66bb618d1fd0eeb84acb61b4a9fa5d8bb0fa upstream. Xiao reported that lvm2 test lvconvert-raid-takeover.sh can hang with small possibility, the root cause is exactly the same as commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") However, Dan reported another hang after that, and junxiao investigated the problem and found out that this is caused by plugged bio can't issue from raid5d(). Current implementation in raid5d() has a weird dependence: 1) md_check_recovery() from raid5d() must hold 'reconfig_mutex' to clear MD_SB_CHANGE_PENDING; 2) raid5d() handles IO in a deadloop, until all IO are issued; 3) IO from raid5d() must wait for MD_SB_CHANGE_PENDING to be cleared; This behaviour is introduce before v2.6, and for consequence, if other context hold 'reconfig_mutex', and md_check_recovery() can't update super_block, then raid5d() will waste one cpu 100% by the deadloop, until 'reconfig_mutex' is released. Refer to the implementation from raid1 and raid10, fix this problem by skipping issue IO if MD_SB_CHANGE_PENDING is still set after md_check_recovery(), daemon thread will be woken up when 'reconfig_mutex' is released. Meanwhile, the hang problem will be fixed as well. Fixes: 5e2cf333b7bd ("md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d") Cc: stable@vger.kernel.org # v5.19+ Reported-and-tested-by: Dan Moulding <dan@danm.net> Closes: https://lore.kernel.org/all/20240123005700.9302-1-dan@danm.net/ Investigated-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20240322081005.1112401-1-yukuai1@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b32aa95843cac6b12c2c014d40fca18aef24a347) Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>