tree cdc0703f446cbc14692577ce01e8df8bb23076e4
parent 00b03d3d6d53897fa55a18a51e6cb20800e83fe5
author Peter Xu <peterx@redhat.com> 1557793001 -0700
committer Tim Zimmermann <tim@linux4.de> 1721966658 +0200
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuwYxAyE4TRnuhLgPbcIaY/gZxe8FAmajIEIACgkQbcIaY/gZ
 xe/qYA/+OTpqOCtNyvmq7ses0MKCRRLwyK2nxac/5qrXI8gsRdJ5p6c5Zz3VEYhD
 l5Qz9WqJTup+zhN6muGhXkM7jCK3R5yDJIsi0zDk9bnGxlTYT7RzXTr4+XiBwkmi
 NLQQWNvBDLrofGA+Jh1IOROiO0Ipmi16Vl7zwimGzTK11qcykXz0jAdQ7La9Qo5v
 gKuwM13QRdgCWBy2kX+QfGMXwOc79KQ5HmFCDcEnNvJER9XThU0Bu/i2AFnc9rot
 VV0KpQPJPhtWa0MoJyjbVr7FtN7aM87Ud9qhdjsdMEsVTJuSy844uOn6QWCxmEdA
 P79iJSpyEPk/XH0HhS9nPyZ+xlhOG4EcB2LFN2nyZpN9vzcecfkFFp9jcdPpBHXy
 hgznmUfAUvNm7FrolFTw/OqbB5AWRulpfvivjOXNCqIpiep4U+OxCHJfG2khNTOr
 knC9CphI5riktYo4R9MWdvb0RcMN6mFm4ZUo5PuI2zhTHEng31BUC0ZRr8jjlBzP
 Bm7DK8l1RFC+Z1GfXafvBNOa8ddTs4mVickjmdX5OYSXx534ftUN2M8t61ryNp/A
 rM+xncuDfHBctUPdkQbnP6LGTcN4rEe3AJTHFoNTSsTTBsFh8BfccUbDY4OwJs0+
 fN2t9SumjMuO/Jieo+UMuV+9xJYr29S0cnn94RwEQK7K1ykBxuo=
 =PX13
 -----END PGP SIGNATURE-----

BACKPORT: userfaultfd/sysctl: add vm.unprivileged_userfaultfd

Userfaultfd can be misued to make it easier to exploit existing
use-after-free (and similar) bugs that might otherwise only make a
short window or race condition available.  By using userfaultfd to
stall a kernel thread, a malicious program can keep some state that it
wrote, stable for an extended period, which it can then access using an
existing exploit.  While it doesn't cause the exploit itself, and while
it's not the only thing that can stall a kernel thread when accessing a
memory location, it's one of the few that never needs privilege.

We can add a flag, allowing userfaultfd to be restricted, so that in
general it won't be useable by arbitrary user programs, but in
environments that require userfaultfd it can be turned back on.

Add a global sysctl knob "vm.unprivileged_userfaultfd" to control
whether userfaultfd is allowed by unprivileged users.  When this is
set to zero, only privileged users (root user, or users with the
CAP_SYS_PTRACE capability) will be able to use the userfaultfd
syscalls.

Andrea said:

: The only difference between the bpf sysctl and the userfaultfd sysctl
: this way is that the bpf sysctl adds the CAP_SYS_ADMIN capability
: requirement, while userfaultfd adds the CAP_SYS_PTRACE requirement,
: because the userfaultfd monitor is more likely to need CAP_SYS_PTRACE
: already if it's doing other kind of tracking on processes runtime, in
: addition of userfaultfd.  In other words both syscalls works only for
: root, when the two sysctl are opt-in set to 1.

[dgilbert@redhat.com: changelog additions]
[akpm@linux-foundation.org: documentation tweak, per Mike]
Link: http://lkml.kernel.org/r/20190319030722.12441-2-peterx@redhat.com
Change-Id: Ied2500a773b06ac1fdc378e61fd5403a270114a6
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
