This starts with an overview of the latest Linux CVEs that involve the kernel stack, explaining why they cannot be exploited without softening certain Linux security controls. For instance, overwriting the return address is unlikely without corrupting the stack canary, while exploitation through syscalls must be done in one shot to prevent the stack from shifting on the next run. I will then describe what a perfect vulnerability would look like: it should work at the first attempt and ideally give you both a KASLR leak and the virtual address of a controlled structure.
Nftables bugs involving structures such as NFT registers offer just that, as a read primitive can be used to calculate the address of said structures along with other static kernel addresses. Since Nftables hooks run in softirq mode, a fresh, static per-CPU stack is allocated while entering soft interrupt mode, hence defeating the purpose of per-system-call stack randomization. After corrupting some memory and gaining RIP control, I will show how to pivot the stack to our controlled NFT registers, thus defeating SMEP/SMAP since no user space payload is ever involved. As we only have 60 bytes left in the registers, a technique to gain more space to host our ROP chain is necessary, which involves duplicating the initial registers and changing their values in a controlled way.
The last attack stage in my presentation will go through the softirq exit routine that is required to avoid stack guard checks and other kernel panics.
I will share some methods kernel developers could use to mitigate these kinds of attacks, starting from the naive approach of enabling the CONFIG_STATIC_USERMODEHELPER config by default, then making a case for more sophisticated techniques such as pointer authentication, which would prevent jumping to arbitrary instruction pointers (similar to PAC on ARMv8.3) and eventually I will show my implementation of per-softirq kernel stack randomization.