Accidentally uncovering a seven years old vulnerability in the Linux kernel

Para acessar esta postagem em português, clique aqui.

Vulnerability research is at the core of Allele Security Intelligence. We have been actively researching for more than a decade, and we offer our expertise to our clients. Among the services we offer are 0day and nday vulnerability research.

In nday vulnerability research projects, in the case of the Linux kernel, we look for vulnerabilities patched upstream, that still affect major distributions even in their latest release. Usually, we find vulnerabilities patched over a year ago that still affect popular Linux distributions. We do that by auditing the Linux kernel source code, monitoring vulnerabilities submitted to mailing lists and patched upstream, checking the findings of the syzkaller fuzzer and other ways.

While doing that research, we accidentally discovered a vulnerability in the core of the TCP subsystem of the Linux kernel. It had been introduced seven years earlier. We reported it upstream, which was patched in May of last year. In this blog post, we’ll share how it happened and briefly analyze the vulnerability.

KCM NETWORKING PROTOCOL RACE CONDITION on sk_receive_queue()

The following vulnerability caught our attention when checking the dashboard of the syzkaller fuzzer. The commit that fixed it is also shown below.

general protection fault in skb_unlink
https://syzkaller.appspot.com/bug?extid=278279efdd2730dd14bf

kcm: close race conditions on sk_receive_queue
https://github.com/torvalds/linux/commit/5121197ecc5db58c07da95eb1ff82b98b121a221

We checked the source code of the latest release of one of the Red Hat Enterprise Linux derivatives, and it was still vulnerable. The Red Hat Enterprise Linux derivates’ kernels are similar to the Red Hat Enterprise Linux. Usually, a vulnerability in one ends up affecting all of them. There were some changes recently in the Red Hat Enterprise Linux ecosystem, and we haven’t checked them since then, but this was true for a while. Fortunately, the bug had a reproducer available, and we let it run in our systems. After a while, the following splat was generated. The full splat can be found here.

ShellSession
[111607.195289] ------------[ cut here ]------------
[111607.195305] refcount_t: addition on 0; use-after-free.
...
[111607.197802] CPU: 2 PID: 3130808 Comm: poc6 Kdump: loaded Not tainted 5.14.0-362.24.2.el9_3.x86_64 #1
...
[111607.201030] Call Trace:
[111607.201182] 
[111607.201350] ? show_trace_log_lvl+0x1c4/0x2df 
[111607.201623] ? show_trace_log_lvl+0x1c4/0x2df 
[111607.201800] ? tcp_twsk_unique+0x183/0x190 
[111607.201993] ? refcount_warn_saturate+0x74/0x110 
[111607.202164] ? __warn+0x81/0x110 
[111607.202442] ? refcount_warn_saturate+0x74/0x110 
[111607.202692] ? report_bug+0x10a/0x140 
[111607.202931] ? handle_bug+0x3c/0x70 
[111607.203212] ? exc_invalid_op+0x14/0x70 
[111607.203425] ? asm_exc_invalid_op+0x16/0x20 
[111607.203679] ? refcount_warn_saturate+0x74/0x110 
[111607.203924] tcp_twsk_unique+0x183/0x190 
[111607.204091] __inet_check_established+0x158/0x2c0 
[111607.204335] __inet_hash_connect+0xb7/0x540 
[111607.204590] ? __pfx___inet_check_established+0x10/0x10 
[111607.206806] tcp_v4_connect+0x24e/0x520 
[111607.207707] __inet_stream_connect+0xcb/0x3b0 
[111607.208583] ? release_sock+0x40/0x90 
[111607.209469] ? selinux_netlbl_socket_connect+0x2b/0x40 
[111607.210342] inet_stream_connect+0x37/0x60 
[111607.211177] __sys_connect+0xa3/0xd0 
[111607.211985] __x64_sys_connect+0x14/0x20 
...
[111607.223742] ------------[ cut here ]------------
[111607.223743] refcount_t: underflow; use-after-free.
...
[111607.228620] CPU: 2 PID: 3130808 Comm: poc6 Kdump: loaded Tainted: G W ------- --- 5.14.0-362.24.2.el9_3.x86_64 #1
...
[111607.239689] Call Trace:
[111607.240472] 
[111607.241313] ? show_trace_log_lvl+0x1c4/0x2df 
[111607.242156] ? show_trace_log_lvl+0x1c4/0x2df 
[111607.242989] ? __inet_check_established+0x23a/0x2c0 
[111607.243838] ? refcount_warn_saturate+0xba/0x110 
[111607.244669] ? __warn+0x81/0x110 
[111607.245484] ? refcount_warn_saturate+0xba/0x110 
[111607.246331] ? report_bug+0x10a/0x140 
[111607.247155] ? handle_bug+0x3c/0x70 
[111607.248005] ? exc_invalid_op+0x14/0x70 
[111607.248834] ? asm_exc_invalid_op+0x16/0x20 
[111607.249629] ? refcount_warn_saturate+0xba/0x110 
[111607.250435] __inet_check_established+0x23a/0x2c0 
[111607.251279] __inet_hash_connect+0xb7/0x540 
[111607.252135] ? __pfx___inet_check_established+0x10/0x10 
[111607.252962] tcp_v4_connect+0x24e/0x520 
[111607.253786] __inet_stream_connect+0xcb/0x3b0 
[111607.254601] ? release_sock+0x40/0x90 
[111607.255361] ? selinux_netlbl_socket_connect+0x2b/0x40 
[111607.256105] inet_stream_connect+0x37/0x60 
[111607.256913] __sys_connect+0xa3/0xd0 
[111607.257709] __x64_sys_connect+0x14/0x20 
...
[116082.336931] ------------[ cut here ]------------
[116082.336934] refcount_t: decrement hit 0; leaking memory.
...
[116082.342044] CPU: 1 PID: 3866568 Comm: poc5 Kdump: loaded Tainted: G W ------- --- 5.14.0-362.24.2.el9_3.x86_64 #1
...
[116082.352815] Call Trace:
[116082.353638] 
[116082.354410] ? show_trace_log_lvl+0x1c4/0x2df 
[116082.355236] ? show_trace_log_lvl+0x1c4/0x2df 
[116082.356055] ? __inet_check_established+0x29c/0x2c0 
[116082.356874] ? refcount_warn_saturate+0xfb/0x110 
[116082.357692] ? __warn+0x81/0x110 
[116082.358505] ? refcount_warn_saturate+0xfb/0x110 
[116082.359302] ? report_bug+0x10a/0x140 
[116082.360124] ? handle_bug+0x3c/0x70 
[116082.360905] ? exc_invalid_op+0x14/0x70 
[116082.361697] ? asm_exc_invalid_op+0x16/0x20 
[116082.362491] ? refcount_warn_saturate+0xfb/0x110 
[116082.363300] __inet_check_established+0x29c/0x2c0 
[116082.364098] __inet_hash_connect+0xb7/0x540 
[116082.364893] ? __pfx___inet_check_established+0x10/0x10 
[116082.365689] tcp_v4_connect+0x24e/0x520 
[116082.366454] ? pgtable_trans_huge_deposit+0x88/0x110 
[116082.367247] __inet_stream_connect+0xcb/0x3b0 
[116082.368000] ? release_sock+0x40/0x90 
[116082.368746] ? selinux_netlbl_socket_connect+0x2b/0x40 
[116082.369488] inet_stream_connect+0x37/0x60 
[116082.370197] __sys_connect+0xa3/0xd0 
[116082.370942] __x64_sys_connect+0x14/0x20 
...
[116082.381500] ---[ end trace f02e72c43eeca11a ]---

We then started to simplify the syzkaller reproducer to get a cleaner code. We reduced the code step by step, identifying needed pieces and removing unneeded ones, finally reaching a simplified version of the code. We already suspected by looking at the splat we obtained, but during this process, we noticed our trigger had nothing to do with the networking protocol described in the syzkaller splat. The call trace in the syzkaller splat contains the KCM protocol, but in our simplified trigger we removed everything related to the KCM protocol, and it was still triggering a splat.

Syzkaller splat:

ShellSession
Call Trace:

 kcm_recvmsg+0x462/0x560 net/kcm/kcmsock.c:1161
 sock_recvmsg_nosec+0x89/0xb0 net/socket.c:871
 ___sys_recvmsg+0x271/0x5c0 net/socket.c:2480
 do_recvmmsg+0x27e/0x7a0 net/socket.c:2601
 __sys_recvmmsg+0x259/0x270 net/socket.c:2680
 __do_sys_recvmmsg net/socket.c:2703 [inline]
 __se_sys_recvmmsg net/socket.c:2696 [inline]
 __x64_sys_recvmmsg+0xe6/0x140 net/socket.c:2696
 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The splat we triggered by running the syzkaller reproducer:

ShellSession
Call Trace:

show_trace_log_lvl+0x1c4/0x2df 
show_trace_log_lvl+0x1c4/0x2df 
tcp_twsk_unique+0x183/0x190 
refcount_warn_saturate+0x74/0x110 
__warn+0x81/0x110 
refcount_warn_saturate+0x74/0x110 
report_bug+0x10a/0x140 
handle_bug+0x3c/0x70 
exc_invalid_op+0x14/0x70 
asm_exc_invalid_op+0x16/0x20 
refcount_warn_saturate+0x74/0x110 
tcp_twsk_unique+0x183/0x190 
__inet_check_established+0x158/0x2c0 
__inet_hash_connect+0xb7/0x540 
__pfx___inet_check_established+0x10/0x10 
tcp_v4_connect+0x24e/0x520 
__inet_stream_connect+0xcb/0x3b0 
release_sock+0x40/0x90 
selinux_netlbl_socket_connect+0x2b/0x40 
inet_stream_connect+0x37/0x60 
__sys_connect+0xa3/0xd0 
__x64_sys_connect+0x14/0x20 
do_syscall_64+0x59/0x90 
handle_mm_fault+0xc5/0x2a0 
do_user_addr_fault+0x1d6/0x6a0 
exc_page_fault+0x62/0x150 
entry_SYSCALL_64_after_hwframe+0x72/0xdc 

As shown above, the bug we triggered has nothing to do with the KCM protocol. The syzkaller splat was generated by KASAN and in our case, it was a warning due to protection mechanisms implemented in the reference counter API of the Linux kernel.

We confirmed the bug we were triggering was different after checking that the module for the KCM protocol wasn’t even available in the system. To reduce the time needed to analyze the issues, we run proof of concepts or reproducers in several distributions simultaneously before checking the system, and if they trigger anything, we then start analyzing it.

ShellSession
$ cat /boot/config-5.14.0-362.24.2.el9_3.x86_64 | grep CONFIG_AF_KCM
# CONFIG_AF_KCM is not set
$

Use-after-free in the TCP subsystem of the Linux kernel due to race CONDITION between tcp_twsk_unique() and inet_twsk_hashdance() – cve-2024-36904

We confirmed it was a different bug and analyzed it to understand it better. The problem happens because the object, a time-wait TCP socket, has its reference counter initialized after being inserted into a hash table and releasing the lock. Then, if a lookup is performed before the reference counter initialization, the object is found with a zeroed reference counter, and the warning is triggered. The commit that added the problem:

tcp/dccp: avoid one atomic operation for timewait hashdance
https://github.com/torvalds/linux/commit/ec94c2696f0bcd5ae92a553244e4ac30d2171a2d#diff-901c0d3a066cb54add11c34dabb345263e8fc6e7fbfd702843be6b7435345b59R139

Understanding the issue, we confirmed upstream should also be affected based on the source code. We then tried on Fedora 39 running with kernel 6.8.6-200.fc39.x86_64. The kernel 6.8 was a recent version at that time. The same simplified reproducer triggers the following splat. The link for the full splat is here.

ShellSession
[433522.338983] ------------[ cut here ]------------
[433522.339033] refcount_t: addition on 0; use-after-free.
...
[433522.340141] CPU: 0 PID: 1039313 Comm: trigger Not tainted 
6.8.6-200.fc39.x86_64 #1
...
[433522.340278] Call Trace:
[433522.340282]  <TASK>
[433522.340307]  ? refcount_warn_saturate+0xe5/0x110
[433522.340313]  ? __warn+0x81/0x130
[433522.340462]  ? refcount_warn_saturate+0xe5/0x110
[433522.340492]  ? report_bug+0x171/0x1a0
[433522.340723]  ? refcount_warn_saturate+0xe5/0x110
[433522.340731]  ? handle_bug+0x3c/0x80
[433522.340781]  ? exc_invalid_op+0x17/0x70
[433522.340785]  ? asm_exc_invalid_op+0x1a/0x20
[433522.340838]  ? refcount_warn_saturate+0xe5/0x110
[433522.340843]  tcp_twsk_unique+0x186/0x190
[433522.340945]  __inet_check_established+0x176/0x2d0
[433522.340974]  __inet_hash_connect+0x74/0x7d0
[433522.340980]  ? __pfx___inet_check_established+0x10/0x10
[433522.340983]  tcp_v4_connect+0x278/0x530
[433522.340989]  __inet_stream_connect+0x10f/0x3d0
[433522.341019]  inet_stream_connect+0x3a/0x60
[433522.341024]  __sys_connect+0xa8/0xd0
[433522.341186]  __x64_sys_connect+0x18/0x20
...
[433522.341703] ---[ end trace 0000000000000000 ]---
[433522.341709] ------------[ cut here ]------------
[433522.341710] refcount_t: underflow; use-after-free.
...
[433522.341820] CPU: 0 PID: 1039313 Comm: trigger Tainted: G        W 
     6.8.6-200.fc39.x86_64 #1
...
[433522.341887] Call Trace:
[433522.341889]  <TASK>
[433522.341890]  ? refcount_warn_saturate+0xbe/0x110
[433522.341894]  ? __warn+0x81/0x130
[433522.341899]  ? refcount_warn_saturate+0xbe/0x110
[433522.341903]  ? report_bug+0x171/0x1a0
[433522.341907]  ? console_unlock+0x78/0x120
[433522.341977]  ? handle_bug+0x3c/0x80
[433522.341981]  ? exc_invalid_op+0x17/0x70
[433522.342007]  ? asm_exc_invalid_op+0x1a/0x20
[433522.342011]  ? refcount_warn_saturate+0xbe/0x110
[433522.342015]  __inet_check_established+0x24d/0x2d0
[433522.342019]  __inet_hash_connect+0x74/0x7d0
[433522.342023]  ? __pfx___inet_check_established+0x10/0x10
[433522.342026]  tcp_v4_connect+0x278/0x530
[433522.342031]  __inet_stream_connect+0x10f/0x3d0
[433522.342035]  inet_stream_connect+0x3a/0x60
[433522.342039]  __sys_connect+0xa8/0xd0
[433522.342044]  __x64_sys_connect+0x18/0x20
...
[433522.342097] ---[ end trace 0000000000000000 ]---
[435060.554199] ------------[ cut here ]------------
[435060.554243] refcount_t: decrement hit 0; leaking memory.
...
[435060.554426] CPU: 2 PID: 879478 Comm: trigger Tainted: G        W 
   6.8.6-200.fc39.x86_64 #1
...
[435060.554603] Call Trace:
[435060.554607]  <TASK>
[435060.554608]  ? refcount_warn_saturate+0xff/0x110
[435060.554614]  ? __warn+0x81/0x130
[435060.554625]  ? refcount_warn_saturate+0xff/0x110
[435060.554630]  ? report_bug+0x171/0x1a0
[435060.554638]  ? console_unlock+0x78/0x120
[435060.554670]  ? handle_bug+0x3c/0x80
[435060.554676]  ? exc_invalid_op+0x17/0x70
[435060.554682]  ? asm_exc_invalid_op+0x1a/0x20
[435060.554694]  ? refcount_warn_saturate+0xff/0x110
[435060.554699]  __inet_check_established+0x29b/0x2d0
[435060.554707]  __inet_hash_connect+0x74/0x7d0
[435060.554712]  ? __pfx___inet_check_established+0x10/0x10
[435060.554716]  tcp_v4_connect+0x278/0x530
[435060.554723]  __inet_stream_connect+0x10f/0x3d0
[435060.554729]  inet_stream_connect+0x3a/0x60
[435060.554734]  __sys_connect+0xa8/0xd0
[435060.554744]  __x64_sys_connect+0x18/0x20
...
[435060.555160] ---[ end trace 0000000000000000 ]---

Initially intended to reproduce a known bug on a Linux distribution, we accidentally uncovered a seven-year-old vulnerability in the Linux kernel. Now, let’s understand the vulnerability. We reached the code below. In line 149, the object’s reference counter added into a hash table is incremented, but it has already released the lock. Then, looking up operations on the hash table finds the object with a zeroed reference counter.

C
100 void inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
101                            struct inet_hashinfo *hashinfo)
102 {
...
105         struct inet_ehash_bucket *ehead = 
inet_ehash_bucket(hashinfo, sk->sk_hash);
106         spinlock_t *lock = inet_ehash_lockp(hashinfo, sk->sk_hash);
...
130         spin_lock(lock);
...
132         inet_twsk_add_node_rcu(tw, &ehead->chain);
...
138         spin_unlock(lock);
...
149         refcount_set(&tw->tw_refcnt, 3);
150 }

The looking up operation is performed by __inet_check_established(). It finds the object in line 561 and then calls twsk_unique().

C
538 static int __inet_check_established(struct inet_timewait_death_row 
*death_row,
539                                     struct sock *sk, __u16 lport,
540                                     struct inet_timewait_sock **twp)
541 {
542         struct inet_hashinfo *hinfo = death_row->hashinfo;
543         struct inet_sock *inet = inet_sk(sk);
544         __be32 daddr = inet->inet_rcv_saddr;
545         __be32 saddr = inet->inet_daddr;
546         int dif = sk->sk_bound_dev_if;
547         struct net *net = sock_net(sk);
...
550         const __portpair ports = 
INET_COMBINED_PORTS(inet->inet_dport, lport);
...
549         INET_ADDR_COOKIE(acookie, saddr, daddr);
...
551         unsigned int hash = inet_ehashfn(net, daddr, lport,
552                                          saddr, inet->inet_dport);
553         struct inet_ehash_bucket *head = inet_ehash_bucket(hinfo, hash);
...
555         struct sock *sk2;
556         const struct hlist_nulls_node *node;
...
561         sk_nulls_for_each(sk2, node, &head->chain) {
562                 if (sk2->sk_hash != hash)
563                         continue;
...
565                 if (likely(inet_match(net, sk2, acookie, ports, dif, 
sdif))) {
566                         if (sk2->sk_state == TCP_TIME_WAIT) {
567                                 tw = inet_twsk(sk2);
568                                 if (twsk_unique(sk, sk2, twp))
569                                         break;
570                         }
571                         goto not_unique;
572                 }
573         }

The function twsk_unique() calls a dynamic function that points to tcp_twsk_unique() in our case.

C
23 static inline int twsk_unique(struct sock *sk, struct sock *sktw, 
void *twp)
24 {
25         if (sk->sk_prot->twsk_prot->twsk_unique != NULL)
26                 return sk->sk_prot->twsk_prot->twsk_unique(sk, sktw, 
twp);
27         return 0;
28 }

The function tcp_twsk_unique() does the first operation on the object in line 177. It calls sock_hold(), which acts on the object. It obtains a reference to the object.

C
110 int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
111 {
...
151         if (tcptw->tw_ts_recent_stamp &&
152             (!twp || (reuse && time_after32(ktime_get_seconds(),
153                                             tcptw->tw_ts_recent_stamp)))) {
...
177                 sock_hold(sktw);
178                 return 1;
179         }
180
181         return 0;
182 }

The function sock_hold() should increment the time-wait socket reference counter. Still, as the object has a zeroed reference counter, the reference counter API detects it and triggers a warning before acting on the object.

Exploitability

The Linux kernel contains protection to detect reference counter issues. That protection triggered the splats shown above, and they are not a result of invalid memory accesses. Even though the warnings might mention use-after-free, it’s not a real use-after-free based on the warnings alone. The object might not have been freed at the time of the access. But, in this case, an interesting or real use-after-free situation can happen even with reference counter protection in use.

For this to happen, the operations on the socket need to occur in a precise order. When the reference counter API detects an issue, like the one triggered by sock_hold(), it prints out the warning and taints the reference counter with a hardcoded value. The value is 0xc0000000. We show the simplified execution flow below:

C
774 static __always_inline void sock_hold(struct sock *sk)
775 {
776         refcount_inc(&sk->sk_refcnt);
777 }
C
191 static inline void __refcount_add(int i, refcount_t *r, int *oldp)
192 {
193         int old = atomic_fetch_add_relaxed(i, &r->refs);
194 
195         if (oldp)
196                 *oldp = old;
197 
198         if (unlikely(!old))
199                 refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
200         else if (unlikely(old < 0 || old + i < 0))
201                 refcount_warn_saturate(r, REFCOUNT_ADD_OVF);
202 }
C
13 void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t)
14 {
15         refcount_set(r, REFCOUNT_SATURATED);
16 
17         switch (t) {
18         case REFCOUNT_ADD_NOT_ZERO_OVF:
19                 REFCOUNT_WARN("saturated; leaking memory");
20                 break;
21         case REFCOUNT_ADD_OVF:
22                 REFCOUNT_WARN("saturated; leaking memory");
23                 break;
24         case REFCOUNT_ADD_UAF:
25                 REFCOUNT_WARN("addition on 0; use-after-free");
26                 break;
27         case REFCOUNT_SUB_UAF:
28                 REFCOUNT_WARN("underflow; use-after-free");
29                 break;
30         case REFCOUNT_DEC_LEAK:
31                 REFCOUNT_WARN("decrement hit 0; leaking memory");
32                 break;
33         default:
34                 REFCOUNT_WARN("unknown saturation event!?");
35         }
36 }

The hardcoded value REFCOUNT_SATURATED is defined as (INT_MIN / 2) that is equal to 0xc0000000.

C
117 #define REFCOUNT_SATURATED      (INT_MIN / 2)

As that value is large enough and there isn’t enough reference to the object, the object release operations don’t free the object. Reference counter issues like that usually become a memory leak rather than a use-after-free. But something is interesting in this vulnerability. The function refcount_set() done by inet_twsk_hashdance() doesn’t check the reference counter value before initializing it, and this overwrites the hardcoded value set by sock_hold() when it detects the object’s reference counter is zeroed. In the end, if the initialization of the reference counter is performed after sock_hold(), the reference counter gets unbalanced, and the release operations free the object earlier. The execution flow continues and the reference that should be obtained by sock_hold() is nullified.

C
134 static inline void refcount_set(refcount_t *r, int n)
135 {
136         atomic_set(&r->refs, n);
137 }

In the normal scenario, the object’s reference counter is initialized to 3, the lookup happens, and sock_hold() increments it to 4. If the execution order occurs as mentioned, sock_hold() finds the object with a zeroed reference counter, taints it to 0xc0000000, and refcount_set() overwrites it to 3. Ultimately, the execution flow continues with the reference counter set to 3 when it should be 4, leading to a real use-after-free scenario. It happens after the sock_hold() warning is triggered. The output of the GDB when this execution order happens is shown below.

ShellSession
(gdb) c
Continuing.
[Switching to Thread 3]

Thread 3 hit Breakpoint 6, arch_atomic_set (i=3, v=0xffff8880243ddb50) at ./arch/x86/include/asm/atomic.h:41
41		__WRITE_ONCE(v->counter, i);
(gdb) x/i $rip
=> 0xffffffff81a96b0c <inet_twsk_hashdance+220>:	movl   $0x3,0x80(%rbx)
(gdb) x/2gx $rbx + 0x80
0xffff8880243ddb50:	0x2d43199ac0000000	0xba5c070600000000
(gdb)

To confirm this execution flow could happen, we built the same kernel with some modifications. We added a mdelay() call before inet_twsk_hashdance() initializes the reference counter and other one after sock_hold() to give enough time for inet_twsk_hashdance() to execute before the release operations. We also enabled KASAN in that build to get a KASAN report confirming the real use-after-free. There’s an additional detail. The cache affected has the SLAB_TYPESAFE_BY_RCU flag set, and KASAN can’t detect issues on caches with this flag set. We then manually removed that flag from the cache. We are aware that kernel 6.12 now contains a feature that teaches KASAN to detect issues on those caches, the configuration CONFIG_SLUB_RCU_DEBUG, but that kernel version also includes the fix for the vulnerability and many other changes. We didn’t want to spend time re-introducing the vulnerability as the subsystem has changed since then or back-porting the feature. We obtained the following report. The link for the full file is here.

ShellSession
[   64.602085] ------------[ cut here ]------------
[   64.602088] refcount_t: addition on 0; use-after-free.
...
[   64.603847] CPU: 2 PID: 5282 Comm: trigger Not tainted 5.14.0-362.24.2.el9_3.x86_64-RESEARCH-KASAN #20
...
[   64.796078] ---[ end trace 0b5cc4dcce1de1a1 ]---
[   65.800530] ==================================================================
[   65.800534] BUG: KASAN: use-after-free in inet_twsk_put+0x1f/0x60
[   65.800544] Write of size 4 at addr ffff88801b2e9840 by task exploit02/5282
[   65.800550] CPU: 2 PID: 5282 Comm: trigger Tainted: G        W         -------  ---  5.14.0-362.24.2.el9_3.x86_64-RESEARCH-KASAN #20
...
[   65.800559] Call Trace:
[   65.800561]  <TASK>
[   65.800563]  ? inet_twsk_put+0x1f/0x60
[   65.800569]  dump_stack_lvl+0x34/0x48
[   65.800574]  print_address_description.constprop.0+0x1f/0x1e0
[   65.800600]  ? inet_twsk_put+0x1f/0x60
[   65.800606]  print_report.cold+0x55/0x244
[   65.800612]  ? _raw_spin_lock_irqsave+0x87/0xe0
[   65.800617]  kasan_report+0xb5/0x130
[   65.800623]  ? inet_twsk_put+0x1f/0x60
[   65.800628]  kasan_check_range+0xfd/0x1e0
[   65.800633]  inet_twsk_put+0x1f/0x60
[   65.800638]  __inet_check_established+0x3e0/0x4b0
[   65.800645]  __inet_hash_connect+0x179/0x7d0
[   65.800651]  ? tcp_set_state+0x125/0x340
[   65.800655]  ? __pfx_tcp_set_state+0x10/0x10
[   65.800659]  ? __pfx___inet_check_established+0x10/0x10
[   65.800664]  ? __pfx_ip_route_output_flow+0x10/0x10
[   65.800672]  ? __pfx___inet_hash_connect+0x10/0x10
[   65.800681]  ? __pfx_avc_has_perm+0x10/0x10
[   65.800687]  ? futex_hash+0xa0/0x120
[   65.800693]  ? selinux_sk_getsecid+0x42/0x50
[   65.800700]  tcp_v4_connect+0x5e0/0xaf0
[   65.800710]  ? __pfx_tcp_v4_connect+0x10/0x10
[   65.800716]  ? __pfx_selinux_socket_connect_helper.isra.0+0x10/0x10
[   65.800726]  __inet_stream_connect+0x1ba/0x680
[   65.800734]  ? _raw_spin_lock_bh+0x85/0xe0
[   65.800741]  ? __pfx___inet_stream_connect+0x10/0x10
[   65.800749]  ? _raw_spin_lock_bh+0x85/0xe0
[   65.800756]  ? __pfx__raw_spin_lock_bh+0x10/0x10
[   65.800761]  ? selinux_netlbl_socket_connect+0x2b/0x40
[   65.800768]  inet_stream_connect+0x44/0x70
[   65.800773]  __sys_connect+0x101/0x130
[   65.800779]  ? __pfx___sys_connect+0x10/0x10
[   65.800784]  ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[   65.800790]  ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
[   65.800797]  ? __pfx___x64_sys_futex+0x10/0x10
[   65.800801]  ? __audit_syscall_entry+0x178/0x200
[   65.800807]  ? ktime_get_coarse_real_ts64+0x4a/0x70
[   65.800813]  __x64_sys_connect+0x3c/0x50
...
[   65.800938] Allocated by task 5281:
[   65.800940]  kasan_save_stack+0x1e/0x40
[   65.800945]  __kasan_slab_alloc+0x66/0x80
[   65.800949]  kmem_cache_alloc+0x155/0x310
[   65.800954]  inet_twsk_alloc+0x88/0x340
[   65.800958]  tcp_time_wait+0x41/0x510
[   65.800962]  tcp_fin+0x1c2/0x240
[   65.800965]  tcp_data_queue+0x882/0xb20
[   65.800969]  tcp_rcv_state_process+0x4a3/0xe20
[   65.800973]  tcp_v4_do_rcv+0x169/0x3e0
[   65.800980]  tcp_v4_rcv+0x1871/0x1910
[   65.800985]  ip_protocol_deliver_rcu+0x41/0x4b0
[   65.800992]  ip_local_deliver_finish+0xfc/0x130
[   65.800998]  ip_local_deliver+0x1d0/0x1e0
[   65.801005]  ip_rcv+0x255/0x270
[   65.801010]  __netif_receive_skb_one_core+0x123/0x140
[   65.801016]  process_backlog+0xf1/0x280
[   65.801020]  __napi_poll+0x59/0x260
[   65.801025]  net_rx_action+0x433/0x540
[   65.801033]  __do_softirq+0xf3/0x39d
[   65.801040] Freed by task 5282:
[   65.801042]  kasan_save_stack+0x1e/0x40
[   65.801048]  kasan_set_track+0x21/0x30
[   65.801052]  kasan_set_free_info+0x20/0x40
[   65.801058]  ____kasan_slab_free+0x14e/0x1b0
[   65.801064]  kmem_cache_free+0x1b7/0x430
[   65.801068]  inet_twsk_free+0x90/0xa0
[   65.801075]  inet_twsk_deschedule_put+0x2a/0x40
[   65.801082]  __inet_check_established+0x3e0/0x4b0
[   65.801089]  __inet_hash_connect+0x179/0x7d0
[   65.801097]  tcp_v4_connect+0x5e0/0xaf0
[   65.801101]  __inet_stream_connect+0x1ba/0x680
[   65.801107]  inet_stream_connect+0x44/0x70
[   65.801112]  __sys_connect+0x101/0x130
[   65.801117]  __x64_sys_connect+0x3c/0x50
[   65.801122]  do_syscall_64+0x59/0x90
[   65.801128]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[   65.801137] The buggy address belongs to the object at ffff88801b2e97c0
                which belongs to the cache tw_sock_TCP of size 256
[   65.801143] The buggy address is located 128 bytes inside of
                256-byte region [ffff88801b2e97c0, ffff88801b2e98c0)
[   65.801150] The buggy address belongs to the physical page:
[   65.801154] page:ffffea00006cba00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b2e8
[   65.801160] head:ffffea00006cba00 order:1 compound_mapcount:0 compound_pincount:0
[   65.801164] memcg:ffff88800fbf0e01
[   65.801166] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[   65.801177] raw: 000fffffc0010200 0000000000000000 dead000000000122 ffff88805fad12c0
[   65.801183] raw: 0000000000000000 0000000080190019 00000001ffffffff ffff88800fbf0e01
[   65.801186] page dumped because: kasan: bad access detected
[   65.801190] Memory state around the buggy address:
[   65.801194]  ffff88801b2e9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   65.801199]  ffff88801b2e9780: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[   65.801203] >ffff88801b2e9800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   65.801207]                                            ^
[   65.801210]  ffff88801b2e9880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[   65.801214]  ffff88801b2e9900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   65.801217] ==================================================================
[   65.801272] Disabling lock debugging due to kernel taint
[   65.801275] ------------[ cut here ]------------
[   65.801276] refcount_t: underflow; use-after-free.
...
[   65.801407] CPU: 2 PID: 5282 Comm: trigger Tainted: G    B   W         -------  ---  5.14.0-362.24.2.el9_3.x86_64-RESEARCH-KASAN #20
...
[   65.801457] Call Trace:
[   65.801460]  <TASK>
[   65.801462]  ? show_trace_log_lvl+0x1c4/0x2df
[   65.801468]  ? show_trace_log_lvl+0x1c4/0x2df
[   65.801475]  ? __inet_check_established+0x3e0/0x4b0
[   65.801481]  ? refcount_warn_saturate+0xcd/0x120
[   65.801486]  ? __warn+0x9c/0x150
[   65.801492]  ? refcount_warn_saturate+0xcd/0x120
[   65.801497]  ? report_bug+0x15e/0x180
[   65.801504]  ? handle_bug+0x3c/0x70
[   65.801509]  ? exc_invalid_op+0x14/0x50
[   65.801514]  ? asm_exc_invalid_op+0x16/0x20
[   65.801519]  ? irq_work_claim+0x1e/0x40
[   65.801526]  ? refcount_warn_saturate+0xcd/0x120
[   65.801531]  __inet_check_established+0x3e0/0x4b0
[   65.801539]  __inet_hash_connect+0x179/0x7d0
[   65.801544]  ? tcp_set_state+0x125/0x340
[   65.801548]  ? __pfx_tcp_set_state+0x10/0x10
[   65.801553]  ? __pfx___inet_check_established+0x10/0x10
[   65.801557]  ? __pfx_ip_route_output_flow+0x10/0x10
[   65.801563]  ? __pfx___inet_hash_connect+0x10/0x10
[   65.801569]  ? __pfx_avc_has_perm+0x10/0x10
[   65.801573]  ? futex_hash+0xa0/0x120
[   65.801577]  ? selinux_sk_getsecid+0x42/0x50
[   65.801602]  tcp_v4_connect+0x5e0/0xaf0
[   65.801610]  ? __pfx_tcp_v4_connect+0x10/0x10
[   65.801615]  ? __pfx_selinux_socket_connect_helper.isra.0+0x10/0x10
[   65.801621]  __inet_stream_connect+0x1ba/0x680
[   65.801627]  ? _raw_spin_lock_bh+0x85/0xe0
[   65.801632]  ? __pfx___inet_stream_connect+0x10/0x10
[   65.801636]  ? _raw_spin_lock_bh+0x85/0xe0
[   65.801640]  ? __pfx__raw_spin_lock_bh+0x10/0x10
[   65.801645]  ? selinux_netlbl_socket_connect+0x2b/0x40
[   65.801652]  inet_stream_connect+0x44/0x70
[   65.801657]  __sys_connect+0x101/0x130
[   65.801661]  ? __pfx___sys_connect+0x10/0x10
[   65.801667]  ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[   65.801673]  ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
[   65.801678]  ? __pfx___x64_sys_futex+0x10/0x10
[   65.801682]  ? __audit_syscall_entry+0x178/0x200
[   65.801688]  ? ktime_get_coarse_real_ts64+0x4a/0x70
[   65.801694]  __x64_sys_connect+0x3c/0x50
...
[   65.801799] ---[ end trace 0b5cc4dcce1de1a2 ]-

CONCLUSION

Besides triggering the KASAN splat, we didn’t spend any time researching the exploitability of vulnerability. Ultimately, we confirmed a real use-after-free situation could happen even with the implementation of the reference counter protection mechanism. We are not aware of any other vulnerability that could trigger a real use-after-free in that scenario. We had to understand the vulnerability to write a trigger from scratch. The syzkaller simplified reproducer works, but it is a weird code, and it seems to take much more time to trigger it. Without the delays and in a KASAN-enabled kernel, even the warning is hard to trigger. We let it run for more than 48 hours, several times, and the warnings appeared just in some occurrences. As the work intended to confirm the potential case for a real use-after-free, we inserted the delays, but the semantics of the code continued intact.

This vulnerability again showed us that interesting findings are sometimes discovered by chance or when exposed to luck. That’s why we value playing around when researching. We never know what we might discover.

The materials for this research can be found on our GitHub at the following link: https://github.com/alleleintel/research/tree/master/CVE-2024-36904

References

general protection fault in skb_unlink
https://syzkaller.appspot.com/bug?extid=278279efdd2730dd14bf

kcm: close race conditions on sk_receive_queue
https://github.com/torvalds/linux/commit/5121197ecc5db58c07da95eb1ff82b98b121a221

tcp/dccp: avoid one atomic operation for timewait hashdance
https://github.com/torvalds/linux/commit/ec94c2696f0bcd5ae92a553244e4ac30d2171a2d

tcp: Use refcount_inc_not_zero() in tcp_twsk_unique().
https://github.com/torvalds/linux/commit/f2db7230f73a80dbb179deab78f88a7947f0ab7e

use-after-free warnings in tcp_v4_connect() due to inet_twsk_hashdance() inserting the object into ehash table without initializing its reference counter
https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/

slub: Introduce CONFIG_SLUB_RCU_DEBUG
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b8c8ba73c68bb3c3e9dad22f488b86c540c839f9