Para acessar esta postagem em português, clique aqui.
Vulnerability research is at the core of Allele Security Intelligence. We have been actively researching for more than a decade, and we offer our expertise to our clients. Among the services we offer are 0day and nday vulnerability research.
In nday vulnerability research projects, in the case of the Linux kernel, we look for vulnerabilities patched upstream, that still affect major distributions even in their latest release. Usually, we find vulnerabilities patched over a year ago that still affect popular Linux distributions. We do that by auditing the Linux kernel source code, monitoring vulnerabilities submitted to mailing lists and patched upstream, checking the findings of the syzkaller fuzzer and other ways.
While doing that research, we accidentally discovered a vulnerability in the core of the TCP subsystem of the Linux kernel. It had been introduced seven years earlier. We reported it upstream, which was patched in May of last year. In this blog post, we’ll share how it happened and briefly analyze the vulnerability.
KCM NETWORKING PROTOCOL RACE CONDITION on sk_receive_queue()
The following vulnerability caught our attention when checking the dashboard of the syzkaller fuzzer. The commit that fixed it is also shown below.
general protection fault in skb_unlink
https://syzkaller.appspot.com/bug?extid=278279efdd2730dd14bf
kcm: close race conditions on sk_receive_queue
https://github.com/torvalds/linux/commit/5121197ecc5db58c07da95eb1ff82b98b121a221
We checked the source code of the latest release of one of the Red Hat Enterprise Linux derivatives, and it was still vulnerable. The Red Hat Enterprise Linux derivates’ kernels are similar to the Red Hat Enterprise Linux. Usually, a vulnerability in one ends up affecting all of them. There were some changes recently in the Red Hat Enterprise Linux ecosystem, and we haven’t checked them since then, but this was true for a while. Fortunately, the bug had a reproducer available, and we let it run in our systems. After a while, the following splat was generated. The full splat can be found here.
[111607.195289] ------------[ cut here ]------------
[111607.195305] refcount_t: addition on 0; use-after-free.
...
[111607.197802] CPU: 2 PID: 3130808 Comm: poc6 Kdump: loaded Not tainted 5.14.0-362.24.2.el9_3.x86_64 #1
...
[111607.201030] Call Trace:
[111607.201182]
[111607.201350] ? show_trace_log_lvl+0x1c4/0x2df
[111607.201623] ? show_trace_log_lvl+0x1c4/0x2df
[111607.201800] ? tcp_twsk_unique+0x183/0x190
[111607.201993] ? refcount_warn_saturate+0x74/0x110
[111607.202164] ? __warn+0x81/0x110
[111607.202442] ? refcount_warn_saturate+0x74/0x110
[111607.202692] ? report_bug+0x10a/0x140
[111607.202931] ? handle_bug+0x3c/0x70
[111607.203212] ? exc_invalid_op+0x14/0x70
[111607.203425] ? asm_exc_invalid_op+0x16/0x20
[111607.203679] ? refcount_warn_saturate+0x74/0x110
[111607.203924] tcp_twsk_unique+0x183/0x190
[111607.204091] __inet_check_established+0x158/0x2c0
[111607.204335] __inet_hash_connect+0xb7/0x540
[111607.204590] ? __pfx___inet_check_established+0x10/0x10
[111607.206806] tcp_v4_connect+0x24e/0x520
[111607.207707] __inet_stream_connect+0xcb/0x3b0
[111607.208583] ? release_sock+0x40/0x90
[111607.209469] ? selinux_netlbl_socket_connect+0x2b/0x40
[111607.210342] inet_stream_connect+0x37/0x60
[111607.211177] __sys_connect+0xa3/0xd0
[111607.211985] __x64_sys_connect+0x14/0x20
...
[111607.223742] ------------[ cut here ]------------
[111607.223743] refcount_t: underflow; use-after-free.
...
[111607.228620] CPU: 2 PID: 3130808 Comm: poc6 Kdump: loaded Tainted: G W ------- --- 5.14.0-362.24.2.el9_3.x86_64 #1
...
[111607.239689] Call Trace:
[111607.240472]
[111607.241313] ? show_trace_log_lvl+0x1c4/0x2df
[111607.242156] ? show_trace_log_lvl+0x1c4/0x2df
[111607.242989] ? __inet_check_established+0x23a/0x2c0
[111607.243838] ? refcount_warn_saturate+0xba/0x110
[111607.244669] ? __warn+0x81/0x110
[111607.245484] ? refcount_warn_saturate+0xba/0x110
[111607.246331] ? report_bug+0x10a/0x140
[111607.247155] ? handle_bug+0x3c/0x70
[111607.248005] ? exc_invalid_op+0x14/0x70
[111607.248834] ? asm_exc_invalid_op+0x16/0x20
[111607.249629] ? refcount_warn_saturate+0xba/0x110
[111607.250435] __inet_check_established+0x23a/0x2c0
[111607.251279] __inet_hash_connect+0xb7/0x540
[111607.252135] ? __pfx___inet_check_established+0x10/0x10
[111607.252962] tcp_v4_connect+0x24e/0x520
[111607.253786] __inet_stream_connect+0xcb/0x3b0
[111607.254601] ? release_sock+0x40/0x90
[111607.255361] ? selinux_netlbl_socket_connect+0x2b/0x40
[111607.256105] inet_stream_connect+0x37/0x60
[111607.256913] __sys_connect+0xa3/0xd0
[111607.257709] __x64_sys_connect+0x14/0x20
...
[116082.336931] ------------[ cut here ]------------
[116082.336934] refcount_t: decrement hit 0; leaking memory.
...
[116082.342044] CPU: 1 PID: 3866568 Comm: poc5 Kdump: loaded Tainted: G W ------- --- 5.14.0-362.24.2.el9_3.x86_64 #1
...
[116082.352815] Call Trace:
[116082.353638]
[116082.354410] ? show_trace_log_lvl+0x1c4/0x2df
[116082.355236] ? show_trace_log_lvl+0x1c4/0x2df
[116082.356055] ? __inet_check_established+0x29c/0x2c0
[116082.356874] ? refcount_warn_saturate+0xfb/0x110
[116082.357692] ? __warn+0x81/0x110
[116082.358505] ? refcount_warn_saturate+0xfb/0x110
[116082.359302] ? report_bug+0x10a/0x140
[116082.360124] ? handle_bug+0x3c/0x70
[116082.360905] ? exc_invalid_op+0x14/0x70
[116082.361697] ? asm_exc_invalid_op+0x16/0x20
[116082.362491] ? refcount_warn_saturate+0xfb/0x110
[116082.363300] __inet_check_established+0x29c/0x2c0
[116082.364098] __inet_hash_connect+0xb7/0x540
[116082.364893] ? __pfx___inet_check_established+0x10/0x10
[116082.365689] tcp_v4_connect+0x24e/0x520
[116082.366454] ? pgtable_trans_huge_deposit+0x88/0x110
[116082.367247] __inet_stream_connect+0xcb/0x3b0
[116082.368000] ? release_sock+0x40/0x90
[116082.368746] ? selinux_netlbl_socket_connect+0x2b/0x40
[116082.369488] inet_stream_connect+0x37/0x60
[116082.370197] __sys_connect+0xa3/0xd0
[116082.370942] __x64_sys_connect+0x14/0x20
...
[116082.381500] ---[ end trace f02e72c43eeca11a ]---We then started to simplify the syzkaller reproducer to get a cleaner code. We reduced the code step by step, identifying needed pieces and removing unneeded ones, finally reaching a simplified version of the code. We already suspected by looking at the splat we obtained, but during this process, we noticed our trigger had nothing to do with the networking protocol described in the syzkaller splat. The call trace in the syzkaller splat contains the KCM protocol, but in our simplified trigger we removed everything related to the KCM protocol, and it was still triggering a splat.
Syzkaller splat:
Call Trace:
kcm_recvmsg+0x462/0x560 net/kcm/kcmsock.c:1161
sock_recvmsg_nosec+0x89/0xb0 net/socket.c:871
___sys_recvmsg+0x271/0x5c0 net/socket.c:2480
do_recvmmsg+0x27e/0x7a0 net/socket.c:2601
__sys_recvmmsg+0x259/0x270 net/socket.c:2680
__do_sys_recvmmsg net/socket.c:2703 [inline]
__se_sys_recvmmsg net/socket.c:2696 [inline]
__x64_sys_recvmmsg+0xe6/0x140 net/socket.c:2696
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbeThe splat we triggered by running the syzkaller reproducer:
Call Trace:
show_trace_log_lvl+0x1c4/0x2df
show_trace_log_lvl+0x1c4/0x2df
tcp_twsk_unique+0x183/0x190
refcount_warn_saturate+0x74/0x110
__warn+0x81/0x110
refcount_warn_saturate+0x74/0x110
report_bug+0x10a/0x140
handle_bug+0x3c/0x70
exc_invalid_op+0x14/0x70
asm_exc_invalid_op+0x16/0x20
refcount_warn_saturate+0x74/0x110
tcp_twsk_unique+0x183/0x190
__inet_check_established+0x158/0x2c0
__inet_hash_connect+0xb7/0x540
__pfx___inet_check_established+0x10/0x10
tcp_v4_connect+0x24e/0x520
__inet_stream_connect+0xcb/0x3b0
release_sock+0x40/0x90
selinux_netlbl_socket_connect+0x2b/0x40
inet_stream_connect+0x37/0x60
__sys_connect+0xa3/0xd0
__x64_sys_connect+0x14/0x20
do_syscall_64+0x59/0x90
handle_mm_fault+0xc5/0x2a0
do_user_addr_fault+0x1d6/0x6a0
exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x72/0xdc As shown above, the bug we triggered has nothing to do with the KCM protocol. The syzkaller splat was generated by KASAN and in our case, it was a warning due to protection mechanisms implemented in the reference counter API of the Linux kernel.
We confirmed the bug we were triggering was different after checking that the module for the KCM protocol wasn’t even available in the system. To reduce the time needed to analyze the issues, we run proof of concepts or reproducers in several distributions simultaneously before checking the system, and if they trigger anything, we then start analyzing it.
$ cat /boot/config-5.14.0-362.24.2.el9_3.x86_64 | grep CONFIG_AF_KCM
# CONFIG_AF_KCM is not set
$Use-after-free in the TCP subsystem of the Linux kernel due to race CONDITION between tcp_twsk_unique() and inet_twsk_hashdance() – cve-2024-36904
We confirmed it was a different bug and analyzed it to understand it better. The problem happens because the object, a time-wait TCP socket, has its reference counter initialized after being inserted into a hash table and releasing the lock. Then, if a lookup is performed before the reference counter initialization, the object is found with a zeroed reference counter, and the warning is triggered. The commit that added the problem:
tcp/dccp: avoid one atomic operation for timewait hashdance
https://github.com/torvalds/linux/commit/ec94c2696f0bcd5ae92a553244e4ac30d2171a2d#diff-901c0d3a066cb54add11c34dabb345263e8fc6e7fbfd702843be6b7435345b59R139
Understanding the issue, we confirmed upstream should also be affected based on the source code. We then tried on Fedora 39 running with kernel 6.8.6-200.fc39.x86_64. The kernel 6.8 was a recent version at that time. The same simplified reproducer triggers the following splat. The link for the full splat is here.
[433522.338983] ------------[ cut here ]------------
[433522.339033] refcount_t: addition on 0; use-after-free.
...
[433522.340141] CPU: 0 PID: 1039313 Comm: trigger Not tainted
6.8.6-200.fc39.x86_64 #1
...
[433522.340278] Call Trace:
[433522.340282] <TASK>
[433522.340307] ? refcount_warn_saturate+0xe5/0x110
[433522.340313] ? __warn+0x81/0x130
[433522.340462] ? refcount_warn_saturate+0xe5/0x110
[433522.340492] ? report_bug+0x171/0x1a0
[433522.340723] ? refcount_warn_saturate+0xe5/0x110
[433522.340731] ? handle_bug+0x3c/0x80
[433522.340781] ? exc_invalid_op+0x17/0x70
[433522.340785] ? asm_exc_invalid_op+0x1a/0x20
[433522.340838] ? refcount_warn_saturate+0xe5/0x110
[433522.340843] tcp_twsk_unique+0x186/0x190
[433522.340945] __inet_check_established+0x176/0x2d0
[433522.340974] __inet_hash_connect+0x74/0x7d0
[433522.340980] ? __pfx___inet_check_established+0x10/0x10
[433522.340983] tcp_v4_connect+0x278/0x530
[433522.340989] __inet_stream_connect+0x10f/0x3d0
[433522.341019] inet_stream_connect+0x3a/0x60
[433522.341024] __sys_connect+0xa8/0xd0
[433522.341186] __x64_sys_connect+0x18/0x20
...
[433522.341703] ---[ end trace 0000000000000000 ]---
[433522.341709] ------------[ cut here ]------------
[433522.341710] refcount_t: underflow; use-after-free.
...
[433522.341820] CPU: 0 PID: 1039313 Comm: trigger Tainted: G W
6.8.6-200.fc39.x86_64 #1
...
[433522.341887] Call Trace:
[433522.341889] <TASK>
[433522.341890] ? refcount_warn_saturate+0xbe/0x110
[433522.341894] ? __warn+0x81/0x130
[433522.341899] ? refcount_warn_saturate+0xbe/0x110
[433522.341903] ? report_bug+0x171/0x1a0
[433522.341907] ? console_unlock+0x78/0x120
[433522.341977] ? handle_bug+0x3c/0x80
[433522.341981] ? exc_invalid_op+0x17/0x70
[433522.342007] ? asm_exc_invalid_op+0x1a/0x20
[433522.342011] ? refcount_warn_saturate+0xbe/0x110
[433522.342015] __inet_check_established+0x24d/0x2d0
[433522.342019] __inet_hash_connect+0x74/0x7d0
[433522.342023] ? __pfx___inet_check_established+0x10/0x10
[433522.342026] tcp_v4_connect+0x278/0x530
[433522.342031] __inet_stream_connect+0x10f/0x3d0
[433522.342035] inet_stream_connect+0x3a/0x60
[433522.342039] __sys_connect+0xa8/0xd0
[433522.342044] __x64_sys_connect+0x18/0x20
...
[433522.342097] ---[ end trace 0000000000000000 ]---
[435060.554199] ------------[ cut here ]------------
[435060.554243] refcount_t: decrement hit 0; leaking memory.
...
[435060.554426] CPU: 2 PID: 879478 Comm: trigger Tainted: G W
6.8.6-200.fc39.x86_64 #1
...
[435060.554603] Call Trace:
[435060.554607] <TASK>
[435060.554608] ? refcount_warn_saturate+0xff/0x110
[435060.554614] ? __warn+0x81/0x130
[435060.554625] ? refcount_warn_saturate+0xff/0x110
[435060.554630] ? report_bug+0x171/0x1a0
[435060.554638] ? console_unlock+0x78/0x120
[435060.554670] ? handle_bug+0x3c/0x80
[435060.554676] ? exc_invalid_op+0x17/0x70
[435060.554682] ? asm_exc_invalid_op+0x1a/0x20
[435060.554694] ? refcount_warn_saturate+0xff/0x110
[435060.554699] __inet_check_established+0x29b/0x2d0
[435060.554707] __inet_hash_connect+0x74/0x7d0
[435060.554712] ? __pfx___inet_check_established+0x10/0x10
[435060.554716] tcp_v4_connect+0x278/0x530
[435060.554723] __inet_stream_connect+0x10f/0x3d0
[435060.554729] inet_stream_connect+0x3a/0x60
[435060.554734] __sys_connect+0xa8/0xd0
[435060.554744] __x64_sys_connect+0x18/0x20
...
[435060.555160] ---[ end trace 0000000000000000 ]---Initially intended to reproduce a known bug on a Linux distribution, we accidentally uncovered a seven-year-old vulnerability in the Linux kernel. Now, let’s understand the vulnerability. We reached the code below. In line 149, the object’s reference counter added into a hash table is incremented, but it has already released the lock. Then, looking up operations on the hash table finds the object with a zeroed reference counter.
100 void inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
101 struct inet_hashinfo *hashinfo)
102 {
...
105 struct inet_ehash_bucket *ehead =
inet_ehash_bucket(hashinfo, sk->sk_hash);
106 spinlock_t *lock = inet_ehash_lockp(hashinfo, sk->sk_hash);
...
130 spin_lock(lock);
...
132 inet_twsk_add_node_rcu(tw, &ehead->chain);
...
138 spin_unlock(lock);
...
149 refcount_set(&tw->tw_refcnt, 3);
150 }The looking up operation is performed by __inet_check_established(). It finds the object in line 561 and then calls twsk_unique().
538 static int __inet_check_established(struct inet_timewait_death_row
*death_row,
539 struct sock *sk, __u16 lport,
540 struct inet_timewait_sock **twp)
541 {
542 struct inet_hashinfo *hinfo = death_row->hashinfo;
543 struct inet_sock *inet = inet_sk(sk);
544 __be32 daddr = inet->inet_rcv_saddr;
545 __be32 saddr = inet->inet_daddr;
546 int dif = sk->sk_bound_dev_if;
547 struct net *net = sock_net(sk);
...
550 const __portpair ports =
INET_COMBINED_PORTS(inet->inet_dport, lport);
...
549 INET_ADDR_COOKIE(acookie, saddr, daddr);
...
551 unsigned int hash = inet_ehashfn(net, daddr, lport,
552 saddr, inet->inet_dport);
553 struct inet_ehash_bucket *head = inet_ehash_bucket(hinfo, hash);
...
555 struct sock *sk2;
556 const struct hlist_nulls_node *node;
...
561 sk_nulls_for_each(sk2, node, &head->chain) {
562 if (sk2->sk_hash != hash)
563 continue;
...
565 if (likely(inet_match(net, sk2, acookie, ports, dif,
sdif))) {
566 if (sk2->sk_state == TCP_TIME_WAIT) {
567 tw = inet_twsk(sk2);
568 if (twsk_unique(sk, sk2, twp))
569 break;
570 }
571 goto not_unique;
572 }
573 }The function twsk_unique() calls a dynamic function that points to tcp_twsk_unique() in our case.
23 static inline int twsk_unique(struct sock *sk, struct sock *sktw,
void *twp)
24 {
25 if (sk->sk_prot->twsk_prot->twsk_unique != NULL)
26 return sk->sk_prot->twsk_prot->twsk_unique(sk, sktw,
twp);
27 return 0;
28 }The function tcp_twsk_unique() does the first operation on the object in line 177. It calls sock_hold(), which acts on the object. It obtains a reference to the object.
110 int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
111 {
...
151 if (tcptw->tw_ts_recent_stamp &&
152 (!twp || (reuse && time_after32(ktime_get_seconds(),
153 tcptw->tw_ts_recent_stamp)))) {
...
177 sock_hold(sktw);
178 return 1;
179 }
180
181 return 0;
182 }The function sock_hold() should increment the time-wait socket reference counter. Still, as the object has a zeroed reference counter, the reference counter API detects it and triggers a warning before acting on the object.
Exploitability
The Linux kernel contains protection to detect reference counter issues. That protection triggered the splats shown above, and they are not a result of invalid memory accesses. Even though the warnings might mention use-after-free, it’s not a real use-after-free based on the warnings alone. The object might not have been freed at the time of the access. But, in this case, an interesting or real use-after-free situation can happen even with reference counter protection in use.
For this to happen, the operations on the socket need to occur in a precise order. When the reference counter API detects an issue, like the one triggered by sock_hold(), it prints out the warning and taints the reference counter with a hardcoded value. The value is 0xc0000000. We show the simplified execution flow below:
774 static __always_inline void sock_hold(struct sock *sk)
775 {
776 refcount_inc(&sk->sk_refcnt);
777 }191 static inline void __refcount_add(int i, refcount_t *r, int *oldp)
192 {
193 int old = atomic_fetch_add_relaxed(i, &r->refs);
194
195 if (oldp)
196 *oldp = old;
197
198 if (unlikely(!old))
199 refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
200 else if (unlikely(old < 0 || old + i < 0))
201 refcount_warn_saturate(r, REFCOUNT_ADD_OVF);
202 }13 void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t)
14 {
15 refcount_set(r, REFCOUNT_SATURATED);
16
17 switch (t) {
18 case REFCOUNT_ADD_NOT_ZERO_OVF:
19 REFCOUNT_WARN("saturated; leaking memory");
20 break;
21 case REFCOUNT_ADD_OVF:
22 REFCOUNT_WARN("saturated; leaking memory");
23 break;
24 case REFCOUNT_ADD_UAF:
25 REFCOUNT_WARN("addition on 0; use-after-free");
26 break;
27 case REFCOUNT_SUB_UAF:
28 REFCOUNT_WARN("underflow; use-after-free");
29 break;
30 case REFCOUNT_DEC_LEAK:
31 REFCOUNT_WARN("decrement hit 0; leaking memory");
32 break;
33 default:
34 REFCOUNT_WARN("unknown saturation event!?");
35 }
36 }The hardcoded value REFCOUNT_SATURATED is defined as (INT_MIN / 2) that is equal to 0xc0000000.
117 #define REFCOUNT_SATURATED (INT_MIN / 2)As that value is large enough and there isn’t enough reference to the object, the object release operations don’t free the object. Reference counter issues like that usually become a memory leak rather than a use-after-free. But something is interesting in this vulnerability. The function refcount_set() done by inet_twsk_hashdance() doesn’t check the reference counter value before initializing it, and this overwrites the hardcoded value set by sock_hold() when it detects the object’s reference counter is zeroed. In the end, if the initialization of the reference counter is performed after sock_hold(), the reference counter gets unbalanced, and the release operations free the object earlier. The execution flow continues and the reference that should be obtained by sock_hold() is nullified.
134 static inline void refcount_set(refcount_t *r, int n)
135 {
136 atomic_set(&r->refs, n);
137 }In the normal scenario, the object’s reference counter is initialized to 3, the lookup happens, and sock_hold() increments it to 4. If the execution order occurs as mentioned, sock_hold() finds the object with a zeroed reference counter, taints it to 0xc0000000, and refcount_set() overwrites it to 3. Ultimately, the execution flow continues with the reference counter set to 3 when it should be 4, leading to a real use-after-free scenario. It happens after the sock_hold() warning is triggered. The output of the GDB when this execution order happens is shown below.
(gdb) c
Continuing.
[Switching to Thread 3]
Thread 3 hit Breakpoint 6, arch_atomic_set (i=3, v=0xffff8880243ddb50) at ./arch/x86/include/asm/atomic.h:41
41 __WRITE_ONCE(v->counter, i);
(gdb) x/i $rip
=> 0xffffffff81a96b0c <inet_twsk_hashdance+220>: movl $0x3,0x80(%rbx)
(gdb) x/2gx $rbx + 0x80
0xffff8880243ddb50: 0x2d43199ac0000000 0xba5c070600000000
(gdb)To confirm this execution flow could happen, we built the same kernel with some modifications. We added a mdelay() call before inet_twsk_hashdance() initializes the reference counter and other one after sock_hold() to give enough time for inet_twsk_hashdance() to execute before the release operations. We also enabled KASAN in that build to get a KASAN report confirming the real use-after-free. There’s an additional detail. The cache affected has the SLAB_TYPESAFE_BY_RCU flag set, and KASAN can’t detect issues on caches with this flag set. We then manually removed that flag from the cache. We are aware that kernel 6.12 now contains a feature that teaches KASAN to detect issues on those caches, the configuration CONFIG_SLUB_RCU_DEBUG, but that kernel version also includes the fix for the vulnerability and many other changes. We didn’t want to spend time re-introducing the vulnerability as the subsystem has changed since then or back-porting the feature. We obtained the following report. The link for the full file is here.
[ 64.602085] ------------[ cut here ]------------
[ 64.602088] refcount_t: addition on 0; use-after-free.
...
[ 64.603847] CPU: 2 PID: 5282 Comm: trigger Not tainted 5.14.0-362.24.2.el9_3.x86_64-RESEARCH-KASAN #20
...
[ 64.796078] ---[ end trace 0b5cc4dcce1de1a1 ]---
[ 65.800530] ==================================================================
[ 65.800534] BUG: KASAN: use-after-free in inet_twsk_put+0x1f/0x60
[ 65.800544] Write of size 4 at addr ffff88801b2e9840 by task exploit02/5282
[ 65.800550] CPU: 2 PID: 5282 Comm: trigger Tainted: G W ------- --- 5.14.0-362.24.2.el9_3.x86_64-RESEARCH-KASAN #20
...
[ 65.800559] Call Trace:
[ 65.800561] <TASK>
[ 65.800563] ? inet_twsk_put+0x1f/0x60
[ 65.800569] dump_stack_lvl+0x34/0x48
[ 65.800574] print_address_description.constprop.0+0x1f/0x1e0
[ 65.800600] ? inet_twsk_put+0x1f/0x60
[ 65.800606] print_report.cold+0x55/0x244
[ 65.800612] ? _raw_spin_lock_irqsave+0x87/0xe0
[ 65.800617] kasan_report+0xb5/0x130
[ 65.800623] ? inet_twsk_put+0x1f/0x60
[ 65.800628] kasan_check_range+0xfd/0x1e0
[ 65.800633] inet_twsk_put+0x1f/0x60
[ 65.800638] __inet_check_established+0x3e0/0x4b0
[ 65.800645] __inet_hash_connect+0x179/0x7d0
[ 65.800651] ? tcp_set_state+0x125/0x340
[ 65.800655] ? __pfx_tcp_set_state+0x10/0x10
[ 65.800659] ? __pfx___inet_check_established+0x10/0x10
[ 65.800664] ? __pfx_ip_route_output_flow+0x10/0x10
[ 65.800672] ? __pfx___inet_hash_connect+0x10/0x10
[ 65.800681] ? __pfx_avc_has_perm+0x10/0x10
[ 65.800687] ? futex_hash+0xa0/0x120
[ 65.800693] ? selinux_sk_getsecid+0x42/0x50
[ 65.800700] tcp_v4_connect+0x5e0/0xaf0
[ 65.800710] ? __pfx_tcp_v4_connect+0x10/0x10
[ 65.800716] ? __pfx_selinux_socket_connect_helper.isra.0+0x10/0x10
[ 65.800726] __inet_stream_connect+0x1ba/0x680
[ 65.800734] ? _raw_spin_lock_bh+0x85/0xe0
[ 65.800741] ? __pfx___inet_stream_connect+0x10/0x10
[ 65.800749] ? _raw_spin_lock_bh+0x85/0xe0
[ 65.800756] ? __pfx__raw_spin_lock_bh+0x10/0x10
[ 65.800761] ? selinux_netlbl_socket_connect+0x2b/0x40
[ 65.800768] inet_stream_connect+0x44/0x70
[ 65.800773] __sys_connect+0x101/0x130
[ 65.800779] ? __pfx___sys_connect+0x10/0x10
[ 65.800784] ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[ 65.800790] ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
[ 65.800797] ? __pfx___x64_sys_futex+0x10/0x10
[ 65.800801] ? __audit_syscall_entry+0x178/0x200
[ 65.800807] ? ktime_get_coarse_real_ts64+0x4a/0x70
[ 65.800813] __x64_sys_connect+0x3c/0x50
...
[ 65.800938] Allocated by task 5281:
[ 65.800940] kasan_save_stack+0x1e/0x40
[ 65.800945] __kasan_slab_alloc+0x66/0x80
[ 65.800949] kmem_cache_alloc+0x155/0x310
[ 65.800954] inet_twsk_alloc+0x88/0x340
[ 65.800958] tcp_time_wait+0x41/0x510
[ 65.800962] tcp_fin+0x1c2/0x240
[ 65.800965] tcp_data_queue+0x882/0xb20
[ 65.800969] tcp_rcv_state_process+0x4a3/0xe20
[ 65.800973] tcp_v4_do_rcv+0x169/0x3e0
[ 65.800980] tcp_v4_rcv+0x1871/0x1910
[ 65.800985] ip_protocol_deliver_rcu+0x41/0x4b0
[ 65.800992] ip_local_deliver_finish+0xfc/0x130
[ 65.800998] ip_local_deliver+0x1d0/0x1e0
[ 65.801005] ip_rcv+0x255/0x270
[ 65.801010] __netif_receive_skb_one_core+0x123/0x140
[ 65.801016] process_backlog+0xf1/0x280
[ 65.801020] __napi_poll+0x59/0x260
[ 65.801025] net_rx_action+0x433/0x540
[ 65.801033] __do_softirq+0xf3/0x39d
[ 65.801040] Freed by task 5282:
[ 65.801042] kasan_save_stack+0x1e/0x40
[ 65.801048] kasan_set_track+0x21/0x30
[ 65.801052] kasan_set_free_info+0x20/0x40
[ 65.801058] ____kasan_slab_free+0x14e/0x1b0
[ 65.801064] kmem_cache_free+0x1b7/0x430
[ 65.801068] inet_twsk_free+0x90/0xa0
[ 65.801075] inet_twsk_deschedule_put+0x2a/0x40
[ 65.801082] __inet_check_established+0x3e0/0x4b0
[ 65.801089] __inet_hash_connect+0x179/0x7d0
[ 65.801097] tcp_v4_connect+0x5e0/0xaf0
[ 65.801101] __inet_stream_connect+0x1ba/0x680
[ 65.801107] inet_stream_connect+0x44/0x70
[ 65.801112] __sys_connect+0x101/0x130
[ 65.801117] __x64_sys_connect+0x3c/0x50
[ 65.801122] do_syscall_64+0x59/0x90
[ 65.801128] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 65.801137] The buggy address belongs to the object at ffff88801b2e97c0
which belongs to the cache tw_sock_TCP of size 256
[ 65.801143] The buggy address is located 128 bytes inside of
256-byte region [ffff88801b2e97c0, ffff88801b2e98c0)
[ 65.801150] The buggy address belongs to the physical page:
[ 65.801154] page:ffffea00006cba00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b2e8
[ 65.801160] head:ffffea00006cba00 order:1 compound_mapcount:0 compound_pincount:0
[ 65.801164] memcg:ffff88800fbf0e01
[ 65.801166] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[ 65.801177] raw: 000fffffc0010200 0000000000000000 dead000000000122 ffff88805fad12c0
[ 65.801183] raw: 0000000000000000 0000000080190019 00000001ffffffff ffff88800fbf0e01
[ 65.801186] page dumped because: kasan: bad access detected
[ 65.801190] Memory state around the buggy address:
[ 65.801194] ffff88801b2e9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 65.801199] ffff88801b2e9780: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[ 65.801203] >ffff88801b2e9800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 65.801207] ^
[ 65.801210] ffff88801b2e9880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 65.801214] ffff88801b2e9900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 65.801217] ==================================================================
[ 65.801272] Disabling lock debugging due to kernel taint
[ 65.801275] ------------[ cut here ]------------
[ 65.801276] refcount_t: underflow; use-after-free.
...
[ 65.801407] CPU: 2 PID: 5282 Comm: trigger Tainted: G B W ------- --- 5.14.0-362.24.2.el9_3.x86_64-RESEARCH-KASAN #20
...
[ 65.801457] Call Trace:
[ 65.801460] <TASK>
[ 65.801462] ? show_trace_log_lvl+0x1c4/0x2df
[ 65.801468] ? show_trace_log_lvl+0x1c4/0x2df
[ 65.801475] ? __inet_check_established+0x3e0/0x4b0
[ 65.801481] ? refcount_warn_saturate+0xcd/0x120
[ 65.801486] ? __warn+0x9c/0x150
[ 65.801492] ? refcount_warn_saturate+0xcd/0x120
[ 65.801497] ? report_bug+0x15e/0x180
[ 65.801504] ? handle_bug+0x3c/0x70
[ 65.801509] ? exc_invalid_op+0x14/0x50
[ 65.801514] ? asm_exc_invalid_op+0x16/0x20
[ 65.801519] ? irq_work_claim+0x1e/0x40
[ 65.801526] ? refcount_warn_saturate+0xcd/0x120
[ 65.801531] __inet_check_established+0x3e0/0x4b0
[ 65.801539] __inet_hash_connect+0x179/0x7d0
[ 65.801544] ? tcp_set_state+0x125/0x340
[ 65.801548] ? __pfx_tcp_set_state+0x10/0x10
[ 65.801553] ? __pfx___inet_check_established+0x10/0x10
[ 65.801557] ? __pfx_ip_route_output_flow+0x10/0x10
[ 65.801563] ? __pfx___inet_hash_connect+0x10/0x10
[ 65.801569] ? __pfx_avc_has_perm+0x10/0x10
[ 65.801573] ? futex_hash+0xa0/0x120
[ 65.801577] ? selinux_sk_getsecid+0x42/0x50
[ 65.801602] tcp_v4_connect+0x5e0/0xaf0
[ 65.801610] ? __pfx_tcp_v4_connect+0x10/0x10
[ 65.801615] ? __pfx_selinux_socket_connect_helper.isra.0+0x10/0x10
[ 65.801621] __inet_stream_connect+0x1ba/0x680
[ 65.801627] ? _raw_spin_lock_bh+0x85/0xe0
[ 65.801632] ? __pfx___inet_stream_connect+0x10/0x10
[ 65.801636] ? _raw_spin_lock_bh+0x85/0xe0
[ 65.801640] ? __pfx__raw_spin_lock_bh+0x10/0x10
[ 65.801645] ? selinux_netlbl_socket_connect+0x2b/0x40
[ 65.801652] inet_stream_connect+0x44/0x70
[ 65.801657] __sys_connect+0x101/0x130
[ 65.801661] ? __pfx___sys_connect+0x10/0x10
[ 65.801667] ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[ 65.801673] ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
[ 65.801678] ? __pfx___x64_sys_futex+0x10/0x10
[ 65.801682] ? __audit_syscall_entry+0x178/0x200
[ 65.801688] ? ktime_get_coarse_real_ts64+0x4a/0x70
[ 65.801694] __x64_sys_connect+0x3c/0x50
...
[ 65.801799] ---[ end trace 0b5cc4dcce1de1a2 ]-CONCLUSION
Besides triggering the KASAN splat, we didn’t spend any time researching the exploitability of vulnerability. Ultimately, we confirmed a real use-after-free situation could happen even with the implementation of the reference counter protection mechanism. We are not aware of any other vulnerability that could trigger a real use-after-free in that scenario. We had to understand the vulnerability to write a trigger from scratch. The syzkaller simplified reproducer works, but it is a weird code, and it seems to take much more time to trigger it. Without the delays and in a KASAN-enabled kernel, even the warning is hard to trigger. We let it run for more than 48 hours, several times, and the warnings appeared just in some occurrences. As the work intended to confirm the potential case for a real use-after-free, we inserted the delays, but the semantics of the code continued intact.
This vulnerability again showed us that interesting findings are sometimes discovered by chance or when exposed to luck. That’s why we value playing around when researching. We never know what we might discover.
The materials for this research can be found on our GitHub at the following link: https://github.com/alleleintel/research/tree/master/CVE-2024-36904
References
general protection fault in skb_unlink
https://syzkaller.appspot.com/bug?extid=278279efdd2730dd14bf
kcm: close race conditions on sk_receive_queue
https://github.com/torvalds/linux/commit/5121197ecc5db58c07da95eb1ff82b98b121a221
tcp/dccp: avoid one atomic operation for timewait hashdance
https://github.com/torvalds/linux/commit/ec94c2696f0bcd5ae92a553244e4ac30d2171a2d
tcp: Use refcount_inc_not_zero() in tcp_twsk_unique().
https://github.com/torvalds/linux/commit/f2db7230f73a80dbb179deab78f88a7947f0ab7e
use-after-free warnings in tcp_v4_connect() due to inet_twsk_hashdance() inserting the object into ehash table without initializing its reference counter
https://lore.kernel.org/netdev/37a477a6-d39e-486b-9577-3463f655a6b7@allelesecurity.com/
slub: Introduce CONFIG_SLUB_RCU_DEBUG
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b8c8ba73c68bb3c3e9dad22f488b86c540c839f9
