The Futex Vulnerability

By |2018-09-25T16:26:12+00:00September 11th, 2014|

Here at Appdome, we take mobile security seriously. Naturally, the news of a new rooting mechanism aroused our curiosity and concern.

Our first instinct was to search the web for any additional information about this exploit and the vulnerability behind it. However, the scarcity of publicly available resources prompted us to perform our own analysis of the vulnerability and the exploit, and try to distill some elements which could be used in order to boost the security level we provide in our solutions.The results of this analysis, we decided, ought to be shared with the public.

Due to the complexity of the vulnerability and exploit, and the level of detail we had set to ourselves as a goal, this report will be divided into several posts, beginning with the vulnerability and the bugs which made it possible.

A brief history

On May 26th, 2014, comex reported a potential bug in the Linux kernel futex mechanism. Later, as the bug was verified and logged as CVE-2014-3153, geohot managed to leverage this vulnerability into a full blown privilege-escalation which he packaged as an Android application and dubbed TowelRoot. This TowelRoot essentially allowed one to root devices which up until that time were unrootable, such as the Samsung S5.

Vulnerability

The vulnerability is based on two logical bugs in the Linux kernel futex mechanism: one which allows releasing a pi-lock from userspace, and another which allows requeuing a lock twice.

For those unfamiliar with futex, futex stands for Fast Userspace muTEX, and I suggest reading about it in Wikipedia and here.

For the purpose of recreating and testing the vulnerability, I had set up an Android emulator. The following section will discuss the setting up a emulator for kernel debugging, if you already know how to do that or are only interested in the analysis, you may skip over to the “Relock bug” section.

Just one thing before starting, I’ll be heavily referencing functions (by name) from the kernel, so you should be ready with the source files for the futex (futex.c, futex.h) and rtmutex (rtmutex.c, rtmutex_common.h) modules at hand.

Setting up the environment

Required software:

  1. Android studio. We’ll need this for the android emulator and other supporting tools.
    Since I am using Linux as my setup, I will be picking the Linux package.
    After downloading the package, unpack it to your $ANDROID_HOME directory. You should then add it to your $PATH environment variable:

    ?
    1
    2
    export PATH=${PATH}:${ANDROID_HOME}/android-studio/sdk/tools
    export LD_LIBRARY_PATH=${ANDROID_HOME}/android-studio/sdk/tools/lib
  2. Git scm. Android’s sources are hosted in a git repository.
  3. Cross-compilation toolchain:
    ?
    1
    2
    3
    cd $ANDROID_HOME
    $ git clone https://android.googlesource.com/platform/prebuilt
    export PATH=${PATH}:${ANDROID_HOME}/prebuilt/linux-x86/toolchain/arm-eabi-4.4.3/bin

Compiling the Android kernel

The official guide to compiling the Android kernel is https://source.android.com/source/building-kernels.html, the following is based on that guide with some minor modifications:

  1. First, fetch a copy of the goldfish kernel source (according to the guide this is the kernel suitable for emulated environments):
    ?
    1
    2
    cd $ANDROID_HOME
    $ git clone https://android.googlesource.com/kernel/goldfish.git
  2. We will be interested in version 3.4 of the kernel:
    ?
    1
    2
    cd ${ANDROID_HOME}/goldfish
    $ git checkout -t origin/android-goldfish-3.4 -b goldfish 3.4
  3. And we also want to undo the fixes applied to the futex module. A quick glance at the history of kernel/futex.c will reveal that the last revision before the fixes following CVE-2014-3153 is e8c92d268b8b8feb550ca8d24a92c1c98ed65ace:
    ?
    1
    2
    cd ${ANDROID_HOME}/goldfish
    $ git checkout e8c92d268b8b8feb550ca8d24a92c1c98ed65ace kernel/futex.c
  4. We are now ready to start compiling the kernel. First we need to generate the build configuration:
    ?
    1
    2
    cd ${ANDROID_HOME}/goldfish
    make ARCH=arm SUBARCH=arm goldfish_armv7_defconfig
  5. Before we continue, we want to enable debug symbols. In ${ANDROID_HOME}/.config, set CONFIG_DEBUG_INFO parameter to y.
  6. And we’re ready to make:
    ?
    1
    2
    3
    4
    5
    6
    cd ${ANDROID_HOME}/goldfish
    make ARCH=arm SUBARCH=arm CROSS_COMPILE=${ANDROID_HOME}/prebuilt/linux-x86/toolchain/arm-eabi-4.4.3/bin/arm-eabi-
    ...
    Compile the kernel with debug info (DEBUG_INFO) [Y/n/?] y
      Reduce debugging information (DEBUG_INFO_REDUCED) [N/y/?] (NEW) N
    ...

Running  the emulator

First we need to set up a machine:

?
1
2
3
4
5
6
7
8
9
$ android create avd --name debug --target android-19
Auto-selecting single ABI armeabi-v7a
Android 4.4.2 is a basic Android platform.
Do you wish to create a custom hardware profile [no]
Created AVD 'debug' based on Android 4.4.2, ARM (armeabi-v7a) processor,
with the following hardware config:
hw.lcd.density=240
hw.ramSize=512
vm.heapSize=48

And we’re ready to run:

?
1
emulator-arm -verbose -debug init -show-kernel -kernel arch/arm/boot/zImage -avd debug -no-boot-anim -no-skin -no-audio -no-window -qemu -gdb tcp::1234

Here’s a breakdown of all the switches:

?
1
2
3
4
5
6
7
8
9
10
11
$ emulator-arm                    
>     -verbose                    
>     -debug init                   # Enable debug messages
>     -show-kernel                  # Show kernel messages
>     -kernel arch/arm/boot/zImage  # Path to the image of the kernel we just compiled
>     -avd debug                    # Use the android device setup named "debug"
>     -no-boot-anim                 # We don't want-
>     -no-skin                      # the emulated machine-
>     -no-audio                     # to have any sort-
>     -no-window                    # of GUI
>     -qemu -gdb tcp::1234           # Notify qemu to set up a gdbserver on localhost:1234

We can now fire up gdb from another terminal, and attach to the gdbserver:

?
1
2
3
4
5
6
7
8
9
10
cd ${ANDROID_HOME}/goldfish
$ gdb vmlinux
...
Reading symbols from vmlinux...done.
(gdb) target remote :1234
Remote debugging using :1234
warning: Architecture rejected target-supplied description
0xc000dcc4 in sys_call_table () at arch/arm/kernel/entry-common.S:494
494             b       ret_slow_syscall
(gdb)

And we’re ready to start working.

Relock bug

The core of futex_lock_pi is futex_lock_pi_atomic which attempts to acquire the lock at uaddr by an atomic cmpxchg of the contents of uaddr and the tid of the calling thread. Basically this (roughly) translates to:

?
1
2
3
if (*uaddr == 0) {
    *uaddr = tid; /* Lock acquired */
}

If the lock was acquired, the system call returns and the thread holds the lock. If not, then the system call blocks and waits for the lock to be released:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
static int futex_lock_pi_atomic(u32 __user *uaddr, struct futex_hash_bucket *hb,
                union futex_key *key,
                struct futex_pi_state **ps,
                struct task_struct *task, int set_waiters)
{
...
    /* Attempt a cmpxchg */
    if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, 0, newval)))
        return -EFAULT;
...
    /* Check that the cmpxchg succeeded in a 0->TID transition */
    if (unlikely(!curval))
        return 1;
...
}

The interesting point about the lock acquisition is that the only requisite is that the cmpxchg succeeds, which is completely unrelated to the acquisition state of the futex key related to that address. This means that a futex lock can be released by simply loading the lock’s (userspace) address with 0.

This is more readily understood with an example. Consider the following test program (apologies in advance for the excessive verbosity):

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include
#include
#include
 
#include "futaux.h"
 
int mutex = 0;
 
void *thread(void *arg)
{
    int ret = 0;
    pid_t tid = gettid();
    printf("Entering threadn", tid);
    while (1) {
        sleep(1);
    }
}
 
int main(int argc, char *argv[])
{
    pthread_t t;
    printf("pid = %dn", getpid());
    printf("mutex = %dn", mutex);
    printf("Acquiring lockn");
    if (futex_lock_pi(&mutex)) {
        perror("Lock error");
        return -1;
    }
    printf("Mutex acquired (mutex = %d)n", mutex);
    if (argc > 1) {
        printf("Releasing lockn");
        mutex = 0;
        printf("mutex = %dn", mutex);
    }
    if (pthread_create(&t, NULL, thread, NULL)) {
        perror("Could not create thread");
        return -1;
    }
    while (1) {
        sleep(1);
    }
    return 0;
}

The output when running this on the emulator is:

?
1
2
3
4
5
6
7
$ adb shell /data/relock
pid = 394
mutex = 0
Acquiring lock
Mutex acquired (mutex = 394)
Entering thread[396]
^C

Now the same execution but with a manual release of the lock after it is acquired, and before the thread is spawned:

?
1
2
3
4
5
6
7
8
9
10
11
$ adb shell /data/relock a
pid = 453
mutex = 0
Acquiring lock
Mutex acquired (mutex = 453)
Releasing lock
mutex = 0
Entering thread[459]
Mutex acquired (mutex = 459)
Exiting thread[459]
^C

I’d like to add just a few words about why this is a bug. The entire purpose of the futex module is to manage lock/release/contention of locks on behalf of the user. The cmpxchg is performed as a sort of optimization in order to grab a lock quickly, however this does not absolve the kernel from managing the internal state of the lock. You may examine the fix for this bug by yourself to better understand what I meant.

Requeue bug

The syscall futex_wait_requeue_pi basically means “Take the lock uaddr1 and wait for uaddr2 to be available, once it is, take it and release uaddr1″.

This call must be paired with futex_requeue. While futex_wait_requeue_pi just sets up the queued waiter, futex_requeue does the heavy lifting and adds that waiter to various waiting lists and takes care of waking it up.

For example, a normal usage would be:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/* Locks */
int lock1 = 0, lock2 = 0;
 
void thread_a(void *arg)
{
...
    int ret = futex_wait_requeue_pi(&lock1, &lock2);
    /* The system call will now block and waits for some other thread to acquire
    lock2 on its behalf and wake it up */
...
}
 
void thread_b(void *arg)
{
...
    int ret = futex_requeue(&lock1, &lock2, lock1);
    /* The kernel will now try to acquire lock2 for thread_a. On success, it will
    wake up the kernel thread, while on failure it will add a waiter object to a
    list of waiters waiting on lock2, and let the rtmutex module handle the wakeup */
...
}

From the above description, it’s clear that each futex_wait_requeue_pi is paired with a single futex_requeue call, and there are measures implemented to prevent further calls with the same parameters.

To be more precise, if we call futex_wait_requeue_pi(&lock1, &lock2) and then futex_requeue(&lock1, &lock2, lock1), we can not call another futex_requeue(&lock1, &lock2, lock1) as it will return an error.

However, if we were to follow the first requeue with a futex_requeue(&lock2, &lock2, lock2) the code will not kick us out of the system call.

Combining the bugs

It’s important to understand that there are two ways a sleeping futex_wait_requeue_pi can be woken, and both are (apparently) mutually exclusive:

  1. The lock was acquired during futex_proxy_trylock_atomic, in which case the sleeper is awoken and futex_requeue will return. End of story.
  2. The lock was acquired at the bottom of the futex_requeue (during the foreach clause) by rt_mutex_start_proxy_lock. Again, end of story.
    By the way, if rt_mutex_start_proxy_lock failed to acquire the lock immediately, it will just add a waiter to the list.

However, I must (and did) emphasize the “apparently” in the previous sentence. Because as it turns out, both relock and requeue bug allow us to both add a waiter, and wake up the sleeping kernel thread (while the waiter is still in the list).

Consider the following usage which takes advantage of both bugs:

requeue

  1. (thread-1futex_lock_pi(&B)
    This will make thread-1 acquire B, so that the next line will block
  2. (thread-2futex_wait_requeue_pi(&A, &B)
    Since B is owned (by thread-1), this call will set up the futex_q object with a pointer to a temporary rt_waiter object on the kernel’s stack (that’s Chekhov’s gun right there).
  3. (thread-1futex_requeue(&A, &B)
    This will try to acquire B on behalf of thread-1, fail, and proceed to add the rt_waiter object in the futex_q object associated with the requeue to list of waiters on B.
    This is a rough outline of the code flow futex_requeue:

    ?
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    static int futex_requeue(u32 __user *uaddr1, unsigned int flags,
                 u32 __user *uaddr2, int nr_wake, int nr_requeue,
                 u32 *cmpval, int requeue_pi)
    {
    ...
        if (requeue_pi && (task_count - nr_wake < nr_requeue)) {
            /* Branch taken */
            ret = futex_proxy_trylock_atomic(uaddr2, hb1, hb2, &key1,
                    &key2, &pi_state, nr_requeue);
            /* The call will return 0 (the lock is occupied) */
    ...
            switch (ret) {
            case 0:
                /* End of this clause */
                break;
    ...
            }
        }
        head1 = &hb1->chain;
        plist_for_each_entry_safe(this, next, head1, list) {
    ...
            /* requeue_pi == 1 */
            if (requeue_pi) {
                atomic_inc(&pi_state->refcount);
                this->pi_state = pi_state;
                ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex,
                                this->rt_waiter,
                                this->task, 1);
                /* This call will return 0, as the lock was not taken
                 * so rt_waiter in futex_wait_requeue_pi's stack frame
                 * has been added to the list of waiters on the lock */
    ...
            }
    ...
        }
    ...
    }
  4. (thread-1) set B to 0,
    Release B (relock bug)
  5. (thread-1futex_requeue(&B, &B).
    The requeue bug here allows us to issue this call, and survive long enough in the system call to reach futex_proxy_trylock_atomic, which will, again, attempt to acquire B for thread-1. This time, the call will succeed, as the lock is apparently free, and so the next time this that happens the kernel thread of thread-2 is woken.
    This time, the flow through futex_requeue is:

About the Author:

Scroll Up
WordPress Video Lightbox Plugin