The heap is a data segment, just like the stack or bss, except that it's used to store dynamic data.
Also, unlike the other data segments, its size isn't fixed and can be extended using the s/brk system calls.
Those are actually used internally by the glibc's malloc implementation, also known as ptmalloc.
Heap based overflows differ from stack based ones in that we can't overwrite EIP/RIP.
It's important to understand how malloc and free work in order to exploit a heap based overflow, so I'll
give a quick explanation about this. Keep in mind we're talking about ptmalloc here, since that's the
implementation the glibc uses.
ptmalloc uses a struct malloc_chunk to represent allocated memory regions. That is:
When a new memory allocation is made, it's stored right after the previous one, so
we have a double linked list to manage these chunks. We could basically picture memory chunks like this:
- The prev_size
field has different meanings depending on whether the previous chunk is
used (i.e. not free-ed) or not. If it's still being used, this field is part of
its (the previous chunk) data, else, it holds the previous chunk size.
- The size
field, as you guessed, indicates the current chunk data size, which, as we'll see,
isn't the size you actually passed to malloc. When a memory allocation is requested through malloc,
4 is added to the size you pass it, and it's then padded up to the next 8 factor. This is how malloc
works for x86, but more generally, it adds sizeof(int *), then adjusts it to the next sizeof(int *)*2
factor, i.e. add word size and pad it to a double-word size.
To clear things up, I made a simple program while working on this. Here's the interesting
part, which sums up what I was saying about malloc's size parameter:
We can test this function with the following program:
Output for x86:
Output for x86_64:
We can compare end_* with found_* addresses, and they match, as expected.
There's something else about that size
field. Because of that padding, the 3 LSB are always free, so they're used
to store some additional information.
The first LSB is used to indicate whether the previous chunk is in use or not,
it's set to PREV_INUSE if it is. The second LSB indicates whether the allocated chunk was mmap-ed or not, and
the third LSB is not used.
I wrote a simple vulnerable program to test and understand heap based overflow, here it is:
This x86 program simply reads a password from a text file, which we'll assume to be only readable by root (so
the program is SUID),
and compare it with the user input, retrieved with gets(). To exploit this, we won't try to retrieve the
original password, but
rather overflow the buffer where its stored so that both input and passwd buffers match.
Our buffers are 28 bytes long, so actually 32 bytes long in total (padding, see above).
These 32 bytes are laid out this way: input buffer + both prev_size
and size
fields from the passwd
buffer.
Our goal is to fill input and passwd buffers with the same string, say "pwn", so our injection will look like:
pwn\x00AAAAAAAAAAAAAAAAAAAABBBBCCCCpwn\x00
We have our null terminated string, 4 bytes, then 28 bytes of data (A, B & C), B overwriting passwd->prev_size,
and
C passwd->size, finally our null terminated string again. strcmp() will now succeed:
$ ./sploit | ./vuln
Password:
Win!
$
This was a very basic heap based overflow exploit, but I think it's a good start to some more advanced
techniques, which I may introduce later, especially regarding the security checks performed by free().
Resources