Latest Tweets

Defeating an ELF32 binary with absolutely no leaks without using the ret2_dlresolve technique

 The binary

I was presented with an ELF32 binary with the following protections:

ch77 protections

Disassembling the binary with r2, I quickly recognized a classic stack overflow by abusing the call to read:

There’s a buffer overflow in the read function.

read() would read up to 0x64 bytes into the buffer pointed to by *buf, clearly overwriting EBP and RET. Apart from this more than obvious stack overflow, this binary did not have any interesting functions to analyse. As you can imagine, there was no feasible way to leak data, and we had to assume that ASLR on the remote server was enabled. With no puts, write, printf or alike calls within the binary, we were supposed to use the ret2_dlresolve technique. This, of course, would be the only technique giving us a 100% chance of success, no matter which version of the GNU/Libc6 the remote server was running, and without the need of leaking some data from the program (which, in turn, was not an obvious thing to accomplish).

So I read about this technique and wrote the exploit. But then I sort of though: what if I can leak some data anyways? Wouldn’t that be a good way of improving my knowledge about powning, even if my new exploit was obscenely a lot less reliable? I mean, why not?

Write-what-where

Let’s suppose that the binary we were provided with was running on the very same system where it was originally built. I started by reading the data put into the .comment section of the binary by gcc itself:

readelf -p .comment ch77

String dump of section ‘.comment’:
[ 0] GCC: (Ubuntu 4.8.5-4ubuntu8) 4.8.5

A quick search on Google gave me that that version of the compiler shipped with Ubuntu Xenial Xerus (16.04), and the GNU/Libc6 version for this GNU/Distro was 2.23. So I downloaded this version of the library by using the libc database. Well, at this point I knew that in order to leak data, I needed to replace the call to read with a call to, maybe, write. But of course you cannot do that by just overwriting the .got.plt entry for read() with the OFFSET of the write() function. Then I though: but of course you can, as long as that offset fits within one byte! So I checked both OFFSETS:

[*] Libc6 read offset: 0xd4350
[*] Libc6 write offset: 0xd43c0

These OFFSETS where obtained from the  libc6-i386 package on Xenial Xerus; when it came to the libc6:i386 package (after adding the i386 architecture with dpkg –add-architecture i386), I got the following offsets:

[*] Libc6 read offset: 0xd5b00
[*] Libc6 write offset: 0xd5b70

I tried this on a Debian Jessie LTS with libc6 version 2.19 as well and I got the following OFFSETS:

[*] Libc6 read offset: 0xc8810
[*] Libc6 write offset: 0xc8880

So yes; by overwriting the LSB of the .got.plt entry for read() with the right byte offset of write(), I would have a call to write() any time I forced the binary to execute call read@plt. Thanks to the first read call, I had a write-what-where primitive, so overwriting the LSB was a piece of cake:

ROP to overwrite LSB of .got.plt for read() with the LSB of write’s OFFSET.

After that, the leak was easy too: because now read() was in fact a call to write(), all I had to do was to call read@plt yet again, forging write() arguments so that I could get write’s libc6 address leaked into stdout:

ROP to leak to stdout the address of write.

By running the exploit I had so far, my leak was working fine and I could get all the interesting addresses (system(), the string “/bin/sh”) and so on. But I had a new problem now: because the call to read had been overwritten with write(), now I had no way of inputting more data to the program!

Leaking write’s address right off the binary’s memory.

Restoring read()

The total number of bytes read by the call to read() was 0x64. I needed 20 bytes of padding in order to reach EBP and then RET. Which means I was left with a total of 0x64 -0x18 = 0x4c bytes (76 in decimal) for my ROP chain. Because I had already wasted 0x28 bytes to overwrite read@got.plt and then call write to leak its own address to stdout, I was in fact left with 0x24 bytes of a ROP chain. My idea was, of course, to yet again overwrite LSB of read@got.plt but this time with its original value. This would restore the read() function and so I would be able to input a classic ROP-shell attack to spawn a shell and powned the program. But, how?

I started looking for ROP gadgets within the binary. I sort of though that a good way of restoring the original byte would be to XOR it. So I looked for gadgets that would allow me to XOR something with an address of my control. I found these:

ROP gadgets to XOR a value on memory.

The second ROP gadget (skipping the first instruction, push cs), seemed like a good choice. I needed to make sure that EBP held the value read@got.plt-0xe, and CL the right byte (key) used for the XOR operation in order to restore the original LSB of read(). Which byte that should be? Easy; I computed a XOR key like this:

xor_byte_cl = libc6_read_byte ^ libc6_write_byte

I kept looking for ROP gadgets that would allow me to write the XOR key to CL, or ECX, to no avail. Putting the address read@got.plt-0xe into EBP was easy; a pop ebp; ret gadget would do it. I had that ROP, so I focused all my attention to get the XOR key into CL. After a while, I realized the following truth: if you do not have the proper ROP gadget that suits you, go abuse a legitimate call to a function that will do all the dirty job for you. Internally, the parameters passed to a libc6 function by the stack (on 32 bits) will eventually ended up into registers when making a system call (int 0x80). So for the call to write I had the following:

Before making a system call, the parameters are passed into registers. ECX will hold the buffer to output from.

The third parameter to write, i.e., the total number of bytes to write to, would be put into the EDX register. The buffer where to read data from would be put into the ECX register. The file descriptor to output data to would be put into the EBX register and finally EAX will hold the system call id, 0x4, for write. What I found out by debugging my exploit was that the value put into ECX would hold right after calling write. So by forging a new ROP chain after the leak that would make a new call to write with the second parameter set to a valid address within the program’s memory ending with the XOR key, I would have the XOR key in the CL register.

I wrote this ROP chain:

rop_cl_90 = p32(binary_read_plt)
rop_cl_90 += p32(clean_3_args_stack)
rop_cl_90 += p32(0x1) + p32(patch_address) + p32(0x4)

The offset patch_address was computed this way:

patch_address = 0x804a000 + xor_byte_cl

The only problem with this ROP chain was that I was left with a total of 12 bytes. I still needed to put read@got.plt-0xe into EBP by popping off the stack its value. That meant 8 more bytes (4 for the ROP gadget pop ebp; ret) and 4 more bytes for the actual address I wanted: 0x8049ffe. Then, I needed 4 more bytes to call the ROP gadget that would XOR the LSB of read@got.plt. So it was obvious that I could indeed restore read with those 12 left-over bytes of ROP magic, but from that point on I would not be able to call main() again to get another chance of inputting my ROP-shell! Stuck again.

Let’s clean 4 registers instead of 3

I needed to save 4 bytes and that would be all. Instead of using a ROP gadget that would pop 3 registers, I needed one that would pop 4 registers (the three write parameters already on the stack plus a new one, the address I wanted into EBP). For this to work, of course, EBP should be the last register in the ROP gadget sequence of pops. I found that ROP gadget:

0x080484b8: pop ebx; pop esi; pop edi; pop ebp; ret;

So I rewrote my original ROP like this:

The final ROP chain to restore read’s LSB.

I crafted the third parameter of write to be that of a valid memory address within the program (as a matter of fact, the .bss symbol workaround ;-)) because this value would be popped into EDI, and then the ROP gadget I was about to use to XOR the LSB of .got.plt had a second instruction that might segfault the program:

The second instruction could segfault the program if EDI+0xe was pointing to a non-valid memory address.

At the same time, this was a large value. Because it was used first as the third parameter of write, that meant write will read up to 0x804a040 (workaround address within the .bss segment) bytes of data. As long as this was not segfaulting, I did not care because I could just discard any junk coming from the program.

So the last part of my first ROP attack would be to put the desired value of EBP into the stack, call the ROP gadget that would XOR the key with the LSB of read@got.plt and then call back to main in order to get a second chance of inputting my ROP-shell:

The last part of my ROP attack to restore read().

Trying it out

Once my first ROP attack was written, I tried it out by sending my exploit to the program, waiting for the second call to read and then sending it the LSB of the write function. After that, the program should be sending me 4 bytes back, the address of libc6’s write:

exploit = “A”*padding + p32(0xdeadbeef) + rop_overwrite + rop_write + rop_cl_90 + rop_restore_read
r.send(exploit)
r.send(p8(libc6_write_byte))
leaked_write = int(r.recv(4)[::-1].encode(“hex”),16)

After this first attack, performed in a single shot, the program was back to main and thus waiting for me to input more data. This time, my ROP-shell that was finally populated with the right addresses for system() and the string “/bin/sh”:

Sending my ROP-shell and gaining a remote shell.

Once my exploit was completed, I ran it against Xenial Xerus and Debian Jessie LTS with success:

Running my exploit and gaining a shell without ret2_dlresolve.

You can get the binary and my exploit and try it out yourselves. Of course, this exploit is not reliable when you do not know the libc6 version of the remote server. Besides, it only works with libc6 versions where the offset between write and read are less than 0xff bytes. The intended way was to use ret2_dlresolve, which is 100% reliable for this case. But anyway, I think it’s been interesting to pull this off and I tried harder, I think. No doubt there may be some other ROP gadgets or some other possible combinations so that I could do the same thing with even less bytes, I’m sure. So feel free to let me know!