Two basic ways to run and test shellcode


There’s a lot of authors, books, and online tutorials talking about shell-coding. I, myself, am in the process of learning a lot of things concerning the difficult and hazardous way of exploiting software, so I sort of though it would be a good idea to be posting my impressions concerning this interesting subject in my blog. The first one is a sort of introduction: the two basic ways of executing shell-code inside a C program, for testing purposes.

First technique: overwriting the return address for main()

If you go and search for shell-code being tested (i.e, executed), using a C-wrapper program, you could end up with something like this:

char shellcode[] =
int main(int argc, char **argv) {
	int *ret;
	ret = (int *)&ret + 2;  
	(*ret) = (int)shellcode;

Okay, we do know that char shellcode[] stores all the opcodes in hexadecimal format for our shellcode.  Then, in order to execute this shellcode, the main function does some sort of a trick. First of all, let’s ensure that the shellcode is executed successfully:

# gcc -m32 -z execstack shell.c -o shell

After compiling the program as an ELF32 binary with the -z execstack flag set, if we run it we get our shell:

root@kali:~# ./shell
# exit

Why? The previous trick I mentioned does this:

  1. First, it defines a variable of type int *, that is, a pointer of type int, inside the main() function. This variable will be located inside main’s stack frame, right after the saved ebp register.
  2. Because our ret variable is located precisely after the saved ebp register, before this saved ebp register will be located the saved return address that has been stored before calling the main() function.
  3. So, apparently, we can use our ret pointer to point to the saved return address mentioned in 2 and overwrite it with the address of our shellcode.

A stack layout for our shell.c program is shown below:

The stack layout for our shellcode test program in C

The stack layout for our shell-code test program in C

So, as clearly shown in the picture above, our ret variable can point to the address where the saved return address is stored (EIP in the picture), and write whatever we want at that address. Because we have defined ret as int *, first we need to set up where it is pointing:

(int *)&ret +2;

That is, the address for the ret variable itself plus 2 (because this is an int *, adding 2 in 32 bit adds, in fact, 8 bytes to the address of ret). Have a look at the bold line in the next code snippet (the assembly code for our main function):

(gdb) disassemble main
Dump of assembler code for function main:
0x080483dc <+0>: push %ebp
0x080483dd <+1>: mov %esp,%ebp
0x080483df <+3>: sub $0x10,%esp
0x080483e2 <+6>: lea -0x4(%ebp),%eax
0x080483e5 <+9>: add $0x8,%eax
0x080483e8 <+12>: mov %eax,-0x4(%ebp)
0x080483eb <+15>: mov -0x4(%ebp),%eax
0x080483ee <+18>: mov $0x8049640,%edx
0x080483f3 <+23>: mov %edx,(%eax)
0x080483f5 <+25>: leave
0x080483f6 <+26>: ret

Well, now it comes as a no surprise that, by computing &ret +2, we have the address for the saved return address before calling our main function. This is a 32 bits memory address, and we store it inside our ret pointer variable. Now, ret is pointing there:

ret = (int *)&ret +2;

To conclude, we need to overwrite the saved return address with the address of our shell-code. Because our ret is a pointer to int (4 bytes in 32 bit architecture), and it is already pointing to the previous saved return address, all we need to do is dereferenced this address and write our shell-code address in it:

(*ret) = (int)shellcode;

If we run the program inside a gdb debugging session, we can have a look at what is inside the address pointed to by ret. Bear in mind that in the disassembly code snippet below, ret is stored in the EAX register:

(gdb) x/x $eax
0xffffd72c: 0x08049640
(gdb) disassemble *0xffffd72c
Dump of assembler code for function shellcode:
0x08049640 <+0>: xor %eax,%eax
0x08049642 <+2>: xor %ebx,%ebx
0x08049644 <+4>: mov $0x17,%al
0x08049646 <+6>: int $0x80
0x08049648 <+8>: jmp 0x8049669 <shellcode+41>
0x0804964a <+10>: pop %esi
0x0804964b <+11>: mov %esi,0x8(%esi)
0x0804964e <+14>: xor %eax,%eax
0x08049650 <+16>: mov %al,0x7(%esi)
0x08049653 <+19>: mov %eax,0xc(%esi)
0x08049656 <+22>: mov $0xb,%al
0x08049658 <+24>: mov %esi,%ebx
0x0804965a <+26>: lea 0x8(%esi),%ecx
0x0804965d <+29>: lea 0xc(%esi),%edx
0x08049660 <+32>: int $0x80
0x08049662 <+34>: xor %ebx,%ebx
0x08049664 <+36>: mov %ebx,%eax
0x08049666 <+38>: inc %eax
0x08049667 <+39>: int $0x80
0x08049669 <+41>: call 0x804964a <shellcode+10>
0x0804966e <+46>: das
0x0804966f <+47>: bound %ebp,0x6e(%ecx)
0x08049672 <+50>: das
0x08049673 <+51>: jae 0x80496dd
0x08049675 <+53>: add %al,(%eax)
End of assembler dump.

So, EAX holds the address of our shell-code. The last lines, painted in blue, are invalid op-code instructions. That is so because we have the string “/bin/sh” at the end of our shell-code, as clearly seen on the the C source code for our program. Therefore, if we interpret these bytes as instructions, we get this abnormal op-code sequence.

So, that’s it! The trick has been explained.

Second technique: use a function pointer!

Now we will be using a function pointer in order to call our shell-code. Basically, we change our main function so now it contains:

int main(int argc, char **argv){
        void (*fp) (void);
        fp = (void *)shellcode;

As you can clearly see in the code snippet above, we define a function pointer using the standard syntax in C, and then we point the function pointer to the address of our shellcode. Finally, we call the function and therefore we end up executing the shell-code.