邪恶八进制信息安全团队技术讨论组's Archiver

EvilOctal 2006-4-15 17:29

[转载]Intro to Functions in Assembly

文章作者:lhall

|=----------------------=[ Functions and Linux Assembly ]=--------------------=|
|=----------------------------------------------------------------------------=|
|=-------------------------=[ [email]lhall@telegenetic.net[/email] ]=------------------------=|


---[ Intro

Start off with a simple C program with a super simple function:

entropy@phalaris {~/asm/functions} cat function.c

void
functOne (void) {
  write(1,"in functOne\n",12); /* write out our string */
  return;               /* and just return */
}

int
main (void) {
  write(1,"in _start\n",12); /* call main _start, using gcc */
  functOne();           /* call functOne */
  write(1,"in _start\n",12); /* call main _start, using gcc */
  exit(0);             /* call exit return value 0 */
}

All this program does is write a string telling us we are in the function
main (we call it _start as we will be using `as` not `gcc` later), call function
functOne which writes a string, and return back to main to write the string
we wrote before the call again, then it calls exit with return value 0.

You could generate the assembly here from gcc (with gcc -S -O0 function.c) but
the asm that is output is a bit confusing. For instance if you look at the asm
generated by `gcc` you'll see it reserving space for local variables:

[...snip...]

functOne:
      pushl  %ebp       /* save the base pointer */
      movl   %esp, %ebp   /* make the stack pointer the base pointer */
      subl   $8, %esp    /* <--- subtract 8 from the stack pointer */
      subl   $4, %esp    /* <--- subtract another 4 from the stack pointer */
      pushl  $12        /* string length */
      pushl  $.LC0      /* address of string */
      pushl  $1        /* to stdout */
      call   write      /* call libc write */
      addl   $16, %esp    /* fix up stack (4 pushl&#39;s)
      leave            /* *leave */
      ret             /* return to caller */

[...snip...]

Notes:

leave,  also known as the procedure epilog, is the same as:

movl  %ebp, %esp
popl  %ebp

enter, also known as the procedure prolog, is the same as:

pushl  %ebp
movl   %esp, %ebp

I suppose it keeps the stack cleaner but the `addl   $16, %esp` seems to take
care of that. Also while learing I think its better to do everything yourself,
and not rely on libc calls, instead using syscalls. If your ever confused or
cant figure a part of asm out this is a good way to at least get an idea of
what to do.

So heres out simple function call program:

entropy@phalaris {~/asm/functions} cat funct.s

.section .data

.equ SYS_WRITE, 4
.equ SYS_EXIT, 1
.equ LINUX_KERNEL, 0x80
.equ STDOUT, 1

_startStr:
  .ascii "in _start\n\0"
functStr:
  .ascii "in functOne\n\0"

.section .text

.type functOne, @function
functOne:
                  # begin procedure prolog
  pushl %ebp         # save the base pointer
  movl  %esp, %ebp     # make the stack pointer the base pointer
                  # end procedure prolog
  movl  $SYS_WRITE, %eax # mov WRITE(4) into eax
  movl  $12, %edx      # length of the string
  movl  $functStr, %ecx  # address of our string
  movl  $STDOUT, %ebx   # writing to stdout
  int  $LINUX_KERNEL   # call the kernel
                  # begin procedure epilog
  movl  %ebp, %esp     # restore the stack pointer
  popl  %ebp         # restore the base pointer
  ret

.globl _start
_start:
  nop              # so our breakpoint will break in gdb
  movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
  movl $10, %edx      # length of the string
  movl $_startStr, %ecx  # address of our string
  movl $STDOUT, %ebx    # writing to stdout
  int  $LINUX_KERNEL    # call the kernel
  call functOne       # call functOne
  movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
  movl $10, %edx      # length of the string
  movl $_startStr, %ecx  # address of our string
  movl $STDOUT, %ebx    # writing to stdout
  int  $LINUX_KERNEL    # call the kernel
  movl $SYS_EXIT, %eax  # mov EXIT(1) into eax
  movl $0, %ebx       # 0 is the return value
  int  $LINUX_KERNEL    # call the kernel

A couple things to notice is we use .equ, equates, to make the code a bit
easier to read, these are similar to #define&#39;s in C. Again this is pretty
simple all we do is write a string in _start, call functOne which prints a
string, we return print the same string as before in _start and then call
exit with return value 0. Everything should be readable while a few thigns
need explination.


---[ Call

The instruction call is how you call functions. What this does is

1) Push the address of the next instruction, the return address, onto the stack.
2) Points %eip to the start of the function, the functions symbol.


---[ Procedure Prolog

Ok so our functions have no arguments or parameters, they are just void. The
first thing a function has to do is called the procedure prolog. It first
saves the current base pointer (ebp) with the instruction pushl %ebp (remember
ebp is the register used for accessing function parameters and local variables).
Now it copies the stack pointer (esp) to the base pointer (ebp) with the
instruction movl %esp, %ebp. This allows you to access the function parameters
as indexes from the base pointer. Local variables are always a subtraction from
ebp, such as -4(%ebp) or (%ebp)-4 for the first local variable, the return value
is always at 4(%ebp) or (%ebp)+4, each parameter or argument is at N*4+4(%ebp)
such as 8(%ebp) for the first argument while the old ebp is at (%ebp). A more
visual diagram of this may be clearer:

argv[1]        12(%ebp)
argv[0]        8(%ebp)
return address   4(%ebp)
old ebp        (%ebp)
local variable 1  -4(%ebp)
local variable 2  -8(%ebp)

Note:
%ebp is the value at %epx, (%ebp) is the address of %ebp.

Moving the stack pointer into the base pointer allows the base pointer to be a
constant reference to the stack frame while in a function. We could not use esp
in a function as we will most likely change it during the execution of the
function itself.

---[ Procedure Epilog

The procedure epilog must do the oppisite of the prolog before a function can
exit, so everything is retured to how it was at the time of the call. With out
restoring the stack frame the ret instruction would have an incorrect value to
return to because the pushed return address wouldnt be at the top of the stack.
To reset the stack pointer we do:

  movl %ebp, %esp  # restore the stack pointer
  popl %ebp      # pop the old ebp back into ebp
  ret          # grab the return address from the stack and jmp to it

Assemble and link funct.s and open it in gdb.

entropy@phalaris {~/asm/functions} as -g funct.s -o funct.o

entropy@phalaris {~/asm/functions} ld funct.o -o funct

entropy@phalaris {~/asm/functions} gdb funct
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-pc-linux-gnu"...Using host libthread_db library
"/lib/libthread_db.so.1".

(gdb) list functOne
13    .section .text
14
15    .type functOne, @function
16    functOne:
17                      # begin procedure prolog
18      pushl %ebp         # save the base pointer
19      movl  %esp, %ebp     # make the stack pointer the base pointer
20                      # end procedure prolog
21      movl  $SYS_WRITE, %eax # mov WRITE(4) into eax
22      movl  $12, %edx      # length of the string
(gdb) <enter>
23      movl  $functStr, %ecx  # address of our string
24      movl  $STDOUT, %ebx   # writing to stdout
25      int  $LINUX_KERNEL   # call the kernel
26                      # begin procedure epilog
27      movl  %ebp, %esp     # restore the stack pointer
28      popl  %ebp         # restore the base pointer
29      ret
(gdb) list _start
28      popl  %ebp         # restore the base pointer
29      ret
30
31    .globl _start
32    _start:
33      nop              # so our breakpoint will break in gdb
34      movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
35      movl $10, %edx      # length of the string
36      movl $_startStr, %ecx  # address of our string
37      movl $STDOUT, %ebx    # writing to stdout
(gdb) <enter>
38      int  $LINUX_KERNEL    # call the kernel
39      call functOne       # call functOne
40      movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
41      movl $10, %edx      # length of the string
42      movl $_startStr, %ecx  # address of our string
43      movl $STDOUT, %ebx    # writing to stdout
44      int  $LINUX_KERNEL    # call the kernel
45      movl $SYS_EXIT, %eax  # mov EXIT(1) into eax
46      movl $0, %ebx       # 0 is the return value
47      int  $LINUX_KERNEL    # call the kernel

Break at the address of _start + 1.

(gdb) break *_start+1
Breakpoint 1 at 0x80480b2: file funct.s, line 34.

Start the program executing.

(gdb) run
Starting program: /home/entropy/asm/functions/funct

Breakpoint 1, _start () at funct.s:34
34      movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
Current language:  auto; currently asm

Breakpoint was hit. Up until the call everthing is pretty clear, I&#39;m just going
to step until the call.

(gdb) step
_start () at funct.s:35
35      movl $10, %edx      # length of the string
(gdb) step
_start () at funct.s:36
36      movl $_startStr, %ecx  # address of our string
(gdb) step
_start () at funct.s:37
37      movl $STDOUT, %ebx    # writing to stdout
(gdb) step
_start () at funct.s:38
38      int  $LINUX_KERNEL    # call the kernel
(gdb) step
in _start
_start () at funct.s:39
39      call functOne       # call functOn

At this point we have written out the string "in _start\n", and the next
instruction will be our call functOne. Disassemble _start and see what the
address of the call functOne is at, look at the next address and that is the
return address that the call instruction should push onto the stack.

(gdb) disassemble _start
Dump of assembler code for function _start:
0x080480b1 <_start+0>:  nop
0x080480b2 <_start+1>:  mov   $0x4,%eax
0x080480b7 <_start+6>:  mov   $0xa,%edx
0x080480bc <_start+11>: mov   $0x80490f0,%ecx
0x080480c1 <_start+16>: mov   $0x1,%ebx
0x080480c6 <_start+21>: int   $0x80
0x080480c8 <_start+23>: call  0x8048094 <functOne>
0x080480cd <_start+28>: mov   $0x4,%eax
0x080480d2 <_start+33>: mov   $0xa,%edx
0x080480d7 <_start+38>: mov   $0x80490f0,%ecx
0x080480dc <_start+43>: mov   $0x1,%ebx
0x080480e1 <_start+48>: int   $0x80
0x080480e3 <_start+50>: mov   $0x1,%eax
0x080480e8 <_start+55>: mov   $0x0,%ebx
0x080480ed <_start+60>: int   $0x80
End of assembler dump.

The address of the call is at 0x080480c8, as shown by the line
0x080480c8 <_start+23>: call  0x8048094 <functOne>, while the address of the
symbol functOne is at 0x8048094. The next address after 0x080480c8 is 0x080480cd,
so this is what we should see at the top of the stack immeditly after the call
instruction is executed.

(gdb) step
functOne () at funct.s:18

Our call has been executed, take a look at the registers.
We see esp is pointing to the address 0xbfdc17ec, take a look to see what that
points too.

18      pushl %ebp         # save the base pointer
(gdb) info reg
eax        0xa    10
ecx        0x80490f0      134516976
edx        0xa    10
ebx        0x1    1
esp        0xbfdc17ec     0xbfdc17ec
ebp        0x0    0x0
esi        0x0    0
edi        0x0    0
eip        0x8048094      0x8048094
eflags      0x246   582
cs         0x73    115
ss         0x7b    123
ds         0x7b    123
es         0x7b    123
fs         0x0    0
gs         0x0    0

Examine in hex the address at 0xbfdc17ec.

(gdb) x/x 0xbfdc17ec
0xbfdc17ec:    0x080480cd

And its the return address seen from the disassembly. Everything in the function
should now be understandable so just step through it.

(gdb) step
functOne () at funct.s:19
19      movl  %esp, %ebp     # make the stack pointer the base pointer
(gdb) step
functOne () at funct.s:21
21      movl  $SYS_WRITE, %eax # mov WRITE(4) into eax
(gdb) step
22      movl  $12, %edx      # length of the string
(gdb) step
23      movl  $functStr, %ecx  # address of our string
(gdb) step
24      movl  $STDOUT, %ebx   # writing to stdout
(gdb) step
25      int  $LINUX_KERNEL   # call the kernel
(gdb) step
in functOne
27      movl  %ebp, %esp     # restore the stack pointer
(gdb) step
28      popl  %ebp         # restore the base pointer
(gdb) step
functOne () at funct.s:29
29      ret

Here the instruction ret is going to jmp (%esp), so take a look at what the
value is at the address of %esp.

(gdb) x/x $esp
0xbfdc17ec:    0x080480cd

Return address from before, so we will return to the instruction right after
the call to functOne.

(gdb) step
_start () at funct.s:40
40      movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
(gdb) list
35      movl $10, %edx      # length of the string
36      movl $_startStr, %ecx  # address of our string
37      movl $STDOUT, %ebx    # writing to stdout
38      int  $LINUX_KERNEL    # call the kernel
39      call functOne       # call functOne
40      movl $SYS_WRITE, %eax  # mov WRITE(4) into eax
41      movl $10, %edx      # length of the string
42      movl $_startStr, %ecx  # address of our string
43      movl $STDOUT, %ebx    # writing to stdout
44      int  $LINUX_KERNEL    # call the kernel

(gdb)

You can see we are at line 40 now, and in the disassembly:
(gdb) disassemble _start
Dump of assembler code for function _start:
0x080480b1 <_start+0>:  nop
0x080480b2 <_start+1>:  mov   $0x4,%eax
0x080480b7 <_start+6>:  mov   $0xa,%edx
0x080480bc <_start+11>: mov   $0x80490f0,%ecx
0x080480c1 <_start+16>: mov   $0x1,%ebx
0x080480c6 <_start+21>: int   $0x80
0x080480c8 <_start+23>: call  0x8048094 <functOne>
0x080480cd <_start+28>: mov   $0x4,%eax
0x080480d2 <_start+33>: mov   $0xa,%edx
0x080480d7 <_start+38>: mov   $0x80490f0,%ecx
0x080480dc <_start+43>: mov   $0x1,%ebx
0x080480e1 <_start+48>: int   $0x80
0x080480e3 <_start+50>: mov   $0x1,%eax
0x080480e8 <_start+55>: mov   $0x0,%ebx
0x080480ed <_start+60>: int   $0x80
End of assembler dump.

we are at the line 0x080480cd <_start+28>: mov   $0x4,%eax.

The rest is pretty easy, we just write out our string and call exit.

(gdb) step
_start () at funct.s:41
41      movl $10, %edx      # length of the string
(gdb)
_start () at funct.s:42
42      movl $_startStr, %ecx  # address of our string
(gdb)
_start () at funct.s:43
43      movl $STDOUT, %ebx    # writing to stdout
(gdb)
_start () at funct.s:44
44      int  $LINUX_KERNEL    # call the kernel
(gdb)
in _start
_start () at funct.s:45
45      movl $SYS_EXIT, %eax  # mov EXIT(1) into eax
(gdb)
_start () at funct.s:46
46      movl $0, %ebx       # 0 is the return value
(gdb)
_start () at funct.s:47
47      int  $LINUX_KERNEL    # call the kernel
(gdb)

Program exited normally.

页: [1]
© 1999-2008 EvilOctal Security Team