发新话题
打印

[转载]Advanced Function Hooking

[转载]Advanced Function Hooking

信息来源:www.phrack.org

==Phrack Inc.==

          Volume 0x0b, Issue 0x3a, Phile #0x08 of 0x0e

|=-----------------=[ IA32 ADVANCED FUNCTION HOOKING ]=------------------=|
|=-----------------------------------------------------------------------=|
|=-------------------=[ mayhem  <mayhem@hert.org> ]=---------------------=|
|=-----------------------=[ December 08th 2001 ]=------------------------=|


--[ Contents

1 - Introduction
  1.1 - History
  1.2 - New requirements

2 - Hooking basics
  2.1 - Usual techniques
  2.2 - Things not to forget

3 - The code explained

4 - Using the library
  4.1 - The API
  4.2 - Kernel symbol resolution
  4.3 - The hook_t object

5 - Testing the code
  5.1 - Loading the module
  5.2 - Playing around a bit
  5.3 - The code

6 - References
  



--[ 1 - Introduction

  
  Abusing, logging , patching , or even debugging : obvious reasons to think
  that hooking matters . We will try to understand how it works . The
  demonstration context is the Linux kernel environment . The articles ends
  with a general purpose hooking library the linux kernel 2.4 serie,
  developped on 2.4.5 and running on IA32, it&#39;s called LKH, the Linux Kernel
  Hooker.


----[ 1.1 - History

  One of the reference on the  function hijacking subject subject has
  been released in November 1999 and is written by Silvio Cesare
  (hi dude ;-). This implementation was pretty straightforward since
  the hooking was consisting in modifying the first bytes of the
  function jumping to another code , in order to filter access on the
  acct_process function of the kernel, keeping specific processes from
  beeing accounted .


----[ 1.2 - New requirements


Some work has been done since that time :

- Pragmatic use of redirection often (always ?) need to access the
  original parameters, whatever their number and their size (for example
  if we want to modify and forward IP packets) .  

- We may need to disable the hook on demand, which is perfect for runtime
  kernel configuration . We may want to call the original functions
  (discrete hooking, used by monitoring programs) or not (aggressive hooking,
  used by security patches to manage ACL  - Access Control Lists - ) on kernel
  ojects .

- In some cases, we may also want to destroy the hook just after the first
  call, for example to do statistics (we can hook one time every seconds or
  every minuts) .



--[ 2 - Hooking basics


----[ 2.1 Usual techniques


Of course, the core hooking code must be done in assembly language, but the
hooking wrapping code is done in C . The LKH high level interface is described
in the API section . May we first understand some hooking basics .

This is basicaly what is hooking :

- Modify the begin of a function code to points to another code
  (called the &#39;hooking code&#39;) . This is a very old and efficient way
  to do what we want . The other way to do this is to patch every calls
  in the code segment referencing the function . This second method
  has some advantages (it&#39;s very stealth) but the implementation is a bit
  complex (memory area blocks parsing, then code scanning) and not very
  fast .

- Modify in runtime the function return address to takes control when the
  hooked function execution is over .

- The hook code must have two different parts, the first one must be
  executed before the function (prepare the stack for accessing para-
  meters, launch callbacks, restore the old function code) , the second
  one must be executed after (reset the hook again if needed)

- Default parameters (defining the hook behaviour) must be set during
  the hook creation (before modifying the function code) . Function
  dependant parameters must be fixed now .

- Add callbacks . Each callback can access and even modify the original
  function parameters .

- Enable, disable, change parameters, add or remove callbacks when we want .




----[ 2.2 - Things not to forget


  -> Functions without frame pointer:

  A important feature is the capability to hook functions compiled with the
  -fomit-frame-pointer gcc option . This feature requires the hooking code to
  be %ebp free , that&#39;s why we will only %esp is used for stack operations.
  We also have to update some part (Some bytes here and there) to fix %ebp
  relative offsets in the hook code . Look at khook_create() in lkh.c for more
  details on that subject .

  The hook code also has to be position independant . That&#39;s why so many
  offsets in the hookcode are fixed in runtime (Since we are in the kernel,
  offsets have to be fixed during the hook creation, but very similar
  techniques can be used for function hooking in *runtime* processes).


  -> Recursion

  We must be able to call the original function from a callback, so the
  original code has t be restored before the execution of any callback .

  
-> Return values

  We must returns the correct value in %eax, wether we have callbacks or no,
  wether the original function is called or no . In the demonstration, the
  return value of the last executed callback is returned if the original
  function is not called . If no callbacks and no original function is called,
  the return value is beyond control.


  -> POST callbacks

  You cannot access function parameters if you execute callbacks after the
  original function . That&#39;s why it&#39;s a bad idea . However, here is the
  technique to do it :
  
  - Set the hook as aggressive

  - Call the PRE callbacks .

  - Call the original function from a callback with its own parameters .

  - Call the POST callbacks .




--[ 3 - The code explained .


   First we install the hook.

   A - Overwrite the first 7 bytes of the hijacked routine
      with an indirect jump pointing to the hook code area .

      The offset put in %eax is the obsolute address of the hook
      code, so each time we&#39;ll call the hijack_me() function,
      the hook code will takes control .

      Before hijack:

      0x80485ec <hijack_me>:       mov   0x4(%esp,1),%eax
      0x80485f0 <hijack_me+4>:      push  %eax
      0x80485f1 <hijack_me+5>:      push  $0x8048e00
      0x80485f6 <hijack_me+10>:     call  0x80484f0 <printf>
      0x80485fb <hijack_me+15>:     add   $0x8,%esp


      After the hijack:

      0x80485ec <hijack_me>:       mov   $0x804a323,%eax
      0x80485f1 <hijack_me+5>:      jmp   *%eax
      0x80485f3 <hijack_me+7>:      movl  (%eax,%ecx,1),%es
      0x80485f6 <hijack_me+10>:     call  0x80484f0 <printf>
      0x80485fb <hijack_me+15>:     add   $0x8,%esp
      
      The 3 instructions displayed after the jmp dont means anything ,
      since gdb is fooled by our hook .
   

   B - Reset the original bytes of the hooked function, we need that if
      we want to call the original function without breaking things .

        pusha
        movl      $0x00, %esi              (1)
        movl      $0x00, %edi              (2)
        push      %ds
        pop      %es
        cld
        xor      %ecx, %ecx
        movb      $0x07, %cl
        rep movsl            


      The two NULL offsets have actually been modified during the hook
      creation (since their values depends on the hooked function offset,
      we have to patch the hook code in runtime) . (1) is fixed with
      the offset of the buffer containing the first 7 saved bytes of the
      original function . (2) is fixed with the original function address.
      If you are familiar with the x86 assembly langage, you should know
      that these instructions will copy %ecx bytes from %ds:%esi to
      %es:%edi . Refers to [2] for the INTEL instructions specifications.

      
   C - Initialise the stack to allow parameters read/write access and
      launch our callbacks . We move the first original parameter
      address in %eax then we push it .

        leal      8(%esp), %eax
        push      %eax
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop
        nop; nop; nop; nop; nop         


      Note that empty slots are full of NOP instruction (opcode 0x90) .
      This mean no operation . When a slot is filled (using khook_add_entry
      function) , 5 bytes are used :

      - The call opcode (opcode 0xE8)

      - The calback offset (4 bytes relative address)

  We choose to set a maximum of 8 callbacks . Each of the inserted
  callbacks are called with one parameter (the %eax pushed value contains
  the address of the original function parameters, reposing the stack).
            



   D - Reset the stack .

        add $0x04, %esp         

      We now remove the original function&#39;s parameter address
      pushed in (C) . That way, %esp is reset to its old value (the
      one before entering the step C). At this moment, the stack
      does not contains the original function&#39;s stack frame since it
      was overwritten on step (A) .


   E - Modify the return address of the original function on the stack .
      On INTEL processors, functions return addresses are saved on the stack,
      which is not a very good idea for security reasons ;-) . This
      modification makes us return where we want (to the hook-code)
      after the original function execution. Then we call the original
      function. On return, the hook code regains control . Let&#39;s look at
  that carefully :


      -> First we get our actual %eip and save it in %esi (the end
        labels points to some code you can easily identify on
        step E5). This trick is always used in position independant
        code.

      1.  jmp      end
        begin:
        pop      %esi              


      -> Then we retreive the old return address reposing
        at 4(%esp) and save it in %eax .

      2.  movl      4(%esp), %eax

      -> We use that saved return address as an 4 bytes offset
        at the end of the hook code (see the NULL pointer in
        step H), so we could return to the right place at the
        end of the hooking process .

      3.  movl      %eax, 20(%esi)      


      -> We modify the return address of the original function
        so we could return just after the &#39;call begin&#39; instruction .

      4.  movl      %esi, 4(%esp)
        movl      $0x00, %eax


      -> We call the original function . The &#39;end&#39; label is used
        in step 1, and the &#39;begin&#39; label points the code just
        after the "jmp end" (still in step 1) .
        The original function will return just after the &#39;call begin&#39;
        instruction since we changed its return address .


      5.  jmp      *%eax
        end:
        call      begin


    F - Back to the hooking code . We set again the 7 evil bytes in the
      original function &#39;s code . These bytes were reset to their original
   values before calling the function, so we need to hook the function
      again (like in step A) .

      This step is noped (replaced by NOP instructions) if the hook is
      single-shot (not permanent), so the 7 bytes of our evil indirect
      jump (step A) are not copied again . This step is very near from
      step (B) since it use the same copy mechanism (using rep movs*
      instructions), so refers tothis step for explainations . NULL
      offsets in the code must be fixed during the hook creation :
      
      - The first one (the source buffer) is replaced by the evil bytes
        buffer .
      
      - The second one (the destination buffer) is replaced by the original
      function entry point address .


        movl      $0x00, %esi
        movl      $0x00, %edi
        push      %ds
        pop      %es
        cld
        xor      %ecx, %ecx
        movb      $0x07, %cl
        rep movsb            


   G - Use the original return address (saved on step E2) and get
      back to the original calling function . The NULL offset you
      can see (*) must be fixed in step E2 with the original function
      return address . The %ecx value is then pushed on the stack so the
      next ret instruction will use it like if it was a saved %eip
      register on the stack . This returns to the (correct) original
      place .

        movl      $0x00, %ecx  *
        pushl     %ecx
        ret



--[ 4 - Using the library


----[ 4.1 - The API


The LKH API is pretty easy to use :
  
hook_t      *khook_create(int addr, int mask);

      Create a hook on the address &#39;addr&#39;. Give also the default type
      (HOOK_PERMANENT or HOOK_SINGLESHOT) , the default state
      (HOOK_ENABLED or HOOK_DISABLED) and the default mode (HOOK_AGGRESSIVE
      or HOOK_DISCRETE) . The type, state and mode are OR&#39;d in the
      &#39;mask&#39; parameter .



void khook_destroy(hook_t *h);

      Disable, destroy, and free the hook ressources .


int khook_add_entry(hook_t *h, char *routine, int range);

      Add a callback to the hook, at the &#39;range&#39; rank . Return -1 if the
      given rank is invalid . Otherwise, return 0 .


int khook_remove_entry(hook_t *h, int range);

      Remove the callback put in slot &#39;range&#39;, return -1 if the given rank
      is invalid . Otherwise return 0 .


void khook_purge(hook_t *h);

      Remove all callbacks on this hook .


int khook_set_type(hook_t *h, char type);

      Change the type for the hook &#39;h&#39; . The type can be HOOK_PERMANENT
      (the hookcode is executed each time the hooked function is called) or
      HOOK_SINGLESHOT (the hookcode is executed only for 1 hijack, then the
      hook is cleanly removed .


int khook_set_state(hook_t *h, char state);

      Change the state for the hook &#39;h&#39; . The state can be HOOK_ENABLED
      (the hook is enabled) or HOOK_DISABLED (the hook is disabled) .


int khook_set_mode(hook_t *h, char mode);

      Change the mode for the hook &#39;h&#39; . The mode can be HOOK_AGGRESSIVE
      (the hook does not call the hijacked function) or HOOK_DISCRETE
      (the hook calls the hijacked function after having executed the
      callback routines) . Some part of the hook code is nop&#39;ed
      (overwritten by no operation instructions) if the hook is aggressive
      (step E and step H) .


int khook_set_attr(hook_t *h, int mask);

      Change the mode, state, and/or type using a unique function call.
      The function returns 0 in case of success or -1 if the specified
      mask contains incompatible options .


Note that you can add or remove entries whenever you want, whatever the
state , type and mode of the used hook .



----[ 4.2 - Kernel symbol resolution

A symbol resolution function has been added to LKH, allowing you to access
exported functions values .

int ksym_lookup(char *name);

Note that it returns NULL if the symbol remains unresolved . This lookup
can resolve symbols contained in the __ksymtab section of the kernel, an
exhaustive list of these symbols is printed when executing &#39;ksyms -a&#39; :

bash-2.03# ksyms -a | wc -l
  1136
bash-2.03# wc -l /boot/System.map
  14647 /boot/System.map
bash-2.03# elfsh -f /usr/src/linux/vmlinux -s  # displaying sections

[SECTION HEADER TABLE]

(nil)    ---         foffset:   (nil)      0 bytes [*Unknown*]
(...)
0xc024d9e0 a-- __ex_table  foffset: 0x14e9e0    5520 bytes [Program data]
0xc024ef70 a-- __ksymtab  foffset: 0x14ff70    9008 bytes [Program data]
0xc02512a0 aw- .data     foffset: 0x1522a0   99616 bytes [Program data]
(...)
(nil)    --- .shstrtab  foffset: 0x1ad260    216 bytes [String table]
(nil)    --- .symtab    foffset: 0x1ad680  245440 bytes [Symbol table]
(nil)    --- .strtab    foffset: 0x1e9540  263805 bytes [String table]

[END]


As a matter of fact, the memory mapped section __ksymtab does not contains
every kernel symbols we would like to hijack.
In the other hand, the non-mapped section .symtab is definitely bigger
(245440 bytes vs 9008 bytes). When using &#39;ksyms&#39;, the __NR_query_module
syscall (or __NR_get_kernel_syms for older kernels) is used internaly, this
syscall can only access the __ksymtab section since the complete kernel
symbol table contained in __ksymtab is not loaded in memory. The solution
to access to whole symbol table is to pick up offsets in our System.map
file (create it using `nm -a vmlinux > System.map`) .

bash-2.03# ksyms -a | grep sys_fork
bash-2.03# grep sys_fork /boot/System.map
c0105898 T sys_fork
bash-2.03#


#define      SYS_FORK      0xc0105898

  if ((s = khook_create((int) SYS_FORK, HOOK_PERMANENT, HOOK_ENABLED)) == NULL)
   KFATAL("init_module: Cant set hook on function *sys_fork* ! \n", -1);
  khook_add_entry(s, (int) fork_callback, 0);

#undef SYS_FORK


For systems not having System.map or uncompressed kernel image (vmlinux),
it is acceptable to uncompress the vmlinuz file (take care, its not a
standard gzip format!
[3] contains very useful information about this) and create manually
a new System.map file .

Another way to go concerning kernel non-exported symbols resolution could
be a statistic based lookup : Analysing references in the kernel
hexadecimal code could allow us to predict the symbol values (fetching
call or jmp instructions), the difficulty of this tool would be the
portability, since the kernel code changes from a version to another.

Dont forgett t change SYS_FORK to your own sys_fork offset value.


----[ 4.3 - LKH Internals: the hook_t object

Let&#39;s look at the hook_t structure (the hook entity in memory) :

typedef struct      s_hook
{
  int            addr;               
  int            offset;               
  char           saved_bytes[7];           
  char           voodoo_bytes[7];      
  char           hook[HOOK_SIZE];      
  char           cache1[CACHE1_SIZE];   
  char           cache2[CACHE2_SIZE];      
}        hook_t;



h->addr        The address of the original function, used to
       enable or disable the hook .

h->offset       This field contains the offset from h->addr where to
       begin overwrite to set the hijack . Its value is 3 or
              0 , it depends if the function has a stack frame
              or not .

h->original_bytes  The seven overwritten bytes of the original
              function .

h->voodoo_bytes   The seven bytes we need to put at the beginning of the
              function to redirect it (contains the indirect jump code
              seen in step A on paragraph 3) .

h->hook        The opcodes buffer contaning the hooking code,
              where we insert callback reference using
              khook_add_entry() .


The cache1 and cache2 buffers are used to backup some hook code when we
set the mode HOOK_AGGRESSIVE (since we have to nop the original function
call, saving this code is necessary , for eventually reset the hook as
discrete after)



Each time you create a hook, an instance of hook_t is declared and
allocated . You have to create one hook per function you want to
hijack .




----[ 5 - Testing the code


Please check http://www.devhell.org/~mayhem/ for fresh code first. The
package (version 1.1) is given at the end of the article) .

Just do #include "lkh.c" and play ! In this example module using LKH,
we wants to hook :

- the hijack_me() function, here you can check the good parameters passing
  and their well done modification throught the callbacks .

- the schedule() function, SINGLESHOT hijack .

- the sys_fork() function, PERMANENT hijack .


------[ 5.1 - Loading the module

bash-2.03# make load
insmod lkh.o
Testing a permanent, aggressive, enabled hook with 3 callbacks:
A in hijack_one  = 0 -OK-
B in hijack_one  = 1 -OK-
A in hijack_zero = 1 -OK-
B in hijack_zero = 2 -OK-
A in hijack_two  = 2 -OK-
B in hijack_two  = 3 -OK-
--------------------
Testing a disabled hook:
A in HIJACKME!!! = 10 -OK-
B in HIJACKME!!! = 20 -OK-
--------------------
Calling hijack_me after the hook destruction
A in HIJACKME!!! = 1  -OK-
B in HIJACKME!!! = 2  -OK-
SCHEDULING!

------[ 5.2 - Playing around a bit

bash-2.05# ls
FORKING!
Makefile  doc  example.c  lkh.c  lkh.h  lkh.o  user  user.c  user.h  user.o
bash-2.05# pwd
/usr/src/coding/LKH


(Did not printed FORKING! since pwd is a shell builtin command :)


bash-2.05# make unload
FORKING!
rmmod lkh;
LKH unloaded - sponsorized by the /dev/hell crew!
bash-2.05# ls
Makefile  doc  example.c  lkh.c  lkh.h  lkh.o  user  user.c  user.h  user.o
bash-2.05#


You can see "FORKING!" each time the sys_fork() kernel function is called
(the hook is permanent) and "SCHEDULING!" when the schedule() kernel function
is called for the first time (since this hook is SINGLESHOT, the schedule()
function is hijacked only one time, then the hook is removed) .

Here is the commented code for this demo :


------[ 5.3 - The code

/*
** LKH demonstration code, developped and tested on Linux x86 2.4.5
**
** The Library code is attached .
** Please check http://www.devhell.org/~mayhem/ for updates .
**
** This tarball includes a userland code (runnable from GDB), the LKH
** kernel module and its include file, and this file (lkm-example.c)
**
** Suggestions {and,or} bug reports are welcomed ! LKH 1.2 already
** in development .
**
** Special thanks to b1nf for quality control ;)
** Shoutout to kraken, keep the good work on psh man !
**
** Thanks to csp0t (one work to describe you : *elite*)
** and cma4 (EPITECH powa, favorite win32 kernel hax0r)
**
** BigKaas to the devhell crew (r1x and nitrogen fux0r)
** Lightman, Gab and Xfred from chx-labs (stop smoking you junkies ;)
**
** Thanks to the phrackstaff and particulary skyper for his
** great support . Le Havre en force ! Case mais oui je t&#39;aime ;)
*/
#include "lkh.c"


int      hijack_me(int a, int b);  /* hooked function */
int      hijack_zero(void *ptr);  /* first callback */
int      hijack_one(void *ptr);  /* second callback */
int      hijack_two(void *ptr);  /* third callback */
void     hijack_fork(void *ptr);  /* sys_fork callback */
void     hijack_schedule(void *ptr);  /* schedule callback */

static  hook_t      *h = NULL;
static  hook_t      *i = NULL;
static  hook_t      *j = NULL;


int
init_module()
{
  int           ret;

  printk(KERN_ALERT "Change the SYS_FORK value then remove the return \n");
  return (-1);

  /*
  ** Create the hooks
  */

#define      SYS_FORK 0xc010584c

  j = khook_create(SYS_FORK
            , HOOK_PERMANENT
            | HOOK_ENABLED
            | HOOK_DISCRETE);

#undef      SYS_FORK

  h = khook_create(ksym_lookup("hijack_me")
            , HOOK_PERMANENT
            | HOOK_ENABLED
            | HOOK_AGGRESSIVE);

  i = khook_create(ksym_lookup("schedule")
            , HOOK_SINGLESHOT
            | HOOK_ENABLED
            | HOOK_DISCRETE);


  /*
  ** Yet another check
  */
  if (!h || !i || !j)
   {
    printk(KERN_ALERT "Cannot hook kernel functions \n");
    return (-1);
   }


  /*
  ** Adding some callbacks for the sys_fork and schedule functions
  */
  khook_add_entry(i, (int) hijack_schedule, 0);
  khook_add_entry(j, (int) hijack_fork, 0);



  /*
  ** Testing the hijack_me() hook .
  */
  printk(KERN_ALERT "LKH: perm, aggressive, enabled hook, 3 callbacks:\n");
  khook_add_entry(h, (int) hijack_zero, 1);
  khook_add_entry(h, (int) hijack_one, 0);
  khook_add_entry(h, (int) hijack_two, 2);
  ret = hijack_me(0, 1);

  printk(KERN_ALERT "--------------------\n");
  printk(KERN_ALERT "Testing a disabled hook :\n");
  khook_set_state(h, HOOK_DISABLED);
  ret = hijack_me(10, 20);

  khook_destroy(h);
  printk(KERN_ALERT "------------------\n");
  printk(KERN_ALERT "Calling hijack_me after the hook destruction\n");
  hijack_me(1, 2);

  return (0);
}



void
cleanup_module()
{
  khook_destroy(i);
  khook_destroy(j);
  printk(KERN_ALERT "LKH unloaded - sponsorized by the /dev/hell crew!\n");
}




/*
** Function to hijack
*/
int
hijack_me(int a, int b)
{
  printk(KERN_ALERT "A in HIJACKME!!! = %u \t -OK- \n", a);
  printk(KERN_ALERT "B in HIJACKME!!! = %u \t -OK- \n", b);
  return (42);
}



/*
** First callback for hijack_me()
*/
int
hijack_zero(void *ptr)
{
  int      *a;
  int      *b;

  a = ptr;
  b = a + 1;
  printk(KERN_ALERT "A in hijack_zero = %u \t -OK- \n", *a);
  printk(KERN_ALERT "B in hijack_zero = %u \t -OK- \n", *b);
  (*b)++;
  (*a)++;
  return (0);
}



/*
** Second callback for hijack_me()
*/
int
hijack_one(void *ptr)
{
  int      *a;
  int      *b;
  
  a = ptr;
  b = a + 1;
  printk(KERN_ALERT "A in hijack_one  = %u \t -OK- \n", *a);
  printk(KERN_ALERT "B in hijack_one  = %u \t -OK- \n", *b);
  (*a)++;
  (*b)++;
  return (1);
}



/*
** Third callback for hijack_me()
*/
int
hijack_two(void *ptr)
{
  int      *a;
  int      *b;

  a = ptr;
  b = a + 1;
  printk(KERN_ALERT "A in hijack_two  = %u \t -OK- \n", *a);
  printk(KERN_ALERT "B in hijack_two  = %u \t -OK- \n", *b);
  (*a)++;
  (*b)++;
  return (2);
}




/*
** Callback for schedule() (kernel exported symbol)
*/
void      hijack_schedule(void *ptr)
{
  printk(KERN_ALERT "SCHEDULING! \n");
}



/*
** Callbacks for sys_fork() (kernel non exported symbol)
*/
void
hijack_fork(void *ptr)
{
  printk(KERN_ALERT "FORKING! \n");
}




--[ 6 - References

[1] Kernel function hijacking
    http://www.big.net.au/~silvio/
[2] INTEL Developers manual
    http://developers.intel.com/design/pentiu m4/manuals/
[3] Linux Kernel Internals
    http://www.linuxdoc.org/guides.html


|=[ EOF ]=---------------------------------------------------------------=|

TOP

发新话题