Debug Fundamentals Exercise 2: Some reverse engineering for Thanksgiving

Debug Fundamentals Exercise 2: Some reverse engineering for Thanksgiving

  • Comments 42

 

Continuing our series on “Fundamentals Exercises”, we have some more reverse engineering for you!  Again, these exercises are designed more as learning experiences rather than simply puzzlers.  We hope you find them interesting and educational!  Feel free to post your responses here, but we won’t put them on the site until after we post the “official” responses, to avoid spoilers.

 

Examine the following code, registers, and stack values to determine the following:

 

1.       What is the return value from DemoFunction2?

2.       What is the purpose of DemoFunction2?

3.       Bonus: Both the last exercise and this week’s exercise involved accessing data at ebp+8.  Why ebp+8?

 

Hints:

1.       You probably don’t want to manually walk through every instruction that executes in the loop.  Instead, walk through a few iterations to determine the intent of the code.

2.       The bracket notation [] in the assembly means to treat the value in brackets as a memory address, and access the value at that address.

3.       32-bit integer return values are stored in eax

 

 

0:000> uf 010024d0

asmdemo2!DemoFunction2:

010024d0 55              push    ebp

010024d1 8bec            mov     ebp,esp

010024d3 8b5508          mov     edx,dword ptr [ebp+8]

010024d6 33c0            xor     eax,eax

010024d8 b920000000      mov     ecx,20h

010024dd d1ea            shr     edx,1

010024df 7301            jnc     asmdemo2!DemoFunction2+0x12 (010024e2)

010024e1 40              inc     eax

010024e2 e2f9            loop    asmdemo2!DemoFunction2+0xd (010024dd)

010024e4 5d              pop     ebp

010024e5 c3              ret

 

0:000> r

eax=80002418 ebx=7ffd7000 ecx=00682295 edx=00000000 esi=80002418 edi=00000002

eip=010024d0 esp=0006fe98 ebp=0006fea8 iopl=0         nv up ei pl zr na pe nc

cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

asmdemo2!DemoFunction2:

010024d0 55              push    ebp

 

0:000> dps esp

0006fe98  0100251c asmdemo2!main+0x20

0006fe9c  80002418

0006fea0  00000002

0006fea4  00000000

0006fea8  0006ff88

0006feac  01002969 asmdemo2!_mainCRTStartup+0x12c

0006feb0  00000002

0006feb4  00682270

0006feb8  006822b8

0006febc  f395c17d

0006fec0  00000000

0006fec4  00000000

0006fec8  7ffd7000

0006fecc  00000000

0006fed0  00000000

0006fed4  00000000

0006fed8  00000094

0006fedc  00000006

0006fee0  00000000

0006fee4  00001771

0006fee8  00000002

0006feec  76726553

0006fef0  20656369

0006fef4  6b636150

0006fef8  00003120

0006fefc  00000000

0006ff00  00000000

0006ff04  00000000

0006ff08  00000000

0006ff0c  00000000

0006ff10  00000000

0006ff14  00000000

 


[Update: our answer. Posted 12/04/2008]

We had a great response to this exercise!  It was good to see so many of you going through this.  There were some readers that found this a good exercise for beginners, and others were looking for a return to Puzzlers.  As an FYI, we may do more Puzzlers in the future, but for now we are going to continue on the “Fundamentals Exercise” track to help all our readers build up a solid foundation for debugging.

 

It was interesting to read how several of you not only gave the answers, but made suggestions for how the code could be optimized!  I want to point out that the code we post for these exercises isn’t intended to be the optimal solution; it is written as a learning tool.   That said, keep that in-depth feedback coming; I think everyone will benefit from a discussion of optimization.

 

Answers to exercise 2:

 

  1. DemoFunction2 returns 5, which is the number of bits set in 80002418, the value at ebp+8.
  2. DemoFunction2 finds the hamming weight of the 32-bit value passed to the function (it returns the count of bits that are equal to 1).
  3. ebp points to the base of the stack frame (the value stored there points to the previous frame's ebp), ebp+4 points to the return address, and ebp+8 points to the first parameter passed to the function.  Note that the value of ebp changes in the function prolog, at instruction 010024d1.  At this point ebp is set to 0006fe94, so at instruction 010024d3, ebp+8 is 0006fe9c, and [ebp+8] = 80002418.
Leave a Comment
  • Please add 7 and 1 and type the answer here:
  • Post
  • It looks like the function counts the number of set bits in its argument. Since the argument here is 0x80002418, the return value is 5.

    After the prologue, the first argument is at ebp+8 and the stack looks like this

    0006fe94 0006fea8 <- ebp,esp (pushed ebp value)

    0006fe98 0100251c [ebp+4] (return address)

    0006fe9c 80002418 [ebp+8] (first argument)

  • This looks like naive implemenation of bit counting in 32-bit unsigned value. DemoFunction2(0x80002418) should return 5.

    [ebp+8] is used to retrieve the input parameter, because [ebp+4] holds the return address, and [ebp] is used to keep the previous value of ebp.

    PS: "Hacker's Delight" by Warren has great chapter on bit counting.

  • This routine counts the number of bits in the 32 bit integer input variable

    int DemoFunction2(int val)

    {

      return NumberOfBitsTurnedOn(val);

    }

    int DemoFunction2(int val)

    {

    int ret = 0;

    for (int =0;i<0x20;i++)

    {

    if (val%2 == 1)

    ret++;

    val/=2;

    }

           return ret;

    }

    ebp+8 is the first argument on the stack, since ebp is initialized from esp. The first 32 bits on the stack is the return address. The next 32 bits is the first argument (the only argument in this case).

  • 1. The return value is 5 for the input 80002418h

    2. The function calculates the number of bits set in the argument.

    3. EBP is set to the top of stack after the old EBP is pushed. Thus [EBP+0] is old EBP, [EBP+4] is the return address and [EBP+8] is the first argument.

  • Return value is :0x1.

    The purpose of DemoFunction2 is to count number of bits set in given DWORD.

    --rc

  • return value is :0x5.

    function calculates number of bits in DWORD which is passed as parameter.

    Why ebp+8?

    Because as soon as we enter function we push ebp on stack, that move parameter further way by 4 bytes.

    --rc

  • as far as I understand, it counts the set bits in a 32 bit (20h) number.

    The parameter passed is 80002418h, which contains 5 set bits, so the return will be 5.

    [ebp+8]: as the ebp will contain the stack pointer (mov ebp, esp), this expression will point to one of the elements on the stack.

    The top element of the stack is the return address, and the next is the parameter passed.

    So ebp+8 will point to the parameter. Now why +8, why not +4? That beats me.

  • ok, I think I figured out:

    esp points to the top of the stack (where the next element will be stored),

    esp-4 points to the last element (the return address)

    and esp-8 points to the next-to-last element, which is the parameter.

  • Return value is 5.  The function is counts the 1-bits in the 32-bit value passed to the function, which was 0x80002418.  EBP+8 accesses the first parameter passed to the function.  EBP+4 is the return address to the calling function.  EBP points to the EBP of the previous function.

  • This function can be documented as follows:

    ; Prototype of the function:

    ; unsigned int DemoFunction2(unsigned int Value)

    asmdemo2!DemoFunction2:

    ; Create the stack frame

    ; not really needed here, as we could access relative to ESP, but for completeness... ;)

    010024d0 55              push    ebp

    010024d1 8bec            mov     ebp,esp

    ; Get the first (and only) argument to the function ("Value")

    010024d3 8b5508          mov     edx,dword ptr [ebp+8]

    ; Clear the internal counter

    010024d6 33c0            xor     eax,eax

    ; Make sure to loop for exacty 32 bits

    010024d8 b920000000      mov     ecx,20h

    CountLoop:

    ; logically shift the Value one bit to the right

    ; the rightmost bit will be put into the Carry

    010024dd d1ea            shr     edx,1

    ; if carry is not set, skip the following command

    010024df 7301            jnc     asmdemo2!DemoFunction2+0x12 (010024e2)

    ; Increment the counter

    ; Due to the "jnc" before, this is only executed if the rightmost bit

    ; before shifting ("shr") was set

    010024e1 40              inc     eax

    ; decrement CX; as long as CX has not reached 0, continue the loop

    010024e2 e2f9            loop    asmdemo2!DemoFunction2+0xd (010024dd)

    ; "undo" the stack frame

    010024e4 5d              pop     ebp

    010024e5 c3              ret

    So:

    2. Obviously, this function counts the number of "1" bits in the lower 32 bit of the binary representation of its argument.

    1. The return value will be 5, as 0x80002418 contains exactly five "1" bits (the hex digits 8, 2, 4, 1, and 8 contain exactly one "1" bit each)

    3. EBP+8 is the argument.

      When the function just entered (at 010024d0), the stack frame looks like:

      ++++++++++++++++++

      + return address + esp+0

      +----------------+

      + argument       + esp+4

      ++++++++++++++++++

      Now, ebp is pushed onto the stack, so we get the following outline:

      ++++++++++++++++++

      + "old" EBP      + esp+0

      +----------------+

      + return address + esp+4

      +----------------+

      + argument       + esp+8

      ++++++++++++++++++

      With the following mov (010024d1), ebp is the same as esp, thus, ebp+8 is the argument.

    If you want to use assembler commands seldomly used ("LOOP"), you could also have used ENTER and LEAVE.

    Anyway, as I love to generate C code out of assembler, here is some code which would produce the same result (but different instructions):

    int DemoFunction2(unsigned int bitmask)

    {

       int i;

       int count = 0;

       __asm int 3;

       for (i = 32; i > 0; i--) {

           if ( (bitmask >> 1) != 0) {

               ++count;

           }

       }

       return count;

    }

    Why is bitmask declared as "unsigned int" instead of "int"? Well, because in C, shifting a signed value to the right provokes undefined behaviour. With "unsigned int", I can be sure that it is a logical shift right, and not an arithmetic one or something completely different.

  •  1. What is the return value from DemoFunction2?

    5

     2. What is the purpose of DemoFunction2?

    Return the number of bits that are set, in a 32-bit value

     3. Why ebp+8?

    It's the first param to the function - ebp+4 is the return address

  • (1) The return value is 5

    (2) The purpose is to count the number of set bits in its argument.

    (3) The preamble sets EBP to point to the stack frame for the function and offset +8 is the first argument.

  • Ah -- counting the number of bits set to 1 in a 32 bit integer value.

    You should have a or edx, edx check before the mov ecx, 0x20h and do a jz to 0x010024e4 (in case someone passed in 0 to begin with). Also, you can try using the BSF (bit scan forward instruction), but the performance is likely to the same.

    Finally, a reminder to make this things fastcalls -- it's a pity that the 32-bit compiler can't default to ecx and edx for parameter passing.

    Isn't legacy a wonderful thing?

  • Hi,

    My answers are:

    1. 5

    2. It counts the number of set (1) bits in the 0x80002418 number

    3. After the function epilog (pusb ebp | mov ebp, esp) ebp+8 is the way to access the first parameter

    -George

  • 1. 1

    2. Count the bits

    3. First argument for this function

    Please, more puzzlers instead of exercises. :)

Page 1 of 3 (42 items) 123