Uninformed: Informative Information for the Uninformed

Vol 4» 2006.Jun


Stack Frame Annotation

The unwind information associated with each non-leaf function contains lots of useful meta-information about the structure of the stack. It provides information about the amount of stack space allocated, the location of saved non-volatile registers, and whether or not a frame register is used and what relation it has to the rest of the stack. This information is also described in terms of the location of the instruction that actually performs the operation associated with the task. Take the following unwind information obtained through dumpbin /unwindinfo as an example:

  0000060C 00006E50 00006FF0 000081FC  _resetstkoflw
    Unwind version: 1
    Unwind flags: None
    Size of prologue: 0x47
    Count of codes: 18
    Frame register: rbp
    Frame offset: 0x20
    Unwind codes:
      3C: SAVE_NONVOL, register=r15 offset=0x98
      38: SAVE_NONVOL, register=r14 offset=0xA0
      31: SAVE_NONVOL, register=r13 offset=0xA8
      2A: SAVE_NONVOL, register=r12 offset=0xD8
      23: SAVE_NONVOL, register=rdi offset=0xD0
      1C: SAVE_NONVOL, register=rsi offset=0xC8
      15: SAVE_NONVOL, register=rbx offset=0xC0
      0E: SET_FPREG, register=rbp, offset=0x20
      09: ALLOC_LARGE, size=0xB0
      02: PUSH_NONVOL, register=rbp

First and foremost, one can immediately see that the size of the prologue used in the _resetstkoflw function is 0x47 bytes. This prologue accounts for all of the operations described in the unwind codes array. Furthermore, one can also tell that the function uses a frame pointer, as conveyed through rbp, and that the frame pointer offset is 0x20 bytes relative to the current stack pointer at the time the frame pointer register is established.

As one would expect with an unwind operation, the unwind codes themselves are stored in the opposite order of which they are executed. This is necessary because of the effect on the stack each unwind code can have. If they are processed in the wrong order, then the unwind operation will get invalid data. For example, the value obtained through a pop rbp instruction will differ depending on whether or not it is done before or after an add rsp, 0xb0.

For the purposes of annotation, however, the important thing to keep in mind is how all of the useful information can be extracted. In this case, it is possible to take all of the information the unwind codes provide and break it down into a definition of the stack frame layout for a function. This can be accomplished by processing the unwind codes in the order that they would be executed rather than the order that they appear in the array. There's one important thing to keep in mind when doing this. Since unwind information can be chained, it is a requirement that the full chain of unwind codes be processed in execution order. This can be accomplished by walking the chain of unwind information and building an execution order list of all of the unwind codes.

Once the execution order list of unwind codes is collected, the next step is to simply enumerate each code, checking to see what operation it performs and building out the stack frame across each iteration. Prior to enumerating each code, the state of the stack pointer should be initialized to 0 to indicate an empty stack frame. As data is allocated on the stack, the stack pointer should be adjusted by the appropriate amount. The actions that need to be taken for each unwind operation that directly effect the stack pointer are described below.

  1. UWOP_PUSH_NONVOL

    When a non-volatile register is pushed onto the stack, such as through a push rbp, the current stack pointer needs to be decremented by 8 bytes.

  2. UWOP_ALLOC_LARGE and UWOP_ALLOC_SMALL

    When stack space is allocated, the current stack pointer needs to be adjusted by the amount indicated.

  3. UWOP_SET_FPREG

    When a frame pointer is defined, its offset relative to the base of the stack should be saved using the current value of the stack pointer.

As the enumeration unwind codes occurs, it is also possible to annotate the different locations on the stack where non-volatile registers are preserved. For instance, given the example unwind information above, it is known that the R15 register is preserved at [rsp + 0x98]. Therefore, we can annotate this location as [rsp + SavedR15].

Beyond annotating preserved register locations on the stack, we can also annotate the instructions that perform operations that effect the stack. For instance, when a non-volatile register is pushed, such as through push rbp, we can annotate the instruction that performs that operation as preserving rbp on the stack. The location of the instruction that's associated with the operation can be determined by taking the BeginAddress associated with the unwind information and adding it to the CodeOffset attribute of the UNWIND_CODE that is being processed. It is important to note, however, that the CodeOffset attribute actually points to the first byte of the instruction immediately following the one that performs the actual operation, so it is necessary to back track in order to determine the start of the instruction that actually performs the operation.

As a result of this analysis, one can take the prologue of the _resetstkoflw function and automatically convert it from:

.text:100006E50     push rbp
.text:100006E52     sub rsp, 0B0h
.text:100006E59     lea rbp, [rsp+0B0h+var_90]
.text:100006E5E     mov [rbp+0A0h], rbx
.text:100006E65     mov [rbp+0A8h], rsi
.text:100006E6C     mov [rbp+0B0h], rdi
.text:100006E73     mov [rbp+0B8h], r12
.text:100006E7A     mov [rbp+88h], r13
.text:100006E81     mov [rbp+80h], r14
.text:100006E88     mov [rbp+78h], r15

to a version with better annotation:

.text:100006E50     push rbp                      ; SavedRBP
.text:100006E52     sub rsp, 0B0h
.text:100006E59     lea rbp, [rsp+20h]
.text:100006E5E     mov [rbp+0A0h], rbx           ; SavedRBX
.text:100006E65     mov [rbp+98h+SavedRSI], rsi   ; SavedRSI
.text:100006E6C     mov [rbp+98h+SavedRDI], rdi   ; SavedRDI
.text:100006E73     mov [rbp+98h+SavedR12], r12   ; SavedR12
.text:100006E7A     mov [rbp+98h+SavedR13], r13   ; SavedR13
.text:100006E81     mov [rbp+98h+SavedR14], r14   ; SavedR14
.text:100006E88     mov [rbp+98h+SavedR15], r15   ; SavedR15

While such annotation may is not entirely useful to understanding the behavior of the binary, it at least simplifies the process of understanding the layout of the stack.