|Informative Information for the Uninformed|
The unwind information associated with each non-leaf function contains lots of useful meta-information about the structure of the stack. It provides information about the amount of stack space allocated, the location of saved non-volatile registers, and whether or not a frame register is used and what relation it has to the rest of the stack. This information is also described in terms of the location of the instruction that actually performs the operation associated with the task. Take the following unwind information obtained through dumpbin /unwindinfo as an example:
0000060C 00006E50 00006FF0 000081FC _resetstkoflw Unwind version: 1 Unwind flags: None Size of prologue: 0x47 Count of codes: 18 Frame register: rbp Frame offset: 0x20 Unwind codes: 3C: SAVE_NONVOL, register=r15 offset=0x98 38: SAVE_NONVOL, register=r14 offset=0xA0 31: SAVE_NONVOL, register=r13 offset=0xA8 2A: SAVE_NONVOL, register=r12 offset=0xD8 23: SAVE_NONVOL, register=rdi offset=0xD0 1C: SAVE_NONVOL, register=rsi offset=0xC8 15: SAVE_NONVOL, register=rbx offset=0xC0 0E: SET_FPREG, register=rbp, offset=0x20 09: ALLOC_LARGE, size=0xB0 02: PUSH_NONVOL, register=rbp
First and foremost, one can immediately see that the size of the prologue used in the _resetstkoflw function is 0x47 bytes. This prologue accounts for all of the operations described in the unwind codes array. Furthermore, one can also tell that the function uses a frame pointer, as conveyed through rbp, and that the frame pointer offset is 0x20 bytes relative to the current stack pointer at the time the frame pointer register is established.
As one would expect with an unwind operation, the unwind codes themselves are stored in the opposite order of which they are executed. This is necessary because of the effect on the stack each unwind code can have. If they are processed in the wrong order, then the unwind operation will get invalid data. For example, the value obtained through a pop rbp instruction will differ depending on whether or not it is done before or after an add rsp, 0xb0.
For the purposes of annotation, however, the important thing to keep in mind is how all of the useful information can be extracted. In this case, it is possible to take all of the information the unwind codes provide and break it down into a definition of the stack frame layout for a function. This can be accomplished by processing the unwind codes in the order that they would be executed rather than the order that they appear in the array. There's one important thing to keep in mind when doing this. Since unwind information can be chained, it is a requirement that the full chain of unwind codes be processed in execution order. This can be accomplished by walking the chain of unwind information and building an execution order list of all of the unwind codes.
Once the execution order list of unwind codes is collected, the next step is to simply enumerate each code, checking to see what operation it performs and building out the stack frame across each iteration. Prior to enumerating each code, the state of the stack pointer should be initialized to 0 to indicate an empty stack frame. As data is allocated on the stack, the stack pointer should be adjusted by the appropriate amount. The actions that need to be taken for each unwind operation that directly effect the stack pointer are described below.
As the enumeration unwind codes occurs, it is also possible to annotate the different locations on the stack where non-volatile registers are preserved. For instance, given the example unwind information above, it is known that the R15 register is preserved at [rsp + 0x98]. Therefore, we can annotate this location as [rsp + SavedR15].
Beyond annotating preserved register locations on the stack, we can also annotate the instructions that perform operations that effect the stack. For instance, when a non-volatile register is pushed, such as through push rbp, we can annotate the instruction that performs that operation as preserving rbp on the stack. The location of the instruction that's associated with the operation can be determined by taking the BeginAddress associated with the unwind information and adding it to the CodeOffset attribute of the UNWIND_CODE that is being processed. It is important to note, however, that the CodeOffset attribute actually points to the first byte of the instruction immediately following the one that performs the actual operation, so it is necessary to back track in order to determine the start of the instruction that actually performs the operation.
As a result of this analysis, one can take the prologue of the _resetstkoflw function and automatically convert it from:
.text:100006E50 push rbp .text:100006E52 sub rsp, 0B0h .text:100006E59 lea rbp, [rsp+0B0h+var_90] .text:100006E5E mov [rbp+0A0h], rbx .text:100006E65 mov [rbp+0A8h], rsi .text:100006E6C mov [rbp+0B0h], rdi .text:100006E73 mov [rbp+0B8h], r12 .text:100006E7A mov [rbp+88h], r13 .text:100006E81 mov [rbp+80h], r14 .text:100006E88 mov [rbp+78h], r15
to a version with better annotation:
.text:100006E50 push rbp ; SavedRBP .text:100006E52 sub rsp, 0B0h .text:100006E59 lea rbp, [rsp+20h] .text:100006E5E mov [rbp+0A0h], rbx ; SavedRBX .text:100006E65 mov [rbp+98h+SavedRSI], rsi ; SavedRSI .text:100006E6C mov [rbp+98h+SavedRDI], rdi ; SavedRDI .text:100006E73 mov [rbp+98h+SavedR12], r12 ; SavedR12 .text:100006E7A mov [rbp+98h+SavedR13], r13 ; SavedR13 .text:100006E81 mov [rbp+98h+SavedR14], r14 ; SavedR14 .text:100006E88 mov [rbp+98h+SavedR15], r15 ; SavedR15
While such annotation may is not entirely useful to understanding the behavior of the binary, it at least simplifies the process of understanding the layout of the stack.