Uninformed: Informative Information for the Uninformed

Vol 4» 2006.Jun


Functions

One of the most obvious uses for the information stored in the exception directory is that it can be used to discover all of the non-leaf functions in a binary. This is cool because it works regardless of whether or not you actually have symbols for the binary, thus providing an easy technique for identifying the majority of the functions in a binary. The process taken to do this is to simply enumerate the array of IMAGE_RUNTIME_FUNCTION_ENTRY structures stored within the exception directory. The BeginAddress attribute of each entry marks the starting point of a non-leaf function. There's a catch, though. Not all of the runtime function entries are actually associated with the entry point of a function. The fact of the matter is that entries can also be associated with various portions of an actual function where stack modifications are deferred until necessary. In these cases, the unwind information associated with the runtime function entry is chained with another runtime function entry.

The chaining of runtime function entries is documented as being indicated through the UNW_FLAG_CHAININFO flag in the Flags attribute of the UNWIND_INFO structure. If this flag is set, the area of memory immediately following the last UNWIND_CODE in the UNWIND_INFO structure is an IMAGE_RUNTIME_FUNCTION_ENTRY structure. The UnwindInfoAddress of this structure indicates the chained unwind information. Aside from this, chaining can also be indicated through an undocumented flag that is stored in the least-significant bit of the UnwindInfoAddress. If the least-significant bit is set, then it is implied that the runtime function entry is directly chained to the IMAGE_RUNTIME_FUNCTION_ENTRY structure that is found at the RVA conveyed by the UnwindInfoAddress attribute with the least significant bit masked off. The reason chaining can be indicated in this fashion is because it is a requirement that unwind information be four byte aligned.

With chaining in mind, it is safe to assume that a runtime function entry is associated with the entry point of a function if its unwind information is not chained. This makes it possible to deterministically identify the entry point of all of the non-leaf functions. From there, it should be possible to identify all of the leaf functions through calls that are made to them by non-leaf functions. This requires code flow analysis, though.