|Informative Information for the Uninformed|
Function hooking is the process of intercepting calls to a given function by redirecting those calls to an alternative function. The concept of function hooking has been around for quite some time and it's unclear who originally presented the idea. There are a number of different libraries and papers that exist which help to facilitate the hooking of functions. With respect to local kernel-mode backdoors, function hooking is an easy and reliable method of creating a backdoor. There are a few different ways in which functions can be hooked. One of the most common techniques involves overwriting the prologue of the function to be hooked with an architecture-specific jump instruction that transfers control to an alternative function somewhere else in memory. This is the approach taken by Microsoft's Detours library. While prologue hooks are conceptually simple, there is actually quite a bit of code needed to implement them properly.
In order to implement a prologue hook in a portable and reliable manner, it is often necessary to make use of a disassembler that is able to determine the size, in bytes, of individual instructions. The reason for this is that in order to perform the prologue overwrite, the first few bytes of the function to be hooked must be overwritten by a control transfer instruction (typically a jump). On the Intel architecture, control transfer instructions can have one of three operands: a register, a relative offset, or a memory operand. Each operand type controls the size of the jump instruction that will be needed: 2 bytes, 5 bytes, and 6 bytes, respectively. The disassembler makes it possible to copy the first n instructions from the function's prologue prior to performing the overwrite. The value of n is determined by disassembling each instruction in the prologue until the number of bytes disassembled is greater than or equal to the number of bytes that will be overwritten when hooking the function.
The reason the first n instructions must be saved in their entirety is to make it possible for the original function to be called by the hook function. In order to call the original version of the function, a small stub of code must be generated that will execute the first n instructions of the function's original prologue followed by a jump to instruction n + 1 in the original function's body. This stub of code has the effect of allowing the original function to be called without it being diverted by the prologue overwrite. This method of implementing function prologue hooks is used extensively by Detours and other hooking libraries.
Recent versions of Windows, such as XP SP2 and Vista, include image files that come with a more elegant way of hooking a function with a function prologue overwrite. In fact, these images have been built with a compiler enhancement that was designed specifically to improve Microsoft's ability to hook its own functions during runtime. The enhancement involves creating functions with a two byte no-op instruction, such as a mov edi, edi, as the first instruction of a function's prologue. In addition to having this two byte instruction, the compiler also prefixes 5 no-op instructions to the function itself. The two byte no-op instruction provides the necessary storage for a two byte relative short jump instruction to be placed on top of it. The relative short jump, in turn, can then transfer control into another relative jump instruction that has been placed in the 5 bytes that were prefixed to the function itself. The end result is a more deterministic way of hooking a function using a prologue overwrite that does not rely on a disassembler. A common question is why a two byte no-op instruction was used rather than two individual no-op instructions. The answer for this has two parts. First, a two byte no-op instruction can be overwritten in an atomic fashion whereas other prologue overwrites, such as a 5 byte or 6 byte overwrite, cannot. The second part has to do with the fact that having a two byte no-op instruction prevents race conditions associated with any thread executing code from within the set of bytes that are overwritten when the hook is installed. This race condition is common to any type of function prologue overwrite.
To better understand this race condition, consider what might happen if the prologue of a function had two single byte no-op instructions. Prior to this function being hooked, a thread executes the first no-op instruction. In between the execution of this first no-op and the second no-op, the function in question is hooked in the context of a second thread and the first two bytes are overwritten with the opcodes associated with a relative short jump instruction, such as 0xeb and 0xf9. After the prologue overwrite occurs, the first thread begins executing what was originally the second no-op instruction. However, now that the function has been hooked, the no-op instruction may have been changed from 0x90 to 0xf9. This may have disastrous effects depending on the context that the hook is executed in. While this race condition may seem unlikely, it is nevertheless feasible and can therefore directly impact the reliability of any solution that uses prologue overwrites in order to hook functions.
Category: Type I
Origin: The concept of patching code has ``existed since the dawn of digital computing''.
Capabilities: Kernel-mode code execution
Considerations: The reliability of a function prologue hook is directly related to the reliability of the disassembler used and the number of bytes that are overwritten in a function prologue. If the two byte no-op instruction is not present, then it is unlikely that a function prologue overwrite will be able to be multiprocessor safe. Likewise, if a disassembler does not accurately count the size of instructions in relation to the actual processor, then the function prologue hook may fail, leading to an unexpected crash of the system. One other point that is worth mentioning is that authors of hook functions must be careful not to inadvertently introduce instability issues into the operating system by failing to properly sanitize and check parameters to the function that is hooked. There have been many examples where legitimate software has gone the route of hooking functions without taking these considerations into account.
Covertness: At the time of this writing, the use of function prologue overwrites is considered to not be covert. It is trivial for tools, such as Joanna Rutkowska's System Virginity Verifier, to compare the in-memory version of system images with the on-disk versions in an effort to detect in-memory alterations. The Windows Debugger (windbg) will also make an analyst aware of differences between in-memory code segments and their on-disk counterparts.