Informative Information for the Uninformed | ||||||||||||||
|
||||||||||||||
Next: Future Direction of PatchGuard
Up: Subverting PatchGuard Version 2
Previous: DPC Routine Patching
Contents
Subverting PatchGuardPatchGuard currently possesses a formidable array of defensive mechanisms that are aimed at making it difficult to reverse engineer and debug. Given that Microsoft does not currently have in place the infrastructure to make PatchGuard enforced by hardware, this is arguably the best that Microsoft will ever really be able to do in the short term. They're only able to build a system that is based on obfuscation and anti-debugging techniques in an attempt to make it difficult for third parties to detect, disable, or bypass it. There are other classes of software that seek to create defenses similar to those of PatchGuard's. However, these other classes usually have far more nefarious purposes than preventing third parties from patching the kernel. Specifically, anti-debugging, anti-reverse-engineering, and self-decrypting code have often used been to hide viruses, rootkits, and other malicious software on compromised systems. Although Microsoft may have intended the defensive mechanisms employed by PatchGuard for an (arguably) good cause, these same anti-debugging, anti-detection, and anti-reverse-engineering techniques that protect PatchGuard from attack by third party drivers can also be subverted to protect custom code from detection or analysis by anti-virus or anti-rootkit software. With this respect, Microsoft has created a double-bladed-sword, as the same elaborate obfuscation and anti-debugging schemes that guard PatchGuard against third party software can also be used to guard malicious software from system security software. It is in fact quite possible to subvert PatchGuard version 2's myriad defenses to execute custom code instead of PatchGuard's system integrity check routine. While doing so might not be exactly called trivial, it is far from impossible. In order to subvert PatchGuard to do one's bidding, one must first catch PatchGuard in the act, so to speak. To accomplish this, the author recommends turning to one of the proposed bypass techniques as a starting place. For example, consider the first proposed bypass technique, wherein the author recommends hooking _C_specific_handler to intercept control of execution at the exception generated by the PatchGuard DPC routine in order to trigger execution of the system integrity check. An implementation of this bypass technique provides direct access to the machine context inside the PatchGuard DPC routine, and this machine context contains the information necessary to locate the PatchGuard system integrity check routine. Since the objective is to repurpose the system integrity check routine to execute custom code, this is a good starting point. However, determining the location of the system integrity check routine is much more involved than simply skipping over PatchGuard's checks entirely; the pointer to the routine in question is encrypted based off of the original arguments to the DPC (the Dpc and DeferredContext arguments). Additionally, the original arguments to the PatchGuard DPC have at this point already been moved from registers to the stack and obfuscated (rotated left or right by a magical constant). As the original contents of the argument registers are deliberately overwritten by the DPC routine before the access violation is triggered, there is no choice other than to somehow fish the DPC arguments out of the caller's stack. This is actually somewhat of a challenge, given that such an approach must work for all kernel versions, and must also work for all of the different DPC permutations. Since this set of possibilities represents an unmaintainably large number of routines to reverse engineer in order to determine rotate obfuscation values and stack offsets, a more generalized approach to locating the original arguments on the stack must be taken. In order to create such a generic approach, one must take a closer look at the first few instructions of each DPC routine (leading up to the intentional access violation). Although PatchGuard has put into place several barriers to prevent easy retrieval of the original arguments from this context, there might be a pattern or weakness that could be exploited in order to recover the arguments in question. The basic things common to each DPC routine, when it comes to the machine context at the time of the access violation, are:
Although the situation may initially appear grim, it is in fact still possible to locate the Dpc argument given the above information; all that is needed is a bit of work (and getting one's hands dirty with some ugly tricks). Specifically, it is possible to search the stack frame of the DPC routine for the Dpc argument with a brute-force attack. This isn't exactly elegant, but it gets the job done. There are a number of hints that can be used to increase the chance of successfully finding the real Dpc argument on the stack:
By repeatedly applying these rules to every applicable location within a reasonable distance upward from the rsp value at the time of the exception (say, 256 bytes, although the exact size can be greater; the only requirement is that the entire local variable space of the DPC routine with the largest local variable space is completely contained within the search region), it is possible to recover the Dpc argument with virtual certainty. In the author's experience, this technique works quite reliably, despite that one might intuit that a search of an unknown stack frame might be prone to failing or turning up false positives. After both the Dpc and DeferredContext arguments to the PatchGuard DPC routine have been recovered, it is a simple matter of analyzing how PatchGuard invokes the system integrity check in order to determine how to locate it in-memory. This has been discussed previously, and it amounts to the following set of statements:
ULONG64 DecryptionKey, PatchGuardCheckFunction; DecryptionKey = *(PULONG64)(Dpc + 0x40); PatchGuardCheckFunction = DecryptionKey ^ DeferredContext; PatchGuardCheckFunction |= 0xFFFFF80000000000; At this point, it's almost possible to replace the system integrity check routine with custom code. However, there is still the matter of the pesky self-decrypting stub that runs before the check function. Because the DPC routine's exception handler rewrites the first instruction of the stub before it is executed, one doesn't have a whole lot of choice but to implement at least a very basic version of the decryption stub for the system integrity check routine. Recall that the first instruction in the stub is set to the following:
lock xor qword ptr [rcx],rdx Looking at the prototype for the decryption stub, rcx corresponds to the address of the decryption stub itself, and rdx corresponds to the decryption key. Since this instruction modifies both itself and the next instruction (the instruction is four bytes long and the xor alters eight bytes), the replacement code for the system integrity check routine must allow the first instruction to be the above xor instruction, and the must allow for the second instruction (at a minimum) to be initially xor-obfuscated. For simplicity's sake, the author has chosen to implement the simplest possible solution to this conundrum, which is to make the second instruction in the replacement code a duplicate of the first instruction. In other words, the replacement code would read as follows:
; ; This instruction is forced on us by PatchGuard, ; and cannot be altered; it is rewritten at runtime. ; lock xor qword ptr [rcx],rdx ; ; The next instruction, conveniently four bytes ; long, re-encrypts itself by xoring the first ; eight bytes of the decryption stub (which includes ; the second instruction) by the decryption key a ; second time; ; lock xor qword ptr [rcx],rdx ; ; (... any custom code may follow here ...) ; As noted previously, after specially constructing the replacement code, it is necessary to initially encrypt the second instruction (as it will be immediately decrypted by the first instruction). This must be done before control is returned to PatchGuard. After the custom code is configured and the second instruction is encrypted, all that remains is to copy the custom code over the PatchGuard decryption stub. When this is accomplished, the PatchGuard DPC's exception handler will invoke the supplied custom code instead of the system integrity check routine. However, this is not really all that interesting due to the fact that PatchGuard utilizes a one-shot timer. The custom code that was substituted for the decryption stub will never be run again. To account for this fact, it would be prudent to place a call to queue a timer with an associated DPC routine (pointing to the DPC routine that PatchGuard selected at boot) within the custom code block. At this point, it is possible to simply allow the normal exception dispatching process to continue (i.e. to resume _C_specific_handler), after which the custom code will be invoked instead of PatchGuard. In essence, PatchGuard has been not only disabled, but completely subverted to call customized code under the control of a third party driver instead of the system integrity check. Still, the situation is less than optimal. Presently, there is still a hook in _C_specific_handler that is there for anyone to see (and recognize that someone has tampered with the kernel). Additionally, the driver that was used to subvert PatchGuard in the first place is still loaded, which may also be a tell-tale giveaway sign that someone may have done something unsavory to the kernel. These problems are also solvable, however. It turns out that after PatchGuard has been subverted, it is safe to unhook from _C_specific_handler, and then simply call back into _C_specific_handler after the hook is removed. Furthermore, everything necessary to run the subverted system integrity check routine could even reside within PatchGuard's own internal data structures; for example, one could simply utilize extra space after the custom code, where the decryption stub and PatchGuard check routine would normally reside as a parameter block. This is especially convenient, as the custom code block is given a pointer to itself in rcx (the first argument), and it is easy to add a known constant value to that pointer in order to retrieve the parameter block for the custom code. At this point, all of the code and data necessary for the custom code that the driver has subverted PatchGuard with is located in dynamically allocated memory. Given this, the original driver is no longer needed and can even be unloaded (so as to further disguise the fact that any alterations to the kernel have taken place). After the driver has been unloaded, the only traces of the alterations that have taken place would be the unloaded module list (easily modified), and the re-written PatchGuard system integrity routine itself (which could easily be bolstered to be self-decrypting (with a differing encryption key in order to make for an extremely difficult to locate target in-memory). The end result is that PatchGuard has been disabled, and in its place, arbitrary custom code is periodically executed. Furthermore, no modifications or patches to kernel code or global data are present and no suspicious drivers (or even suspicious extraneous memory allocations) remain present in memory. In essence, the only traces of the fact that PatchGuard has been subverted would be visible only to someone (or something) that knows how to locate and disable PatchGuard. The supplied example program for subverting PatchGuard is fairly simple, and it does not utilize all of the defensive technologies employed by PatchGuard. For instance, it does not change the decryption key on every execution, nor does it follow through with keeping the entire code block encrypted except just before execution. These features could be easily added, however, and would greatly increase the difficulty of locating the subverted PatchGuard code in memory.
|