Uninformed: Informative Information for the Uninformed

Vol 6» 2007.Jan

Obfuscation of System Integrity Check Calls via Structured Exception Handling

Much like PatchGuard version 1, this version of PatchGuard utilizes structured exception handling (SEH) support as an integral part of the process used to kick off execution of the system integrity check routine. The means by which this is accomplished have changed somewhat since the last PatchGuard version. In particular, there are several layers of obfuscation in each PatchGuard DPC that are used to shroud the actual call to the integrity check routine. In an effort to make matters more difficult for would-be attackers, the exact details of the obfuscation used vary between each of the ten DPCs that may be repurposed for use with PatchGuard. They all exhibit a common pattern, however, which can be described at a high level.

The first step in invoking the PatchGuard system integrity checking routine is a KTIMER with an associated KDPC (indicating a DPC callback routine to be called when the timer lapses) associated with it. This timer is primed for single-shot execution in an interval on the order of several minutes (with a random fuzz factor delta applied to increase the difficulty of performing a classic egghunt style attack to locate the KTIMER in non-paged pool). The DPC routine indicated with the KDPC that is associated with PatchGuard's KTIMER is one of the set of ten legitimate DPC routines that may be repurposed for use with PatchGuard. The means by which this particular invocation of the DPC routine is distinguished from a legitimate system invocation of the DPC routine in question is by the use of a deliberately invalid kernel pointer as one of the arguments to the DPC routine.

The prototype for a DPC routine is described by PKDEFERRED_ROUTINE:

    IN struct _KDPC *Dpc,       // pointer to parent DPC
    IN PVOID DeferredContext,   // arbitrary context - assigned at DPC initialization
    IN PVOID SystemArgument1,   // arbitrary context - assigned when DPC is queued
    IN PVOID SystemArgument2    // arbitrary context - assigned when DPC is queued

Essentially, a DPC is a callback routine with a set of user-defined context parameters whose interpretation is entirely up to the DPC routine itself. The standard use for context arguments in callback functions is to use them to point to a larger structure which contains information necessary for the callback routine to function, and this is exactly how the ten DPC routines that can used by PatchGuard regard the DeferredContext argument during legitimate execution. It is this usage of the DeferredContext argument which allows PatchGuard to trigger its execution for each of the ten DPC routines via an exception; PatchGuard arranges for a bogus DeferredContext value to be passed to the DPC routine when it is called. The first time that the DPC routine tries to dereference the DPC-specific structure referred to by DeferredContext, an exception occurs (which transfers control to the exception dispatching system, and eventually to PatchGuard's integrity check routine). While this may seem simple at first, if the reader is familiar with kernel mode programming, then there should be a couple of red flags set off by this description; normally, it is not possible to catch bogus memory references at DISPATCH_LEVEL or above with SEH (usually, one of the PAGE_FAULT_IN_NONPAGED_AREA or IRQL_NOT_LESS_OR_EQUAL bugchecks will be raised, depending on whether the bogus reference was to a reserved non-paged region or a paged-out pagable memory region). As a result, one would expect that PatchGuard would be putting the system at risk of randomly bugchecking by passing bogus pointers that are referenced at DISPATCH_LEVEL, the IRQL at which DPC routines run. However, PatchGuard has a couple of tricks up its metaphorical sleeve. It takes advantage of an implementation-specific detail of the current generation of x64 processors shipped by AMD in order to form kernel mode addresses that, while bogus, will not result in a page fault when referenced. Instead, these bogus addresses will result in a general protection fault, which eventually manifests itself as a STATUS_ACCESS_VIOLATION SEH exception. This path to raising a STATUS_ACCESS_VIOLATION exception does in fact work even at DISPATCH_LEVEL, thus allowing PatchGuard to provide safe bogus pointer values for the DeferredContext argument in order to trigger SEH dispatching without risking bringing the system down with a bugcheck.

Specifically, the implementation detail that PatchGuard relies upon relates to the 48-bit address space limitation in AMD's Hammer family of processors [4]. Current AMD processors only implement 48 bits of the 64-bit address space presented by the x64 architecture. This is accomplished by requiring that bits 63 through the most significant bit implemented by the processor (current AMD processors implement 48 bits) of any given address be set to either all ones or all zeros. An address of this form is defined to be a canonical address, or a well-formed address. Attempts to reference addresses that are not canonical as defined by this definition result in the processor immediately raising a general protection fault. This restriction on the address space essentially splits the usable address space into two halves; one region at the high end of the address space, and one region at the low end of the address space, with a no-mans-land in between the two. Windows utilizes this split to divide user mode from kernel mode, with the high end of the address space being reserved for kernel mode usage and the low end of the address space being reserved for user mode usage. PatchGuard takes advantage of this processor-mandated no-mans-land to create bogus pointer values that can be safely dereferenced and caught by SEH, even at high IRQLs.

All of the DPC routines that are in the set which may be repurposed for use by PatchGuard dereference the DeferredContext argument as the first part of work that does not involve shuffling stack variables around. In other words, the first real work involved in any of the PatchGuard-enabled DPC routines is to touch a structure or variable pointed to by the DeferredContext argument. In the execution path of PatchGuard attempting to trigger a system integrity check, the DeferredContext argument is invalid, which eventually results in an access violation exception that is routed to the SEH registrations for the DPC routine. If one examines any of the PatchGuard DPC routines, it is clear that all of them have several overlapping SEH registrations (a construct that normally indicates several levels of nested try/except and try/finally constructs):

1: kd> !fnseh nt!ExpTimeRefreshDpcRoutine
nt!ExpTimeRefreshDpcRoutine Lc8 0A,02 [EU ] nt!_C_specific_handler (C)
> fffff8000100358a La (fffff8000112c830 -> fffff80001000000)
> fffff8000100358a Lc (fffff8000112c870 -> fffff80001003596)
> fffff8000100358a L16 (fffff8000112c8a0 -> fffff80001000000)
> fffff8000100358a L18 (fffff8000112c8f0 -> fffff800010035a2)

These SEH registrations are integral to the operation of PatchGuard's system integrity checks. The specifics of how each handler registration work differ for each DPC routine (in an attempt to frustrate attempts to reverse engineer them), but the general idea is that each registered handler performs a portion of the work necessary to set up a call to the PatchGuard integrity check routine. This work is divided up among four different exception/unwind handlers in an effort to make it difficult to understand what is going on, but ultimately the end result is the same for each of the DPC routines; one of the exception/unwind handlers ends up making a direct call to the system integrity check decryption stub in-memory. The decryption stub decrypts itself, and then decrypts the PatchGuard check routine, following with a transfer of control to the integrity check routine so that PatchGuard can inspect various protected registers, MSRs, and kernel images (such as the kernel itself) for unauthorized modification.

Additionally, all of the PatchGuard DPCs have been enhanced to obfuscate the DPC routine arguments in stack variables (whose exact stack displacement varies from DPC routine to DPC routine, and furthermore between kernel flavor to kernel flavor; for example, the multiprocessor and uniprocessor kernel builds have different stack frame layouts for many of the PatchGuard DPC routines). Recall that in the x64 calling convention, the first four arguments are passed via registers (rcx, rdx, r8, and r9 respectively). Each PatchGuard DPC routine takes special care to save away significant register arguments onto the stack (in an obfuscated form). Several of the arguments remain obfuscated until just before the decryption stub for the system integrity check routine is called, in an effort to make it difficult for third parties to patch into the middle of a particular DPC routine and easily access the original arguments to the DPC. This is presumably designed in an attempt to make it more difficult to differentiate DPC invocations that perform the DPC routine's legitimate function from DPC invocations that will call PatchGuard. It also makes it difficult, though not impossible, for a third party to recover the original arguments to the DPC routine from the context of any of the exception handlers registered to the DPC routine in a generalized fashion.

This obfuscation of arguments can be clearly seen by disassembling any of the PatchGuard DPC routines. For example, when looking at ExpTimeRefreshDpcRoutine, one can see that the routine saves away the Dpc (rcx) and DeferredContext (rdx) arguments on the stack, rotates them by a magical constant (this constant differs for each DPC routine flavor and is used to further complicate the task of recovering the original DPC arguments in a generalized fashion), and then overwrites the original argument registers:

0: kd> uf nt!ExpTimeRefreshDpcRoutine
; On entry, we have the following:
; rcx -> Dpc
; rdx -> DeferredContext (if this is being called for
;                         PatchGuard, then DeferredContext
;                         is a bogus kernel pointer).
; r8  -> SystemArgument1
; r9  -> SystemArgument2
; r11 is used as an ephemeral frame pointer here.
; Ephemeral frame pointers are an x64-specific compiler
; construct, wherein a volatile register is used as a
; frame pointer until the first function call is made.
fffff800`01003540 4c8bdc          mov     r11,rsp
fffff800`01003543 4881ecc8000000  sub     rsp,0C8h
fffff800`0100354a 4889642460      mov     qword ptr [rsp+60h],rsp
; This DPC routine does not use SystemArgument1 or
; SystemArgument2.  As a result, it is free to overwrite
; these argument registers immediately without preserving
; their value.
; r8  = Dpc
; rcx = Dpc
; rdx = DeferredContext
fffff800`0100354f 4c8bc1          mov     r8,rcx
fffff800`01003552 4889542448      mov     qword ptr [rsp+48h],rdx
; Set [rsp+20h] to zero.  This is a state variable that is
; used by the exception/unwind scope handlers in order to
; coordinate the PatchGuard execution process across the
; set of four exception/unwind scope handlers associated
; with this section of code.
fffff800`01003557 4533c9          xor     r9d,r9d
fffff800`0100355a 44894c2420      mov     dword ptr [rsp+20h],r9d
; PatchGuard zeros out various key fields in the DPC.
; This is an attempt to make it difficult to locate the DPC
; in-memory from the context of an exception handler called
; when a PatchGuard DPC accesses the bogus DeferredContext
; argument.  Specifically, PatchGuard zeros the Type and
; DeferredContext fields of the KDPC structure, shown below:
; 0: kd> dt nt!_KDPC
;   +0x000 Type             : UChar
;   +0x001 Importance       : UChar
;   +0x002 Number           : UChar
;   +0x003 Expedite         : UChar
;   +0x008 DpcListEntry     : _LIST_ENTRY
;   +0x018 DeferredRoutine  : Ptr64
;   +0x020 DeferredContext  : Ptr64 Void
;   +0x028 SystemArgument1  : Ptr64 Void
;   +0x030 SystemArgument2  : Ptr64 Void
;   +0x038 DpcData          : Ptr64 Void
; Dpc->Type = 0
fffff800`0100355f 448809          mov     byte ptr [rcx],r9b
; Dpc->DeferredContext = 0
fffff800`01003562 4c894920        mov     qword ptr [rcx+20h],r9
; Here, the DPC loads [r11-20h] with an obfuscated
; copy of the DeferredContext argument (rotated
; left by 0x34 bits).
; Recall that rsp == r11+0xc8, so this location
; can also be aliased by [rsp+0A8h].
; [rsp+0A8h] -> ROL(DeferredContext, 0x34)
fffff800`01003566 488bc2          mov     rax,rdx
fffff800`01003569 48c1c034        rol     rax,34h
fffff800`0100356d 498943e0        mov     qword ptr [r11-20h],rax
; Similarly, the DPC loads [r11-48h] with an
; obfuscated copy of the Dpc argument (rotated
; right by 0x48 bits).
; This location may be aliased as [rsp+80h].
; [rsp+80h] -> ROR(Dpc, 0x48)
fffff800`01003571 488bc1          mov     rax,rcx
fffff800`01003574 48c1c848        ror     rax,48h
fffff800`01003578 498943b8        mov     qword ptr [r11-48h],rax
; The following register context is now in place:
; r8         = Dpc
; rcx        = Dpc
; rdx        = DeferredContext
; rax        = ROR(Dpc, 0x48)
; [rsp+0A8h] = ROL(DeferredContext, 0x34)
; [rsp+80h]  = ROR(Dpc, 0x48)
; The DPC routine destroys the contents of rcx by
; zero extending it with a copy of the low byte of
; the DeferredContext value.
fffff800`0100357c 0fb6ca          movzx   ecx,dl
; The DPC routine destroys the contents of r8 with
; a right shift (unlike a rotate, the incoming left
; bits are simply zero filled instead of set to the
; rightmost bits being shifted off.  The rightmost
; bits are thus lost forever, destroying the r8
; register as a useful source of the Dpc argument.
fffff800`0100357f 49d3e8          shr     r8,cl
; r8 is saved away on the stack, but it is no longer
; directly useful as a way to locate the Dpc argument
; due to the destructive right shift above.
fffff800`01003582 4c898424d8000000 mov     qword ptr [rsp+0D8h],r8
; r8         = Dpc >> (UCHAR)DeferredContext
; rcx        = (UCHAR)DeferredContext
; rdx        = DeferredContext
; rax        = ROR(Dpc, 0x48)
; [rsp+0A8h] = ROL(DeferredContext, 0x34)
; [rsp+80h]  = ROR(Dpc, 0x48)
; Here, we temporarily deobfuscate the DeferredContext
; argument stored at [r11-20h] above.  In this particular
; instance, rdx also happens to contain the deobfuscated
; DeferredContext value, but not all instances of
; PatchGuard's DPC routines share this property of
; retaining a plaintext copy of DeferredContext in rdx.
fffff800`0100358a 498b43e0        mov     rax,qword ptr [r11-20h]
fffff800`0100358e 48c1c834        ror     rax,34h
; Now, we have the following context in place:
; r8         = Dpc >> (UCHAR)DeferredContext
; rcx        = (UCHAR)DeferredContext
; rdx        = DeferredContext   (* But not valid for
;                                 all DPC routines.)
; rax        = DeferredContext
; [rsp+0A8h] = ROL(DeferredContext, 0x34)
; [rsp+80h]  = ROR(Dpc, 0x48)
; The next step is to dereference the DeferredContext value.
; For a legitimate DPC invocation, this operation is harmless;
; the DeferredContext value would point to valid kernel memory.
; For PatchGuard, however, this triggers an access violation
; that winds up with control being transferred to the exception
; handlers registered to the DPC routine.
fffff800`01003592 8b00            mov     eax,dword ptr [rax]

At this point, it is necessary to investigate the various exception/unwind handlers registered to the DPC routine in order to determine what happens next. Most of these handlers can be skipped as they are nothing more than minor layers of obfuscation that, while differing significantly between each DPC routine, have the same end result. One of the exception/unwind handlers, however, makes the call to PatchGuard's integrity check, and this handler is worthy of further discussion. Because the exception registrations for all of the PatchGuard DPC routines make use of nt!_C_specific_handler, the scope handlers conform to a standard prototype, defined below:

// Define the standard type used to describe a C-language exception handler,
// which is used with _C_specific_handler.
// The actual parameter values differ depending on whether the low byte of the
// first argument contains the value 0x1.  If this is the case, then the call
// is to the unwind handler to the routine; otherwise, the call is to the
// exception handler for the routine.  Each routine has fairly different
// interpretations for the two arguments, though the prototypes are as far as
// calling conventions go compatible.

        __in    PEXCEPTION_POINTERS    ExceptionPointers,  // if low byte is 0x1, then we're an unwind
        __in    ULONG64                EstablisherFrame    // faulting routine stack pointer

In the case of nt!ExpTimeRefreshDpcRoutine, the fourth scope handler registration is the one that performs the call to PatchGuard's integrity check routine. Here, the routine only executes the integrity check if a state variable stored at [rsp+20h] in the DPC routine is set to a particular value. This state variable is modified as the access violation exception traverses each of the exception/unwind scope handlers until it reaches this handler, which eventually leads up to the execution of PatchGuard's system integrity check. For now, it is best to assume that this routine is being called with [rsp+20h] in the DPC routine having been set to a value other than 0x15. This signifies that PatchGuard should be executed.

0: kd> uf fffff8000112c8f0
; mov eax, eax is a hotpatch stub and can be ignored.
fffff800`0112c8f0 8bc0            mov     eax,eax
fffff800`0112c8f2 55              push    rbp
fffff800`0112c8f3 4883ec20        sub     rsp,20h
; rdx corresponds to the EstablisherFrame argument.
; This argument is the stack pointer (rsp) value for
; the routine that this exception/unwind handler is
; associated with.  The typical use of this argument
; is to allow seamless access to local variables in
; the routine for which the try/except filter is
; associated with.  This is what eventually ends up
; occuring here, with the rbp register being loaded
; with the stack pointer of the DPC routine at the
; point in time where the exception occured.
fffff800`0112c8f7 488bea          mov     rbp,rdx
; We make the check against the state variable.
; Recall that when the DPC routine was first entered,
; [rsp+20h] in the DPC routine's context was set to
; zero.  That location corresponds to [rbp+20h] in
; this context, as rbp has been loaded with the stack
; pointer that was in use in the DPC routine.  This
; location is checked and altered by each of the
; registered exception/unwind handlers, and will
; eventually be set to 0x15 when this routine is called.
fffff800`0112c8fa 83452007        add     dword ptr [rbp+20h],7
fffff800`0112c8fe 8b4520          mov     eax,dword ptr [rbp+20h]
fffff800`0112c901 83f81c          cmp     eax,1Ch
; For the moment, consider the case where this jump is
; not taken.  The jump is taken when PatchGuard is not
; being executed (which is not the interesting case).
fffff800`0112c904 0f858c000000    jne     nt!ExpTimeRefreshDpcRoutine+0x215 (fffff800`0112c996)

; To understand the following instructions, it is
; necessary to look back at the stack variable context
; that was set up by the DPC routine prior to the
; faulting instruction that caused the access
; violation exception.  The following values were
; set on the stack at that time:
; [rsp+0A8h] = ROL(DeferredContext, 0x34)
; [rsp+80h]  = ROR(Dpc, 0x48)
; The following set of instructions utilize these
; obfuscated copies of the original arguments to the
; DPC routine in order to make the call to PatchGuard's
; integrity check routine.
; The first step taken is to deobfuscate the Dpc value
; that was stored at [rsp+80h], or [rbp+80h] as seen from
; this context.
fffff800`0112c90a 488b8580000000  mov     rax,qword ptr [rbp+80h]
; rax = Dpc
fffff800`0112c911 48c1c048        rol     rax,48h
; [rbp+50h] -> Dpc
fffff800`0112c915 48894550        mov     qword ptr [rbp+50h],rax
; Next, the DeferredContext argument is deobfuscated and
; stored plaintext.
fffff800`0112c919 488b85a8000000  mov     rax,qword ptr [rbp+0A8h]
; rax = DeferredContext
fffff800`0112c920 48c1c834        ror     rax,34h
; [rbp+58h] -> DeferredContext
fffff800`0112c924 48894558        mov     qword ptr [rbp+58h],rax
; rax = Dpc
fffff800`0112c928 488b4550        mov     rax,qword ptr [rbp+50h]
; The next instruction accesses memory after the KDPC
; object in memory.  Recall that a KDPC object is 0x40
; bytes in length on x64, so [Dpc+40h] is the first
; value beyond the DPC in memory.  In reality, the KDPC
; is a member of a larger structure, which is defined
; as follows:
;  KDPC      Dpc;            // +0x00
;  ULONGLONG DecryptionKey;  // +0x40
; };
; As a result, this instruction is equivalent to casting
; the Dpc argument to a PATCHGUARD_DPC_CONTEXT*, and then
; accessing the DecryptionKey member
; rcx = Dpc->DecryptionKey
fffff800`0112c92c 488b4840        mov     rcx,qword ptr [rax+40h]
; [rbp+40h] -> DecryptionKey
fffff800`0112c930 48894d40        mov     qword ptr [rbp+40h],rcx
; rax = DecryptionKey
fffff800`0112c934 488b4540        mov     rax,qword ptr [rbp+40h]
; The DeferredContext value is then xor'd with the
; decryption key stored in the PATCHGUARD_DPC_CONTEXT
; structure.  This yields the significant bits of the
; pointer to the PatchGuard decryption stub.  Recall
; that due to the "no-mans-land" region in between the
; kernel mode and user mode address space boundaries
; on current AMD64 processors, the rest of the bits
; are required to be either all ones or all zeros in
; order to form a valid address.  Because we are
; dealing with a kernel mode address, it can be safely
; assumed that all of the bits must be ones.
fffff800`0112c938 48334558        xor     rax,qword ptr [rbp+58h]
; [rbp+30h] -> DeferredContext ^ DecryptionKey
fffff800`0112c93c 48894530        mov     qword ptr [rbp+30h],rax
; Set the required bits to ones in the decrypted
; pointer, as required to form a canonical address on
; current AMD64 systems.
fffff800`0112c940 48b80000000000f8ffff mov rax,0FFFFF80000000000h
; [rbp+30h] -> [rbp+30h] | 0xFFFFF80000000000
; Now, [rbp+30h] is the pointer to the decryption stub.
fffff800`0112c94a 48094530        or      qword ptr [rbp+30h],rax
; The following instructions make extra copies of the decryption
; stub on the stack of the DPC routine.  There is no real purpose
; to this, other than a half-hearted attempt to confuse anyone
; attempting to reverse engineer this section of PatchGuard.
; [rbp+38h] -> [rbp+30h] (Decryption stub)
fffff800`0112c94e 488b4530        mov     rax,qword ptr [rbp+30h]
fffff800`0112c952 48894538        mov     qword ptr [rbp+38h],rax
; [rbp+28h] -> [rbp+38h] (Decryption stub)
fffff800`0112c956 488b4538        mov     rax,qword ptr [rbp+38h]
fffff800`0112c95a 48894528        mov     qword ptr [rbp+28h],rax
; The next set of instructions rewrite the first
; four bytes of the initial opcode in the decryption
; stub.  This opcode must be set to the following
; instruction:
; f0483111        lock xor qword ptr [rcx],rdx
; The individual opcode bytes for the instruction are
; written to the decryption stub one byte at a time.
; *(PULONG)DecryptionStub = 0x113148f0
fffff800`0112c95e 488b4528        mov     rax,qword ptr [rbp+28h]
fffff800`0112c962 c600f0          mov     byte ptr [rax],0F0h
fffff800`0112c965 488b4528        mov     rax,qword ptr [rbp+28h]
fffff800`0112c969 c6400148        mov     byte ptr [rax+1],48h
fffff800`0112c96d 488b4528        mov     rax,qword ptr [rbp+28h]
fffff800`0112c971 c6400231        mov     byte ptr [rax+2],31h
fffff800`0112c975 488b4528        mov     rax,qword ptr [rbp+28h]
fffff800`0112c979 c6400311        mov     byte ptr [rax+3],11h
; Finally, a call to the decryption stub is made.  The
; decryption stub has a prototype that conforms to the
; following definition:
; PgDecryptionStub(
;       __in PVOID   PatchGuardRoutine,
;       __in ULONG64 DecryptionKey,
;       __in ULONG   Reserved0,
;       __in ULONG   Reserved1
;       );
; The two 'reserved' ULONG values are always set to zero.
; rcx is loaded with the address of the decryption stub,
; and rdx is loaded with the DecryptionKey value.
fffff800`0112c97d 4533c9          xor     r9d,r9d
fffff800`0112c980 4533c0          xor     r8d,r8d
fffff800`0112c983 488b5540        mov     rdx,qword ptr [rbp+40h]
fffff800`0112c987 488b4d38        mov     rcx,qword ptr [rbp+38h]
; At this point, control is transferred to the decryption
; stub, as described previously.  The decryption stub will
; decrypt itself, decrypt the PatchGuard integrity check
; routine, and then transfer control to the PatchGuard
; integrity check routine.  The integrity check routine is
; responsible for ensuring that the DPC is returned to a
; usable state (recall that parts of it were zeroed out
; by the DPC routine earlier), and that it is re-queued
; for execution.  It is also responsible for re-encrypting
; the decryption stub as desired.
fffff800`0112c98b ff5538          call    qword ptr [rbp+38h]
; After the call is made, the exception filter returns
; the EXCEPTION_EXECUTE_HANDLER manifest constant.  This
; causes one of the registered handlers to be invoked
; in order to handle the exception.  The handler will
; transfer control to the return point of the DPC routine,
; thus skipping the body of the DPC (since the call to
; the DPC was not a request for the legitimate function of
; the DPC to be performed).
fffff800`0112c98e 41b901000000    mov     r9d,1
fffff800`0112c994 eb03            jmp     nt!ExpTimeRefreshDpcRoutine+0x218 (fffff800`0112c999)

fffff800`0112c996 4533c9          xor     r9d,r9d

fffff800`0112c999 418bc1          mov     eax,r9d
fffff800`0112c99c 4883c420        add     rsp,20h
fffff800`0112c9a0 5d              pop     rbp
fffff800`0112c9a1 c3              ret

This does represent a significant level of obfuscation, but it is not impenetrable, and there are various simple ways through which an attacker could bypass all of these layers of obfuscation entirely.