Improving Software Security Analysis using Exploitation Properties 12/2007 skape mmiller@hick.org Abstract Reliable exploitation of software vulnerabilities has continued to become more difficult as formidable mitigations have been established and are now included by default with most modern operating systems. Future exploitation of software vulnerabilities will rely on either discovering ways to circumvent these mitigations or uncovering flaws that are not adequately protected. Since the majority of the mitigations that exist today lack universal bypass techniques, it has become more fruitful to take the latter approach. It is in this vein that this paper introduces the concept of exploitation properties and describes how they can be used to better understand the exploitability of a system irrespective of a particular vulnerability. Perceived exploitability is of utmost importance to both an attacker and to a defender given the presence of modern mitigations. The ANI vulnerability (MS07-017) is used to help illustrate these points by acting as a simple example of a vulnerability that may have been more easily identified as code that should have received additional scrutiny by taking exploitation properties into consideration. 1) Introduction Modern exploit mitigations have become formidable opponents with respect to the effect they have on reliable exploitation. Some of the more substantial modern mitigations include GuardStack (GS), SafeSEH, DEP (NX), ASLR, pointer encoding, and various heap improvements[8, 9, 10, 15, 24, 3, 4]. The fact that there have been very few public exploits that have been able to universally bypass all of these mitigations at once is a testament to the resilience of these techniques working in concert with one another. It is obvious that the absence of a given mitigation directly contributes to the exploitability of the associated code. Likewise, it is also well known that most mitigations have situations in which they will offer little to no protection[5, 16, 18, 20, 2, 4]. For instance, in certain cases, it may be possible to perform a partial overwrite on Windows Vista to defeat ASLR due to the fact that only 15 bits of most 32-bit addresses may be affected by randomization[2, 17]. Other mitigations also have situations where they may not provide adequate coverage. Given the fact that the majority of mitigations have known limitations, it makes sense to consider where this information might be useful. In the field of program analysis, whether it be manual, static, or dynamic, the question of scoping is often pertinent. This question typically revolves around figuring out what areas of code should be reviewed and what precedence, if any, should be assigned to different regions. Typical approaches taken to accomplish this often involve identifying code that straddles a trust boundary or performs complex operations reachable from a trust boundary. However, depending on one's perspective, this type of approach is insufficient in the face of modern mitigations because it may result in areas of code being reviewed that are adequately protected by all mitigations. To help address this perceived deficiency, this paper introduces the concept of exploitation properties and describes how they can be used to provide a better understanding of exploitability of a system if a vulnerability is found to be present. Regions of code that are found to have a number of distinct exploitation properties may be more interesting from an exploitation standpoint and therefore may warrant additional scrutiny from a program analysis perspective. The use of exploitation properties may benefit both an attacker and a defender. For example, companies may wish to perform targeted reviews on areas of code that may be more trivially exploited in an effort to prevent reliable exploits from being released in the future. Likewise, an attacker searching for a vulnerability may wish to avoid auditing regions of code that are likely to be more difficult to exploit. Exploitation properties represent additional criteria that can be used when attempting to better understand the security aspects of a program. Annotating regions of code with exploitation properties makes it possible to use set unions and intersections to identify the subset of interesting regions of code for a particular analysis problem. For example, an attacker may wish to determine the regions of code that may permit the use of traditional stack-based buffer overflow techniques as well as permitting a partial overwrite of a return address in order to defeat ASLR. Using these two exploitation properties as criteria, a narrowed subset can be produced which contains only those regions which meet both criteria by intersecting those regions that have both exploitation properties. For the purpose of this paper, the term narrowing is not used in the strict mathematical sense; rather, this paper uses narrowing to describe the process of constraining the scope of analysis through the use of specific criteria. The concept of using automated analysis as a precursor to more strenuous program analysis is certainly not new. There have been many tools ranging from the simple detection of calls to strcpy to much more sophisticated forms of static analysis. Still, the use of exploitation properties can be seen as an additional set of data points which may be useful in the context of program analysis given the hypothesis that most reliably exploitable security vulnerabilities are being pushed into areas of code that are less affected by mitigations. The concept of exploitation properties is presented as follows. Section 2 categorizes and defines a limited number of concrete exploitation properties. Section 3 provides a concrete example of using exploitation properties to help identify the function that contained the ANI vulnerability. Section 4 describes some potential ways in which exploitation properties can be applied. Section 5 gives a brief description of future work involving exploitation properties. 2) Exploitation Properties Exploitation properties describe the ease with which an arbitrary vulnerability might be exploited. An understanding of a system's perceived exploitability can provide useful insights when attempting to establish the risk factors associated with it. An example of this can be seen in threat modeling where the DREAD model of classifying risk includes a high-level evaluation of exploitability as one of the risk factors[14]. It is important to note that exploitation properties do not provide any indication that a vulnerability exists; instead, they are only meant to convey information about how easily a vulnerability could be exploited. The concept of an exploitation property can be broken into different categories which are tied to the configuration or context that the property is associated with. Examples of these categories include platforms, processes, binary modules, functions, and so on. The following subsections provide concrete examples to better illustrate the concept of an exploitation property. These examples are given by showing what implications a property has with respect to exploitation as well as how a property might be derived. It should be noted that the examples given in this paper do not represent a complete, exhaustive set of exploitation properties. 2.1) Platform Properties Exploitation properties associated with a platform are meant to illustrate how easily a vulnerability may be exploited when a given platform configuration, such as the operating system or architecture, is used. For example, Windows 2000 does not include support for enforcing non-executable pages. This implies that any vulnerability found within an application that runs in the context of the Windows 2000 platform may be exploited more easily. An understanding of exploitation properties that are associated with a platform may be useful when attempting to assess the risk of applications that might run on multiple platforms. There are many other examples of exploitation properties that are tied to platforms. In order to limit the scope of this document, platform exploitation properties are not discussed at length. 2.2) Process Properties Process exploitation properties carry some information about how easily vulnerabilities found within the context of a running process may be exploited. For example, Internet Explorer running on 32-bit versions of Windows Vista do not make use of hardware-enforced DEP (NX) by default. This means that any vulnerabilities found within code that runs in the context of Internet Explorer will not be protected by non-executable regions. An understanding of exploitation properties that are associated with a process context can help to provide a better understanding of the risks associated with code that may run in the context of a given process. In order to limit the scope of this document, process exploitation properties are not discussed at length. 2.3) Module Properties Module exploitation properties are used to illustrate the effect that a particular binary module has on ease of exploitation. This category of properties is useful when attempting to identify binaries that may be more easily exploited if a vulnerability is found within them or in code that depends on them. This subsection describes two examples of module exploitation properties. 2.3.1) No Support for ASLR Windows Vista was the first major release of Windows to include a built-in implementation of Address Space Layout Randomization (ASLR)[15,24]. In order to head off potential application compatibility issues, Microsoft chose to make ASLR an opt-in feature by requiring binaries to be compiled with a new compiler switch (/dynamicbase)[21]. This compiler switch is responsible for setting a bit (0x40) in the DllCharacteristics that are defined within a binary. If this bit is set, the Windows kernel will attempt to randomize the base address of the binary when it is mapped into memory the first time. If the bit is not set, the binary will not have its base address randomized, although it could be relocated in memory if the binary's preferred region is already occupied by another allocation. As such, any binary that does not support ASLR may be mapped at a predictable location within a process address space at execution time. This can allow an attacker to make assumptions about the address space which may make exploitation easier if a vulnerability is found within any code that is mapped into the same address space as the module of interest. 2.3.2) No Support for SafeSEH With Visual Studio 2003, Microsoft introduced a compile-time change known as SafeSEH which attempts to act as a mitigation for the SEH overwrite attack vector[5,9]. SafeSEH works by adding a static list of known good exception handlers that are considered valid as metadata within a given binary. Binaries that support SafeSEH allow the exception dispatcher to perform additional checks when dispatching exceptions. The most important check involves determining if an exception handler that is found to exist within the mapped region of a given binary is actually considered to be one of the safe exception handlers. If the exception handler is not a safe exception handler, the exception dispatcher can take steps to prevent it from being called. This behavior works to mitigate the potential exploitation vector. In order to communicate this information to the exception dispatcher, modern PE files include fields in the load config data directory which hold the offset of the safe exception handler table and the number of elements found within the table. The load config data directory contains meta data that is useful to the dynamic loader such as information about safe exception handlers, the module's global security cookie address, and so on[13]. The following output from dumpbin.exe illustrates what this might look like: 310751E0 Safe Exception Handler Table 1 Safe Exception Handler Count Safe Exception Handler Table Address -------- 310357D1 __except_handler4 Unfortunately, as with ASLR, the benefits offered by SafeSEH are not complete unless every binary that is loaded into an address space has been compiled to make use of SafeSEH. If a binary has not been compiled to make use of SafeSEH, an attacker may be able to use any address found within the binary's memory mapping as an exception handler in conjunction with an SEH overwrite. 2.4) Function Properties Function exploitation properties convey information about how a function contributes to the exploitability of an application. For example, a function might make it possible to use certain exploitation techniques that might otherwise be prevented if mitigations were present. Alternatively, a function might simply assist in the exploitation process. Function exploitation properties are especially useful because they provide more detailed information than exploitation properties that are derived from the platform, process, or module context. 2.4.1) Absence of GuardStack The GuardStack (GS) support included with versions of the Microsoft Visual Studio compiler since 2002 offers a compile-time mitigation to traditional stack-based buffer overflows[23]. It supports this through a combination of a random canary inserted into a stack frame at runtime and an intelligent stack frame layout algorithm. The random canary is pushed onto the stack when a function is called and then popped off the stack and validated prior to function return. If the canary does not match the expected value, it is assumed that a stack-based buffer overflow occurred and that the process should be terminated. Since the initial release of GS support a number of techniques have been described that could be used to bypass or weaken it[5, 16, 20]. While these techniques were at one time useful or have not yet been fully realized, the author assumes that most would agree that the GS implementation provided by the most recent compiler is robust (with the exception of SEH). There is currently no publicly known universal bypass technique for GS that the author is aware of. Given this assumption, functions that are protected by GS become less interesting from the standpoint of identifying stack-based buffer overflows. On the other hand, functions that are not protected by GS can instantly be qualified as interesting targets for review. This is especially true with binaries that have been compiled with GS support but contain a number of functions that the compiler has chosen not to compile with GS protections. This choice is made by taking into account certain conditions such as the presence or absence of local variables that are declared as fixed-size arrays. As previous research has illustrated[27], it is possible to identify functions that have not been compiled to use GS through the use of simple static analysis tools. It is also possible to further refine the approaches described in previous research if one has symbols and one assumes that the most recent compiler was used. This can be accomplished by analyzing the call graph of an executable and noting the set of functions that do not call securitycheckcookie. Considered another way, the same set of functions can be identified by taking the set of all functions contained within a binary less the subset that call securitycheckcookie. The set of functions that is identified by either approach can be annotated with an exploitation property that indicates that they may contain stack-based buffer overflows that would not be hindered by GS. It may also be prudent to take the compiler version that was used into consideration when analyzing binaries. This is important due to the fact that older versions of the compiler used a GS implementation that could be trivially defeated in certain circumstances[16]. For example, previous versions of GS did not layout the stack frame in a manner that would prevent an attacker from overwriting other local variables and function arguments. In scenarios where this occurred and an overwritten local variable or parameter was dereferenced (such as by invoking a function pointer), the mitigation offered by GS would be meaningless. Thus, a secondary exploitation property could involve identifying functions where attacks such as the one described above could be possible. 2.4.2) Partial Overwrite Feasibility One of the unique consequences of implementing Address Space Layout Randomization (ASLR) on Windows is the limitation that the system allocation granularity imposes on the number of bits that can be randomized within most memory allocations. In particular, the allocation granularity used by Windows enforces strict 16-page alignment for the base addresses of most memory mappings in user-mode. This restriction means that it is only possible to introduce entropy into the low 15 bits of the high-order 16 bits of a 32-bit memory mapping[17]. While this may sound odd at first glance, the high-order two bits are not randomized due to the divide between kernel and user-mode. This assumes that a machine is booted without /3GB. The low-order 16 bits remain unchanged relative to the high-order bits. This caveat means that it may be possible to perform a partial overwrite of an address and thus bypass the security features offered by ASLR[2]. However, the ability to perform a partial overwrite also relies on the presence of useful code or data within a region that is relative to the address that is being overwritten. To visualize how this type of information might be useful, consider a scenario where an attacker is performing a partial overwrite of a return address on the stack. In this situation, it is often necessary for one or more useful opcodes to be present at an address that is 16-page relative to the return address. For example, consider a scenario where the function may have a vulnerability that would permit a partial overwrite. In this example, is called by and . In order to permit the use of a partial overwrite, a useful opcode must be found within the same 16-page aligned region that either or reside on. If a useful opcode is present, an exploitation property can be attached to in order to indicate that a partial overwrite may be feasible due to the presence of a useful opcode within the same 16-page aligned region as either or . For example, consider the following pseudo-disassembly illustrating a case where the call f instruction in is on the same 16-page region as a useful opcode: ... useful jmp on same 16-page region 0x14c1XXXX 0x14c1fc04 jmp esp ... entry point to h() 0x14c1a910 push ebp 0x14c1a911 mov ebp, esp 0x14c1a914 call f ... entry point to y(), not on same 16-page region 0x137f44c8 push ebp While this captures the basic concept, a better approach might be to view a binary in a different way. For example, consider the following approach to drawing the same conclusion: for each code region that contains a useful opcode, identify the subset of functions that are called from call sites within the same 16-page aligned region as the useful opcode. This has the effect of annotating all of the child functions that could potentially leverage a partial overwrite of the return address with respect to a particular collection of opcodes. One important point that must be made about this exploitation property is that is entirely dependent upon the definition of "useful code or data". Exploitation is very much an art and it goes without saying that attempting to constrain the approaches that an attacker might make use of is likely to be folly. However, defining a known-set of useful opcodes and using that set as a base with which to draw the above conclusion can be said to be better than not doing so at all. 2.4.3) Function or Parent Registers an Exception Handler One of the unique exploitation vectors that exists in 32-bit programs that run on Windows is known as an SEH overwrite[5]. An SEH overwrite makes it possible to gain control of execution flow by overwriting an exception registration record on the stack. From an exploitation perspective, the act of registering an exception handler within a function opens up the possibility of making use of an SEH overwrite. Since exception handlers are chained, the act of registering an exception handler also implicates any functions that are children of a function that registers the exception handler. This makes it possible to define an exploitation property that illustrates the possibility of an SEH overwrite being abused within the scope of a specific set of functions. Detecting this property can be as simple as signaturing the compiler generated code that is used to generate and register an exception handler within a function. An example of two functions, and , that would meet this criteria can be seen below: void f() { __try { g(); } __except(EXCEPTION_EXECUTE_HANDLER) { } } void g() { ... } In addition to this information being useful from an SEH overwrite perspective, it may also benefit an attacker in situations where an exception handler simply swallows any exceptions that are dispatched without crashing the process[1]. In the example given above, any exception that occurs in the context of will be swallowed by without necessarily crashing the process. This behavior may allow an attacker to retry their exploitation attempt multiple times, thus enabling a bruteforce attack that would otherwise not be feasible. This can make defeating ASLR more feasible. 2.4.4) Function is an Exception Handler The introduction of SafeSEH as a modern compile-time mitigation has caused the particulars of how exception handlers are implemented to become more interesting. This has to do with the fact that SafeSEH restricts the set of exception handlers that may be called by the exception dispatcher to those that are specified as being valid within the scope of a given binary. As discussed previously in this paper, SafeSEH prevents traditional SEH overwrites from being able to use any address as the overwritten exception handler. While this is effective in its primary intent, there is still the possibility that a valid exception handler can be abused to make exploitation more feasible[1]. This scenario is restricted to EH3 and prior exception handlers as EH4 includes a check of a cookie before dispatching exceptions. As such, it may be useful to flag the regions of code that are associated with EH3 and prior exception handlers, including language-specific exception handlers, as being potentially interesting from an exploitation perspective. Unfortunately, as with ASLR, the benefits offered by SafeSEH are not complete unless every binary that is loaded into a process address space has been compiled to make use of SafeSEH. If a binary has not been compiled to make use of SafeSEH, an attacker may be able to use any address found within the binary's memory mapping as an exception handler in the context of an SEH overwrite. This may make exploitation more feasible. 3) Case Study: MS07-017 The animated cursor (ANI) vulnerability was discovered by Alexander Sotirov in late 2006 and patched by Microsoft with the MS07-017 critical update in April, 2007 . Apart from being a client-side vulnerability that was exposed through web-browsers and other mediums, the ANI vulnerability was one of the first notable security issues that affected Windows Vista. It was notable due to the simple fact that even though Microsoft had touted Windows Vista as being the most secure operating system to date, the exploits that were released for the ANI vulnerability were very reliable. These exploits were able to ignore or defeat the protections offered by mitigations such as GS, DEP, and even Vista's newest mitigation: ASLR. To better understand how this was possible it is important to dive deeper into the details of the vulnerability itself. gives a brief description of the ANI vulnerability and some of the techniques that were used to successfully exploit it. Following this description, illustrates how exploitation properties, in combination with another class of properties, can be used to detect functions that may contain vulnerabilities similar to the ANI vulnerability. This is meant to help illustrate the perceived benefits of applying the concept of exploitation properties to aide in the process of identifying regions of code that may deserve additional scrutiny based on their perceived exploitability. 3.1) Background While the ANI vulnerability was certainly unique, it was not the first time the animated cursor code was found to have a security issue. Microsoft patched an issue that was almost exactly the same as MS07-017 with MS05-002 roughly two years prior. In both cases, the underlying security issue was related to a failure to properly validate input that was derived from the contents of an animated cursor file. Alexander Sotirov provided much of the initial research on the ANI vulnerability and also gave an excellent write-up to its effect[22]. This paper will only attempt to highlight the flaw. The vulnerability itself was found in user32!LoadAniIcon which is responsible for processing a number of different chunks that may be contained within an animated cursor file. Each chunk is a TLV (Type-Length-Value) as described by the following structure: struct ANIChunk { char tag[4]; // ASCII tag DWORD size; // length of data in bytes char data[size]; // variable sized data } Keeping this structure in mind, the flaw itself can be seen in the abbreviated pseudo-code below as modified slightly from Sotirov's original write-up: 01: int LoadAniIcon(struct MappedFile* file, ...) { 02: struct ANIChunk chunk; 03: struct ANIHeader header; // 36 byte structure 04: while (1) { 05: // read the first 8 bytes of the chunk 06: ReadTag(file, &chunk); 07: switch (chunk.tag) { 08: case 'anih': 09: // read chunk.size bytes into header 10: ReadChunk(file, &chunk, &header); On line 6, the chunk header is read into the local variable chunk using ReadTag which populates the chunk's tag and size fields. If the chunk's tag is equal to 'anih', the data associated with the chunk is read into the header local variable using ReadChunk on line 10. The problem is that ReadChunk uses the size field of the chunk as the amount of data to read from the file. Since header is a fixed-size (36 byte) data structure and the chunk's size can be variable, a trivial stack-based buffer overflow may occur if more than 36 bytes are specified as the chunk size. In terms of the vulnerability, that's all there is to it, but the implications from an exploitation perspective are where things start to get interesting. When attempting to exploit this vulnerability it may at first appear that all attempts to do so would be futile. Given Vista's security push, an attacker would be justified in thinking that surely the LoadAniIcon function is protected by a GS cookie. This point is especially justified considering the majority of all binaries shipped with Windows Vista have been compiled with GS enabled[27]. However, there are indeed circumstances where the compiler will choose to not enable GS for a specific function. As chance would have it, the compiler chose not to enable GS for the LoadAniIcon function because of the simple fact that it does not contain any characteristics that would suggest that a stack-based buffer overflow might be possible (such as the use of stack-allocated arrays). This means that an attacker is able to make use of exploitation techniques that are associated with traditional stack-based buffer overflows. While this drastically increases the chances of being able to produce a reliable exploit, there are still other mitigations that are of potential concern. Another mitigation that might be concerning in most circumstances is hardware-enforced DEP (NX). This would generally prevent an attacker from being able to run arbitrary code within regions that are not marked as executable (such as the stack and the heap). However, as fate would have it, Internet Explorer is configured to not run with DEP enabled. This immediately removes this concern from the equation for exploits that attempt to trigger the ANI vulnerability through Internet Explorer. With DEP out of the picture, ASLR becomes a weakened but still potentially significant hurdle. While it may appear that ASLR would be challenging to defeat in most circumstances, this particular vulnerability provides an example of two different ways in which ASLR can be bypassed. The simplest approach, as taken by Sotirov, involves making use of the fact that Internet Explorer is not compiled with support for ASLR and therefore can be found at a fixed address within the address space. This allows an attacker to make use of opcodes contained within iexplore.exe's memory mapping. A second approach, as taken by the author, involves using a partial overwrite to ignore the effects of ASLR completely. The details relating to how a partial overwrite works were explained in 2.4.2. In either case, an attacker is able to reliably defeat Vista's ASLR. To compound the problem, the particulars of the context in which this vulnerability occur make it easier to exploit even without the presence of mitigations. This improved reliability comes from the fact that the LoadAniIcon function is wrapped in an exception handling context that simply swallows exceptions that are encountered. This makes it possible for an exploit to fail without actually crashing the process, thus allowing the attacker to try multiple times without having to worry about making a mistake that crashes the process. When all is said and done, the simplicity of the vulnerability and the ease with which mitigations could be bypassed are what lead to the ANI vulnerability being quite unique. Given the fact that this vulnerability can be so easily exploited, it is prudent to describe how it could have been detected as being a high risk function. 3.2) Detection The ease of exploitability associated with the ANI vulnerability makes it an obvious candidate for study with respect to the exploitation properties that have been described in this paper. It should be possible to use extremely simple criteria to accomplish two things. First, the criteria must identify the LoadAniIcon function. Second, the criteria should be unique enough to limit the size of the narrowed subset. Reducing the subset size is beneficial as it may permit the use of more complex program analysis tools which can further constrain or explicitly identify instances of vulnerabilities. Determining the specific criteria that is needed to identify the LoadAniIcon function can help illustrate how one can make use of exploitation properties. Given the description of the ANI vulnerability, one can easily deduce some of the more interesting properties that it has. An exploitation property that one might immediately observe is that the LoadAniIcon function does not make use of GS (2.4.1). This makes it possible to define criteria which states that only functions that have not been compiled with GS should be considered. Functions that have been compiled with GS are inherently less interesting for the purpose of this exercise due to the fact that they are less likely to contain exploitable vulnerabilities. A second property that the ANI vulnerability had with regard to exploitation was that it was possible for an attacker to make use of a partial overwrite to defeat ASLR. The exploitation property described in 2.4.2 illustrates how one can make this determination statically. In the case of the ANI vulnerability, a partial overwrite can be performed by making use of a jmp [ebx] that is located within the same 16-page aligned region as the caller of LoadAniIcon. Thus, any functions that could potentially make use of a partial overwrite can be used as additional criteria. At this point, a subset can be produced that is constrained to the regions of code that are annotated with the GS and partial overwrite exploitation properties. It is possible to further refine the set of functions that should ultimately be considered by studying the form that the ANI vulnerability took. The first point to note is that the stack-based buffer overflow occurred when writing beyond the bounds of a struct that was allocated on the stack. Furthermore, the overflow did not actually occur in the immediate context of the LoadAniIcon itself. Instead, the overflow was triggered by passing a pointer to the stack-allocated struct as a parameter when calling the function ReadChunk. Based on these data points it is possible to define a third criteria. In this case, the third criteria is not an exploitation property but is instead an example of a vulnerability property. While not discussed in detail in this paper, many examples of vulnerability properties exist, though perhaps not categorized as such. A vulnerability property can be thought of as an annotation that illustrates whether or not a region of code has a form that is similar to that seen in vulnerabilities or has the potential of being a vulnerability. The complexity of a vulnerability property, as with the complexity of an exploitation property, can range from highly sophisticated to very simplistic. For the purpose of this paper, a vulnerability property can be used that is very simple and imprecise but nevertheless effective at further narrowing the set of functions that should be reviewed. This property is based on whether or not a function passes a pointer to a stack-allocated variable as a parameter to a child function. This property is directly derived from the general form that the ANI vulnerability takes. At a minimum, a region of code that matches this form suggests that a vulnerability could be present. Using these three properties, it should be possible to easily identify both the function that contains the ANI vulnerability as well as other functions that could contain similar vulnerabilities. However, it is important to note that this process does not produce functions that definitely have vulnerabilities. This can be plainly seen by the fact that both the vulnerable and fixed versions of the LoadAniIcon should be detected by the criteria described above. While this may seem to run counter to the purposes of this paper, it is important for the reader to remember that the goal of using these exploitation properties is not to identify specific instances of vulnerabilities. Instead, the goal is to identify regions of code that might warrant additional scrutiny due to the relative ease with which a vulnerability could be exploited if one is found to be present. 3.3) Test Case The author developed an analysis tool as an extension to Microsoft's Phoenix framework in order to test the ideas described in this paper[12]. Unfortunately, the current release (July 2007 SDK) of Phoenix requires private symbol information for native binaries. This limitation prevented the author from being able to run the analysis tool across the vulnerable version of user32.dll. In lieu of this ability, the author chose to generate a binary containing test cases that closely mirror the form of the function containing the ANI vulnerability. Using these test cases, the author used the features provided by the analysis tool to determine the exploitation and vulnerability properties described in the previous section and to identify the resulting subset of functions meeting all criteria. This was accomplished by first attempting to identify the subset of functions that do not contain GS within the scope of the target binary. After identifying the subset of functions without GS, a second subset was taken which consists of the functions that pass a pointer to a stack-allocated local variable as a parameter to a child routine. This was accomplished by using Phoenix's static single assignment (SSA) and alias implementations to collect the requisite data flow information[12,25]. Using this data flow information, it is possible to perform backwards data flow analysis to determine the potential storage location of the parameter being passed at each point along a given data flow path starting from the operand associated with a parameter at a call site. The analysis terminates either when a fixed point is reached or when it is determined that a pointer to a stack-allocated variable could be passed as the parameter. While the previous section described the potential for using the partial overwrite exploitation property to detect the function containing the ANI vulnerability[6], it is not possible to create a meaningful parallel between the test binary and that of the ANI vulnerability. This is due in part to the fact that while it would certainly be possible to artificially place a useful opcode at a specific location in the test binary, it would not add any value beyond showing that it is possible to detect useful opcodes within the same 16-page aligned region as the caller of a given function. The author feels that this point is somewhat moot given the fact that it has already been proven that a partial overwrite can be used with the ANI vulnerability. The only additional benefit that it could offer in this case would be to help further constrain the resultant set size. However, without being able to run this analysis against the vulnerable version of user32.dll, it is not possible to draw meaningful conclusions at this point in time. 3.4) Results The results of running the analysis tool against the test binary produced the expected behavior. To illustrate this, it is helpful to consider a sampling of the functions that were analyzed. The following functions have a form that is similar to the ANI vulnerability. These functions also match the criteria described in the previous subsection. Specifically, these functions do not make use of GS and pass a pointer to a stack-allocated local variable (var) to a child function: int tc_df_pass_local_ptr_to_callee() { int var; tc_df_pass_local_ptr_to_callee_func(&var); return 0; } int tc_df_pass_local_ptr_to_callee_alias() { int var; int *p = &var; tc_df_pass_local_ptr_to_callee_func(p); return 0; } int tc_df_pass_local_ptr_to_callee_alias_struct( struct _foo *foo) { int var; foo->ptr = &var; return tc_df_pass_local_ptr_to_callee_func( foo->ptr); return 0; } Additionally, a handful of different test functions were also included in the target binary in an effort to ensure that other scenarios were not improperly detected as matching the criteria. Some examples of these functions include: int tc_df_pass_local_to_callee_alias() { int var = 2; int p = var; tc_df_pass_local_to_callee_func(p); return 0; } int tc_df_pass_local_to_callee_deref() { int var = 2; int *p = &var; tc_df_pass_local_to_callee_func(*p); return 0; } int tc_df_pass_heap_ptr_to_callee(struct _foo *foo) { tc_df_pass_local_ptr_to_callee_func(&foo->val); return 0; } When running the analysis tool against the target binary, the following output is shown: >PhaseRunner.exe detectani.xml dfa.exe Running phase: ANI Detection ... 1 target(s) Displaying 3 normalizables at the ProgramElement.Method granularity... 00001: dfa!tc_df_pass_local_ptr_to_callee_alias 00002: dfa!tc_df_pass_local_ptr_to_callee 00003: dfa!tc_df_pass_local_ptr_to_callee_alias_struct While this unfortunately does not prove that these techniques could be used to identify the function containing the ANI vulnerability, it does nevertheless hint at the potential for detecting the function containing the ANI vulnerability using its suggested exploitation and vulnerability properties. As an side, another interesting way in which this type of detection can be accomplished is through the use of Language Integrated Queries (LINQ) which are now supported in Visual Studio 2008[11]. For instance, a simple LINQ expression for the above narrowing operation can be expressed as: var matches = from Method method in engine.GetScopeMethods() where !method.IsGuardStackEnabled() && method.IsPassingStackLocalPtrToChild() select method; foreach (var method in matches) Console.WriteLine("{0} matches", method); 4) Potential Uses Program analysis is one area that may benefit from the use of exploitation properties. In particular, an auditor can make use of exploitation properties to assist in the process of identifying regions of code that should be audited more closely or with greater precedence. This determination can be made by using exploitation properties to understand the ease of exploitation associated with specific binaries or functions. By combining this information with other data that is collected either manually or automatically, an auditor can get a better understanding of the security aspects that are associated with a system. This is beneficial both to an attacker and a defender. An attacker can identify regions of code that would be easier to exploit and thus devote more time to auditing those regions. Likewise, a defender can use this information to the same extent but for different purposes. This type of information is especially useful to a defender who needs to balance the cost associated with performing security reviews because it should offer a better understanding of what the business cost might be if a vulnerability is found in a region of code. This cost can be derived from the negative publicity and response effort needed to cope with a flaw that is found publicly in a region of code that is widely exploited. For example, consider some of the Windows flaws that have lead to wormable issues and the cost they have had relative to other issues. Exploitation properties may also benefit the security community by helping to identify ways in which future mitigations can be applied. This would involve analyzing regions of code that could be more easily exploited in an effort to determine what other forms of mitigations could help to protect these regions, if any. This information could be fed back to the compiler to make it possible for mitigations to be enabled that might otherwise be disabled by default. For example, a function that by default would not have GS but is subsequently found to be highly exploitable may benefit from having the compiler insert GS. 5) Future Work While this paper has defined exploitation properties and described a handful of concrete examples, it has not attempted to formally define the correlation between exploitation properties and the exploitation techniques they are associated with. Future research will attempt to concretely define this relationship as it should lead to a better understanding of the variables that permit the use of various exploitation techniques. Using more formal definitions of exploitation properties, a larger scale case study can be completed which collects data about the effect of using exploitation properties to improve program understanding for a variety of purposes. The author views exploitation properties as being one component in a larger model. This larger model could be used to join major areas of study within computer security including attack surface analysis, vulnerability analysis, and exploitation analysis to form a more complete understanding of the true risks associated with a system. 6) Conclusion This paper has introduced the general concept of exploitation properties and described how they can be used to better understand the exploitability of a system. The purpose of an exploitation property is to help convey the ease with which a vulnerability might be exploited if one is found to be present. Exploitation properties can be broken down into different categories based on the configuration or context that a given property is associated from. These categories include operating platforms, running processes, binary modules, and functions. Exploitation properties can be used to provide an alternative understanding of an application's attack surface from the perspective of which areas would be most trivially exploited. This can allow an attacker to focus on finding security issues in code that would be more easily exploited. Likewise, a defender can draw the same conclusions and direct resources of their own at reviewing the associated code. It may also be possible to use this information to augment existing mitigations or to come up with new mitigations. A contrived example based on the form of the ANI vulnerability was used to illustrate an automated approach to extracting exploitation properties and using them to help identify a constrained subset of regions of code that meet a specific criteria. Future research will attempt to better define the extent of exploitation properties and their uses. [1] Dowd, M., Metha, N., McDonald, J. Breaking C++ Applications. https://www.blackhat.com/presentations/bh-usa-07/Dowd_McDonald_and_Mehta/Whitepaper/bh-usa-07-dowd_mcdonald_and_mehta.pdf [2] Durden, Tyler. Bypassing PaX ASLR Protection. July, 2002. http://www.phrack.org/issues.html?issue=59&id=9 [3] Howard, Michael. Protecting against Pointer Subterfuge (Kinda!). http://blogs.msdn.com/michael_howard/archive/2006/01/30/520200.aspx [4] Johnson, Richard. Windows Vista: Exploitation Countermeasures. http://rjohnson.uninformed.org/ [5] Litchfield, David. Defeating the Stack Based Buffer Overflow Prevention Mechanism of Microsoft Windows 2003 Server. http://www.nextgenss.com/papers/defeating-w2k3-stack-protection.pdf [6] Metasploit. Exploiting the ANI vulnerability on Vista. http://blog.metasploit.com/2007/04/exploiting-ani-vulnerability-on-vista.html [7] Microsoft Corporation. Microsoft Security Bulletin MS05-002. Jan, 2005. http://www.microsoft.com/technet/security/Bulletin/MS05-002.mspx [8] Microsoft Corporation. /GS (Buffer Security Check). http://msdn2.microsoft.com/en-us/library/8dbf701c(VS.80).aspx [9] Microsoft Corporation. /SAFESEH (Image has Safe Exception Handlers). http://msdn2.microsoft.com/en-us/library/9a89h429.aspx [10] Microsoft Corporation. A detailed description of the Data Execution Prevention (DEP) feature. http://support.microsoft.com/kb/875352 [11] Microsoft Corporation. The LINQ Project. http://msdn2.microsoft.com/en-us/netframework/aa904594.aspx [12] Microsoft Corporation. Phoenix. http://research.microsoft.com/phoenix/ [13] Microsoft Corporation. Microsoft Portable Executable and Object File Format Specification. http://download.microsoft.com/download/9/c/5/9c5b2167-8017-4bae-9fde-d599bac8184a/pecoff_v8.doc [14] Microsoft Corporation. Threat Modeling. June, 2003. http://msdn2.microsoft.com/en-us/library/aa302419.aspx [15] PaX Team. ASLR. http://pax.grsecurity.net/docs/aslr.txt [16] Ren, Chris et al. Microsoft Compiler Flaw Technical Note. http://www.cigital.com/news/index.php?pg=art&artid=70 [17] Rahbar, Ali. An analysis of Microsoft Windows Vista's ASLR. Oct, 2006. http://www.sysdream.com/articles/Analysis-of-Microsoft-Windows-Vista's-ASLR.pdf [18] skape, Skywing. Bypassing Windows Hardware-enforced DEP. http://www.uninformed.org/?v=2&a=4&t=sumry [19] skape. Preventing the Exploitation of SEH Overwrites. http://www.uninformed.org/?v=5&a=2&t=sumry [20] skape. Reducing the Effective Entropy of GS Cookies. http://www.uninformed.org/?v=7&a=2&t=sumry [21] Skywing. Vista ASLR is not on by default for image base addresses. http://www.nynaeve.net/?p=100 [22] Sotirov, Alexander. Windows Animated Cursor Stack Overflow Vulnerability. March, 2007. http://www.determina.com/security.research/vulnerabilities/ani-header.html [23] Wikipedia. Stack-smashing protection. http://en.wikipedia.org/wiki/Stack-smashing_protection [24] Wikipedia. Address space layout randomization. http://en.wikipedia.org/wiki/ASLR [25] Wikipedia. Static single assignment form. http://en.wikipedia.org/wiki/Static_single_assignment_form [26] University of Wisconsin. Wisconsin Program-Slicing Project's Home Page. http://www.cs.wisc.edu/wpis/html/ [27] Whitehouse, Ollie. Analysis of GS protections in Microsoft Windows Vista. http://www.symantec.com/avcenter/reference/GS_Protections_in_Vista.pdf