Temporal Return Addresses: Exploitation Chronomancy skape mmiller@hick.org Last modified: 8/6/2005 1) Foreword Abstract: Nearly all existing exploitation vectors depend on some knowledge of a process' address space prior to an attack in order to gain meaningful control of execution flow. In cases where this is necessary, exploit authors generally make use of static addresses that may or may not be portable between various operating system and application revisions. This fact can make exploits unreliable depending on how well researched the static addresses were at the time that the exploit was implemented. In some cases, though, it may be possible to predict and make use of certain addresses in memory that do not have static contents. This document introduces the concept of temporal addresses and describes how they can be used, under certain circumstances, to make exploitation more reliable. Disclaimer: This document was written in the interest of education. The author cannot be held responsible for how the topics discussed in this document are applied. Thanks: The author would like to thank H D Moore, spoonm, thief, jhind, johnycsh, vlad902, warlord, trew, vax, uninformed, and all the friends of nologin! With that, on with the show... 2) Introduction A common impediment to the implementation of portable and reliable exploits is the location of a return address. It is often required that a specific instruction, such as a jmp esp, be located at a predictable location in memory so that control flow can be redirected into an attacker controlled buffer. This scenario is more common on Windows, but applicable scenarios exist on UNIX derivatives as well. Many times, though, the locations of the instructions will vary between individual versions of an operating system, thus limiting an exploit to a set of version-specific targets that may or may not be directly determinable at attack time. In order to make an exploit independent of, or at least less dependent on, a target's operating system version, a shift in focus becomes necessary. Through the blur of rhyme and reason an attacker might focus and realize that not all viable return addresses will exist indeterminably in a target process' address space. In fact, viable return addresses can be found in a transient state throughout the course of a program's execution. For instance, a pointer might be stored at a location in memory that happens to contain a viable two byte instruction somewhere within the bytes that compose the pointer's address. Alternatively, an integer value somewhere in memory could be initialized to a value that is equivalent to a viable instruction. In both cases, though, the contents and locations of the values will almost certainly be volatile and unpredictable, thus making them unsuitable for use as return addresses. Fortunately, however, there does exist at least one condition that can lend itself well to portable exploitation that is bounded not by the operating system version the target is running on, but instead by a defined window of time. In a condition such as this, a timer of some sort must exist at a predictable location in memory that is known to be updated at a constant time interval, such as every second. The location in memory that the timer resides at is known as a temporal address. On top of this, it is also important for the attacker determine the scale of measurement the timer is operating on, such as whether or not it's measured in epoch time (from 1970 or 1601) or if it's simply acting as a counter. With these three elements identified, an attacker can attempt to predict the periods of time where a useful instruction can be found in the bytes that compose the future state of any timer in memory. To help illustrate this, suppose an attacker is attempting to find a reliable location of a jmp edi instruction. The attacker knows that the program being exploited has a timer that holds the number of seconds since Jan. 1, 1970 at a predictable location in memory. By doing some analysis, the attacker could determine that on Wednesday July 27th, 2005 at 3:39:12PM CDT, a jmp edi could be found within any four byte timer that stores the number of seconds since 1970. The window of opportunity, however, would only last for 4 minutes and 16 seconds assuming the timer is updated every second. By accounting for timing as a factor in the selection of return addresses, an attacker can be afforded options beyond those normally seen when the address space of a process is viewed as unchanging over time. In that light, this document is broken into three portions. First, the steps needed to find, analyze, and make use of temporal addresses will be explained. Second, upcoming viable opcode windows will be shown and explained along with methods that can be used to determine target time information prior to exploitation. Finally, examples of commonly occurring temporal addresses on Windows NT+ will be described and analyzed to provide real world examples of the subject of this document. Before starting, though, it is important to understand some of the terminology that will be used, or perhaps abused, in the interest of conveying the concepts. The term temporal address is used to describe a location in memory that contains a timer of some sort. The term opcode is used interchangeably with the term instruction to convey the set of viable bytes that could partially compose a given temporal state. The term update period is used to describe the amount of time that it takes for the contents of a temporal address to change. Finally, the term scale is used to describe the unit of measure for a given temporal address. 3) Locating Temporal Addresses In order to make use of temporal addresses it is first necessary to devise a method of locating them. To begin this search it is necessary that one understand the attributes of a temporal address. All temporal addresses are defined as storing a time-associated counter that increments at a constant interval. For instance, an example would be a location in memory that stores the number of seconds since Jan. 1, 1970 that is incremented every second. As a more concrete definition, all time-associated counters found in memory are represented in terms of a scale (the unit of measure), an interval or period (how often they are updated), and have a maximum storage capacity (variable size). If any these parts are unknown or variant for a given memory location, it is impossible for an attacker to consistently leverage it for use as time-bounded return address because of the inability to predict the byte values at the location for a given period of time. With the three major components of a temporal address identified (scale, period, and capacity), a program can be written to search through a process' address space with the goal of identifying regions of memory that are updated at a constant period. From there, a scale and capacity can be inferred based on an arbitrarily complex set of heuristics, the simplest of which can identify regions that are storing epoch time. It's important to note, though, that not all temporal addresses will have a scale that is measured as an absolute time period. Instead, a temporal address may simply store the number of seconds that have passed since the start of execution, among other scenarios. These temporal addresses are described as having a scale that is simply equivalent to their period and are for that reason referred to as counters. To illustrate the feasibility of such a program, the author has implemented an algorithm that should be conceptually portable to all platforms, though the implementation itself is limited to Windows NT+. The approach taken by the author, at a high level, is to poll a process' address space multiple times with the intention of analyzing changes to the address space over time. In order to reduce the amount of memory that must be polled, the program is also designed to skip over regions that are backed against an image file or are otherwise inaccessible. To accomplish this task, each polling cycle is designed to be separated by a constant (or nearly constant) time interval, such as 5 seconds. By increasing the interval between polling cycles the program can detect temporal addresses that have a larger update period. The granularity of this period of time is measured in nanoseconds in order to support high resolution timers that may exist within the target process' address space. This allows the program to detect timers measured in nanoseconds, microseconds, milliseconds, and seconds. The purpose of the delay between polling cycles is to give temporal address candidates the ability to complete one or more update periods. As each polling cycle occurs, the program reads the contents of the target process' address space for a given region and caches it locally within the scanning process. This is necessary for the next phase. After at least two polling cycles have completed, the program can compare the cached memory region differences between the most recent view of the target process' address space and the previous view. This is accomplished by walking through the contents of each cached memory region in four byte increments to see if there is any difference between the two views. If a temporal address exists, the contents of a the two views should have a difference that is no larger than the maximum period of time that occurred between the two polling cycles. It's important to remember that the maximum period can be conveyed down to nanosecond granularity. For instance, if the polling cycle period was 5 seconds, any portion of memory that changed by more than 5 seconds, 5000 milliseconds, or 5000000 microseconds is obviously not a temporal address candidate. To that point, any region of memory that didn't change at all is also most likely not a temporal address candidate, though it is possible that the region of memory simply has an update period that is longer than the polling cycle. Once a memory location is identified that has a difference between the two views that is within or equal to the polling cycle period, the next step of analysis can begin. It's perfectly possible for memory locations that meet this requirement to not actually be timers, so further analysis is necessary to weed them out. At this point, though, memory locations such as these can be referred to as temporal address candidates. The next step is to attempt to determine the period of the temporal address candidate. This is accomplished by some rather silly, but functional, logic. First, the delta between the polling cycles is calculated down to nanosecond granularity. In a best case scenario, the granularity of a polling cycle that is spaced apart by 5 seconds will be 5000000000 nanoseconds. It's not safe to assume this constant though, as thread scheduling and other non-constant parameters can affect the delta between polling cycles for a given memory region. The next step is to iteratively compare the difference between the two views to the current delta to see if the difference is greater than or equal to the current delta. If it is, it can be assumed that the difference is within the current unit of measure. If it's not, the current delta should be divided by 10 to progress to the next unit of measure. When broken down, the progressive transition in units of measurement is described in figure 3.1. Delta Measurement --------------------------- 1000000000 Nanoseconds 100000000 10 Nanoseconds 10000000 100 Nanoseconds 1000000 Microseconds 100000 10 Microseconds 10000 100 Microseconds 1000 Milliseconds 100 10 Milliseconds 10 100 Milliseconds 1 Seconds Figure 3.1: Delta measurement reductions Once a unit of measure for the update period is identified, the difference is divided by the current delta to produce the update period for a given temporal address candidate. For example, if the difference was 5 and the current delta was 5, the update period for the temporal address candidate would be 1 second (5 updates over the course of 5 seconds). With the update period identified, the next step is to attempt to determine the storage capacity of the temporal address candidate. In this case, the author chose to take a shortcut, though there are most certainly better approaches that could be taken given sufficient interest. The author chose to assume that if the update period for a temporal address candidate was measured in nanoseconds, then it was almost certainly at least the size of a 64-bit integer (8 bytes on x86). On the other hand, all other update periods were assumed to imply a 32-bit integer (4 bytes on x86). With the temporal address candidate's storage capacity identified in terms of bytes, the next step is to identify the scale that the temporal address may be conveying (the timer's unit of measure). To accomplish this, the program calculates the number of seconds since 1970 and 1601 between the current time minus at least equal the polling cycle period and the current time itself. The temporal address candidate's current value (as stored in memory) is then converted to seconds using the determined update period and then compared against the two epoch time ranges. If the candidate's converted current value is within either epoch time range then it can most likely be assumed that the temporal address candidates's scale is measured from epoch time, either from 1970 or 1601 depending on the range it was within. While this sort of comparison is rather simple, any other arbitrarily complex set of logic could be put into place to detect other types of time scales. In the event that none of the logic matches, the temporal address candidate is deemed to simply have a scale of a counter (as defined previously in this chapter). Finally, with the period, scale, and capacity for the temporal address candidate identified, the only thing left is to check to see if the three components are equivalent to previously collected components for the given temporal address candidate. If they differ in orders of magnitude then it is probably safe to assume that the candidate is not actually a temporal address. On the other, consistent components between polling cycles for a temporal address candidate are almost a sure sign that it is indeed a temporal address. When everything is said and done, the program should collect every temporal address in the target process that has an update period less than or equal to the polling cycle period. It should also have determined the scale and size of the temporal address. When run on Windows against a program that is storing the current epoch time since 1970 in seconds in a variable every second, the following output is displayed: C:\>telescope 2620 [*] Attaching to process 2620 (5 polling cycles)... [*] Polling address space........ Temporal address locations: 0x0012FE88 [Size=4, Scale=Counter, Period=1 sec] 0x0012FF7C [Size=4, Scale=Epoch (1970), Period=1 sec] 0x7FFE0000 [Size=4, Scale=Counter, Period=600 msec] 0x7FFE0014 [Size=8, Scale=Epoch (1601), Period=100 nsec] This output tells us that the address of the variable that is storing the epoch time since 1970 can be found at 0x0012FF7C and has an update period of one second. The other things that were found will be discussed later in this document. 3.1) Determining Per-byte Durations Once the update period and size of a temporal address have been determined, it is possible to calculate the amount of time it takes to change each byte position in the temporal address. For instance, if a four byte temporal address with an update period of 1 second were found in memory, the first byte (or LSB) would change once every second, the second byte would change once every 256 seconds, the third byte would change once every 65536 seconds, and the fourth byte would change once every 16777216 seconds. The reason these properties are exhibited is because each byte position has 256 possibilities (0x00 to 0xff inclusive). This means that each byte position increases in duration by 256 to a given power. This can be described as shown in figure 3.2. Let x equal the byte index starting at zero for the LSB. duration(x) = 256 ^ x Figure 3.2: Period independent byte durations The next step to take after determining period-specific byte durations is to convert the durations to a measure more aptly accessible assuming a period that is more granular than a second. For instance, figure shows that if each byte duration is measured in 100 nanosecond intervals for an 8 byte temporal address, a conversion can be applied to convert from 100 nanosecond intervals for a byte duration to seconds. tosec(x) = duration(x)/107 Figure 3.3: 100 nanosecond byte durations to seconds This phase is especially important when it comes to calculating viable opcode windows because it is necessary to know for how long a viable opcode will exist which is directly dependent on the direction of the opcode byte closest to the LSB. This will be discussed in more detail in chapter 4. 4) Calculating Viable Opcode Windows Once a set of temporal addresses has been located, the next logical step is to attempt to calculate the windows of time that one or more viable opcodes can be found within the bytes of the temporal address. It is also just as important to calculate the duration of each byte within the temporal address. This is the type of information that is required in order to determine when a portion of a temporal address can be used as a return address for an exploit. The approach taken to accomplish this is to make use of the equations provided in the previous chapter for calculating the number of seconds it takes for each byte to change based on the update period for a given temporal address. By using the tosec function for each byte index, a table can be created as illustrated in figure 4.1 for a 100nanosecond 8 byte timer. Byte Index Seconds (ext) ------------------------ 0 0 (zero) 1 0 (zero) 2 0 (zero) 3 1 (1 sec) 4 429 (7 mins 9 secs) 5 109951 (1 day 6 hours 32 mins 31 secs) 6 28147497 (325 days 18 hours 44 mins 57 secs) 7 7205759403 (228 years 179 days 23 hours 50 mins 3 secs) Figure 4.1: 8 byte 100ns per-byte durations in seconds This shows that any opcodes starting at byte index 4 will have a 7 minute and 9 second window of time. The only thing left to do is figure out when to strike. 5) Picking the Time to Strike The time to attack is entirely dependent on both the update period of the temporal address and its scale. In most cases, temporal addresses that have a scale that is relative to an arbitrary date (such as 1970 or 1601) are the most useful because they can be predicted or determined with some degree of certainty. Regardless, a generalized approach can be used to determine projected time intervals where useful opcodes will occur. To do this, it is first necessary to identify the set of instructions that could be useful for a given exploit, such as a jmp esp. Once identified, the next step is to break the instructions down into their raw opcodes, such as 0xff 0xe4 for jmp esp. After all the raw opcodes have been collected, it is then necessary to begin calculating the projected time intervals that the bytes will occur at. The method used to accomplish this is rather simple. First, a starting byte index must be determined in terms of the lowest acceptable window of time that an exploit can use. In the case of a 100 nanosecond timer, the best byte index to start at would be byte index 4 considering all previous indexes have a duration of less than or equal to one second. The bytes that occur at index 4 have a 7 minute and 9 second duration, thus making them feasible for use. With the starting byte index determined, the next step is to create permutations of all subsequent opcode byte combinations. In simpler terms, this would mean producing all of the possible byte value combinations that contain the raw opcodes of a given instruction at a byte index equal to or greater than the starting byte index. To help visualize this, figure 5.1 provides a small sample of jmp esp byte combinations in relation to a 100 nanosecond timer. Byte combinations ----------------------- 00 00 00 00 ff e4 00 00 00 00 00 00 ff e4 01 00 00 00 00 00 ff e4 02 00 ... 00 00 00 00 ff e4 47 04 00 00 00 00 ff e4 47 05 00 00 00 00 ff e4 47 06 ... 00 00 00 00 00 ff e4 00 00 00 00 00 00 ff e4 01 00 00 00 00 00 ff e4 02 Figure 5.1: 8 byte 100ns jmp esp byte combinations Once all of the permutations have been generated, the next step is to convert them to meaingful absolute time representations. This is accomplished by converting all of the permutations, which represent past, future, or present states of the temporal address, to seconds. For instance, one of the permutations for a jmp esp instruction found within the 64-bit 100nanosecond timer is 0x019de4ff00000000 (116500949249294300). Converting this to seconds is accomplished by doing: 11650094924 = trunc(116500949249294300 / 10^7) This tells us the number of seconds that will have passed when the stars align to form this byte combination, but it does not convey the scale in which the seconds are measured, such as whether they are based from an absolute date (such as 1970 or 1601) or are simply acting as a timer. In this case, if the scale were defined as being the number of seconds since 1601, the total number of seconds could be adjusted to indicate the number of seconds that have occurred since 1970 by subtracting the constant number of seconds between 1970 and 1601: 5621324 = 11650094924 - 11644473600 This indicates that a total of 5621324 seconds will have passed since 1970 when 0xff will be found at byte index 4 and 0xe4 will be found at byte index 5. The window of opportunity will be 7 minutes and 9 seconds after which point the 0xff will become a 0x00, the 0xe4 will become 0xe5, and the instruction will no longer be usable. If 5621324 is converted to a printable date format based on the number of seconds since 1970, one can find that the date that this particular permutation will occur at is Fri Mar 06 19:28:44 CST 1970. While it's now been shown that is perfectly possible to predict specific times in the past, present, and future that a given instruction or instructions can be found within a temporal address, such an ability is not useful without being able to predict or determine the state of the temporal address on a target computer at a specific moment in time. For instance, while an exploitation chronomancer knows that a jmp esp can be found on March 6th, 1970 at about 7:30 PM, it must also be known what the target machine has their system time set to down to a granularity of mere seconds, or at least minutes. While guessing is always an option, it is almost certainly going to be less fruitful than making use of existing tools and services that are more than willing to provide a would-be attacker with information about the current system time on a target machine. Some of the approaches that can be taken to gather this information will be discussed in the next section. 5.1) Determining System Time There are a variety of techniques that can potentially be used to determine the system time of a target machine with varying degrees of accuracy. The techniques listed in this section are by no means all-encompassing but do serve as a good base. Each technique will be elaborated on in the following sub-sections. 5.1.1) DCERPC SrvSvc NetrRemoteTOD One approach that can be taken to obtain very granular information about the current system time of a target machine is to use the SrvSvc's NetrRemoteTOD request. To transmit this request to a target machine a NULL session (or authenticated session) must be established using the standard Session Setup AndX SMB request. After that, a Tree Connect AndX to the IPC share should be issued. From there, an NT Create AndX request can be issued on the named pipe. Once the request is handled successfully the file descriptor returned can be used for the DCERPC bind request to the SrvSvc's UUID. Finally, once the bind request has completed successfully, a NetrRemoteTOD request can be transacted over the named pipe using a TransactNmPipe request. The response to this request should contain very granular information, such as day, hour, minute, second, timezone, as well as other fields that are needed to determine the target machine's system time. Figure shows a sample response. This vector is very useful because it provides easy access to the complete state of a target machine's system time which in turn can be used to calculate the windows of time that a temporal address can be used during exploitation. The negatives to this approach is that it requires access to the SMB ports (either 139 or 445) which will most likely be inaccessible to an attacker. 5.1.2) ICMP Timestamps The ICMP TIMESTAMP request (13) can be used to obtain a machine's measurement of the number of milliseconds that have occurred since midnight UT. If an attacker can infer or assume that a target machine's system time is set to a specific date and timezone, it may be possible to calculate the absolute system time down to a millisecond resolution. This would satisfy the timing requirements and make it possible to make use of temporal addresses that have a scale that is measured from an absolute time. According to the RFC, though, if a system is unable to determine the number of milliseconds since UT then it can use another value capable of representing time (though it must set a high-order bit to indicate the non-standard value). 5.1.3) IP Timestamp Option Like the ICMP TIMESTAMP request, IP also has a timestamp option (type 68) that measures the number of milliseconds since midnight UT. This could also be used to determine down to a millisecond resolution what the remote system's clock is set to. Since the measurement is the same, the limitations are the same as ICMP's TIMESTAMP request. 5.1.4) HTTP Server Date Header In scenarios where a target machine is running an HTTP server, it may be possible to extract the system time by simply sending an HTTP request and checking to see if the response contains a date header or not. Figure shows an example HTTP response that contains a date header. 5.1.5) IRC CTCP TIME Perhaps one of the more lame approaches to obtaining a target machine's time is by issuing a CTCP TIME request over IRC. This request is designed to instruct the responder to reply with a readable date string that is relative to the responder's system time. Unless spoofed, the response should be equivalent to the system time on the remote machine. 6) Determining the Return Address Once all the preliminary work of calculating all of the viable opcode windows has been completed and a target machine's system time has been determined, the final step is to select the next available window for a compatible opcode group. For instance, if the next window for a jmp esp equivalent instruction is Sun Sep 25 22:37:28 CDT 2005, then the byte index to the start of the jmp esp equivalent must be determined based on the permutation that was generated. In this case, the permutation that would have been generated (assuming a 100nanosecond period since 1601) is 0x01c5c25400000000. This means that jmp esp equivalent is actually a push esp, ret which starts at byte index four. If the start of the temporal address was at 0x7ffe0014, then the return address that should be used in order to get the push esp, ret to execute would be 0x7ffe0018. This basic approach is common to all temporal addresses of varying capacity, period, and scale. 7) Case Study: Windows NT SharedUserData With all the generic background information out of the way, a real world practical use of this technique can be illustrated through an analysis of a region of memory that happens to be found in every process on Windows NT+. This region of memory is referred to as SharedUserData and has a backward compatible format for all versions of NT, though new fields have been appended over time. At present, the data structure that represents SharedUserData is KUSERSHAREDDATA which is defined as follows on Windows XP SP2: 0:000> dt _KUSER_SHARED_DATA +0x000 TickCountLow : Uint4B +0x004 TickCountMultiplier : Uint4B +0x008 InterruptTime : _KSYSTEM_TIME +0x014 SystemTime : _KSYSTEM_TIME +0x020 TimeZoneBias : _KSYSTEM_TIME +0x02c ImageNumberLow : Uint2B +0x02e ImageNumberHigh : Uint2B +0x030 NtSystemRoot : [260] Uint2B +0x238 MaxStackTraceDepth : Uint4B +0x23c CryptoExponent : Uint4B +0x240 TimeZoneId : Uint4B +0x244 Reserved2 : [8] Uint4B +0x264 NtProductType : _NT_PRODUCT_TYPE +0x268 ProductTypeIsValid : UChar +0x26c NtMajorVersion : Uint4B +0x270 NtMinorVersion : Uint4B +0x274 ProcessorFeatures : [64] UChar +0x2b4 Reserved1 : Uint4B +0x2b8 Reserved3 : Uint4B +0x2bc TimeSlip : Uint4B +0x2c0 AlternativeArchitecture : _ALTERNATIVE_ARCHITECTURE_TYPE +0x2c8 SystemExpirationDate : _LARGE_INTEGER +0x2d0 SuiteMask : Uint4B +0x2d4 KdDebuggerEnabled : UChar +0x2d5 NXSupportPolicy : UChar +0x2d8 ActiveConsoleId : Uint4B +0x2dc DismountCount : Uint4B +0x2e0 ComPlusPackage : Uint4B +0x2e4 LastSystemRITEventTickCount : Uint4B +0x2e8 NumberOfPhysicalPages : Uint4B +0x2ec SafeBootMode : UChar +0x2f0 TraceLogging : Uint4B +0x2f8 TestRetInstruction : Uint8B +0x300 SystemCall : Uint4B +0x304 SystemCallReturn : Uint4B +0x308 SystemCallPad : [3] Uint8B +0x320 TickCount : _KSYSTEM_TIME +0x320 TickCountQuad : Uint8B +0x330 Cookie : Uint4B One of the purposes of SharedUserData is to provide processes with a global and consistent method of obtaining certain information that may be requested frequently, thus making it more efficient than having to incur the performance hit of a system call. Furthermore, as of Windows XP, SharedUserData acts as an indirect system call re-director such that the most optimized system call instructions can be used based on the current hardware's support, such as by using sysenter over the standard int 0x2e. As can be seen right off the bat, SharedUserData contains a few fields that pertain to the timing of the current system. Furthermore, if one looks closely, it can be seen that these timer fields are actually updated constantly as would be expected for any timer variable: 0:000> dd 0x7ffe0000 L8 7ffe0000 055d7525 0fa00000 93fd5902 00000cca 7ffe0010 00000cca a78f0b48 01c59a46 01c59a46 0:000> dd 0x7ffe0000 L8 7ffe0000 055d7558 0fa00000 9477d5d2 00000cca 7ffe0010 00000cca a808a336 01c59a46 01c59a46 0:000> dd 0x7ffe0000 L8 7ffe0000 055d7587 0fa00000 94e80a7e 00000cca 7ffe0010 00000cca a878b1bc 01c59a46 01c59a46 The three timing-related fields of most interest are TickCountLow, InterruptTime, and SystemTime. These three fields will be explained individually later in this chapter. Prior to that, though, it is important to understand some of the properties of SharedUserData and why it is that it's quite useful when it comes to temporal addresses. 7.1) The Properties of SharedUserData There are a number of important properties of SharedUserData, some of which make it useful in terms of temporal addresses and others that make it somewhat infeasible depending on the exploit or hardware support. As far as the properties that make it useful go, SharedUserData is located at a static address, 0x7ffe0000, in every version of Windows NT+. Furthermore, SharedUserData is mapped into every process. The reasons for this are that NTDLL, and most likely other 3rd party applications, have been compiled and built with the assumption that SharedUserData is located at a fixed address. This is something many people are abusing these days when it comes to passing code from kernel-mode to user-mode. On top of that, SharedUserData is required to have a backward compatible data structure which means that the offsets of all existing attributes will never shift, although new attributes may be, and have been, appended to the end of the data structure. Lastly, there are a few products for Windows that implement some form of ASLR. Unfortunately for these products, SharedUserData cannot be feasibly randomized, or at least the author is not aware of any approaches that wouldn't have severe performance impacts. On the negative side of the house, and perhaps one of the most limiting factors when it comes to making use of SharedUserData, is that it has a null byte located at byte index one. Depending on the vulnerability, it may or may not be possible to use an attribute within SharedUserData as a return address due to NULL byte restrictions. As of XP SP2 and 2003 Server SP1, SharedUserData is no longer marked as executable and will result in a DEP violation (if enabled) assuming the hardware supports PAE. While this is not very common yet, it is sure to become the norm over the course of time. 7.2) Locating Temporal Addresses As seen previously in this document, using the telescope program on any Windows application will result in the same two (or three) timers being displayed: C:\>telescope 2620 [*] Attaching to process 2620 (5 polling cycles)... [*] Polling address space........ Temporal address locations: 0x7FFE0000 [Size=4, Scale=Counter, Period=600 msec] 0x7FFE0014 [Size=8, Scale=Epoch (1601), Period=100 nsec] Referring to the structure definition described at the beginning of this chapter, it is possible for one to determine which attribute each of these addresses is referring to. Each of these three attributes will be discussed in detail in the following sub-sections. 7.2.1) TickCountLow The TickCountLow attribute is used, in combination with the TickCountMultiplier, to convey the number of milliseconds that have occurred since system boot. To calculate the number of milliseconds since system boot, the following equation is used: T = shr(TickCountLow * TickCountMultiplier, 24) This attribute is representative of a temporal address that has a counter scale. It starts an unknown time and increments at constant intervals. The biggest problem with this attribute are the intervals that it increases at. It's possible that two machines in the same room with different hardware will have different update periods for the TickCountLow attribute. This makes it less feasible to use as a temporal address because the update period cannot be readily predicted. On the other hand, it may be possible to determine the current uptime of the machine through TCP timestamps or some alternative mechanism, but without the ability to determine the update period, the TickCountLow attribute seems unusable. This attribute is located at 0x7ffe0000 on all versions of Windows NT+. 7.2.2) InterruptTime This attribute is used to store a 100 nanosecond timer starting at system boot that presumably counts the amount of time spent processing interrupts. The attribute itself is stored as a KSYSTEMTIME structure which is defined as: 0:000> dt _KSYSTEM_TIME +0x000 LowPart : Uint4B +0x004 High1Time : Int4B +0x008 High2Time : Int4B Depending on the hardware a machine is running, the InterruptTime's period may be exactly equal to 100 nanoseconds. However, testing has seemed to confirm that this is not always the case. Given this, both the update period and the scale of the InterruptTime attribute should be seen as limiting factors. This fact makes it less useful because it has the same limitations as the TickCountLow attribute. Specifically, without knowing when the system booted and when the counter started, or how much time has been spent processing interrupts, it is not possible to reliably predict when certain bytes will be at certain offsets. Furthermore, the machine would need to have been booted for a significant amount of time in order for some of the useful instructions to be feasibly found within the bytes that compose the timer. This attribute is located at 0x7ffe0008 on all versions of Windows NT+. 7.2.3) SystemTime The SystemTime attribute is by far the most useful attribute when it comes to its temporal address qualities. The attribute itself is a 100 nanosecond timer that is measured from Jan. 1, 1601 which is stored as a KSYSTEMTIME structure like the InterruptTime attribute. See the InterruptTime sub-section for a structure definition. This means that it has an update period of 100 nanoseconds and has a scale that measures from Jan. 1, 1601. The scale is also measured relative to the timezone that the machine is using (with the exclusion of daylight savings time). If an attacker is able to obtain information about the system time on a target machine, it may be possible to make use of the SystemTime attribute as a valid temporal address for exploitation purposes. This attribute is located at 0x7ffe0014 on all versions of Windows NT+. 7.3) Calculating Viable Opcode Windows After analyzing SharedUserData for temporal addresses it should become clear that the SystemTime attribute is by far the most useful and potentially feasible attribute due to its scale and update period. In order to successfully leverage it in conjunction with an exploit, though, the viable opcode windows must be calculated so that a time to strike can be selected. This can be done prior to determining what the actual date is on a target machine but requires that the storage capacity (size of the temporal address in bytes), the update period, and the scale be known. In this case, the size of the SystemTime attribute is 12 bytes, though in reality the 3rd attribute, High2Time, is exactly the same as the second, High1Time, so all that really matters are the the first 8 bytes. Doing the math to calculate per-byte durations gives the results shown in figure . This indicates that it is only worth focusing on opcode permutations that start at byte index four due to the fact that all previous byte indexes have a duration of less than or equal to one second. By applying the scale as being measured since Jan 1, 1601, all of the possible permutations for the past, present, and future can be calculated as described in chapter . The results of these calculations for the SystemTime attribute are described in the following paragraphs. In order to calculate the viable opcode windows it is necessary to have identified the viable set of opcodes. In this case study a total of 320 viable opcodes were used (recall that opcode in this case can mean one or more instruction). These viable opcodes were taken from the Metasploit Opcode Database. After performing the necessary calculations and generating all of the permutations, a total of 3615 viable opcode windows were found between Jan. 1, 1970 and Dec. 23, 2037. Each viable opcode was broken down into groupings of similar or equivalent opcodes such that it could be made easier to visualize. Looking closely at these figures it can bee seen that there were two large spikes around 2002 and 2003 for the [esp + 8] => eip opcode group which includes pop/pop/ret instructions common to SEH overwrites. Looking more closely at these two years shows that there were two significant periods of time during 2002 and 2003 where the stars aligned and certain exploits could have used the SystemTime attribute as a temporal return address. Figure shows the spikes in more detail. It's a shame that this technique was not published about during those time frames! Never again in the lifetime of anyone who reads this paper will there be such an occurrence. Perhaps of more interest than past occurrences of certain opcode groups is what will come in the future. The table in figure 7.1 shows the upcoming viable opcode windows for 2005. Date Opcode Group ------------------------------------------ Sun Sep 25 22:08:50 CDT 2005 eax => eip Sun Sep 25 22:15:59 CDT 2005 ecx => eip Sun Sep 25 22:23:09 CDT 2005 edx => eip Sun Sep 25 22:30:18 CDT 2005 ebx => eip Sun Sep 25 22:37:28 CDT 2005 esp => eip Sun Sep 25 22:44:37 CDT 2005 ebp => eip Sun Sep 25 22:51:47 CDT 2005 esi => eip Sun Sep 25 22:58:56 CDT 2005 edi => eip Tue Sep 27 04:41:21 CDT 2005 eax => eip Tue Sep 27 04:48:30 CDT 2005 ecx => eip Tue Sep 27 04:55:40 CDT 2005 edx => eip Tue Sep 27 05:02:49 CDT 2005 ebx => eip Tue Sep 27 05:09:59 CDT 2005 esp => eip Tue Sep 27 05:17:08 CDT 2005 ebp => eip Tue Sep 27 05:24:18 CDT 2005 esi => eip Tue Sep 27 05:31:27 CDT 2005 edi => eip Tue Sep 27 06:43:02 CDT 2005 [esp + 0x20] => eip Fri Oct 14 14:36:48 CDT 2005 eax => eip Sat Oct 15 21:09:19 CDT 2005 ecx => eip Mon Oct 17 03:41:50 CDT 2005 edx => eip Tue Oct 18 10:14:22 CDT 2005 ebx => eip Wed Oct 19 16:46:53 CDT 2005 esp => eip Thu Oct 20 23:19:24 CDT 2005 ebp => eip Sat Oct 22 05:51:55 CDT 2005 esi => eip Sun Oct 23 12:24:26 CDT 2005 edi => eip Thu Nov 03 23:17:07 CST 2005 eax => eip Sat Nov 05 05:49:38 CST 2005 ecx => eip Sun Nov 06 12:22:09 CST 2005 edx => eip Mon Nov 07 18:54:40 CST 2005 ebx => eip Wed Nov 09 01:27:11 CST 2005 esp => eip Thu Nov 10 07:59:42 CST 2005 ebp => eip Fri Nov 11 14:32:14 CST 2005 esi => eip Sat Nov 12 21:04:45 CST 2005 edi => eip Figure 7.1: Opcode windows for Sept 2005 - Jan 2006 8) Case study: Example application Aside from Windows' processes having SharedUserData present, it may also be possible, depending on the application in question, to find other temporal addresses at static locations across various operating system versions. Take for instance the following example program that simply calls time every second and stores it in a local variable on the stack named t: #include #include void main() { unsigned long t; while (1) { t = time(NULL); SleepEx(1000, TRUE); } } When the telescope program is run against a running instance of this example program, the results produced are: C:\>telescope 3004 [*] Attaching to process 3004 (5 polling cycles)... [*] Polling address space........ Temporal address locations: 0x0012FE24 [Size=4, Scale=Counter, Period=70 msec] 0x0012FE88 [Size=4, Scale=Counter, Period=1 sec] 0x0012FE9C [Size=4, Scale=Counter, Period=1 sec] 0x0012FF7C [Size=4, Scale=Epoch (1970), Period=1 sec] 0x7FFE0000 [Size=4, Scale=Counter, Period=600 msec] 0x7FFE0014 [Size=8, Scale=Epoch (1601), Period=100 nsec] Judging from the source code of the example application it would seem clear that the address 0x0012ff7c coincides with the local variable t which is used to store the number of seconds since 1970. Indeed, the t variable also has an update period of one second as indicated by the telescope program. The other finds may be either inaccurate or not useful depending on the particular situation, but due to the fact that they were identified as counters instead of being relative to one of the two epoch times most likely makes them unusable. In order to write an exploit that can leverage the temporal address t, it is first necessary to take the steps outlined in this document with regard to calculating the duration of each byte index and then building a list of all the viable opcode permutations. The duration of each byte index for a four byte timer with a one second period are shown in figure 8.1. Byte Index Seconds (ext) ------------------------ 0 1 (1 sec) 1 256 (4 mins 16 secs) 2 65536 (18 hours 12 mins 16 secs) 3 16777216 (194 days 4 hours 20 mins 16 secs) Figure 8.1: 4 byte 1sec per-byte durations in seconds The starting byte index for this temporal address is byte index one due to the fact that it has the smallest feasible window of time for an exploit to be launched (4 mins 16 secs). After identifying this starting byte index, permutations for all the viable opcodes can be generated. Nearly all of the viable opcode windows have a window of 4 minutes. Only a few have a window of 18 hours. To get a better idea for what the future has in store for a timer like this one, table 8.2 shows the upcoming viable opcode windows for 2005. Date Opcode Group ------------------------------------------ Fri Sep 02 01:28:00 CDT 2005 [reg] => eip Thu Sep 08 21:18:24 CDT 2005 [reg] => eip Fri Sep 09 15:30:40 CDT 2005 [reg] => eip Sat Sep 10 09:42:56 CDT 2005 [reg] => eip Sun Sep 11 03:55:12 CDT 2005 [reg] => eip Tue Sep 13 10:32:00 CDT 2005 [reg] => eip Wed Sep 14 04:44:16 CDT 2005 [reg] => eip Figure 8.2: Opcode windows for Sept 2005 - Jan 2006 9) Conclusion Temporal addresses are locations in memory that are tied to a timer of some sort, such as a variable storing the number of seconds since 1970. Like a clock, temporal addresses have an update period, meaning the rate at which its contents are changed. They also have an inherent storage capacity which limits the amount of time they can convey before being rolled back over to the start. Finally, temporal addresses will also always have a scale associated with them that indicates the unit of measure for the contents of a temporal address, such as whether it's simply being used as a counter or whether it's measuring the number of seconds since 1970. These three attributes together can be used to predict when certain byte combinations will occur within a temporal address. This type of prediction is useful because it can allow an exploitation chronomancer the ability to wait until the time is right and then strike once predicted byte combinations occur in memory on a target machine. In particular, the byte combinations most useful would be ones that represent useful opcodes, or instructions, that could be used to gain control over execution flow and allow an attacker to exploit a vulnerability. Such an ability can give the added benefit of providing an attacker with universal return addresses in situations where a temporal address is found at a static location in memory across multiple operating system and application revisions. An exploitation chronomancer is one who is capable of divining the best time to exploit something based on the alignment of certain bytes that occur naturally in a process' address space. By making use of the techniques described in this document, or perhaps ones that have yet to be described or disclosed, those who have yet to dabble in the field of chronomancy can begin to get their feet wet. Viable opcode windows will come and go, but the usefulness of temporal addresses will remain for eternityor at least as long as computers as they are known today are around. The fact of the matter is, though, that while the subject matter discussed in this document may have an inherent value, the likelihood of it being used for actual exploitation is slim to none due to the variance and delay between viable opcode windows for different periods and scales of temporal addresses. Or is it really that unlikely? Vlad902 suggested a scenario where an attacker could compromise an NTP server and configure it to constantly return a time that contains a useful opcode for exploitation purposes. All of the machines that synchronize with the compromised NTP server would then eventually have a predictable system time. While not completely fool proof considering it's not always known how often NTP clients will synchronize (although logs could be used), it's nonetheless an interesting approach. Regardless of feasibility, the slave that is knowledge demands to be free, and so it shall. Bibliography Mesander, Rollo, and Zeuge. The Client-To-Client Protocol (CTCP). http://www.irchelp.org/irchelp/rfc/ctcpspec.html; accessed Aug 5, 2005. Metasploit Project. The Metasploit Opcode Database. http://metasploit.com/users/opcode/msfopcode.cgi; accessed Aug 6, 2005. Postel, J. RFC 792 - Internet Control Message Protocol. http://www.ietf.org/rfc/rfc0792.txt?number=792; accessed Aug 5, 2005.