|Informative Information for the Uninformed|
In the month of August, eEye released an advisory for a stack-based buffer overflow that was found in the McAfee Subscription Manager ActiveX control. The underlying vulnerability was in an insecure call to vsprintf that was exposed through scripting-accessible routines. At a glance, this vulnerability would appear trivial to exploit given that it's a very basic stack overflow. However, once it comes to transmitting a payload, or even a particular return address, certain limiting factors begin to appear. The focus of this paper will center around an exercise in implementing a custom encoder to overcome certain character set limitations. The McAfee Subscription Manager vulnerability will be used as a real-world example of a vulnerability that requires a custom encoder to exploit.
When it comes to exploiting this vulnerability, the first step is to reproduce the conditions reported in the advisory. Like most vulnerabilities, it's customary to send an arbitrary sequence of bytes, such as A's. However, in this particular exploit, sending a sequence of A's, which equates to 0x41, actually causes the return address to be overwritten with 0x61's which are lowercase a's. Judging from this, it seems obvious that the input string is undergoing a tolower operation and it will not be possible for the payload or return address to contain any uppercase characters.
Given these character restrictions, it's safe to go forward with writing the exploit. To simply get a proof of concept for code execution, it makes sense to put a series of int3's, represented by the 0xcc opcode, immediately following the return address. The return address could then be pointed to the location of a push esp / ret or some other type of instruction that transfers control to where the series of int3's should reside. Once the vulnerability is triggered, the debugger should break in at an int3 instruction, but that's not actually what happens. Instead, it breaks in on a completely different instruction:
(4f8.58c): Unknown exception - code c0000096 (!!! second chance !!!) eax=00000f19 ebx=00000000 ecx=00139438 edx=0013a384 esi=00001b58 edi=0013a080 eip=0013a02c esp=0013a02c ebp=36213365 iopl=0 cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 0013a02c ec in al,dx 0:000> u eip 0013a02c ec in al,dx 0013a02d ec in al,dx 0013a02e ec in al,dx 0013a02f ec in al,dx
Again, it looks like the buffer is undergoing some sort of transformation. One quick thing to notice is that 0xcc + 0x20 = 0xec. This is similar to what would happen when changing an uppercase character to a lowercase character, such as where 'A', or 0x41, is converted to 'a', or 0x61, by adding 0x20. It appears that the operation that's performing the case lowering may also be inadvertently performing it on a specific high ASCII range.
What's actually occurring is that the subscription manager control is calling _mbslwr, using the statically linked CRT, on a copy of the original input string. Internally, _mbslwr calls into __crtLCMapStringA. Eventually this will lead to a call out to kernel32!LCMapStringW. The second parameter to this routine is dwMapFlags which describes what sort of transformations, if any, should be performed on the buffer. The _mbslwr routine passes 0x100, or LCMAP_LOWERCASE. This is what results in the lowering of the string.
So, given this information, it can be determined that it will not be possible to use characters through and including 0x41 and 0x5A as well as, for the sake of clarity, 0xc0 and 0xe02.1. The main reason this ends up causing problems is because many of the payload encoders out there for x86, including those in Metasploit, rely on characters from these two sets for their decoder stub and subsequent encoded data. For that reason, and for the challenge, it's worth pursuing the implementation of a custom encoder.
While this particular vulnerability will permit the use of many characters above 0x80, it makes the challenge that much more interesting, and particulary useful, to limit the usable character set to the characters described below. The reason this range is more useful is because the characters are UTF-8 safe and also tolower safe. Like most good payloads, the encoder will also avoid NULL bytes.
0x01 -> 0x40 0x5B -> 0x7f
As with all encoded formats, there are actually two major pieces involved. The first part is the encoder itself. The encoder is responsible for taking a raw buffer and encoding it into the appropriate format. The second part is the decoder, which, as is probably obvious, takes the encoded form and converts it back into the raw form so that it can be executed as a payload. The implementation of these two pieces will be described in the following chapters.