Uninformed: Informative Information for the Uninformed

Vol 4» 2006.Jun


Replacing ptrace()

A lot of people seem to move to Mac OS X from a Linux or BSD background and therefore expect the ptrace() syscall to be useful. However, unfortunately, this isn't the case on Mac OSX. For some ungodly reason, Apple decided to leave ptrace() incomplete and unable to do much more than take a feeble attempt at an anti-debug mechanism or single step the process.

As it turns out, the anti-debug mechanism (PT_DENY_ATTACH) only stops future ptrace() calls from attaching to the process. Since ptrace() functionality is highly limited on Mac OS X anyway, and task_for_pid() is unrestricted, this basically has no purpose.

In this section I will run through the missing features from a real implementation of ptrace and show you how to implement them on Mac OS X.

The first and most useful thing we'll look at is how to get a port for a task. Assuming you have sufficient privileges to do so, you can call the task_for_pid() function providing a unix process id and you will receive a port for that task.

This function is pretty straightforward to use and works as you'd expect.

	pid_t 	pid;
	task_t	port;

	task_for_pid(mach_task_self(), pid, &port);

After this call, if sufficient privileges were held, a port will be returned in ``port''. This can then be used with later API function calls in order to manipulate the target tasks resources. This is pretty similar conceptually to the ptrace() PTRACE_ATTACH functionality.

One of the most noticeable changes to ptrace() on Mac OS X is the fact that it is no longer possible to retrieve register state as you would expect. Typically, the ptrace() commands PTRACE_GETREGS and PTRACE_GETFPREGS would be used to get register contents. Fortunately this can be achieved quite easily using the Mach API.

The task_threads() function can be used with a port in order to get a list of the threads in the task.

	thread_act_port_array_t thread_list;
	mach_msg_type_number_t thread_count;

	task_threads(port, &thread_list, &thread_count);

Once you have a list of threads, you can then loop over them and retrieve register contents from each. This can be done using the thread_get_state() function.

The code below shows the process involved for retrieving the register contents from a thread (in this case the first thread) of a thread_act_port_array_t list.

NOTE:
	This code will only work on ppc machines, i396_thread_state_t type is 
	used for intel.


	ppc_thread_state_t ppc_state;
	mach_msg_type_number_t sc = PPC_THREAD_STATE_COUNT;
	long thread = 0;	// for first thread

	thread_get_state(
			  thread_list[thread],
			  PPC_THREAD_STATE,
			  (thread_state_t)&ppc_state,
			  &sc
	);

For PPC machines, you can then print out the registered contents for a desired register as so:

	printf(" lr: 0x%x\n",ppc_state.lr);

Now that register contents can be retrieved, we'll look at changing them and updating the thread to use our new contents.

This is similar to the ptrace PTRACE_SETREGS and PTRACE_SETFPREGS requests on Linux. We can use the mach call thread_set_state to do this. I have written some code to put these concepts together into a tiny sample program.

The following small assembly code will continue to loop until the r3 register is nonzero.

	.globl _main
	_main:

		li      r3,0
	up:
		cmpwi   cr7,r3,0
		beq-    cr7,up
		trap

The C code below attaches to the process and modifies the value of the r3 register to 0xdeadbeef.

/*
 * This sample code retrieves the old value of the
 * r3 register and sets it to 0xdeadbeef.
 *
 * - nemo
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <mach/mach_types.h>
#include <mach/ppc/thread_status.h>

void error(char *msg)
{
        printf("[!] error: %s.\n",msg);
        exit(1);
}

int main(int ac, char **av)
{
        ppc_thread_state_t ppc_state;
        mach_msg_type_number_t sc = PPC_THREAD_STATE_COUNT;
        long thread = 0;        // for first thread
        thread_act_port_array_t thread_list;
        mach_msg_type_number_t thread_count;
        task_t  port;
        pid_t   pid;

        if(ac != 2) {
                printf("usage: %s <pid>\n",av[0]);
                exit(1);
        }

        pid = atoi(av[1]);

        if(task_for_pid(mach_task_self(), pid, &port))
                error("cannot get port");

        // better shut down the task while we do this.
        if(task_suspend(port)) error("suspending the task");

        if(task_threads(port, &thread_list, &thread_count))
                error("cannot get list of tasks");


        if(thread_get_state(
                          thread_list[thread],
                          PPC_THREAD_STATE,
                          (thread_state_t)&ppc_state,
                          &sc
        )) error("getting state from thread");

        printf("old r3: 0x%x\n",ppc_state.r3);

        ppc_state.r3 = 0xdeadbeef;

        if(thread_set_state(
                          thread_list[thread],
                          PPC_THREAD_STATE,
                          (thread_state_t)&ppc_state,
                          sc
        )) error("setting state");

        if(task_resume(port)) error("cannot resume the task");

        return 0;
}

A sample run of these two programs is as follows:

	-[nemo@gir:~/code]$ ./tst&
	[1] 5302
	-[nemo@gir:~/code]$ gcc chgr3.c -o chgr3
	-[nemo@gir:~/code]$ ./chgr3 5302
	old r3: 0x0
	-[nemo@gir:~/code]$
	[1]+  Trace/BPT trap          ./tst

As you can see, when the C code is run, ./tst has it's r3 register modified and the loop exits, hitting the trap.

Some other features which have been removed from the ptrace() call on Mac OS X are the ability to read and write memory. Again, we can achieve this functionality using Mach API calls. The functions vm_write() and vm_read() (as expected) can be used to write and read the address space of a target task.

These calls work pretty much how you would expect. However there are examples throughout the rest of this paper which use these functions. The functions are defined as follows:

kern_return_t   vm_read
                (vm_task_t                          target_task,
                 vm_address_t                           address,
                 vm_size_t                                 size,
                 size                                  data_out,
                 target_task                         data_count);


kern_return_t   vm_write
                (vm_task_t                          target_task,
                 vm_address_t                           address,
                 pointer_t                                 data,
                 mach_msg_type_number_t              data_count);

These functions provide similar functionality to the ptrace requests: PTRACE_POKETEXT, PTRACE_POKEDATA and PTRACE_POKEUSR.

The memory being read/written must have the appropriate protection in order for these functions to work correctly. However, it is quite easy to set the protection attributes for the memory before the read or write takes place. To do this, the vm_protect() API call can be used.

kern_return_t   vm_protect
                 (vm_task_t           target_task,
                  vm_address_t            address,
                  vm_size_t                  size,
                  boolean_t           set_maximum,
                  vm_prot_t        new_protection);

The ptrace() syscall on Linux also provides a way to step a process up to the point where a syscall is executed. The PTRACE_SYSCALL request is used for this. This functionality is useful for applications such as "strace" to be able to keep track of system calls made by an application. Unfortunately, this feature does not exist on Mac OS X. The Mach api provides a very useful function which would provide this functionality.

kern_return_t   task_set_emulation
                (task_t                                    task,
                 vm_address_t                  routine_entry_pt,
                 int                             syscall_number);

This function would allow you to set up a userspace handler for a syscall and log it's execution. However, this function has not been implemented on Mac OS X.