voucher_swap - Exploit for P0 issue 1731 on iOS 12.1.2 Brandon Azad
---- Issue 1731: CVE-2019-6225 --------------------------------------------------------------------
iOS/macOS: task_swap_mach_voucher() does not respect MIG semantics leading to use-after-free
Consider the MIG routine task_swap_mach_voucher():
routine task_swap_mach_voucher(
task : task_t;
new_voucher : ipc_voucher_t;
inout old_voucher : ipc_voucher_t);
Here's the (placeholder) implementation:
kern_return_t
task_swap_mach_voucher(
task_t task,
ipc_voucher_t new_voucher,
ipc_voucher_t *in_out_old_voucher)
{
if (TASK_NULL == task)
return KERN_INVALID_TASK;
*in_out_old_voucher = new_voucher;
return KERN_SUCCESS;
}
The correctness of this implementation depends on exactly how MIG ownership semantics are defined for each of these parameters.
When dealing with Mach ports and out-of-line memory, ownership follows the traditional rules (the ones violated by the bugs above):
-
All Mach ports (except the first) passed as input parameters are owned by the service routine if and only if the service routine returns success. If the service routine returns failure then MIG will deallocate the ports.
-
All out-of-line memory regions passed as input parameters are owned by the service routine if and only if the service routine returns success. If the service routine returns failure then MIG will deallocate all out-of-line memory.
But this is only part of the picture. There are more rules for other types of objects:
-
All objects with defined MIG translations that are passed as input-only parameters are borrowed by the service routine. For reference-counted objects, this means that the service routine is not given a reference, and hence a reference must be added if the service routine intends to keep the object around.
-
All objects with defined MIG translations that are returned in output parameters must be owned by the output parameter. For reference-counted objects, this means that output parameters consume a reference on the object.
And most unintuitive of all:
- All objects with defined MIG translations that are passed as input in input-output parameters are owned (not borrowed!) by the service routine. This means that the service routine must consume the input object's reference.
Having defined MIG translations means that there is an automatic conversion defined between the object type and its Mach port representation. A task port is one example of such a type: you can convert a task port to the underlying task object using convert_port_to_task(), and you can convert a task to its corresponding port using convert_task_to_port().
Getting back to Mach vouchers, this is the MIG definition of ipc_voucher_t:
type ipc_voucher_t = mach_port_t
intran: ipc_voucher_t convert_port_to_voucher(mach_port_t)
outtran: mach_port_t convert_voucher_to_port(ipc_voucher_t)
destructor: ipc_voucher_release(ipc_voucher_t)
;
This definition means that MIG will automatically convert the voucher port input parameters to ipc_voucher_t objects using convert_port_to_voucher(), convert the ipc_voucher_t output parameters into ports using convert_voucher_to_port(), and discard any extra references using ipc_voucher_release(). Note that convert_port_to_voucher() produces a voucher reference without consuming a port reference, while convert_voucher_to_port() consumes a voucher reference and produces a port reference.
To confirm our understanding of the MIG semantics outlined above, we can look at the function _Xtask_swap_mach_voucher(), which is generated by MIG during the build process:
mig_internal novalue _Xtask_swap_mach_voucher
(mach_msg_header_t *InHeadP, mach_msg_header_t *OutHeadP)
{
...
kern_return_t RetCode;
task_t task;
ipc_voucher_t new_voucher;
ipc_voucher_t old_voucher;
...
task = convert_port_to_task(In0P->Head.msgh_request_port);
new_voucher = convert_port_to_voucher(In0P->new_voucher.name);
old_voucher = convert_port_to_voucher(In0P->old_voucher.name);
RetCode = task_swap_mach_voucher(task, new_voucher, &old_voucher);
ipc_voucher_release(new_voucher);
task_deallocate(task);
if (RetCode != KERN_SUCCESS) {
MIG_RETURN_ERROR(OutP, RetCode);
}
...
if (IP_VALID((ipc_port_t)In0P->old_voucher.name))
ipc_port_release_send((ipc_port_t)In0P->old_voucher.name);
if (IP_VALID((ipc_port_t)In0P->new_voucher.name))
ipc_port_release_send((ipc_port_t)In0P->new_voucher.name);
...
OutP->old_voucher.name = (mach_port_t)convert_voucher_to_port(old_voucher);
OutP->Head.msgh_bits |= MACH_MSGH_BITS_COMPLEX;
OutP->Head.msgh_size = (mach_msg_size_t)(sizeof(Reply));
OutP->msgh_body.msgh_descriptor_count = 1;
}
Tracing where each of the references are going, we can deduce that:
-
The new_voucher parameter is deallocated with ipc_voucher_release() after invoking the service routine, so it is not owned by task_swap_mach_voucher(). In other words, task_swap_mach_voucher() is not given a reference on new_voucher.
-
The old_voucher parameter has a reference on it before it gets overwritten by task_swap_mach_voucher(), which means task_swap_mach_voucher() is being given a reference on the input value of old_voucher.
-
The value returned by task_swap_mach_voucher() in old_voucher is passed to convert_voucher_to_port(), which consumes a reference on the voucher. Thus, task_swap_mach_voucher() is giving _Xtask_swap_mach_voucher() a reference on the output value of old_voucher.
Finally, looking back at the implementation of task_swap_mach_voucher(), we can see that none of these rules are being followed:
kern_return_t
task_swap_mach_voucher(
task_t task,
ipc_voucher_t new_voucher,
ipc_voucher_t *in_out_old_voucher)
{
if (TASK_NULL == task)
return KERN_INVALID_TASK;
*in_out_old_voucher = new_voucher;
return KERN_SUCCESS;
}
This results in two separate reference counting issues:
-
By overwriting the value of in_out_old_voucher without first releasing the reference, we are leaking a reference on the input value of old_voucher.
-
By assigning the value of new_voucher to in_out_old_voucher without adding a reference, we are consuming a reference we don't own, leading to an over-release of new_voucher.
---- Exploit flow ---------------------------------------------------------------------------------
First we allocate a bunch of pipes so that we can spray pipe buffers later.
Then we spray enough Mach ports to fill the ipc.ports zone and cause it to grow and allocate fresh pages from the zone map; 8000 ports is usually sufficient. That way, when we allocate our pipe buffers, there's a high chance the pipe buffers lie directly after the ports in kernel memory. The last port that we allocate is the base port.
Next we write a 16383-byte pattern to our pipe buffers, causing them to allocate from kalloc.16384. XNU limits the global amount of pipe buffer memory to 16 MB, but this is more than sufficient to fill kalloc.16384 and get some pipe buffers allocated after our base port in kernel memory.
We fill the pipes with fake Mach ports. For each pipe buffer we fill, we set the fake ports' ip_kotype bits to specify which pair of pipe file descriptors corresponds to this pipe buffer.
Now that we've allocated some pipe buffers directly after the base port, we set up state for triggering the vulnerability. We spray several pages of Mach vouchers, and choose one near the end to be the target for use-after-free. We want the target voucher to lie on a page containing only sprayed vouchers, so that later we can free all the vouchers on that page and make the page available for zone garbage collection.
Then we spray 15% of physical memory size with allocations from kalloc.1024. We'll free this memory later to ensure that there are lots of free pages to encourage zone garbage collection.
Next we stash a pointer to the target voucher in our thread's ith_voucher field using thread_set_mach_voucher(), and then remove the added voucher reference using the task_swap_mach_voucher() vulnerability. This means that even though ith_voucher still points to the target voucher, there's only one reference on it, so just like the rest of the vouchers it'll be freed once we destroy all the voucher ports in userspace.
At this point we free the kalloc.1024 allocations, destroy the voucher ports to free all the vouchers, and start slowly filling kernel memory with out-of-line ports allocations to try and trigger a zone gc and get the page containing our freed target voucher (which ith_voucher still points to) reallocated with out-of-line ports. In my experiments, spraying 17% of physical memory size is sufficient.
We'll try and reallocate the page containing the freed voucher with a pattern of out-of-line Mach ports that overwrites certain fields of the voucher. Specifically, we overwrite the voucher's iv_port field, which specifies the Mach port that exposes this voucher to userspace, with NULL and overwrite the iv_refs field, which is the voucher's reference count, with the lower 32 bits of a pointer to the base port.
Overwriting iv_refs with the lower 32 bits of a pointer to the base port will ensure that the reference count is valid so long as the base port's address is small enough. This is necessary for us to call thread_get_mach_voucher() later without triggering a panic. Additionally, the pointer to the base port plays double-duty since we'll later use the task_swap_mach_voucher() vulnerability again to increment iv_refs and change what was a pointer to the base port so that it points into our pipe buffers instead.
Once we've reallocated the voucher with our out-of-line ports spray, we call thread_get_mach_voucher(). This interprets ith_voucher, which points into the middle of our out-of-line ports spray, as a Mach voucher, and since iv_port is NULL, a new Mach voucher port is allocated to represent the freed voucher. Then thread_get_mach_voucher() returns the voucher port back to us in userspace, allowing us to continue manipulating the freed voucher while it still overlaps the out-of-line ports array.
Next we increment the voucher's iv_refs field using task_swap_mach_voucher(), which modifies the out-of-line pointer to the base port overlapping iv_refs so that it now points into the pipe buffers.
And since we guaranteed that every possible fake port inside the pipe buffers looks valid, we can now safely receive the messages containing the out-of-line ports spray to recover a send right to a fake ipc_port overlapping our pipe buffers.
Our next step is to determine which pair of pipe file descriptors corresponds to the pipe buffer. Since we set each possible fake port's ip_kotype bits earlier while spraying pipe buffers, we can use mach_port_kobject() to retrieve the fake port's ip_kotype and determine the overlapping pipe. And at this point, we can now inspect and modify our fake port by reading and writing the pipe's contents.
We can now discard all the filler ports and pipes we allocated earlier, since they're no longer needed.
Our next step is to build a kernel memory read primitive. Although we have a send right to an ipc_port overlapping our pipe buffer, we don't actually know the address of our pipe buffer in kernel memory. And if we want to use the pid_for_task() trick to read memory, we'll need to build a fake task struct at a known address so that we can make our fake port's ip_kobject field point to it. So our next goal should be to find the address of our pipe buffer.
Unfortunately, unlike prior exploits that have produced a dangling port, we only have a send right to our fake port, not a receive right. This means we have few options for modifying the port's state in such a way that it stores a pointer inside the ipc_port struct that allows us to determine its address.
One thing we can do is call mach_port_request_notification() to generate a request that a dead name notification for the fake port be delivered to the base port. This will cause the kernel to allocate an array in the fake port's ip_requests field and store a pointer to the base port inside that array. Thus, we only need a single 8-byte read to get the address of the base port, and since the base port is at a fixed offset from the fake port (determined by how many times we incremented the freed voucher's iv_refs field), we can use the address of the base port to calculate the address of our pipe buffer.
Of course, that means that in order to build our arbitrary read primitive, we need ... another arbitrary read primitive. So why is this helpful? Because our first read primitive will leak memory every time we use it while the second one will not.
The problem we need to resolve in order to use pid_for_task() to read kernel memory is that we need to get a fake task struct whose bsd_info field points to the address we want to read at a known address in kernel memory. One way to do that is to simply send a Mach message containing our fake task struct to the fake port, and then read out the port's ip_messages.imq_messages field via the pipe to get the address of the ipc_kmsg struct containing the message. Then we can compute the address of the fake task inside the ipc_kmsg and rewrite the fake port to be a task port pointing to the fake task, allowing us to call pid_for_task() to read 4 bytes of kernel memory.
Using this technique, we can read the value of the base port pointer in the ip_requests array and then compute the address of the fake port and the containing pipe buffer. And once we know the address of the pipe buffer, we can create the fake task by writing to our pipe to avoid leaking memory on each read.
Now that we have a stable kernel read primitive, we can find the address of the host port and read out the host port's ip_receiver field to get the address of the kernel's ipc_space.
I then borrow Ian's technique of iterating through all the ipc_port elements in the host port's zalloc block looking for the kernel task port. Once we find the kernel task port, we can read the ip_kobject field to get the kernel task, and reading the task's map field gives us the kernel's vm_map.
At this point we have everything we need to build a fake kernel task inside our pipe buffer, giving us the ability to read and write kernel memory using mach_vm_read() and mach_vm_write().
The next step is to build a permanent fake kernel task port. We allocate some kernel memory with mach_vm_allocate() and then write a new fake kernel task into that allocation. We then modify the fake port's ipc_entry in our task so that it points to the new fake kernel task, which allows us to clean up the remaining resources safely.
We remove the extra reference on the base port, destroy the voucher port allocated by the call to thread_get_mach_voucher() on the freed voucher, deallocate the ip_requests array, and destroy the leaked ipc_kmsg structs used during our first kernel read primitive.
This leaves us with a stable system and a fake kernel task port with which we can read and write kernel memory.
---- Kernel function calling / PAC bypass ---------------------------------------------------------
In order to call kernel functions I use the iokit_user_client_trap() technique. This works without modification on non-PAC devices, but on PAC-enabled devices like the iPhone XS we need to do a little extra work.
First we get a handle to an IOAudio2DeviceUserClient. Since the container sandbox usually prevents us from accessing this class, we briefly replace our proc's credentials with the kernel proc's credentials to bypass the sandbox check.
Once we have an IOAudio2DeviceUserClient, we read the value of the user client's trap field, which points to a heap-allocated IOExternalTrap object. Then, to call an arbitrary kernel function, we simply overwrite the trap to point to the target function and then call IOConnectTrap6() from userspace.
This technique has several limitations at this stage: we only control the values of registers X1 - X6, the return value gets truncated to 32 bits, and the function pointer that we call must already have a valid PACIZA signature (that is, a PAC signature using the A-instruction key with context 0). Thus, we'll need to find a way to generate PACIZA signatures on arbitrary functions.
As it turns out, one way to do this is to call the module destructor for the com.apple.nke.lttp kext. There is already a PACIZA'd pointer to the function l2tp_domain_module_stop() in kernel memory, so we already have the ability to call it. And as the final step in tearing down the module, l2tp_domain_module_stop() calls sysctl_unregister_oid() on the sysctl__net_ppp_l2tp global sysctl_oid struct, which resides in writable memory. And on PAC-enabled systems, sysctl_unregister_oid() executes the following instruction sequence on the sysctl_oid struct:
LDR X10, [X9,#0x30]! ;; X10 = old_oidp->oid_handler
CBNZ X19, loc_FFFFFFF007EBD330
CBZ X10, loc_FFFFFFF007EBD330
MOV X19, #0
MOV X11, X9 ;; X11 = &oid_handler
MOVK X11, #0x14EF,LSL#48 ;; X11 = 14EF`&oid_handler
AUTIA X10, X11 ;; X10 = AUTIA(oid_handler, 14EF`&handler)
PACIZA X10 ;; X10 = PACIZA(X10)
STR X10, [X9] ;; old_oidp->oid_handler = X10
That means that the field sysctl__net_ppp_l2tp->oid_handler will be replaced with the value PACIZA(AUTIA(sysctl__net_ppp_l2tp->oid_handler, )).
Clearly we can't forge PACIA signatures at this point, so AUTIA will fail and produce an invalid pointer value. This isn't NULL or some constant sentinel, but rather is the XPAC'd value with two of the pointer extension bits replaced with an error code to make the resulting pointer invalid. And this is interesting because when PACIZA is used to sign a pointer with invalid extension bits, what actually happens is that first the corrected pointer is signed and then one bit of the PAC signature is flipped, rendering it invalid.
What this means for us is that even though sysctl__net_ppp_l2tp->oid_handler was not originally signed, this gadget overwrites the field with a value that is only one bit different from a valid PACIZA signature, allowing us to compute the true PACIZA signature. And if we use this gadget to sign a pointer to a JOP gadget like "mov x0, x4 ; br x5", then we can execute any kernel function we want with up to 4 arguments.
We then use the signed "mov x0, x4 ; br x5" gadget to build a PACIA-signing primitive. There are a small number of possible PACIA gadgets, of which we use one that starts:
PACIA X9, X10
STR X9, [X2,#0xF0]
In order to use this gadget, we execute the following JOP program:
X1 = &"MOV X10, X3 ; BR X6"
X2 = KERNEL_BUFFER
X3 = CONTEXT
X4 = POINTER
X5 = &"MOV X9, X0 ; BR X1"
X6 = &"PACIA X9, X10 ; STR X9, [X2,#0xF0]"
PC = PACIA("MOV X0, X4 ; BR X5")
MOV X0, X4
BR X5
MOV X9, X0
BR X1
MOV X10, X3
BR X6
PACIA X9, X10
STR X9, [X2,#0xF0]
This leaves us with the PACIA'd pointer in kernel memory, which we can read back using our read primitive. Thus, we can now perform arbitrary PACIA forgeries. And using a similar technique with a PACDA gadget, we can produce PACDA forgeries.
All that's left is to get control over X0 when doing a function call. We read in the IOAudio2DeviceUserClient's vtable and use our forgery gadgets to replace IOAudio2DeviceUserClient::getTargetAndTrapForIndex() with IOUserClient::getTargetAndTrapForIndex() and replace IOUserClient::getExternalTrapForIndex() with IORegistryEntry::getRegistryEntryID(). Then we overwrite the user client's registry entry ID field with a pointer to the IOExternalTrap. Finally we write the patched vtable into allocated kernel memory and replace the user client's vtable pointer with a forged pointer to our fake vtable.
And at this point we now have the ability to call arbitrary kernel functions with up to 7 arguments using the iokit_user_client_trap() technique, just like on non-PAC devices.
---- Running the exploit --------------------------------------------------------------------------
For best results, reboot the device and wait a few seconds before running the exploit. I've seen reliability above 99.5% on my devices after a fresh boot (the completed exploit has never failed for me).
Running the exploit twice without rebooting will almost certainly panic, since it will mess up the heap groom and possibly result in base port having a too-large address.
After getting kernel read/write and setting up kernel function calling, the exploit will trigger a panic by calling an invalid address with special values in registers X0 - X6 to demonstrate that function calling is successful.
---- Platforms ------------------------------------------------------------------------------------
I've tested on an iPhone 8, iPhone XR, and iPhone XS running iOS 12.1.2. You can add support for other devices in the files voucher_swap/parameters.c and voucher_swap/kernel_call/kc_parameters.c. The exploit currently assumes a 16K kernel page size, although it should be possible to remove this requirement. The PAC bypass also relies on certain gadgets which may be different on other versions or devices.
This vulnerability was fixed in iOS 12.1.3, released January 22, 2019: https://support.apple.com/en-us/HT209443
---- Other exploits -------------------------------------------------------------------------------
This bug was independently discovered and exploited by Qixun Zhao (@S0rryMybad) as part of a remote jailbreak. He developed a clever exploit strategy that reallocates the voucher with OSStrings; you can read about it here: http://blogs.360.cn/post/IPC%20Voucher%20UaF%20Remote%20Jailbreak%20Stage%202%20(EN).html