OC-release的内部实现

一直知道release就是计数--,之前也大概看了下代码,了解到内部实现没这么简单,近期就花了点时间了解了一下这一块的实现

1. 几个小知识

1.1 TaggedPointer

苹果采用64位处理器后, 为了节省内存和提高执行效率,苹果提出了Tagged Pointer的概念。对于64位程序,引入Tagged Pointer后,一些基础的数据对象能减少一半的内存占用,以及3倍的访问速度提升,100倍的创建、销毁速度提升

更多细节可参考深入理解Tagged Pointer
1.2 Pointer isa

对象的isa不包含额外的信息,跟Nonpointer isa刚好相反

1.3 Nonpointer isa

对象的isa不仅包含类的信息,同时还使用了一些位来存储一些额外的信息

查看objc源码中的定义

union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };
#endif
};

展开ISA_BITFIELD

// arm64
#   define ISA_BITFIELD                                                      
      uintptr_t nonpointer        : 1; // 标记是否是pointer isa                                       
      uintptr_t has_assoc         : 1; // 标记是否有关联对象                                       
      uintptr_t has_cxx_dtor      : 1; // 标记是否有c++析构函数                                      
      uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ // 标记对象的指针的值
      uintptr_t magic             : 6; // 标记对象是否已经完成了初始化,在 arm64 中 0x16 是调试器判断当前对象是真的对象还是没有初始化的空间(在 x86_64 中该值为 0x3b)                                      
      uintptr_t weakly_referenced : 1; // 标记是否有弱引用                                      
      uintptr_t deallocating      : 1; // 标记对象是否正在被释放                                      
      uintptr_t has_sidetable_rc  : 1; // 标记对象是否有引用计数存储在sidetable                                      
      uintptr_t extra_rc          : 19 // 存储对象的引用计数
// x86_64
#   define ISA_BITFIELD                                                        
      uintptr_t nonpointer        : 1;                                         
      uintptr_t has_assoc         : 1;                                         
      uintptr_t has_cxx_dtor      : 1;                                         
      uintptr_t shiftcls          : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ 
      uintptr_t magic             : 6;                                         
      uintptr_t weakly_referenced : 1;                                         
      uintptr_t deallocating      : 1;                                         
      uintptr_t has_sidetable_rc  : 1;                                         
      uintptr_t extra_rc          : 8

SUPPORT_PACKED_ISA为1的时候就会定义ISA_BITFIELD;可以看到64位是支持nonpointer isa的32位就不支持了

// Define SUPPORT_PACKED_ISA=1 on platforms that store the class in the isa 
// field as a maskable pointer with other data around it.
#if (!__LP64__  ||  TARGET_OS_WIN32  ||  \
     (TARGET_OS_SIMULATOR && !TARGET_OS_IOSMAC))
#   define SUPPORT_PACKED_ISA 0
#else
#   define SUPPORT_PACKED_ISA 1
#endif
2. release内部实现
- (oneway void)release {
    _objc_rootRelease(self);
}

NEVER_INLINE void
_objc_rootRelease(id obj)
{
    ASSERT(obj);

    obj->rootRelease();
}

ALWAYS_INLINE bool 
objc_object::rootRelease()
{
    return rootRelease(true, false);
}

跳来跳去最后执行到bool objc_object::rootRelease(bool performDealloc, bool handleUnderflow)函数
performDealloc标记当引用计数-1之后<=0的时候执行dealloc
handleUnderflow表示需要去sidetable中去借引用计数

2.1 TaggedPointer对象

if (isTaggedPointer()) return false;

tagged pointer没有引用计数这一说,所以它的release、dealloc是直接返回,内存管理由系统来处理

2.2 Pointer isa对象释放
pointer isa的对象(包含不支持nonpointer isa的架构以及支持但是isa没有存储额外信息的场景),引用计数是存储在sidetable中的,release的时候也是直接从sidetable中取引用计数然后-1

对于不支持nonpointer isa的情况,rootRelease的实现

inline bool 
objc_object::rootRelease()
{
    if (isTaggedPointer()) return false;
    return sidetable_release(true);
}

对于支持nonpointer isa的,但是isa中没存储额外的信息,isa的nonpointer为
截取rootRelease中的处理逻辑如下:

if (slowpath(!newisa.nonpointer)) { // nonpointer的isa,但是isa中也没有存储额外信息的情况,引用计数存储在sidetable,直接调用sidetable_release(true)
     ClearExclusive(&isa.bits);
     if (rawISA()->isMetaClass()) return false;
         if (sideTableLocked) sidetable_unlock();
            return sidetable_release(performDealloc);
}

sidetable_release实现如下:

uintptr_t
objc_object::sidetable_release(bool performDealloc)
{
#if SUPPORT_NONPOINTER_ISA
    ASSERT(!isa.nonpointer);
#endif
    SideTable& table = SideTables()[this];

    bool do_dealloc = false;

    table.lock();
    //try_emplace:https://en.cppreference.com/w/cpp/container/map/try_emplace
    auto it = table.refcnts.try_emplace(this, SIDE_TABLE_DEALLOCATING); // try_emplace的作用是key对应的记录存在则啥都不做,不存在的话则将value存储到对应的key
    auto &refcnt = it.first->second;
    if (it.second) { // there was no entry 没有记录,那么就直接dealloc
        do_dealloc = true;
    } else if (refcnt < SIDE_TABLE_DEALLOCATING) { // 0b10
        // SIDE_TABLE_WEAKLY_REFERENCED may be set. Don't change it.
        do_dealloc = true;
        refcnt |= SIDE_TABLE_DEALLOCATING;
    } else if (! (refcnt & SIDE_TABLE_RC_PINNED)) { // #define SIDE_TABLE_RC_PINNED         (1UL<<(WORD_BITS-1))
        /*
         举个例子使用WORD_BITS=8位来演算,假设sidetable存储的是0b100 == 4
         SIDE_TABLE_RC_PINNED = 0b10000000
         refcnt = 0b00010000
         1. refcnt & SIDE_TABLE_RC_PINNED ==> 0b00010000 & 0b10000000 = 0b00000000
         2. 只有当引用计数器的高位也为1的时候,才为真;此时溢出了,直接返回了false
        */
        refcnt -= SIDE_TABLE_RC_ONE; // 1UL<<2
        // 假设sidetable存储的是0b100 == 4
        // 0b00010000 - 0b100 = 0b00001100 ==> rc = 0b11 = 3
        // 0b00001100 - 0b100 = 0b00001000 ==> rc = 0b10 = 2
    }
    table.unlock();
    if (do_dealloc  &&  performDealloc) {
        ((void(*)(objc_object *, SEL))objc_msgSend)(this, @selector(dealloc));
    }
    return do_dealloc;
}

2.2.1 引用计数器也有一些标记位

  • SIDE_TABLE_DEALLOCATING (1UL<<1) 标记对象是否正在被释放
  • SIDE_TABLE_WEAKLY_REFERENCED (1UL<<0) 标记对象是否有弱引用
  • SIDE_TABLE_RC_PINNED (1UL<<(WORD_BITS-1)) // WORD_BITS 64bit是64 32bit是32,标记对象的引用计数是否已经到顶了
  • SIDE_TABLE_RC_ONE (1UL<<2) 由于后两位是标记位,所以引用计数器-1就是每次减去这个定义的值
假设sidetable存储的是0b100 == 4
0b00010000 - 0b100 = 0b00001100 ==> rc = 0b11 = 3
0b00001100 - 0b100 = 0b00001000 ==> rc = 0b10 = 2

2.2.2 整体的逻辑

  • if (it.second)分支 当sidetable中查没有记录的时候走dealloc流程,此时标记计数器的标记为SIDE_TABLE_DEALLOCATING;

这个分支要结合auto it = table.refcnts.try_emplace(this, SIDE_TABLE_DEALLOCATING)一起看,try_emplace的作用是:key对应的记录存在则啥都不做,不存在的话则将value存储到对应的key,当没有记录的时候会走这个分支,此时try_emplace将SIDE_TABLE_DEALLOCATING存储进去

  • if (refcnt < SIDE_TABLE_DEALLOCATING)分支 当引用计数器的值小于SIDE_TABLE_DEALLOCATING,表示只有weak引用标记,此时将计数器标记为SIDE_TABLE_DEALLOCATING
  • if (refcnt < SIDE_TABLE_DEALLOCATING)分支 当引用计数器计数大于0的时候就执行-1操作

具体的细节可以看上面贴的代码,有注释

2.3 Nonpointer对象释放
2.3.1 nonpointer isa的isa中存储有引用计数的信息,内部实现大概流程是

  1. 先从isa中的extra_rc去减1,当-1之后大于0,则直接返回了
  2. 当-1之后不大于0的时候,就去判断是否有has_sidetable_rc,如果有就从sidetable中存储的引用计数借RC_HALF的计数,并将借来的计数-1存储到isa的extra_rc中,这其中涉及到一系列的容错和尝试的逻辑,目的就是为了将修改后的isa的bits信息成功存储到isa的bits中,如果失败了就再加回到sidetable中存储的引用计数器中去,然后再重试走借的这一套流程
  3. 如果has_sidetable_rc为false,此时就没法借了,就走后面的dealloc流程了

2.3.2 源代码

bool 
objc_object::rootRelease(bool performDealloc, bool handleUnderflow) // true,false
{
    if (isTaggedPointer()) return false;

    bool sideTableLocked = false;

    isa_t oldisa;
    isa_t newisa;

 retry:
    do {
        oldisa = LoadExclusive(&isa.bits);
        newisa = oldisa;
        if (slowpath(!newisa.nonpointer)) { // 支持nonpointer isa,但是isa没存储额外信息,引用计数存储在sidetable,直接调用sidetable_release(true)
            ClearExclusive(&isa.bits);
            if (rawISA()->isMetaClass()) return false;
            if (sideTableLocked) sidetable_unlock();
            return sidetable_release(performDealloc);
        }
        // don't check newisa.fast_rr; we already called any RR overrides
        // nonpointer isa的处理流程
        uintptr_t carry;
        newisa.bits = subc(newisa.bits, RC_ONE, 0, &carry);  // extra_rc--
        if (slowpath(carry)) { // 不够减了
            // don't ClearExclusive()
            goto underflow; // 对于nonpointer isa去sidetable_rc中去借
        }
    } while (slowpath(!StoreReleaseExclusive(&isa.bits, 
                                             oldisa.bits, newisa.bits)));

    if (slowpath(sideTableLocked)) sidetable_unlock();
    return false;

 underflow:
    // newisa.extra_rc-- underflowed: borrow from side table or deallocate
    // 如果extra_rc--小于等于0的时候则去sidetable_rc去每次借1<<7位的数据
    // abandon newisa to undo the decrement
    newisa = oldisa;

    if (slowpath(newisa.has_sidetable_rc)) { // 有引用计数存储在sidetable
        if (!handleUnderflow) {
            ClearExclusive(&isa.bits);
            return rootRelease_underflow(performDealloc); // rootRelease(performDealloc, true);这里也就是会执行到下面的流程handleUnderflow=true,从sidetable借引用计数[sidetable_subExtraRC_nolock(RC_HALF)]存储到extra_rc中去[StoreReleaseExclusive]
        }

        // Transfer retain count from side table to inline storage.

        if (!sideTableLocked) {
            ClearExclusive(&isa.bits);
            sidetable_lock();
            sideTableLocked = true;
            // Need to start over to avoid a race against 
            // the nonpointer -> raw pointer transition.
            goto retry;
        }

        // Try to remove some retain counts from the side table.        
        size_t borrowed = sidetable_subExtraRC_nolock(RC_HALF);

        // To avoid races, has_sidetable_rc must remain set 
        // even if the side table count is now zero.

        if (borrowed > 0) {
            // Side table retain count decreased.
            // Try to add them to the inline count.
            newisa.extra_rc = borrowed - 1;  // redo the original decrement too
            bool stored = StoreReleaseExclusive(&isa.bits, 
                                                oldisa.bits, newisa.bits);
            if (!stored) {
                // 存储失败了,则尝试在存储一次
                // Inline update failed. 
                // Try it again right now. This prevents livelock on LL/SC 
                // architectures where the side table access itself may have 
                // dropped the reservation.
                isa_t oldisa2 = LoadExclusive(&isa.bits);
                isa_t newisa2 = oldisa2;
                if (newisa2.nonpointer) {
                    uintptr_t overflow;
                    newisa2.bits = 
                        addc(newisa2.bits, RC_ONE * (borrowed-1), 0, &overflow); // 当超过bits所能表示的范围的时候 overflow就返回1,否则为0
                    if (!overflow) {
                        stored = StoreReleaseExclusive(&isa.bits, oldisa2.bits, 
                                                       newisa2.bits); // 将isa.bits由oldisa2.bits交换为newisa2.bits
                    }
                }
            }
            // 还是失败了则放回到sidetable_rc中去,重新走这个流程
            if (!stored) {
                // Inline update failed.
                // Put the retains back in the side table.
                sidetable_addExtraRC_nolock(borrowed);
                goto retry;
            }

            // Decrement successful after borrowing from side table.
            // This decrement cannot be the deallocating decrement - the side 
            // table lock and has_sidetable_rc bit ensure that if everyone 
            // else tried to -release while we worked, the last one would block.
            sidetable_unlock();
            return false;
        }
        else {
            // 没有sidetable存储引用计数,表示对象没有人持有了,此时就走后面的释放流程了
            // Side table is empty after all. Fall-through to the dealloc path.
        }
    }

    // Really deallocate.

    if (slowpath(newisa.deallocating)) { // 如果正在释放的对象,调用了release则抛出异常
        ClearExclusive(&isa.bits);
        if (sideTableLocked) sidetable_unlock();
        return overrelease_error();
        // does not actually return
    }
    newisa.deallocating = true;
    if (!StoreExclusive(&isa.bits, oldisa.bits, newisa.bits)) goto retry;

    if (slowpath(sideTableLocked)) sidetable_unlock();

    __c11_atomic_thread_fence(__ATOMIC_ACQUIRE);

    if (performDealloc) { // 当引用计数--之后<=0的时候就走dealloc
        ((void(*)(objc_object *, SEL))objc_msgSend)(this, @selector(dealloc));
    }
    return true;
}

代码中写了注释,可以了解个大概的流程;其中有2个函数查了资料没查到具体的作用,就写了点例子来测试它的功能

subc-- 作用就是相减,当不够减的时候carry是1,当够减的时候carry为0
subc内部实现:调用__builtin_subcl,写了下面测试用例

 void subcTestCase(void) {
    uintptr_t carryout;
    {
        uintptr_t left = 1;
        uintptr_t result = __builtin_subcl(left, 1, 0, &carryout);
        NSLog(@"case0: result = %lu carryout = %lu", result, carryout); // case0: result = 0 carryout = 0
    }
    {
        uintptr_t left = 2;
        uintptr_t result = __builtin_subcl(left, 1, 0, &carryout);
        NSLog(@"case1: result = %lu carryout = %lu", result, carryout); // case1: result = 1 carryout = 0
    }
    {
        uintptr_t left = 0;
        uintptr_t result = __builtin_subcl(left, 1, 0, &carryout);
        NSLog(@"case2: result = %lu carryout = %lu", result, carryout); // case2: result = 18446744073709551615 carryout = 1
    }
    {
        uintptr_t left = 0;
        uintptr_t result = __builtin_subcl(left, 5, 0, &carryout);
        NSLog(@"case3: result = %lu carryout = %lu", result, carryout); // case3: result = 18446744073709551611 carryout = 1
    }
}

addc-- 作用就是相加,当相加之后的值超过了第一个参数类型所能表示的范围就会溢出overflow为1否则为0
写了一下测试用例

void addcTestCase(void) {
    // unsigned long uintptr_t
    uintptr_t borrowed = (1ULL<<7);
    uintptr_t right = (1ULL<<56) * (borrowed - 1);
    uintptr_t overflow;
    {
        uintptr_t left = 1ULL<<63;
        uintptr_t result = __builtin_addcl(left, right, 0, &overflow);
        NSLog(@"case0: result = %lu overflow = %lu", result, overflow); // case0: result = 18374686479671623680 overflow = 0
    }
    {
        uintptr_t left = NSUIntegerMax - 1;
        uintptr_t result = __builtin_addcl(left, right, 0, &overflow);
        NSLog(@"case1: result = %lu overflow = %lu", result, overflow); // case1: result = 9151314442816847870 overflow = 1
    }
}

3. dealloc内部实现

上面也说到release内部如果计数--之后为0了就内部会走到dealloc的流程;dealloc的内部实现对TaggedPointerNonpointer isa Pointer isa会怎么处理了

dealloc内部调用_objc_rootDealloc(id obj) ,其内部调用了obj->rootDealloc();

我们直接看rootDealloc的实现

对于不支持nonpointer isa的架构,直接object_dispose

inline void
objc_object::rootDealloc()
{
    if (isTaggedPointer()) return;
    object_dispose((id)this);
}

对于支持nonpointer isa的架构

inline void
objc_object::rootDealloc()
{
    if (isTaggedPointer()) return;  // fixme necessary?

    if (fastpath(isa.nonpointer  &&  
                 !isa.weakly_referenced  &&  
                 !isa.has_assoc  &&  
                 !isa.has_cxx_dtor  &&  
                 !isa.has_sidetable_rc))
    {
        // 对象的isa中没有需要额外处理的释放相关的,直接释放
        assert(!sidetable_present());
        free(this);
    } 
    else {
        object_dispose((id)this);
    }
}

3.1 整体逻辑

对于不支持nonpointer isa的架构

  • 直接object_dispose

对于支持nonpointer isa的架构

  • TaggedPointer直接返回
  • nonpointer isa对象的isa中没有存储需要额外处理的标记则直接free
  • 否则调用object_dispose去处理释放逻辑

3.2 object_dispose

id 
object_dispose(id obj)
{
    if (!obj) return nil;

    objc_destructInstance(obj);    
    free(obj);

    return nil;
}

可以看到当isa中有特殊的标记的时候,dealloc的时候需要做一些额外的逻辑,然后再free;

3.2.1 额外的析构逻辑objc_destructInstance

  • 调用c++析构函数,如果有的话
  • 调用移除关联对象,如果有的话
  • 清除掉对象的弱引用表记录以及sidetable中的计数记录,如果有的话
void *objc_destructInstance(id obj) 
{
    if (obj) {
        // Read all of the flags at once for performance.
        bool cxx = obj->hasCxxDtor(); // 是否有c++析构方法
        bool assoc = obj->hasAssociatedObjects(); // 是否有关联对象

        // This order is important.
        if (cxx) object_cxxDestruct(obj); // 调用c++析构函数
        if (assoc) _object_remove_assocations(obj); // 移除关联对象
        obj->clearDeallocating(); // 清除弱引用以及sidetable中的记录
    }

    return obj;
}

3.2.2 clearDeallocating

inline void 
objc_object::clearDeallocating()
{
    if (slowpath(!isa.nonpointer)) {
        // Slow path for raw pointer isa.
        sidetable_clearDeallocating(); // ponter isa则从sidetable中取引用计数器做处理
    }
    else if (slowpath(isa.weakly_referenced  ||  isa.has_sidetable_rc)) {
        // Slow path for non-pointer isa with weak refs and/or side table data.
        clearDeallocating_slow(); // nonpointer isa如果有弱引用或者引用计数存储在sidetable则做对应的清除处理
    }

    assert(!sidetable_present());
}

这里都是对有弱引用或者sidetable存储的引用计数信息的情况进行处理;区别就是nonpointer isa本身存储了这些标记信息,则可以根据标记信息去做对应的清除逻辑;而历看是否有对应的引用计数器信息,再根据pointer isa则需要去sidetable里面去遍历计数器的标记位去做对应的清除逻辑

pointer isa的处理

void 
objc_object::sidetable_clearDeallocating()
{
    SideTable& table = SideTables()[this];

    // clear any weak table items
    // clear extra retain count and deallocating bit
    // (fixme warn or abort if extra retain count == 0 ?)
    table.lock();
    RefcountMap::iterator it = table.refcnts.find(this); // 从sidetable找该对象的记录
    if (it != table.refcnts.end()) { // 找到了
        if (it->second & SIDE_TABLE_WEAKLY_REFERENCED) { // 弱引用的标记位来判断是否有弱引用记录
            weak_clear_no_lock(&table.weak_table, (id)this); // 清除弱引用表中的记录
        }
        table.refcnts.erase(it); // 将计数器从sidetable移除
    }
    table.unlock();
}

Nonpointer isa的处理

NEVER_INLINE void
objc_object::clearDeallocating_slow()
{
    ASSERT(isa.nonpointer  &&  (isa.weakly_referenced || isa.has_sidetable_rc));

    SideTable& table = SideTables()[this];
    table.lock();
    if (isa.weakly_referenced) { // 有弱引用记录
        weak_clear_no_lock(&table.weak_table, (id)this);
    }
    if (isa.has_sidetable_rc) { // 有计数信息存储在sidetable
        table.refcnts.erase(this);
    }
    table.unlock();
}
}

4. 总结
  • 苹果为了提高效率以及节约空间,引入了TaggedPointer,同时64位下也采用了Pointer isa合理的利用了64位的空间,isa不仅仅是个指针,还包含了各种标记位,同时也用了几位来存储引用计数,结合sidetable来一起存储引用计数
  • 直接操作isa的效率显然比去sidetable中去读取计数然后修改的效率要高的多,所以苹果有个小的设计--当isa中的extra_rc不足的时候,去sidetable借,然后存储到extra_rc中,这样提高了效率,值得学习和借鉴