×

dispatch_sync死锁问题研究

96
jackjhu
2015.09.18 15:40* 字数 730

首先,看看如下代码的输出是什么?

- (void)viewDidLoad {
    [super viewDidLoad];
    NSLog(@"Hello");
            dispatch_sync(dispatch_get_main_queue(), ^{
                NSLog(@"World");
            });
}

首先答案是会发生死锁,我们看看官方文档关于dispatch_sync的解释:

Submits a block to a dispatch queue like dispatch_async(), however
dispatch_sync() will not return until the block has finished.

Calls to dispatch_sync() targeting the current queue will result
in dead-lock. Use of dispatch_sync() is also subject to the same
multi-party dead-lock problems that may result from the use of a mutex
.
Use of dispatch_async() is preferred.

Unlike dispatch_async(), no retain is performed on the target queue. Because
calls to this function are synchronous, the dispatch_sync() "borrows" the
reference of the caller.

As an optimization, dispatch_sync() invokes the block on the current
thread when possible.

如果dispatch_sync()的目标queue为当前queue,会发生死锁(并行queue并不会)。使用dispatch_sync()会遇到跟我们在pthread中使用mutex锁一样的死锁问题。

话是这么说,我们看看究竟是怎么做的?先放码:

source/queue.c

void
dispatch_sync(dispatch_queue_t dq, void (^work)(void))
{
    struct Block_basic *bb = (void *)work;
    dispatch_sync_f(dq, work, (dispatch_function_t)bb->Block_invoke);
}

DISPATCH_NOINLINE
void
dispatch_sync_f(dispatch_queue_t dq, void *ctxt, dispatch_function_t func)
{
    typeof(dq->dq_running) prev_cnt;
    dispatch_queue_t old_dq;

    if (dq->dq_width == 1) {
        return dispatch_barrier_sync_f(dq, ctxt, func);
    }

    // 1) ensure that this thread hasn't enqueued anything ahead of this call
    // 2) the queue is not suspended
    if (slowpath(dq->dq_items_tail) || slowpath(DISPATCH_OBJECT_SUSPENDED(dq))) {
        _dispatch_sync_f_slow(dq);
    } else {
        prev_cnt = dispatch_atomic_add(&dq->dq_running, 2) - 2;

        if (slowpath(prev_cnt & 1)) {
            if (dispatch_atomic_sub(&dq->dq_running, 2) == 0) {
                _dispatch_wakeup(dq);
            }
            _dispatch_sync_f_slow(dq);
        }
    }

    old_dq = _dispatch_thread_getspecific(dispatch_queue_key);
    _dispatch_thread_setspecific(dispatch_queue_key, dq);
    func(ctxt);
    _dispatch_workitem_inc();
    _dispatch_thread_setspecific(dispatch_queue_key, old_dq);

    if (slowpath(dispatch_atomic_sub(&dq->dq_running, 2) == 0)) {
        _dispatch_wakeup(dq);
    }
}

Step1. 可以看到dispatch_sync将我们block函数指针进行了一些转换后,直接传给了dispatch_sync_f()去处理。

Step2. dispatch_sync_f首先检查传入的队列宽度(dq_width),由于我们传入的main queue为串行队列,队列宽度为1,所有接下来会调用dispatch_barrier_sync_f,传入3个参数,dispatch_sync中的目标queue、上下文信息和由我们block函数指针转化过后的func结构体。

接下来我们看看dispatch_barrier_sync_f的实现

source/queue.c

void
dispatch_barrier_sync_f(dispatch_queue_t dq, void *ctxt, dispatch_function_t func)
{
    dispatch_queue_t old_dq = _dispatch_thread_getspecific(dispatch_queue_key);

    // 1) ensure that this thread hasn't enqueued anything ahead of this call
    // 2) the queue is not suspended
    // 3) the queue is not weird
    if (slowpath(dq->dq_items_tail)
            || slowpath(DISPATCH_OBJECT_SUSPENDED(dq))
            || slowpath(!_dispatch_queue_trylock(dq))) {
        return _dispatch_barrier_sync_f_slow(dq, ctxt, func);
    }

    _dispatch_thread_setspecific(dispatch_queue_key, dq);
    func(ctxt);
    _dispatch_workitem_inc();
    _dispatch_thread_setspecific(dispatch_queue_key, old_dq);
    _dispatch_queue_unlock(dq);
}

Step3. disptach_barrier_sync_f首先做了做了3个判断:

  • 队列存在尾部节点状态(判断当前是不是处于队列尾部)
  • 队列不为暂停状态
  • 使用_dispatch_queue_trylock检查队列能被正常加锁。

满足所有条件则不执行if语句内的内容,执行下面代码,简单解释为:

  • 使用mutex锁,获取到当前进程资源锁。
  • 直接执行我们block函数指针的具体内容。
  • 然后释放锁,整个调用结束。

然后在我们例子中,很显然当前队列中还有其他viewController的任务,我们的流程跑到_dispatch_barrier_aync_f_slow()函数体中。

刨根问底,让我们看看这个函数。

source/queue.c

static void
_dispatch_barrier_sync_f_slow(dispatch_queue_t dq, void *ctxt, dispatch_function_t func)
{
    
    // It's preferred to execute synchronous blocks on the current thread
    // due to thread-local side effects, garbage collection, etc. However,
    // blocks submitted to the main thread MUST be run on the main thread
    
    struct dispatch_barrier_sync_slow2_s dbss2 = {
        .dbss2_dq = dq,
#if DISPATCH_COCOA_COMPAT
        .dbss2_func = func,
        .dbss2_ctxt = ctxt,
#endif
        .dbss2_sema = _dispatch_get_thread_semaphore(),
    };
    struct dispatch_barrier_sync_slow_s {
        DISPATCH_CONTINUATION_HEADER(dispatch_barrier_sync_slow_s);
    } dbss = {
        .do_vtable = (void *)DISPATCH_OBJ_BARRIER_BIT,
        .dc_func = _dispatch_barrier_sync_f_slow_invoke,
        .dc_ctxt = &dbss2,
    };
//---------------重点是这里---------------   
    _dispatch_queue_push(dq, (void *)&dbss);
    dispatch_semaphore_wait(dbss2.dbss2_sema, DISPATCH_TIME_FOREVER);
    _dispatch_put_thread_semaphore(dbss2.dbss2_sema);

#if DISPATCH_COCOA_COMPAT
    // Main queue bound to main thread
    if (dbss2.dbss2_func == NULL) {
        return;
    }
#endif
    dispatch_queue_t old_dq = _dispatch_thread_getspecific(dispatch_queue_key);
    _dispatch_thread_setspecific(dispatch_queue_key, dq);
    func(ctxt);
    _dispatch_workitem_inc();
    _dispatch_thread_setspecific(dispatch_queue_key, old_dq);
    dispatch_resume(dq);
}

Step4. 既然我们上面已经判断了,main queue中还有其他任务,现在不能直接执行这个block,跳入到_dispatch_barrier_sync_f_slow函数体,那它怎么处理我们加入的block呢?

在_dispatch_barrier_sync_f_slow中,使用_dispatch_queue_push将我们的block压入main queue的FIFO队列中,然后等待信号量,ready后被唤醒。

然后dispatch_semaphore_wait返回_dispatch_semaphore_wait_slow(dsema, timeout)函数,持续轮训并等待,直到条件满足。

所以在此过程中,我们最初调用的dispatch_sync函数一直得不到返回,main queue被阻塞,而我们的block又需要等待main queue来执行它。死锁愉快的产生了。

最后:

我们绘制上张图来轻松的描述一下这个问题:

dispatch_sync.png
日记本
Web note ad 1