netty 时间轮介绍

背景

最近有接触netty相关内容,组内有做关于netty时间轮的分享,正好总结这篇文章,做个了解和记录。时间轮在超时控制,异常处理,锁控制等方面都有非常多的应用。本期要说的netty时间轮是实现相对简单的一种,比较复杂的如kafka是多级的时间轮实现,我们暂不做介绍。

原理

首先我们理解什么是时间轮,其实可以形象的看做是一个手表,按照固定的时间,即tick,每次摆动对应一个阶段的时间,然后对应时间刻度上绑定待触发任务。当时钟指针到达时,对应的任务就可以执行了。比如一种场景,每一个刻度代表一秒,某个超时判断为10s,则这个超时就在10s的刻度上(假定时钟一个周期大于10s,即格子数大于10个)。当指针从0到达10s的刻度,就是该超时判断执行的时候。

时间轮

以上是个简单时间轮的原理,实用的时间轮比之要复杂。

思路参考

关于时间轮的实现,基本都基于George VargheseTony Lauck的论文,后面有链接。netty的时间轮处理除了上面的原理,还在每个调度任务中记录轮次,在轮次减少到0后,再判断当前轮上的剩余格数,进行任务执行。

注意使用的netty版本是4.1.51.Final

netty实现代码

  1. 时间轮构造

    public HashedWheelTimer(
        ThreadFactory threadFactory,
        long tickDuration, TimeUnit unit, int ticksPerWheel, boolean leakDetection) {
        this(threadFactory, tickDuration, unit, ticksPerWheel, leakDetection, -1);
    }
    
        public HashedWheelTimer(
                ThreadFactory threadFactory,
                long tickDuration, TimeUnit unit, int ticksPerWheel, boolean leakDetection,
                long maxPendingTimeouts) {
    
            if (threadFactory == null) {
                throw new NullPointerException("threadFactory");
            }
            if (unit == null) {
                throw new NullPointerException("unit");
            }
            if (tickDuration <= 0) {
                throw new IllegalArgumentException("tickDuration must be greater than 0: " + tickDuration);
            }
            if (ticksPerWheel <= 0) {
                throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
            }
    
            // Normalize ticksPerWheel to power of two and initialize the wheel.
            wheel = createWheel(ticksPerWheel);
            mask = wheel.length - 1;
    
            // Convert tickDuration to nanos.
            long duration = unit.toNanos(tickDuration);
    
            // Prevent overflow.
            if (duration >= Long.MAX_VALUE / wheel.length) {
                throw new IllegalArgumentException(String.format(
                        "tickDuration: %d (expected: 0 < tickDuration in nanos < %d",
                        tickDuration, Long.MAX_VALUE / wheel.length));
            }
    
            if (duration < MILLISECOND_NANOS) {
                if (logger.isWarnEnabled()) {
                    logger.warn("Configured tickDuration %d smaller then %d, using 1ms.",
                                tickDuration, MILLISECOND_NANOS);
                }
                this.tickDuration = MILLISECOND_NANOS;
            } else {
                this.tickDuration = duration;
            }
    
            //创建调度线程
            workerThread = threadFactory.newThread(worker);
    
            leak = leakDetection || !workerThread.isDaemon() ? leakDetector.track(this) : null;
    
            this.maxPendingTimeouts = maxPendingTimeouts;
    
            if (INSTANCE_COUNTER.incrementAndGet() > INSTANCE_COUNT_LIMIT &&
                WARNED_TOO_MANY_INSTANCES.compareAndSet(false, true)) {
                reportTooManyInstances();
            }
        }

除了一些参数校验,我们按照顺序,先看主要的方法createWheel,按照时间轮每圈的格子数,先做个转换normalizeTicksPerWheel(),找到最近的一个二进制数值,如7取8,13取16。这么做主要是通过利用&运算可以快速取余,用到了后面的定义的mask= wheel.length-1。这种方式在HashMap以及一些中间件中经常使用。随后,根据新的每圈格子数,创建对应的拉链数组。继续看构造函数,会创建一个worker线程,用来遍历时间轮,执行调度。


    private static HashedWheelBucket[] createWheel(int ticksPerWheel) {
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException(
                    "ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }
        if (ticksPerWheel > 1073741824) {
            throw new IllegalArgumentException(
                    "ticksPerWheel may not be greater than 2^30: " + ticksPerWheel);
        }

        ticksPerWheel = normalizeTicksPerWheel(ticksPerWheel);
        HashedWheelBucket[] wheel = new HashedWheelBucket[ticksPerWheel];
        for (int i = 0; i < wheel.length; i ++) {
            wheel[i] = new HashedWheelBucket();
        }
        return wheel;
    }

    private static int normalizeTicksPerWheel(int ticksPerWheel) {
        int normalizedTicksPerWheel = 1;
        while (normalizedTicksPerWheel < ticksPerWheel) {
            normalizedTicksPerWheel <<= 1;
        }
        return normalizedTicksPerWheel;
    }

  1. worker线程
    worker线程我们主要看下run方法。获得当前tick的一个过期时间,然后取余得到当前tick的分桶,然后,指定的bucket再判断当前tick的截止时间,判断轮次是否截止+是否取消,是否得处于过期timeout.deadline <= deadline。具体可以看io.netty.util.HashedWheelTimer.HashedWheelBucket.expireTimeouts

    private final class Worker implements Runnable {
        private final Set<Timeout> unprocessedTimeouts = new HashSet<Timeout>();

        private long tick;

        @Override
        public void run() {
            // Initialize the startTime.
            startTime = System.nanoTime();
            if (startTime == 0) {
                // We use 0 as an indicator for the uninitialized value here, so make sure it's not 0 when initialized.
                startTime = 1;
            }

            // Notify the other threads waiting for the initialization at start().
            startTimeInitialized.countDown();

            do {
                final long deadline = waitForNextTick();
                if (deadline > 0) {
                    int idx = (int) (tick & mask);
                    processCancelledTasks();
                    HashedWheelBucket bucket =
                            wheel[idx];
                    transferTimeoutsToBuckets();
                    bucket.expireTimeouts(deadline);
                    tick++;
                }
            } while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);

            // Fill the unprocessedTimeouts so we can return them from stop() method.
            for (HashedWheelBucket bucket: wheel) {
                bucket.clearTimeouts(unprocessedTimeouts);
            }
            for (;;) {
                HashedWheelTimeout timeout = timeouts.poll();
                if (timeout == null) {
                    break;
                }
                if (!timeout.isCancelled()) {
                    unprocessedTimeouts.add(timeout);
                }
            }
            processCancelledTasks();
        }
        
  1. 时间轮每个格子对应的分桶
    如下,对应到时间轮每个格子,其中的元素是实际调度任务,构造了一个双向链表的结构。其中expireTimeouts,即上面介绍的当指定的tick到来时,判断并执行分桶上的调度任务。
    private static final class HashedWheelBucket {
        // Used for the linked-list datastructure
        private HashedWheelTimeout head;
        private HashedWheelTimeout tail;

        /**
         * Add {@link HashedWheelTimeout} to this bucket.
         */
        public void addTimeout(HashedWheelTimeout timeout) {
            assert timeout.bucket == null;
            timeout.bucket = this;
            if (head == null) {
                head = tail = timeout;
            } else {
                tail.next = timeout;
                timeout.prev = tail;
                tail = timeout;
            }
        }

        /**
         * Expire all {@link HashedWheelTimeout}s for the given {@code deadline}.
         */
        public void expireTimeouts(long deadline) {
            HashedWheelTimeout timeout = head;

            // process all timeouts
            while (timeout != null) {
                HashedWheelTimeout next = timeout.next;
                if (timeout.remainingRounds <= 0) {
                    next = remove(timeout);
                    if (timeout.deadline <= deadline) {
                        timeout.expire();
                    } else {
                        // The timeout was placed into a wrong slot. This should never happen.
                        throw new IllegalStateException(String.format(
                                "timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
                    }
                } else if (timeout.isCancelled()) {
                    next = remove(timeout);
                } else {
                    timeout.remainingRounds --;
                }
                timeout = next;
            }
        }

  1. 实际调度任务元素
    下面是实际执行的调度任务元素,它实现了javaTimeout接口,元素包括对时间轮的索引,调度任务,本身的截止时间,以及状态,轮次,分桶和前后链表指针等。除了cancel,remove等方法,核心的是expire()用来真正实现任务的调度。每个有自己对应的deadline,即预计调度的时间点。

    private static final class HashedWheelTimeout implements Timeout {

        private static final int ST_INIT = 0;
        private static final int ST_CANCELLED = 1;
        private static final int ST_EXPIRED = 2;
        private static final AtomicIntegerFieldUpdater<HashedWheelTimeout> STATE_UPDATER =
                AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimeout.class, "state");

        private final HashedWheelTimer timer;
        private final TimerTask task;
        private final long deadline;

        @SuppressWarnings({"unused", "FieldMayBeFinal", "RedundantFieldInitialization" })
        private volatile int state = ST_INIT;

        // remainingRounds will be calculated and set by Worker.transferTimeoutsToBuckets() before the
        // HashedWheelTimeout will be added to the correct HashedWheelBucket.
        long remainingRounds;

        // This will be used to chain timeouts in HashedWheelTimerBucket via a double-linked-list.
        // As only the workerThread will act on it there is no need for synchronization / volatile.
        HashedWheelTimeout next;
        HashedWheelTimeout prev;

        // The bucket to which the timeout was added
        HashedWheelBucket bucket;

        HashedWheelTimeout(HashedWheelTimer timer, TimerTask task, long deadline) {
            this.timer = timer;
            this.task = task;
            this.deadline = deadline;
        }

        @Override
        public Timer timer() {
            return timer;
        }

        @Override
        public TimerTask task() {
            return task;
        }

        @Override
        public boolean cancel() {
            // only update the state it will be removed from HashedWheelBucket on next tick.
            if (!compareAndSetState(ST_INIT, ST_CANCELLED)) {
                return false;
            }
            // If a task should be canceled we put this to another queue which will be processed on each tick.
            // So this means that we will have a GC latency of max. 1 tick duration which is good enough. This way
            // we can make again use of our MpscLinkedQueue and so minimize the locking / overhead as much as possible.
            timer.cancelledTimeouts.add(this);
            return true;
        }

        void remove() {
            HashedWheelBucket bucket = this.bucket;
            if (bucket != null) {
                bucket.remove(this);
            } else {
                timer.pendingTimeouts.decrementAndGet();
            }
        }

        public void expire() {
            if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
                return;
            }

            try {
                task.run(this);
            } catch (Throwable t) {
                if (logger.isWarnEnabled()) {
                    logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
                }
            }
        }

  1. 添加调度任务
    这里添加任务就是创建实际调度的对象,时间轮的启动也在这里start(),最后创建调度元素后,添加到一个队列里,然后在调度的时候,再从队列转换到buckets中。这个队列是使用timeouts = PlatformDependent.newMpscQueue();jclTools创建的。是一个高性能的并发Queue包。
    @Override
    public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
        if (task == null) {
            throw new NullPointerException("task");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }

        long pendingTimeoutsCount = pendingTimeouts.incrementAndGet();

        if (maxPendingTimeouts > 0 && pendingTimeoutsCount > maxPendingTimeouts) {
            pendingTimeouts.decrementAndGet();
            throw new RejectedExecutionException("Number of pending timeouts ("
                + pendingTimeoutsCount + ") is greater than or equal to maximum allowed pending "
                + "timeouts (" + maxPendingTimeouts + ")");
        }

        start();

        // Add the timeout to the timeout queue which will be processed on the next tick.
        // During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
        long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;

        // Guard against overflow.
        if (delay > 0 && deadline < 0) {
            deadline = Long.MAX_VALUE;
        }
        HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
        timeouts.add(timeout);
        return timeout;
    }

  1. 启动
    时间轮对象本身的start()方法是public的,但是在添加调度任务的时候启动即可,不用显式地调用,毕竟如果时间轮没有任务,也没有启动的必要。注意这里判断状态都是原子类,其中还主要用到了AtomicIntegerFieldUpdater,是jdk基于反射实现的原子状态管理类。
   public void start() {
        switch (WORKER_STATE_UPDATER.get(this)) {
            case WORKER_STATE_INIT:
                if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
                    workerThread.start();
                }
                break;
            case WORKER_STATE_STARTED:
                break;
            case WORKER_STATE_SHUTDOWN:
                throw new IllegalStateException("cannot be started once stopped");
            default:
                throw new Error("Invalid WorkerState");
        }

        // Wait until the startTime is initialized by the worker.
        while (startTime == 0) {
            try {
                startTimeInitialized.await();
            } catch (InterruptedException ignore) {
                // Ignore - it will be ready very soon.
            }
        }
    }

执行示例

import io.netty.util.HashedWheelTimer;
import org.junit.Test;

import java.time.LocalDateTime;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

 @Test
    public void testTimeWheel() throws InterruptedException {
        HashedWheelTimer wheelTimer = new HashedWheelTimer(Executors.defaultThreadFactory(), 1, TimeUnit.SECONDS, 64, false);
        System.out.println("time wheel start@" + LocalDateTime.now());
        wheelTimer.newTimeout(timeout -> System.out.println("timeout 10s --> " + LocalDateTime.now()), 10, TimeUnit.SECONDS);
        wheelTimer.newTimeout(timeout -> System.out.println("timeout 20s --> " + LocalDateTime.now()), 20, TimeUnit.SECONDS);
        Thread.currentThread().join();
    }

运行结果:

time wheel start@2020-11-13T18:10:03.232
timeout 10s --> 2020-11-13T18:10:14.234
timeout 20s --> 2020-11-13T18:10:24.233

总结

以上就是本期的全部内容,对netty时间轮算是有了个初步的了解。对于其中一些更加深入的细节,还需要再努力研究下。感谢阅读。以下参考资料感兴趣的读者可以多做深入。

参考资料

  1. Hashed and Hierarchical timing wheels : Data structures for the efficient implementation of timer facility
  2. George Varghese and Tony Lauck's slide
  3. 时间轮详解
  4. Netty学习
  5. netty-hashedwheeltimer

推荐阅读更多精彩内容