解读HotSpot 安全点（safepoint）技术

安全点（safepoint）在HotSpot中是一个核心的技术点，所谓安全点，指的是代码执行过程中被选择出来的一些位置，当JVM需要执行一些要STW（Stop The World）的操作的时候，这些位置用于线程进入这些位置并等待系统执行完成STW操作；安全点不能太少也不能太多，安全点过少会导致那些需要执行STW操作的程序需要等待太久，安全点太多又会导致程序执行时需要频繁的check安全点，导致安全点check造成系统负载升高。在HotSpot内部，需要STW的操作典型的是GC（garbage collect），这一点很好理解，如果在GC的时候用户线程还在执行，那么用户线程就会产生新的垃圾对象，这部分对象我们称为“浮动垃圾”，当然，运行的用户线程除了产生新的垃圾，还会引用原本被GC标记为垃圾的对象，这样GC的时候就会错误的将还有引用关系的对象回收掉，最终导致程序错误，所以，在GC的时候，是需要STW的，当然，随着GC技术的发展，一些GC（比如CMS等）都可以和用户线程并发执行，但这其实只是将GC分阶段执行，在其中的某些阶段和用户程序并发执行，当真正执行垃圾回收的时候还是需要STW的，比如CMS，就把垃圾回收分成了初始标记、并发标记、最终标记及并发清除四个阶段，其中，初始标记和最终标记还是需要STW的。除了GC，还有其他的操作需要STW的吗？答案是肯定的。在文章 Java 动态调试技术原理及实践中，提到了JVM运行时类重定义，文章提到实现了一种可以进行动态调试的Java-debug-tool工具，根据描述，这种工具使用了jvmti（JVM Tool Interface）实现了java agent，并可以动态的挂载到目标JVM上，执行命令，然后在运行时类中插桩并重新加载类，使得新加载的类中的方法执行时可以产生大量的调试信息。在HotSpot中，这种技术的实现和GC一样，也需要STW（相对应的VM_Operation是VM_RedefineClasses），但是本文重点在于解读safepoint的相关实现细节，所以对于VM_RedefineClasses以及VM_Operation的相关内容不做过多描述。在HotSpot中，SafepointSynchronize用来实现安全点，下面是典型的进入和退出安全点的代码片段：

// ... enter the safepoint
SafepointSynchronize::begin();

// do the stw work here

// ... exit the safepoint
SafepointSynchronize::end();

SafepointSynchronize::begin用于实现进入安全点，当所有线程都进入了安全点后，VMThread才能继续执行后面的代码，SafepointSynchronize::end用于实现退出安全点。进入安全点时java线程可能存在不同的状态，这里需要处理所有的可能情况：

（1）处于解释执行字节码的状态中，解释器在通过字节码派发表（dispatch table）获取到下一条字节码的时候会主动检查安全点的状态；
（2）处于执行native代码的状态，也就是执行JNI，此时，VMThread不会等待线程进入安全点，执行JNI退出后线程需要主动检查安全点状态，如果此时安全点位置被标记了，那么就不能继续执行，需要等待安全点位置被清除后才能继续执行；
（3）处于编译代码执行中，那么编译器会在合适的位置（比如循环、方法调用等）插入读取全局Safepoint Polling内存页的指令，如果此时安全点位置被标记了，那么Safepoint Polling内存页会变成不可读，此时线程会因为读取了不可读的内存也而陷入内核，事先注册好的信号处理程序就会处理这个信号并让线程进入安全点。
（4）线程本身处于blocked状态，比如线程在等待锁，那么线程的阻塞状态将不会结束直到安全点标志被清除掉；
（5）当线程处于(1)/(2)/(3)三种状态的切换中，那么切换前会先检查安全点的状态，如果此时要求进入安全点，那么切换将不被允许，需要等待直到安全点状态清除；

SafepointSynchronize::begin进入安全点

下面就来看看HotSpot进入安全点的函数begin的实现细节：

// Roll all threads forward to a safepoint and suspend them all
void SafepointSynchronize::begin() {
  
  // 1
  Thread* myThread = Thread::current();
  assert(myThread->is_VM_thread(), "Only VM thread may execute a safepoint");

  // 2
  // By getting the Threads_lock, we assure that no threads are about to start or
  // exit. It is released again in SafepointSynchronize::end().
  Threads_lock->lock();

  // 3
  assert( _state == _not_synchronized, "trying to safepoint synchronize with wrong state");

  // 4
  int nof_threads = Threads::number_of_threads();

  // 5
  MutexLocker mu(Safepoint_lock);

  // 6
  // Reset the count of active JNI critical threads
  _current_jni_active_count = 0;

  // 7
  // Set number of threads to wait for, before we initiate the callbacks
  _waiting_to_block = nof_threads;
  
  // 8
  TryingToBlock     = 0 ;
  
  // 9
  int still_running = nof_threads;

  // Begin the process of bringing the system to a safepoint.
  // Java threads can be in several different states and are
  // stopped by different mechanisms:
  //
  //  1. Running interpreted
  //     The interpreter dispatch table is changed to force it to
  //     check for a safepoint condition between bytecodes.
  //  2. Running in native code
  //     When returning from the native code, a Java thread must check
  //     the safepoint _state to see if we must block.  If the
  //     VM thread sees a Java thread in native, it does
  //     not wait for this thread to block.  The order of the memory
  //     writes and reads of both the safepoint state and the Java
  //     threads state is critical.  In order to guarantee that the
  //     memory writes are serialized with respect to each other,
  //     the VM thread issues a memory barrier instruction
  //     (on MP systems).  In order to avoid the overhead of issuing
  //     a memory barrier for each Java thread making native calls, each Java
  //     thread performs a write to a single memory page after changing
  //     the thread state.  The VM thread performs a sequence of
  //     mprotect OS calls which forces all previous writes from all
  //     Java threads to be serialized.  This is done in the
  //     os::serialize_thread_states() call.  This has proven to be
  //     much more efficient than executing a membar instruction
  //     on every call to native code.
  //  3. Running compiled Code
  //     Compiled code reads a global (Safepoint Polling) page that
  //     is set to fault if we are trying to get to a safepoint.
  //  4. Blocked
  //     A thread which is blocked will not be allowed to return from the
  //     block condition until the safepoint operation is complete.
  //  5. In VM or Transitioning between states
  //     If a Java thread is currently running in the VM or transitioning
  //     between states, the safepointing code will wait for the thread to
  //     block itself when it attempts transitions to a new state.
  //
  {
    EventSafepointStateSynchronization sync_event;
    int initial_running = 0;

    // 10
    _state            = _synchronizing;
    
    // 11
    OrderAccess::fence();

    // 13
    // Flush all thread states to memory
    if (!UseMembar) {
      os::serialize_thread_states();
    }

    // 14
    // Make interpreter safepoint aware
    Interpreter::notice_safepoints();

    // 15
    os::make_polling_page_unreadable();

    // 16
    // Consider using active_processor_count() ... but that call is expensive.
    int ncpus = os::processor_count() ;

    // 17
    // Iterate through all threads until it have been determined how to stop them all at a safepoint
    unsigned int iterations = 0;
    int steps = 0 ;
    
    // 18
    while(still_running > 0) {
      // 19
      for (JavaThread *cur = Threads::first(); cur != NULL; cur = cur->next()) {
        
        // 20
        ThreadSafepointState *cur_state = cur->safepoint_state();
        
        // 21
        if (cur_state->is_running()) {
          
          // 22
          cur_state->examine_state_of_thread();
          
          // 23
          if (!cur_state->is_running()) {
            
            // 24
            still_running--;
            // consider adjusting steps downward:
            //   steps = 0
            //   steps -= NNN
            //   steps >>= 1
            //   steps = MIN(steps, 2000-100)
            //   if (iterations != 0) steps -= NNN
          }
        }
      }

      // 25
      if (still_running > 0) {

        // Spin to avoid context switching.
        // There's a tension between allowing the mutators to run (and rendezvous)
        // vs spinning.  As the VM thread spins, wasting cycles, it consumes CPU that
        // a mutator might otherwise use profitably to reach a safepoint.  Excessive
        // spinning by the VM thread on a saturated system can increase rendezvous latency.
        // Blocking or yielding incur their own penalties in the form of context switching
        // and the resultant loss of $ residency.
        //
        // Further complicating matters is that yield() does not work as naively expected
        // on many platforms -- yield() does not guarantee that any other ready threads
        // will run.   As such we revert to naked_short_sleep() after some number of iterations.
        // nakes_short_sleep() is implemented as a short unconditional sleep.
        // Typical operating systems round a "short" sleep period up to 10 msecs, so sleeping
        // can actually increase the time it takes the VM thread to detect that a system-wide
        // stop-the-world safepoint has been reached.  In a pathological scenario such as that
        // described in CR6415670 the VMthread may sleep just before the mutator(s) become safe.
        // In that case the mutators will be stalled waiting for the safepoint to complete and the
        // the VMthread will be sleeping, waiting for the mutators to rendezvous.  The VMthread
        // will eventually wake up and detect that all mutators are safe, at which point
        // we'll again make progress.
        //
        // Beware too that that the VMThread typically runs at elevated priority.
        // Its default priority is higher than the default mutator priority.
        // Obviously, this complicates spinning.
        //
        // Note too that on Windows XP SwitchThreadTo() has quite different behavior than Sleep(0).
        // Sleep(0) will _not yield to lower priority threads, while SwitchThreadTo() will.
        //
        // See the comments in synchronizer.cpp for additional remarks on spinning.
        //
        // In the future we might:
        // 1. Modify the safepoint scheme to avoid potentially unbounded spinning.
        //    This is tricky as the path used by a thread exiting the JVM (say on
        //    on JNI call-out) simply stores into its state field.  The burden
        //    is placed on the VM thread, which must poll (spin).
        // 2. Find something useful to do while spinning.  If the safepoint is GC-related
        //    we might aggressively scan the stacks of threads that are already safe.
        // 3. Use Solaris schedctl to examine the state of the still-running mutators.
        //    If all the mutators are ONPROC there's no reason to sleep or yield.
        // 4. YieldTo() any still-running mutators that are ready but OFFPROC.
        // 5. Check system saturation.  If the system is not fully saturated then
        //    simply spin and avoid sleep/yield.
        // 6. As still-running mutators rendezvous they could unpark the sleeping
        //    VMthread.  This works well for still-running mutators that become
        //    safe.  The VMthread must still poll for mutators that call-out.
        // 7. Drive the policy on time-since-begin instead of iterations.
        // 8. Consider making the spin duration a function of the # of CPUs:
        //    Spin = (((ncpus-1) * M) + K) + F(still_running)
        //    Alternately, instead of counting iterations of the outer loop
        //    we could count the # of threads visited in the inner loop, above.
        // 9. On windows consider using the return value from SwitchThreadTo()
        //    to drive subsequent spin/SwitchThreadTo()/Sleep(N) decisions.
        
        // 26
        os::make_polling_page_unreadable();

        // 27
        // Instead of (ncpus > 1) consider either (still_running < (ncpus + EPSILON)) or
        // ((still_running + _waiting_to_block - TryingToBlock)) < ncpus)
        ++steps ;
        if (ncpus > 1 && steps < SafepointSpinBeforeYield) {
          SpinPause() ;     // MP-Polite spin
        } else
          if (steps < DeferThrSuspendLoopCount) {
            os::naked_yield() ;
          } else {
            os::naked_short_sleep(1);
          }

        // 28
        iterations ++ ;
      }
    }
    
    // 29
    assert(still_running == 0, "sanity check");
  } //EventSafepointStateSync

  // wait until all threads are stopped
  {
    
    // 30
    int initial_waiting_to_block = _waiting_to_block;

    // 31
    while (_waiting_to_block > 0) {
      if (!SafepointTimeout || timeout_error_printed) {
        Safepoint_lock->wait(true);  // true, means with no safepoint checks
      } else {
        // Compute remaining time
        jlong remaining_time = safepoint_limit_time - os::javaTimeNanos();

        // If there is no remaining time, then there is an error
        if (remaining_time < 0 || Safepoint_lock->wait(true, remaining_time / MICROUNITS)) {
          print_safepoint_timeout(_blocking_timeout);
        }
      }
    }
    
    // 32
    assert(_waiting_to_block == 0, "sanity check");

    // 33
    // Record state
    _state = _synchronized;

    // 34
    OrderAccess::fence();
  } // EventSafepointWaitBlocked
}

进入安全点的代码，下面按照注释的34个点进行解释：

（1）获取当前线程，并且判断是否是VMThread，VMThread才能执行安全点代码，这里强制校验一下；
（2）获取到全局线程锁，这样在安全点就没有线程可以start或者exit；
（3）安全点同步状态，一共有三个状态，0表示线程都不再安全点上，1表示正在让线程运行到安全点，2表示所有线程均已进入安全点，这里判断了一下，如果此时已经为3，那么就没必要执行同步安全点的工作了；

  enum SynchronizeState {
      _not_synchronized = 0,                   // Threads not synchronized at a safepoint
                                               // Keep this value 0. See the comment in do_call_back()
      _synchronizing    = 1,                   // Synchronizing in progress
      _synchronized     = 2                    // All Java threads are stopped at a safepoint. Only VM thread is running
  };

（4）获取到当前JVM内的线程数量；
（5）获取到安全点执行锁，这样其他线程（比如CMS GC线程）就无法执行安全点代码，保证并发安全；
（6）重置处于JNI代码执行的线程数，VMThread不会等待处于JNI的线程，这些线程需要主动check安全点，这个计数器将在那些执行JNI的线程退出时被更新，这个在后面将block的时候再将；
（7）需要等待进入安全点的线程数量；
（8）尝试阻塞次数；
（9）依然还在运行的线程数量；
（10）将同步状态改成_synchronizing，表示正在同步安全点；
（11）刷新高速缓存，完成多核将数据同步；
（13）刷新线程的状态到内存；
（14）通知字节码解释器进入安全点，这样解释器将会在执行下一条字节码执行的时候check安全点并进入安全点，这一点后面再详细描述；
（15）让全局safepoint polling内存页不可读，这样，执行那些被编译成本地代码的指令的过程中，线程就会因为读到不可读的页面而陷入内核，之后就会进入安全点；
（16）获取当前机器CPU核数；
（17）一共等待的轮数，用于spin；
（18）只要还有线程正在运行，那么就要继续迭代，直到所有线程都进入安全点；
（19）循环的看每一个线程；
（20）获取当前线程的ThreadSafePointState，这个状态会在线程创建的时候创建；
（21）如果当前还没到安全点，那么就要让他进入安全点；
（22）examine_state_of_thread函数用于check当前线程是否已经进入安全点状态，如果进入了，就需要更新一些计数：

void ThreadSafepointState::examine_state_of_thread() {
 // 1 判断当前状态
 assert(is_running(), "better be running or just have hit safepoint poll");

 // 2 获取当前状态
 JavaThreadState state = _thread->thread_state();

 // Save the state at the start of safepoint processing.
 _orig_thread_state = state;

 // Check for a thread that is suspended. Note that thread resume tries
 // to grab the Threads_lock which we own here, so a thread cannot be
 // resumed during safepoint synchronization.

 // We check to see if this thread is suspended without locking to
 // avoid deadlocking with a third thread that is waiting for this
 // thread to be suspended. The third thread can notice the safepoint
 // that we're trying to start at the beginning of its SR_lock->wait()
 // call. If that happens, then the third thread will block on the
 // safepoint while still holding the underlying SR_lock. We won't be
 // able to get the SR_lock and we'll deadlock.
 //
 // We don't need to grab the SR_lock here for two reasons:
 // 1) The suspend flags are both volatile and are set with an
 //    Atomic::cmpxchg() call so we should see the suspended
 //    state right away.
 // 2) We're being called from the safepoint polling loop; if
 //    we don't see the suspended state on this iteration, then
 //    we'll come around again.
 //
 // 看是否已经挂起
 bool is_suspended = _thread->is_ext_suspended();
 if (is_suspended) {

   // 如果线程已经挂起，那么就更新状态为_at_safepoint
   roll_forward(_at_safepoint);
   return;
 }

 // Some JavaThread states have an initial safepoint state of
 // running, but are actually at a safepoint. We will happily
 // agree and update the safepoint state here.
 // 有些线程正在执行JNI，此时虽然线程状态是running的，但是不需要等待进入安全点，那么也可以直接更新状态
 if (SafepointSynchronize::safepoint_safe(_thread, state)) {
   SafepointSynchronize::check_for_lazy_critical_native(_thread, state);

   // 更新状态
   roll_forward(_at_safepoint);
   return;
 }

 // 如果线程正在执行java代码，那么就标记为需要进入安全点，后续线程进入安全点的时候会更新这个状态
 if (state == _thread_in_vm) {
   roll_forward(_call_back);
   return;
 }

 // All other thread states will continue to run until they
 // transition and self-block in state _blocked
 // Safepoint polling in compiled code causes the Java threads to do the same.
 // Note: new threads may require a malloc so they must be allowed to finish

 assert(is_running(), "examine_state_of_thread on non-running thread");
 return;
}
   ```

   roll_forward就和它的函数名字一样，要推进线程进入安全点的进程：

```java
// Returns true is thread could not be rolled forward at present position.
void ThreadSafepointState::roll_forward(suspend_type type) {
 _type = type;

 switch(_type) {
   // 如果进入安全了
   case _at_safepoint:
     
     // 执行_waiting_to_block--操作，完成当前线程进入安全点的工作
     SafepointSynchronize::signal_thread_at_safepoint();
     
     // 如果当前线程正在执行native代码，执行_current_jni_active_count++
     if (_thread->in_critical()) {
       // Notice that this thread is in a critical section
       SafepointSynchronize::increment_jni_active_count();
     }
     break;

   // 还没有达到安全点，那么就要标记一下，等待线程进入安全点   
   case _call_back:
     set_has_called_back(false);
     break;
 }
}

（23、24）再次检测当前线程是否已经进入安全点，如果是的话，那么就更新still_running计数器；
（25）完成一轮检测之后，判断是否还存在没有进入安全点的状态，如果有的话，需要继续执行；
（26）这里其实是有参数控制经过多少次后再将safepoint polling页设置为不可读的，所以这里还有一个设置操作，但是为了代码简洁一些，我把那些计数器去掉了；
（27）如果经过了太多论迭代还是没能让线程进入安全点，考虑到CPU消耗，可以适当等一会再轮询；
（28）迭代次数更新；
（29）循环结束后，表示所有线程都进入了安全点，此时still_running应该为0；
（30、31、32）等待_waiting_to_block为0；
（33）将状态变更为_synchronized，表示进入安全点结束；
（34）刷新缓存；

SafepointSynchronize::end退出安全点

相应的，有进入安全点的代码，就有退出安全点的代码，进入安全点的时候，VMThread使得所有线程都进入了block状态，那退出安全点的时候，VMThread就有责任将所有线程唤醒，让他们继续执行接下来的代码，下面就来看看end函数的实现细节：

// Wake up all threads, so they are ready to resume execution after the safepoint
// operation has been carried out
void SafepointSynchronize::end() {
  // memory fence isn't required here since an odd _safepoint_counter
  // value can do no harm and a fence is issued below anyway.

  // 1
  assert(myThread->is_VM_thread(), "Only VM thread can execute a safepoint");

  // 2
  // Make polling safepoint aware
  os::make_polling_page_readable();

  // 3
  // Remove safepoint check from interpreter
  Interpreter::ignore_safepoints();

  {
    // 4
    MutexLocker mu(Safepoint_lock);

    // 5
    assert(_state == _synchronized, "must be synchronized before ending safepoint synchronization");

    // 6
    // Set to not synchronized, so the threads will not go into the signal_thread_blocked method
    // when they get restarted.
    _state = _not_synchronized;
    
    // 7
    OrderAccess::fence();

    // 8
    // Start suspended threads
    for(JavaThread *current = Threads::first(); current; current = current->next()) {
      // A problem occurring on Solaris is when attempting to restart threads
      // the first #cpus - 1 go well, but then the VMThread is preempted when we get
      // to the next one (since it has been running the longest).  We then have
      // to wait for a cpu to become available before we can continue restarting
      // threads.
      // FIXME: This causes the performance of the VM to degrade when active and with
      // large numbers of threads.  Apparently this is due to the synchronous nature
      // of suspending threads.
      //
      // TODO-FIXME: the comments above are vestigial and no longer apply.
      // Furthermore, using solaris' schedctl in this particular context confers no benefit

      // 9
      ThreadSafepointState* cur_state = current->safepoint_state();
      
      // 10
      assert(cur_state->type() != ThreadSafepointState::_running, "Thread not suspended at safepoint");
      
      // 11
      cur_state->restart();
      
      // 12
      assert(cur_state->is_running(), "safepoint state has not been reset");
    }

    // 13
    // Release threads lock, so threads can be created/destroyed again. It will also starts all threads
    // blocked in signal_thread_blocked
    Threads_lock->unlock();
  }
}

（1）还是要判断一下，只有VMThread才能执行安全点代码；
（2）这里要让safepoint polling内存页重新变为可读，如果这里不变更，那么执行那些被编译为本地代码的代码时就会陷入内核，无法继续执行，这和进入安全点时的让这个页面不可读是对偶的；
（3）告诉字节码解释器退出安全点了，这一点下面再花篇幅来解读；
（4）获取safepoint锁，并发安全；
（5）执行退出时状态应该是所有线程都已经同步，否则就是错误的状态；
（6）变更状态，重点在于后面变为所有线程都离开了安全点；
（7）告诉缓存刷新；
（8）循环让每个线程都离开安全点，重新运行；
（9）获取当前线程的安全点状态；
（10）判断线程状态，如果此时线程的安全点状态已经在处于running了，那么就说明程序出错了；
（11）调用restart函数，这里其实只是做安全点状态的变化，没有特别复杂；
（12）restart会将安全点状态变为running，这里校验一下；
（13）释放线程锁，进入安全点的时候获取到了这把锁，使得没有线程能够start或者exit，这里释放了之后就可以完成这些操作了；

执行线程安全点同步工作：SafepointSynchronize::block

上文已经了解了VMThread是如何进入和退出安全点的，但是还缺少一个环节，那就是让线程阻塞在安全点位置上，SafepointSynchronize::block函数用来完成这部分工作，上文说到，线程进入安全点前会存在不同的执行状态，当他们得知需要进入安全点后，就会调用block函数来阻塞住自己，直到安全点代码完成执行，下面就来看看block函数的实现细节：

void SafepointSynchronize::block(JavaThread *thread) {

  // 1
  assert(thread != NULL, "thread must be set");
  assert(thread->is_Java_thread(), "not a Java thread");

  // 2
  JavaThreadState state = thread->thread_state();

  // 3
  // Check that we have a valid thread_state at this point
  switch(state) {
    case _thread_in_vm_trans:
    case _thread_in_Java:        // From compiled code

      // 4
      // We are highly likely to block on the Safepoint_lock. In order to avoid blocking in this case,
      // we pretend we are still in the VM.
      thread->set_thread_state(_thread_in_vm);

      // 5
      // We will always be holding the Safepoint_lock when we are examine the state
      // of a thread. Hence, the instructions between the Safepoint_lock->lock() and
      // Safepoint_lock->unlock() are happening atomic with regards to the safepoint code
      Safepoint_lock->lock_without_safepoint_check();

      // 6
      if (is_synchronizing()) {

        // 7
        // Decrement the number of threads to wait for and signal vm thread
        assert(_waiting_to_block > 0, "sanity check");

        // 8
        _waiting_to_block--;

        // 9
        thread->safepoint_state()->set_has_called_back(true);

        // 10
        if (thread->in_critical()) {
          // Notice that this thread is in a critical section
          increment_jni_active_count();
        }

        // 11
        // Consider (_waiting_to_block < 2) to pipeline the wakeup of the VM thread
        if (_waiting_to_block == 0) {
          Safepoint_lock->notify_all();
        }
      }

      // 12
      // We transition the thread to state _thread_blocked here, but
      // we can't do our usual check for external suspension and then
      // self-suspend after the lock_without_safepoint_check() call
      // below because we are often called during transitions while
      // we hold different locks. That would leave us suspended while
      // holding a resource which results in deadlocks.
      thread->set_thread_state(_thread_blocked);


      // 13
      Safepoint_lock->unlock();

      // 14
      // We now try to acquire the threads lock. Since this lock is hold by the VM thread during
      // the entire safepoint, the threads will all line up here during the safepoint.
      Threads_lock->lock_without_safepoint_check();

      // 15
      // restore original state. This is important if the thread comes from compiled code, so it
      // will continue to execute with the _thread_in_Java state.
      thread->set_thread_state(state);

      // 16
      Threads_lock->unlock();
      break;

    case _thread_in_native_trans:
    case _thread_blocked_trans:
    case _thread_new_trans:

      // 17
      // We transition the thread to state _thread_blocked here, but
      // we can't do our usual check for external suspension and then
      // self-suspend after the lock_without_safepoint_check() call
      // below because we are often called during transitions while
      // we hold different locks. That would leave us suspended while
      // holding a resource which results in deadlocks.
      thread->set_thread_state(_thread_blocked);

      // It is not safe to suspend a thread if we discover it is in _thread_in_native_trans. Hence,
      // the safepoint code might still be waiting for it to block. We need to change the state here,
      // so it can see that it is at a safepoint.

      // 18
      // Block until the safepoint operation is completed.
      Threads_lock->lock_without_safepoint_check();

      // 19
      // Restore state
      thread->set_thread_state(state);

      // 20
      Threads_lock->unlock();
      break;
  }
}

（1）当前线程肯定要是java线程，而不能是VMThread；
（2）获取当前线程的状态；
（3）看看当前线程处于什么状态，不同状态下处理的方式可能存在差异；
（4）状态变更（需要研究）
（5）获取Safepoint锁，这个锁VMThread在执行进入安全点的时候会持有，在迭代之后会稍微释放一下（wait函数），这个时候其他java线程就能获取到这个锁，然后执行进入同步的代码；
（6）判断是否依然处于同步中状态；
（7）_waitint_to_blcok计数器应该大于0，此时；
（8）更新这个计数器；
（9）这是为了让VMThread知道当前函数以及调用block函数，进入安全点阻塞了；
（10）如果当前线程正在执行JNI代码，则更新相关的计数器；
（11）如果当前线程是最后一个进入安全点的线程，那么就要通知VMThread线程继续执行；
（12）标记线程进入block状态；
（13）释放Safepoint锁，此时其他线程可以获取到该锁，执行这段代码；
（14）获取Threads_lock锁，VMThread在进入安全点后会持有该锁，所以，其他线程执行到这里就会被阻塞住，直到VMThread执行end函数退出安全点，这个线程才能获取到该锁，来退出安全点；
（15）让线程状态变更为进入安全点之前的状态；
（16）释放Threads_lock，其他线程才能从安全点退出；
（17）线程在状态切换，则直接让线程进入安全点；
（18、19、20）获取到Threads_lock，进入阻塞，等待退出安全点；

让线程进入安全点

上文提到，线程在进入安全点前，会有不同的状态，下面来分析其中两种状态下线程是如何进入安全点的；

线程处于解释执行字节码状态

首先，我们需要了解java其实是解释+编译结合起来的一门高性能语言，解释执行的特点是启动快，缺点就是允许慢，编译的优点就是允许起来快，但是需要花费大量的时间来编译成本地代码，这里面涉及很多优化。

处于解释字节码执行状态下，HotSpot使用一种称为“模板解释器”的技术来实现字节码解释执行，所谓“模板技术”，指的是每一个字节码指令，都会被映射到字节码解释模板表中的一项模板，对应的是将指令翻译成汇编代码，这样就能让解释执行的速度也可以很快。

了解了这一点，下面来开始看当线程处于执行字节码解释执行时是如何进入安全点的；在上文讲到VMThread进入安全点的函数begin的时候，提到会执行一个函数：Interpreter::notice_safepoints，这个函数会通知模板解释器，你需要在执行下一条字节码的时候进入安全点：

void TemplateInterpreter::notice_safepoints() {
  if (!_notice_safepoints) {
    // switch to safepoint dispatch table
    _notice_safepoints = true;
    copy_table((address*)&_safept_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
  }
}

这里将_safept_table拷贝到了_active_table中，那_active_table是一张什么表呢？字节码模板解释器在执行字节码的时候，需要一张表来派发字节码，如果需要线程进入安全点，那么就需要在执行字节码前需要做一点额外的工作，下面来看看需要做什么额外的工作：

  { CodeletMark cm(_masm, "safepoint entry points");
    Interpreter::_safept_entry =
      EntryPoint(
                 generate_safept_entry_for(btos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(ztos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(ctos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(stos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(atos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(itos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(ltos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(ftos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(dtos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint)),
                 generate_safept_entry_for(vtos, CAST_FROM_FN_PTR(address, InterpreterRuntime::at_safepoint))
                 );
  }
    // installation of code in other places in the runtime
  // (ExcutableCodeManager calls not needed to copy the entries)
  set_safepoints_for_all_bytes();
 
void TemplateInterpreterGenerator::set_safepoints_for_all_bytes() {
  for (int i = 0; i < DispatchTable::length; i++) {
    Bytecodes::Code code = (Bytecodes::Code)i;
    if (Bytecodes::is_defined(code)) Interpreter::_safept_table.set_entry(code, Interpreter::_safept_entry);
  }
}

在初始化解释器模板的时候会初始化这个table，可以看到传入了一个函数：at_safepoint：

IRT_ENTRY(void, InterpreterRuntime::at_safepoint(JavaThread* thread))
  // We used to need an explict preserve_arguments here for invoke bytecodes. However,
  // stack traversal automatically takes care of preserving arguments for invoke, so
  // this is no longer needed.

  // IRT_END does an implicit safepoint check, hence we are guaranteed to block
  // if this is called during a safepoint

  if (JvmtiExport::should_post_single_step()) {
    // We are called during regular safepoints and when the VM is
    // single stepping. If any thread is marked for single stepping,
    // then we may have JVMTI work to do.
    JvmtiExport::at_single_stepping_point(thread, method(thread), bcp(thread));
  }
IRT_END

这个函数会在执行下一条字节码的时候执行：

address TemplateInterpreterGenerator::generate_safept_entry_for (TosState state,
                                                                address runtime_entry) {
  address entry = __ pc();
  __ push(state);
  
  // 调用at_safepoint函数，用于进入安全点
  __ call_VM(noreg, runtime_entry);
  
  __ dispatch_via(vtos, Interpreter::_normal_table.table_for (vtos));
  return entry;
}

下面来看看at_safepoint是如何让线程进入安全点的：

#define IRT_ENTRY(result_type, header)                               \
  result_type header {                                               \
    ThreadInVMfromJava __tiv(thread);                                \
    VM_ENTRY_BASE(result_type, header, thread)                       \
    debug_only(VMEntryWrapper __vew;)

IRT_ENTRY这个宏定义是关键，这个宏定义创建了一个ThreadInVMfromJava对象，创建这个对象的时候，会调用构造函数，当函数调用完成后，会自动调用析构函数，下面来看看这个类的构造函数和析构函数：

class ThreadInVMfromJava : public ThreadStateTransition {
 public:
  ThreadInVMfromJava(JavaThread* thread) : ThreadStateTransition(thread) {
    trans_from_java(_thread_in_vm);
  }
  ~ThreadInVMfromJava()  {
    if (_thread->stack_yellow_reserved_zone_disabled()) {
      _thread->enable_stack_yellow_reserved_zone();
    }
    trans(_thread_in_vm, _thread_in_Java);
    // Check for pending. async. exceptions or suspends.
    if (_thread->has_special_runtime_exit_condition()) _thread->handle_special_runtime_exit_condition();
  }
};

在构造函数中，将线程的状态变为了in_vm模式；
在析构函数中，调用了trans函数，将线程状态从in_vm变为了in_java

void trans(JavaThreadState from, JavaThreadState to)  {
   transition(_thread, from, to);
 }

  // Change threadstate in a manner, so safepoint can detect changes.
  // Time-critical: called on exit from every runtime routine
  static inline void transition(JavaThread *thread, JavaThreadState from, JavaThreadState to) {

    // 1
    assert(from != _thread_in_Java, "use transition_from_java");
    assert(from != _thread_in_native, "use transition_from_native");

    // 2
    // Change to transition state (assumes total store ordering!  -Urs)
    thread->set_thread_state((JavaThreadState)(from + 1));

    // 3
    if (SafepointSynchronize::do_call_back()) {

      // 4
      SafepointSynchronize::block(thread);
    }

    // 5
    thread->set_thread_state(to);
  }

（1）状态校验，避免调用错误；
（2）改变状态；
（3）看看是否需要调用block进入安全点；
（4）调用block函数，进入安全点，阻塞直到VMThread完成安全点代码执行并释放Threads_lock；
（5）恢复线程状态；

至此，处于字节码解释执行的线程是如何进入安全点的流程梳理清楚了，简单来说，VMThread会在执行进入安全点代码的begin函数的时候，将解释器的字节码路由表替换掉，然后在执行下一条字节码之前插入检查进入安全点的代码，这样，下一条字节码解释执行的时候就会进入安全点；

当然，有进入就有退出，VMThread完成代码执行后，会执行end函数退出安全点，会调用ignore_safepoints函数，将字节码派发表替换成原来的，这样，执行下一条字节码的时候，就不会插入进入安全点的代码：

// switch from the dispatch table which notices safepoints back to the
// normal dispatch table.  So that we can notice single stepping points,
// keep the safepoint dispatch table if we are single stepping in JVMTI.
// Note that the should_post_single_step test is exactly as fast as the
// JvmtiExport::_enabled test and covers both cases.
void TemplateInterpreter::ignore_safepoints() {
  if (_notice_safepoints) {
    if (!JvmtiExport::should_post_single_step()) {
      // switch to normal dispatch table
      _notice_safepoints = false;
      copy_table((address*)&_normal_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
    }
  }
}

线程处于运行编译后代码状态

当线程正在执行已经被编译成本地代码的代码的时候，会在一些位置读取Safepoint——polling内存页，VMThread在进入安全点的时候会将这个内存页配置为不可读，这样，当线程试图去读这个内存页的时候，就会产生错误信号，在linux下，错误信号处理器将会处理这个信号：

///////////////////////////////////////////////////////////////////////////////////
// signal handling (except suspend/resume)

// This routine may be used by user applications as a "hook" to catch signals.
// The user-defined signal handler must pass unrecognized signals to this
// routine, and if it returns true (non-zero), then the signal handler must
// return immediately.  If the flag "abort_if_unrecognized" is true, then this
// routine will never retun false (zero), but instead will execute a VM panic
// routine kill the process.
//
// If this routine returns false, it is OK to call it again.  This allows
// the user-defined signal handler to perform checks either before or after
// the VM performs its own checks.  Naturally, the user code would be making
// a serious error if it tried to handle an exception (such as a null check
// or breakpoint) that the VM was generating for its own correct operation.
//
// This routine may recognize any of the following kinds of signals:
//    SIGBUS, SIGSEGV, SIGILL, SIGFPE, SIGQUIT, SIGPIPE, SIGXFSZ, SIGUSR1.
// It should be consulted by handlers for any of those signals.
//
// The caller of this routine must pass in the three arguments supplied
// to the function referred to in the "sa_sigaction" (not the "sa_handler")
// field of the structure passed to sigaction().  This routine assumes that
// the sa_flags field passed to sigaction() includes SA_SIGINFO and SA_RESTART.
//
// Note that the VM will print warnings if it detects conflicting signal
// handlers, unless invoked with the option "-XX:+AllowUserSignalHandlers".
//
extern "C" JNIEXPORT int JVM_handle_linux_signal(int signo,
                                                 siginfo_t* siginfo,
                                                 void* ucontext,
                                                 int abort_if_unrecognized);

这个函数里面有一段和Safepoint相关的处理代码：

        // Java thread running in Java code => find exception handler if any
        // a fault inside compiled code, the interpreter, or a stub

        if ((sig == SIGSEGV) && checkPollingPage(pc, (address)info->si_addr, &stub)) {
          break;
        }

在checkPollingPage函数内部会生成用于处理信号的函数，然后在后续执行这个函数，下面来看看checkPollingPage这个函数的实现细节：

inline static bool checkPollingPage(address pc, address fault, address* stub) {
  if (fault == os::get_polling_page()) {
    *stub = SharedRuntime::get_poll_stub(pc);
    return true;
  }
  return false;
}

address SharedRuntime::get_poll_stub(address pc) {
  address stub;
  // Look up the code blob
  CodeBlob *cb = CodeCache::find_blob(pc);

  bool at_poll_return = ((CompiledMethod*)cb)->is_at_poll_return(pc);
  bool has_wide_vectors = ((CompiledMethod*)cb)->has_wide_vectors();
  if (at_poll_return) {

    stub = SharedRuntime::polling_page_return_handler_blob()->entry_point();
  } else if (has_wide_vectors) {

    stub = SharedRuntime::polling_page_vectors_safepoint_handler_blob()->entry_point();
  } else {

    stub = SharedRuntime::polling_page_safepoint_handler_blob()->entry_point();
  }

  return stub;
}


  static SafepointBlob* polling_page_return_handler_blob()     { return _polling_page_return_handler_blob; }
  static SafepointBlob* polling_page_safepoint_handler_blob()  { return _polling_page_safepoint_handler_blob; }
  static SafepointBlob* polling_page_vectors_safepoint_handler_blob()  { return _polling_page_vectors_safepoint_handler_blob; }

这几个blob在SharedRuntime::generate_stubs函数里面完成初始化：

//----------------------------generate_stubs-----------------------------------
void SharedRuntime::generate_stubs() {
  _wrong_method_blob                   = generate_resolve_blob(CAST_FROM_FN_PTR(address, SharedRuntime::handle_wrong_method),          "wrong_method_stub");
  _wrong_method_abstract_blob          = generate_resolve_blob(CAST_FROM_FN_PTR(address, SharedRuntime::handle_wrong_method_abstract), "wrong_method_abstract_stub");
  _ic_miss_blob                        = generate_resolve_blob(CAST_FROM_FN_PTR(address, SharedRuntime::handle_wrong_method_ic_miss),  "ic_miss_stub");
  _resolve_opt_virtual_call_blob       = generate_resolve_blob(CAST_FROM_FN_PTR(address, SharedRuntime::resolve_opt_virtual_call_C),   "resolve_opt_virtual_call");
  _resolve_virtual_call_blob           = generate_resolve_blob(CAST_FROM_FN_PTR(address, SharedRuntime::resolve_virtual_call_C),       "resolve_virtual_call");
  _resolve_static_call_blob            = generate_resolve_blob(CAST_FROM_FN_PTR(address, SharedRuntime::resolve_static_call_C),        "resolve_static_call");
  _resolve_static_call_entry           = _resolve_static_call_blob->entry_point();

#if defined(COMPILER2) || INCLUDE_JVMCI
  // Vectors are generated only by C2 and JVMCI.
  bool support_wide = is_wide_vector(MaxVectorSize);
  if (support_wide) {
    _polling_page_vectors_safepoint_handler_blob = generate_handler_blob(CAST_FROM_FN_PTR(address, SafepointSynchronize::handle_polling_page_exception), POLL_AT_VECTOR_LOOP);
  }
#endif // COMPILER2 || INCLUDE_JVMCI
  _polling_page_safepoint_handler_blob = generate_handler_blob(CAST_FROM_FN_PTR(address, SafepointSynchronize::handle_polling_page_exception), POLL_AT_LOOP);
  _polling_page_return_handler_blob    = generate_handler_blob(CAST_FROM_FN_PTR(address, SafepointSynchronize::handle_polling_page_exception), POLL_AT_RETURN);

  generate_deopt_blob();

#ifdef COMPILER2
  generate_uncommon_trap_blob();
#endif // COMPILER2
}

最后都会和handle_polling_page_exception函数有关：

void SafepointSynchronize::handle_polling_page_exception(JavaThread *thread) {
  assert(thread->is_Java_thread(), "polling reference encountered by VM thread");
  assert(thread->thread_state() == _thread_in_Java, "should come from Java code");
  assert(SafepointSynchronize::is_synchronizing(), "polling encountered outside safepoint synchronization");

  if (ShowSafepointMsgs) {
    tty->print("handle_polling_page_exception: ");
  }

  if (PrintSafepointStatistics) {
    inc_page_trap_count();
  }

  ThreadSafepointState* state = thread->safepoint_state();

  state->handle_polling_page_exception();
}

在state->handle_polling_page_exception函数中，会调用block函数进入安全点阻塞，直到VMThread退出安全点；

至此，就把安全点相关的内容大致学习完成了，安全点在HotSpot虚拟机中占有重要地位，其中GC就需要在安全点执行，通过本文的分析，可以学习到VMThread是如何让所有线程停下来的，虽然简单来说就是锁栅栏，但是这其中还是有很多内容值得深入学习的。

image

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 158,847评论 4赞 362
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 67,208评论 1赞 292
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 108,587评论 0赞 243
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 43,942评论 0赞 205
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 52,332评论 3赞 287
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 40,587评论 1赞 218
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 31,853评论 2赞 312
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 30,568评论 0赞 198
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 34,273评论 1赞 242
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 30,542评论 2赞 246
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 32,033评论 1赞 260
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 28,373评论 2赞 253
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 33,031评论 3赞 236
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 26,073评论 0赞 8
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 26,830评论 0赞 195
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 35,628评论 2赞 274
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 35,537评论 2赞 269

解读HotSpot 安全点（safepoint）技术

推荐阅读更多精彩内容