KafkaController分析2-NetworkClient分析

96
扫帚的影子
2017.01.14 23:07* 字数 294
  • NetworkClient: 顾名思义哈, 用于网络连接,消息发送的客户端封装, 源码中的注释:

A network client for asynchronous request/response network i/o. This is an internal class used to implement the user-facing producer and consumer clients.

  • 用在何处:
1. kafka本身实现了java版的producer和consumer,里面的网络连接,请求发送均使用NetworkClient实现;
2. KafkaController中controller与其他broker的通讯,使用NetworkClient实现;

InFlightRequests类

  • 所在文件: clients/src/main/java/org/apache/kafka/clients/InFlightRequests.java
  • 实现了request的集合, 包括正在发送的和已经发送的但还没有接收到response的request;
  • 主要成员变量: private final Map<String, Deque<ClientRequest>> requests = new HashMap<String, Deque<ClientRequest>>();
    针对每个连接使用Deque<ClientRequest>数据结构来保存所有的request;Deque<ClientRequest> 是个双端队列;
  • 添加新的request, 新的reqeust总是通过addFirst放到队首
public void add(ClientRequest request) {
        Deque<ClientRequest> reqs = this.requests.get(request.request().destination());
        if (reqs == null) {
            reqs = new ArrayDeque<>();
            this.requests.put(request.request().destination(), reqs);
        }
        reqs.addFirst(request);
    }
  • 取出最早发送的request, 通过pollLast()取出
public ClientRequest completeNext(String node) {
        return requestQueue(node).pollLast();
    }
  • public boolean canSendMore(String node)决定是否可以通过NetworkClient来发送请求
    对于通过NetworkClient来发送的request, 如果之前发送的请求并没有通过底层socket实际发送完成, 是不允许发送新的request的
public boolean canSendMore(String node) {
        Deque<ClientRequest> queue = requests.get(node);
        return queue == null || queue.isEmpty() ||
               (queue.peekFirst().request().completed() && queue.size() < this.maxInFlightRequestsPerConnection);
    }
```

# ClusterConnectionStates
* 所在文件:clients/src/main/java/org/apache/kafka/clients/ClusterConnectionStates.java
* 记录到各个broker node的连接状态:
`private final Map<String, NodeConnectionState> nodeState`
* 对同一node的两次连接有一定的时间间隔限制, 即采用延迟连接:
`private final long reconnectBackoffMs`
* 连接状态有如下三种:
```
ConnectionState.DISCONNECTED -- 未连接
ConnectionState.DISCONNECTING -- 正在连接
ConnectionState.CONNECTED -- 已连接
```
* `canConnect`: 判断是否允许连接到node:如果从未连接过或者连接当前是断开的并且距离上次连接的间隔大于`reconnectBackoffMs`, 则允许连接;
```
public boolean canConnect(String id, long now) {
        NodeConnectionState state = nodeState.get(id);
        if (state == null)
            return true;
        else
            return state.state == ConnectionState.DISCONNECTED && now - state.lastConnectAttemptMs >= this.reconnectBackoffMs;
    }
```

# NetworkClien类
* 所在文件: clients/src/main/java/org/apache/kafka/clients/NetworkClient.java
* 非线程安全
* 继承自 `KafkaClient`
* 使用了 `org.apache.kafka.common.network.Selector`来处理网络IO, [详情点这里 => Kafka源码分析-网络层](http://www.jianshu.com/p/8cbc7618abcb)
* 简单讲这个类用来管理一个到broker node的连接,请求发送和响应接收:
>A network client for asynchronous request/response network i/o. This is an internal class used to implement the user-facing producer and consumer clients.
* 核心函数 `poll`
使用`selector.poll`来处理实现的socket读写事件;
```
        long metadataTimeout = metadataUpdater.maybeUpdate(now);
        try {
            this.selector.poll(Utils.min(timeout, metadataTimeout, requestTimeoutMs));
        } catch (IOException e) {
            log.error("Unexpected error during I/O", e);
        }
```
经过`selector.poll`的调用,所有**发送完成的requet**, **接收完成的response**, **所有断开的连接**, **所有新建成功的连接**都将放到`selector`中相应的队列里;
* 处理发送完成的request
```
private void handleCompletedSends(List<ClientResponse> responses, long now) {
        // if no response is expected then when the send is completed, return it
        for (Send send : this.selector.completedSends()) {
            ClientRequest request = this.inFlightRequests.lastSent(send.destination());
            if (!request.expectResponse()) {
                this.inFlightRequests.completeLastSent(send.destination());
                responses.add(new ClientResponse(request, now, false, null));
            }
        }
    }
```
对于不需要回应response的请求,将从`ifFlightRequests`中删除;
* 处理接收到的response
```
private void handleCompletedReceives(List<ClientResponse> responses, long now) {
        for (NetworkReceive receive : this.selector.completedReceives()) {
            String source = receive.source();
            ClientRequest req = inFlightRequests.completeNext(source);
            ResponseHeader header = ResponseHeader.parse(receive.payload());
            // Always expect the response version id to be the same as the request version id
            short apiKey = req.request().header().apiKey();
            short apiVer = req.request().header().apiVersion();
            Struct body = ProtoUtils.responseSchema(apiKey, apiVer).read(receive.payload());
            correlate(req.request().header(), header);
            if (!metadataUpdater.maybeHandleCompletedReceive(req, now, body))
                responses.add(new ClientResponse(req, now, false, body));
        }
    }
```
如果是`metadata`的更新response,则调用`metadataUpdater.maybeHandleCompletedReceive` 处理metadata的更新;
* 处理新建的连接
```
 private void handleConnections() {
        for (String node : this.selector.connected()) {
            log.debug("Completed connection to node {}", node);
            this.connectionStates.connected(node);
        }
    }
```
* 处理所有的`handle***`函数返回的responses
```
        List<ClientResponse> responses = new ArrayList<>();
        handleCompletedSends(responses, updatedNow);
        handleCompletedReceives(responses, updatedNow);
        handleDisconnections(responses, updatedNow);
        handleConnections();
        handleTimedOutRequests(responses, updatedNow);

        // invoke callbacks
        for (ClientResponse response : responses) {
            if (response.request().hasCallback()) {
                try {
                    response.request().callback().onComplete(response);
                } catch (Exception e) {
                    log.error("Uncaught error in request completion:", e);
                }
            }
        }
```

# NetworkClientBlockingOps
* 所在文件: core/src/main/scala/kafka/utils/NetworkClientBlockingOps.scala
* 利用非阻塞的`NetworkClient`的方法, 实现了阻塞的方法;
* 阻塞直到`Client.ready`
```
def blockingReady(node: Node, timeout: Long)(implicit time: JTime): Boolean = {
    client.ready(node, time.milliseconds()) || pollUntil(timeout) { (_, now) =>
      if (client.isReady(node, now))
        true
      else if (client.connectionFailed(node))
        throw new IOException(s"Connection to $node failed")
      else false
    }
  }
```
* 阻塞发送request直到收到response
```
def blockingSendAndReceive(request: ClientRequest, timeout: Long)(implicit time: JTime): Option[ClientResponse] = {
    client.send(request, time.milliseconds())

    pollUntilFound(timeout) { case (responses, _) =>
      val response = responses.find { response =>
        response.request.request.header.correlationId == request.request.header.correlationId
      }
      response.foreach { r =>
        if (r.wasDisconnected) {
          val destination = request.request.destination
          throw new IOException(s"Connection to $destination was disconnected before the response was read")
        }
      }
      response
    }
  }
```

##### [Kafka源码分析-汇总](http://www.jianshu.com/p/aa274f8fe00f)
Kafka
Web note ad 1