Flink 源码之JobGraph生成

Flink源码分析系列文档目录

JobGraph

相比StreamGraph，JobGraph在生成的时候做出了一项优化：将尽可能多的operator组合到同一个task中，形成operator chain。这样以来，同一个chain中的operator运行在同一个线程中，可以显著降低线程切换的性能开销，并且能增大吞吐量和降低延迟。

Operator Chain

入口方法

StreamGraph的getJobGraph方法

@Override
public JobGraph getJobGraph(@Nullable JobID jobID) {
    return StreamingJobGraphGenerator.createJobGraph(this, jobID);
}

生成JobGraph的逻辑在StreamingJobGraphGenerator类中

StreamingJobGraphGenerator的createJobGraph方法如下所示：

public static JobGraph createJobGraph(StreamGraph streamGraph, @Nullable JobID jobID) {
    return new StreamingJobGraphGenerator(streamGraph, jobID).createJobGraph();
}

继续跟踪，发现创建JobGraph的主要逻辑如下所示：

private JobGraph createJobGraph() {
    // 进行一些校验工作
    preValidate();

    // make sure that all vertices start immediately
    // 设置JobGraph的调度模式
    jobGraph.setScheduleMode(streamGraph.getScheduleMode());

    // Generate deterministic hashes for the nodes in order to identify them across
    // submission iff they didn't change.
    Map<Integer, byte[]> hashes = defaultStreamGraphHasher.traverseStreamGraphAndGenerateHashes(streamGraph);

    // Generate legacy version hashes for backwards compatibility
    List<Map<Integer, byte[]>> legacyHashes = new ArrayList<>(legacyStreamGraphHashers.size());
    for (StreamGraphHasher hasher : legacyStreamGraphHashers) {
        legacyHashes.add(hasher.traverseStreamGraphAndGenerateHashes(streamGraph));
    }

    Map<Integer, List<Tuple2<byte[], byte[]>>> chainedOperatorHashes = new HashMap<>();

    // 这里是重点，JobGraph的顶点和边在这个方法中创建，并且尝试将尽可能多的StreamNode聚合在一个JobGraph节点中。聚合条件稍后分析
    setChaining(hashes, legacyHashes, chainedOperatorHashes);

    // 设置物理边界
    setPhysicalEdges();

    // 设置slot共享和coLocation。同一个coLocationGroup的task需要在同一个slot中运行
    setSlotSharingAndCoLocation();

    // 配置检查点
    configureCheckpointing();

    JobGraphGenerator.addUserArtifactEntries(streamGraph.getUserArtifacts(), jobGraph);

    // set the ExecutionConfig last when it has been finalized
    try {
        // 设置运行时配置
        jobGraph.setExecutionConfig(streamGraph.getExecutionConfig());
    }
    catch (IOException e) {
        throw new IllegalConfigurationException("Could not serialize the ExecutionConfig." +
                "This indicates that non-serializable types (like custom serializers) were registered");
    }

    return jobGraph;
}

其中最为重要的是setChaining方法。该方法为StreamGraph中的每个source节点生成Job Vertex（chain）。

private void setChaining(Map<Integer, byte[]> hashes, List<Map<Integer, byte[]>> legacyHashes, Map<Integer, List<Tuple2<byte[], byte[]>>> chainedOperatorHashes) {
    for (Integer sourceNodeId : streamGraph.getSourceIDs()) {
        createChain(sourceNodeId, sourceNodeId, hashes, legacyHashes, 0, chainedOperatorHashes);
    }
}

Chain的概念：JobGraph最为重要的优化方式为创建OperatorChain，可以尽可能的多整合一些操作在同一个节点中完成，避免不必要的线程切换和网络通信。

createChain方法的主要逻辑：

如果stream具有多个sources，遍历每一个sources，调用createChain方法。
createChain方法的两个参数startNodeId和currentNodeId，如果这两个参数形同，意味着一个新chain的创建。如果这两个参数不相同，则将startNode和currentNode构造在同一个chain中。
使用一个变量builtVertices保证各个StreamNode没有被重复处理。
处理流程将各个节点的出边（out edge）分类。分类的依据为isChainable函数。
出边分为3类，可以被chain和不可以被chain的，还有一种（transitiveOutEdges）是在递归调用createChain的时候加入，目的是存放整个chain所有的出边（在构造chain的时候，遇到一个无法被chain的节点，则意味着该chain已经结束，这个无法被chain的StreamEdge就是这个chain的出边）。
createChain方法会递归调用。如果某个StreamNode的出边可以chain，则调用createChain方法连接这个节点（chain的起始节点）和这个节点可以被chain的出边指向的节点，一直递归到出边不可chain为止。
遇到不可chain的节点，会创建一个job vertex。
同一个chain中的start node和chain内的节点之间operator的关系在chainedOperatorHashes变量中保存，结构为Map<startNodeID, List<Tuple2<StartNodeHash, currentNodeHash>>>
每一个Stream Node（无论有没有对应的job vertex）的配置信息在config变量中。setVertexConfig方法负责设置config变量。
通过ChainedConfig变量来保存chain的起始节点和chain内各个节点配置的对应关系。ChainedConfig结构为Map<startNodeID, Map<currentNodeID, Config>>。
调用connect方法将每个job vertex（chain）和下一个连接起来。比如节点A和B相连，会现在A后追加一个Intermediate DateSet，然后是Job Edge，最后连接到B节点。

createChain代码如下所示：

private List<StreamEdge> createChain(
        Integer startNodeId,
        Integer currentNodeId,
        Map<Integer, byte[]> hashes,
        List<Map<Integer, byte[]>> legacyHashes,
        int chainIndex,
        Map<Integer, List<Tuple2<byte[], byte[]>>> chainedOperatorHashes) {

    // builtVertices存放了已经被构建了的StreamNode ID，避免重复操作
    if (!builtVertices.contains(startNodeId)) {

        // 存储整个chain所有的出边
        List<StreamEdge> transitiveOutEdges = new ArrayList<StreamEdge>();
        // 存储可以被chain的StreamEdge
        List<StreamEdge> chainableOutputs = new ArrayList<StreamEdge>();
        // 存储可以不可以被chain的StreamEdge
        List<StreamEdge> nonChainableOutputs = new ArrayList<StreamEdge>();

        // 获取当前处理node
        StreamNode currentNode = streamGraph.getStreamNode(currentNodeId);

        // 分类可以被chain的edge和不可被chain的edge，使用isChainable的方法判断
        for (StreamEdge outEdge : currentNode.getOutEdges()) {
            if (isChainable(outEdge, streamGraph)) {
                chainableOutputs.add(outEdge);
            } else {
                nonChainableOutputs.add(outEdge);
            }
        }

        for (StreamEdge chainable : chainableOutputs) {
            // 如果是可被chain的StreamEdge，递归调用createChain
            // 注意currentNode是chainable.getTargetId()
            // 递归直到currentNode的out edge为不可chain的edge，会执行下一段for循环，不可chain的边被加入transitiveOutEdges，最终返回到递归最外层
            // 这样以来，transitiveOutEdges收集齐了整个chain所有的出边
            transitiveOutEdges.addAll(
                    createChain(startNodeId, chainable.getTargetId(), hashes, legacyHashes, chainIndex + 1, chainedOperatorHashes));
        }

        for (StreamEdge nonChainable : nonChainableOutputs) {
            // 如果是不可被chain的StreamEdge，添加到transitiveOutEdges集合中
            transitiveOutEdges.add(nonChainable);
            // 调用createChain，构建新的chain
            createChain(nonChainable.getTargetId(), nonChainable.getTargetId(), hashes, legacyHashes, 0, chainedOperatorHashes);
        }

        List<Tuple2<byte[], byte[]>> operatorHashes =
            chainedOperatorHashes.computeIfAbsent(startNodeId, k -> new ArrayList<>());

        byte[] primaryHashBytes = hashes.get(currentNodeId);
        OperatorID currentOperatorId = new OperatorID(primaryHashBytes);

        for (Map<Integer, byte[]> legacyHash : legacyHashes) {
            operatorHashes.add(new Tuple2<>(primaryHashBytes, legacyHash.get(currentNodeId)));
        }

        // 设置chain的名字
        chainedNames.put(currentNodeId, createChainedName(currentNodeId, chainableOutputs));
        // 设置chain的最小资源
        chainedMinResources.put(currentNodeId, createChainedMinResources(currentNodeId, chainableOutputs));
        // 设置chain的最小资源
        chainedPreferredResources.put(currentNodeId, createChainedPreferredResources(currentNodeId, chainableOutputs));

        if (currentNode.getInputFormat() != null) {
            getOrCreateFormatContainer(startNodeId).addInputFormat(currentOperatorId, currentNode.getInputFormat());
        }

        if (currentNode.getOutputFormat() != null) {
            getOrCreateFormatContainer(startNodeId).addOutputFormat(currentOperatorId, currentNode.getOutputFormat());
        }

        // 如果currentNodeId和startNodeId相等，说明需要创建一个新的chain，会生成一个JobVertex
        StreamConfig config = currentNodeId.equals(startNodeId)
                ? createJobVertex(startNodeId, hashes, legacyHashes, chainedOperatorHashes)
                : new StreamConfig(new Configuration());

        // 设置的顶点属性到config中
        setVertexConfig(currentNodeId, config, chainableOutputs, nonChainableOutputs);

        if (currentNodeId.equals(startNodeId)) {

            // 意味着一个新chain的开始
            config.setChainStart();
            config.setChainIndex(0);
            config.setOperatorName(streamGraph.getStreamNode(currentNodeId).getOperatorName());
            config.setOutEdgesInOrder(transitiveOutEdges);
            config.setOutEdges(streamGraph.getStreamNode(currentNodeId).getOutEdges());

            // 对于每一个chain，把它和指向下一个chain的出边连接起来
            for (StreamEdge edge : transitiveOutEdges) {
                connect(startNodeId, edge);
            }

            config.setTransitiveChainedTaskConfigs(chainedConfigs.get(startNodeId));

        } else {
            chainedConfigs.computeIfAbsent(startNodeId, k -> new HashMap<Integer, StreamConfig>());

            config.setChainIndex(chainIndex);
            // 获取到被chain的节点
            StreamNode node = streamGraph.getStreamNode(currentNodeId);
            config.setOperatorName(node.getOperatorName());
            // 关联chain内节点的配置信息到chain的起始节点上
            chainedConfigs.get(startNodeId).put(currentNodeId, config);
        }

        config.setOperatorID(currentOperatorId);

        if (chainableOutputs.isEmpty()) {
            config.setChainEnd();
        }
        return transitiveOutEdges;

    } else {
        return new ArrayList<>();
    }
}

isChainable方法，这个方法很重要。用于判断某个边两头连接的StreamNode的node是否可以组成OperatorChain。方法如下所示：

public static boolean isChainable(StreamEdge edge, StreamGraph streamGraph) {
    StreamNode upStreamVertex = streamGraph.getSourceVertex(edge);
    StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);

    StreamOperator<?> headOperator = upStreamVertex.getOperator();
    StreamOperator<?> outOperator = downStreamVertex.getOperator();

    return downStreamVertex.getInEdges().size() == 1
            && outOperator != null
            && headOperator != null
            && upStreamVertex.isSameSlotSharingGroup(downStreamVertex)
            && outOperator.getChainingStrategy() == ChainingStrategy.ALWAYS
            && (headOperator.getChainingStrategy() == ChainingStrategy.HEAD ||
                headOperator.getChainingStrategy() == ChainingStrategy.ALWAYS)
            && (edge.getPartitioner() instanceof ForwardPartitioner)
            && upStreamVertex.getParallelism() == downStreamVertex.getParallelism()
            && streamGraph.isChainingEnabled();
}

总结起来，可以chain的条件如下（都必须满足）：

下游节点的前置节点有且只能有1个。
该Edge的上游和下游节点必须存在。
上游节点和下游节点位于同一个SlotSharingGroup中。
下游的chain策略为ChainingStrategy.ALWAYS。
上游的chain策略为ChainingStrategy.ALWAYS或ChainingStrategy.HEAD。
使用ForwardPartitoner及其子类。
上游和下游节点的并行度一致。
chaining被启用。

接下来是setPhysicalEdges方法。该方法负责设置job vertex的物理边界。执行步骤总结如下：

遍历physicalEdgesInOrder对象，该对象包含了所有的不可被chain的出边（在调用connect方法的时候edge被加入该集合）。
physicalInEdgesInOrder结构为Map<不可chain的edge指向的下游节点,List<不可chain的edge>>。
找到这些不可chain的edge指向的下游节点，设置物理边界（该节点的入边）

private void setPhysicalEdges() {
    Map<Integer, List<StreamEdge>> physicalInEdgesInOrder = new HashMap<Integer, List<StreamEdge>>();

    for (StreamEdge edge : physicalEdgesInOrder) {
        int target = edge.getTargetId();

        List<StreamEdge> inEdges = physicalInEdgesInOrder.computeIfAbsent(target, k -> new ArrayList<>());

        inEdges.add(edge);
    }

    for (Map.Entry<Integer, List<StreamEdge>> inEdges : physicalInEdgesInOrder.entrySet()) {
        int vertex = inEdges.getKey();
        List<StreamEdge> edgeList = inEdges.getValue();

        vertexConfigs.get(vertex).setInPhysicalEdges(edgeList);
    }
}

其余的方法对生成JobGraph过程的理解不是很重要，暂时不分析，留在以后补充。

示例图

JobGraph示意图

注意StreamGraph的window和sink两个节点被chain到了一起。

最后编辑于：2021.04.23 15:20:56

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 157,298评论 4赞 360
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 66,701评论 1赞 290
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 107,078评论 0赞 237
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 43,687评论 0赞 202
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 52,018评论 3赞 286
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 40,410评论 1赞 211
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 31,729评论 2赞 310
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 30,412评论 0赞 194
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 34,124评论 1赞 239
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 30,379评论 2赞 242
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 31,903评论 1赞 257
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 28,268评论 2赞 251
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 32,894评论 3赞 233
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 26,014评论 0赞 8
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 26,770评论 0赞 192
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 35,435评论 2赞 269
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 35,312评论 2赞 260

Flink 源码之JobGraph生成

Flink源码分析系列文档目录

JobGraph

入口方法

示例图

推荐阅读更多精彩内容