【Pipeline】 Recovery

分享一波大数据&Java的学习视频和书籍:
### Java与大数据资源分享

本文来学习Pipeline Recovery的相关知识。
当client写文件时,数据是以block的形式相继顺序写入的。HDFS把block又拆分成很多packet把这些packet传播到write pipeline中的Datanodes。如下图所示:

write pipeline有三个过程:
1)Pipeline setup. The client sends a Write_Block request along the pipeline and the last DataNode sends an acknowledgement back. After receiving the acknowledgement, the pipeline is ready for writing.

2)Data streaming. The data is sent through the pipeline in packets. The client buffers the data until a packet is filled up, and then sends the packet to the pipeline. If the client calls hflush(), then even if a packet is not full, it will nevertheless be sent to the pipeline and the next packet will not be sent until the acknowledgement of the previous hflush’ed packet is received by the client.

3)Close (finalize the replica and shutdown the pipeline). The client waits until all packets have been acknowledged and then sends a close request. All DataNodes in the pipeline change the corresponding replica into the FINALIZED state and report back to the NameNode. The NameNode then changes the block’s state to COMPLETE if at least the configured minimum replication number of DataNodes reported a FINALIZED state of their corresponding replicas.

下面来看pipeline recovery。什么时候发生呢?当一个block is being written时,在上面三个阶段的任意一个阶段如果一个或者多个在pipeline中的datanode发生了错误,那么就会发起pipeline recovery。

Pipeline Recovery:
Pipeline recovery is initiated when one or more DataNodes in the pipeline encounter an error in any of the three stages while a block is being written.

与上面三个阶段相对应的,pipeline recovery也针对这三个不同阶段有不同的恢复行为。

1)Recovery from Pipeline Setup Failure

  • If the pipeline was created for a new block, the client abandons the block and asks the NameNode for a new block and a new list of DataNodes. The pipeline is reinitialized for the new block.

如果pipeline是因为create a block而创建的话,client abandon了block,并向Namenode请求一个new block和 new list of Datanodes,那么pipeline会为这个new block 重新初始化。

  • If the pipeline was created to append to a block, the client rebuilds the pipeline with the remaining DataNodes and increments the block’s generation stamp.

如果pipeline是因为append a block而创建的,那么client 利用剩余的Datanodes重新build pipeline并且增加block的GS

2)Recovery from Data Streaming Failure

  • When a DataNode in the pipeline detects an error (for example, a checksum error or a failure to write to disk), that DataNode takes itself out of the pipeline by closing up all TCP/IP connections. If the data is deemed not corrupted, it also writes buffered data to the relevant block and checksum (METADATA) files.

当一个在pipeline中的datanode检测到error时,这个datanode会把它自己从pipeline中移出去通过关闭所有TCP/IP连接的方式。如果数据看起来没有损坏,它仍会写buffered data到相关的block和metadata文件中。

  • When the client detects the failure, it stops sending data to the pipeline, and reconstructs a new pipeline using the remaining good DataNodes. As a result, all replicas of the block are bumped up to a new GS.

当client检测到failure时,client停止发送数据到pipeline同时使用剩余的good datanodes 来重新构建一个new pipeline。所有的块副本会提升至一个新的GS

  • The client resumes sending data packets with this new GS. If the data sent has already been received by some of the DataNodes, they just ignore the packet and pass it downstream in the pipeline.

client继续恢复发送data packets使用这个new GS。如果这些数据已经被某些Datanodes receive过了,那么这些已经receive到数据的datanode只需要忽略这些packet并直接把packet传到pipeline的下游节点。

3)Recovery from Close Failure

  • When the client detects a failure in the close state, it rebuilds the pipeline with the remaining DataNodes. Each DataNode bumps up the block’s GS and finalizes the replica if it’s not finalized yet.

当client在close 状态下检测到failure时,它会使用剩余datanodes节点 rebuild pipeline。每个datanode会提升block的GS并且会把没有finalize的副本给finalize 掉。

总结:
当一个Datanode is bad时,它会把自己从pipeline中remove掉。在pipeline recovery的过程中, client可能会利用剩下的Datanodes(client可能会使用新的Datanode来替换bad Datanode,也可能不会替换。这取决于DataNode replacement policy)来rebuild一个新的pipeline。 replication monitor会关注与复制block来满足我们配置的replication factor。

DataNode Replacement Policy upon Failure

There are four configurable policies regarding whether to add additional DataNodes to replace the bad ones when setting up a pipeline for recovery with the remaining DataNodes:

  1. DISABLE: Disables DataNode replacement and throws an error (at the server); this acts like NEVER at the client.
  2. NEVER: Never replace a DataNode when a pipeline fails (generally not a desirable action).
  3. DEFAULT: Replace based on the following conditions:
    a. Let r be the configured replication number.
    b. Let n be the number of existing replica datanodes.
    c. Add a new DataNode only if r >= 3 and EITHER
    • floor(r/2) >= n; OR
    • r > n and the block is hflushed/appended.
  4. ALWAYS: Always add a new DataNode when an existing DataNode failed. This fails if a DataNode can’t be replaced.

参考文章:

https://blog.cloudera.com/

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 159,015评论 4 362
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,262评论 1 292
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 108,727评论 0 243
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 43,986评论 0 205
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,363评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,610评论 1 219
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,871评论 2 312
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,582评论 0 198
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,297评论 1 242
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,551评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,053评论 1 260
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,385评论 2 253
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,035评论 3 236
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,079评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,841评论 0 195
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,648评论 2 274
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,550评论 2 270

推荐阅读更多精彩内容