大数据系列（二）：HDFS（Hadoop分布式文件系统）（一）

HDFS设计

HDFS is a filesystem designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware.

HDFS不适用的情况

低时间延迟的数据访问（Low-latency data access)
HDFS 是为高吞吐量应用优化的，这会导致它的高延迟性（Remember, HDFS is optimized for delivering a high throughput of data, and this may be at the expense of latency）
大量的小文件
由于namenode将文件系统的元数据存储在内存中，因此该文件系统所能存储的文件总数受限于namenode的内存容量。每个文件、目录和数据块的存储信息大约占150B，数十亿的文件存储会有问题（Since the namenode holds filesystem metadata in memory, the limit to the number of files in a filesystem is governed by the amount of memory on the namenode. As a rule of thumb, each file, directory, and block takes about 150 bytes。While storing millions of files is feasible, billions is beyond the capability of current hardware）
多用户写入，任意修改文件
HDFS中的文件可能只有一个writter (Files in HDFS may be written to by a single writer)

HDFS概念

数据块（block）

数据块（block）：是最小可读写数据的数量（is the minimum amount of data that it can read or write.）HDFS也拥有数据块，默认值为64MB。HDFS上的文件也被划分为块大小（block-sized)的分块(chunks)，作为独立的储存单元。

为什么HDFS的一个数据块如此大（Why Is a Block in HDFS So Large?）

和硬盘数据块相比，HDFS数据块大的原因是为了最小化寻址开销(HDFS blocks are large compared to disk blocks, and the reason is to minimize the cost of seeks.)**
通过将块设置的足够大，从磁盘传输数据的时间可以明显大于定位这个块的开始位置所需的时间。这样，传输一个有多个块组成的文件的时间取决于磁盘传输速率(By making a block large enough, the time to transfer the data from the disk can be made to be significantly larger than the time to seek to the start of the block. Thus the time to transfer a large file made of multiple blocks operates at the disk transfer rate.)
比如，寻址时间10ms左右，而传输速率为100MB/s，为了使寻址速率占传输时间的1%，块的大小需要是100MB左右（if the seek time is around 10 ms, and the transfer rate is 100 MB/s, then to make the seek time 1% of the transfer time, we need to make the block size around 100 MB）

优势

文件的所有块不需要存储在同一个磁盘上，因此它们可以利用集群上的任意磁盘进行存储（There’s nothing that requires the blocks from a file to be stored on the same disk, so they can take advantage of any of the disks in the cluster）
使用抽象块而非整个文件作为存储单元，大大简化了存储子系统的设计(making the unit of abstraction a block rather than a file simplifies the storage subsystem)
- 将存储子系统控制单元设置为块，可简化存储管理（单个磁盘能存储多少块相对容易）(since blocks are a fixed size, it is easy to calculate how many can be stored on a given disk)
- 消除对元数据的顾虑（块只是存储数据的一部分，而文件的元数据，并不需要一起存储。这样，其他系统可以单独管理这些元数据）(blocks are just a chunk of data to be stored—file metadata such as permissions information does not need to be stored with the blocks, so another system can handle metadata separately)
块非常适合用于数据备份进而提供数据容错能力和可用性，一般为3份(blocks fit well with replication for providing fault tolerance and availability)

Namenode and datanode

一个HDFS集群有两类节点，以管理-工作形式运行，即一个namenode（管理者）和多个datanode（工作者）（An HDFS cluster has two types of node operating in a master-worker pattern: a namenode (the master) and a number of datanodes (workers).）

Namenode

The namenode manages the filesystem namespace. It maintains the filesystem tree and the metadata for all the files and directories in the tree (namenode管理文件系统的命名空间，维持文件系统树和树里所有文件和目录的元数据）。This information is stored persistently on the local disk in the form of two files: the namespace image and the edit log.(这些信息永久以两类文件（命名空间镜像文件和编制日志文件）储存在本地硬盘。）
The namenode also knows the datanodes on which all the blocks for a given file are located, however, it does not store block locations persistently, since this information is reconstructed from datanodes when the system starts.(namenode也记录每个文件中各个块所在的数据节点信息，但它并不永久保存块的信息，因为这些信息会在系统启动时由数据节点重建）

Datanodes

Datanodes store and retrieve blocks when they are told to (by clients or the namenode), and they report back to the namenode periodically with lists of that they are storing.(Datanodes存储并检索数据块（受客户端和namenode调度），并定期向namenode发送它储存的块的列表）

HDFS Federation

The namenode keeps a reference to every file and block in the filesystem in memory, which means that on very large clusters with many files, memory becomes the limiting factor for scaling. HDFS Federation allows a cluster to scale by adding namenodes, each of which manages a portion of the filesystem namespace. (namenode和内存文件系统中每个文件和数据块保持关联，这意味着在多文件的大集群上，内存会限制规模化的增长。HDFS Federation 允许一个集群通过增加namenodes实现规模化，每一个namenode管理命名空间文件系统的一部分）

Under federation, each namenode manages a namespace volume, which is made up of the metadata for the namespace, and a block pool containing all the blocks for the files in the namespace(在Federation体制下，每个namenode管理一个在命名空间组成元数据的namespace volume和一个包括命名空间文件所有数据块的block pool）

Namespace volumes are independent of each other, which means namenodes do not communicate with one another (Namespace volumes 相互独立，意味着namenodes之间并无联系）
Block pool storage is not partitioned, however, so datanodes register with each namenode in the cluster and store blocks from multiple block pools.

HDFS High-Availability

The combination of replicating namenode metadata on multiple filesystems, and using the secondary namenode to create checkpoints protects against data loss, but does not provide high-availability of the filesystem. The namenode is still a single point of failure (SPOF).(在多个文件系统上复制namenode元数据并使用第二namenode来建立检查点可以防止数据丢失，但并没有提供高有效性。namenode依旧是SPOF）

To support HDFS High-Availability, there is a pair of namenodes in an active standby configuration. In the event of the failure of the active namenode, the standby takes over its duties to continue servicing client requests without a significant interruption. (为了支持HA，会有备用的namenodes待命。如果活动的namenode失效，这些备用可以继续工作避免明显的中断）

最后编辑于：2017.11.27 04:52:35

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 160,108评论 4赞 364
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 67,699评论 1赞 296
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 109,812评论 0赞 244
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 44,236评论 0赞 213
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 52,583评论 3赞 288
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 40,739评论 1赞 222
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 31,957评论 2赞 315
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 30,704评论 0赞 204
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 34,447评论 1赞 246
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 30,643评论 2赞 249
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 32,133评论 1赞 261
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 28,486评论 3赞 256
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 33,151评论 3赞 238
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 26,108评论 0赞 8
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 26,889评论 0赞 197
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 35,782评论 2赞 277
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 35,681评论 2赞 272

大数据系列（二）：HDFS（Hadoop分布式文件系统）（一）

大数据系列（二）：HDFS（Hadoop分布式文件系统）（一）

HDFS设计

HDFS不适用的情况<p>

HDFS概念<p>

数据块（block）<p>

为什么HDFS的一个数据块如此大（Why Is a Block in HDFS So Large?）<p>

优势<p>

Namenode and datanode <p>

Namenode

Datanodes <p>

HDFS Federation <p>

HDFS High-Availability <p>

推荐阅读更多精彩内容