es Snapshot and Restore

Overview

整理一下es的snapshot功能，分两块，一块是本地磁盘disk存储，一块是远程hdfs作存储，目录如下，

0. Overview
1. Version
2. Install plugin
3. Disk
   - create repo
   - create snapshot
   - restore
   - setp
4. HDFS
   - create hdfs repo
   - insert data
   - create hdfs snapshot
   - restore from hdfs
5. Restoring to a different cluster
   - registering repository
   - list snapshot
   - starting restore from a snapshot 
6. benchmark
   - snapshoting speed
   - restoring speed
7. plugin auto route
8. other
9. Reference

Version

elasticsearch-5.4.3.zip
repository-hdfs-5.4.3.zip

Install plugin

# need to specified absolute path
bin/elasticsearch-plugin install file:///data/mapleleaf/es_snapshot/repository-hdfs-5.4.3.zip

# check hdfs master namenode ip and port using webhdfs
curl -i "http://localhost:8081/webhdfs/v1/?op=LISTSTATUS"

# start es
sh bin/elasticsearch -d
ps aux | grep elasticsearch | grep -v "grep" | awk '{print $2}' | xargs kill -9
ps aux | grep elasticsearch | grep -v "grep" | awk '{print $2}' | xargs kill -9 ; sleep 3 && sh bin/elasticsearch -d && ps aux | grep elasticsearch | grep -v "grep" && tailf logs/es_snap.log

Disk

create repo

# add below line to esyml
path.repo: ["/data/mapleleaf/es_snapshot/my_backup"]

# create repo, named: my_backup
curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -H 'Content-Type: application/json' -d '{
    "type": "fs",
    "settings": {
        "location": "/data/mapleleaf/es_snapshot/my_backup",
        "compress": true
    }
}'

curl -X GET "localhost:9200/_snapshot/my_backup?pretty"
curl -X DELETE "localhost:9200/_snapshot/my_backup"

create snapshot

# create snapshot
curl -X PUT "localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true&pretty"
curl -X GET "localhost:9200/_snapshot/my_backup/*?pretty"
curl -X GET "localhost:9200/_snapshot/my_backup/snapshot_1/_status?pretty"
curl -X DELETE "localhost:9200/_snapshot/my_backup/snapshot_1?pretty"

restore

# restore
curl -X POST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore?pretty"

setp

check index

curl -X PUT "localhost:9200/customer" -H 'Content-Type: application/json' -d'
{
    "settings" : {
        "index" : {
            "number_of_shards" : 5, 
            "number_of_replicas" : 0 
        }
    }
}
'

curl -X GET "localhost:9200/_cat/indices?v"
curl -X DELETE "localhost:9200/customer?pretty"

insert data

for i in {1..10000};
do
    curl -s -X POST "localhost:9200/customer/external/?pretty" -H 'Content-Type: application/json' -d"
    {
      \"id\": ${i},
      \"num\": ${i},
      \"name\": \"John Doe\"
    }" > /dev/null
done

insert docs

close index

curl -X POST "localhost:9200/customer/_close?pretty"

restore
因为之前我store了一次backup，当时backup只有1条doc，当插入1万条之后，close，然后restore，是以当时store的snapshot来恢复。

after restore

reinsert

curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
    "query": {
        "match_all": {}
    }
}'

reinsert

create snapshot_2

before

after

7 close & restore

HDFS

create hdfs repo

curl -X PUT "localhost:9200/_snapshot/my_hdfs_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://xxxxx:xxxx",
    "path": "elasticsearch/respositories/my_hdfs_repository",
    "compress": true
  }
}'

如果在这一步出现异常，可以参考这里。

create repo successed

insert data

doc 10000

create hdfs snapshot

curl -X PUT "localhost:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_1?wait_for_completion=true&pretty"

access_control_exception

在jvm.optiopns添加插件的安全配置

fix access_control_exception

create snap successed

hdfs ls snapshot files

restore from hdfs

随意增加一些docs，使得与snapshot时的index有差异，便于观察restore效果。

doc 10000+

close index

doc index close

restore
curl -X POST "localhost:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_1/_restore?pretty"

restore successed

doc 10000

Restoring to a different cluster

All that is required is registering the repository containing the snapshot in the new cluster and starting the restore process.

curl -X GET "localhost:9201/_cat/indices?v"

clusterB initial

registering repository

curl -X PUT "localhost:9201/_snapshot/my_hdfs_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "hdfs",
  "settings": {
    "uri": "hdfs://xxxxx:xxxx",
    "path": "elasticsearch/respositories/my_hdfs_repository",
    "compress": true
  }
}'

registering using the same hdfs path with clusterA

list snapshot

curl -X GET "localhost:9201/_snapshot/my_hdfs_repository/*?pretty"

lists working snapshots

starting restore

curl -X POST "localhost:9201/_snapshot/my_hdfs_repository/snapshot_hdfs_1/_restore?pretty"

restore successed

benchmark

会用esrally将数据写入

before

snapshoting speed

hdfs before snapshot

# backgroud running
curl -X PUT "XXX:9200/_snapshot/my_hdfs_repository/snapshot_hdfs_long_1" -H 'Content-Type: application/json' -d'
{
  "indices": "591_etl_fuhaochen_test_2018062500",
  "ignore_unavailable": true,
  "include_global_state": false
}'

# check running status
curl -X GET "XXX:9200/_snapshot/my_hdfs_repository/*?pretty"

in_progress

success

hdfs after snapshot

restoring speed

date
curl -X POST "XXX:9201/_snapshot/my_hdfs_repository/snapshot_hdfs_long_1/_restore?wait_for_completion=true&pretty"
date

after

snapshoting耗时远比restoring高。

plugin auto route

测试一下插件会不会自动路由，即是否需要在每一个节点（datanode，masternode等）都安装？还是只需要在整个es集群的其中一个node安装之后，该node就会将plugin自动路由安装到集群的其他node上？

health

nodes

plugins

自动路由不可用。

other

尝试snapshot更大的index，但是报错了，配置应该没有问题（因为小索引是snapshot成功的）

大索引snapshot失败

小索引snapshot成功

Self-suppression not permitted这个error应该是hadoop的DataNode剩余空间不够导致。

Reference

最后编辑于：2019.02.01 11:53:46

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 156,907评论 4赞 360
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 66,546评论 1赞 289
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 106,705评论 0赞 238
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 43,624评论 0赞 203
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 51,940评论 3赞 285
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 40,371评论 1赞 210
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 31,672评论 2赞 310
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 30,396评论 0赞 195
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 34,069评论 1赞 238
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 30,350评论 2赞 242
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 31,876评论 1赞 256
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 28,243评论 2赞 251
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 32,847评论 3赞 231
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 26,004评论 0赞 8
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 26,755评论 0赞 192
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 35,378评论 2赞 269
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 35,266评论 2赞 259