mysql数据同步到elasticsearch


环境

  • centos7.2
  • php5.5.7
  • mysql5.7

安装elasticsearch

下载地址:https://www.elastic.co/downloads/elasticsearch

image.png

安装2.4.6

wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/rpm/elasticsearch/2.4.6/elasticsearch-2.4.6.rpm

yum -y install elasticsearch-2.4.6.rpm

环境:

  • ubuntu 18.0.4
  • php7.2.13
  • mysql5.7
  • java1.8+
  • go1.9+

安装elasticsearch

下载地址:https://www.elastic.co/downloads/elasticsearch

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.deb

gdebi elasticsearch-6.5.0.deb

安装完成后,配置elasticsearch

vim /etc/elasticsearch/elasticsearch.yml

//配置内容
.
.
.
cluster.name: zpdx-search    //集群名称
.
.
.
node.name: zpdx-1      //节点名称
.
.
.
path.data: /var/lib/elasticsearch    //数据路径
.
.
.
path.log: /var/log/elasticsearch      //日志路径
.
.
.
network.host:  127.0.0.1      //主机地址,默认
.
.
.
http:port:  9200      //端口,默认

开启elasticsearch

systemctl start elasticsearch.service

elasticsearch设置开机自启动

sudo /bin/systemctl daemon-reload

sudo /bin/systemctl enable elasticsearch.service

安装mvn

sudo apt update

sudo apt install maven 

安装 ik

github下载地址: elasticsearch-analysis-ik

git clone https://github.com/medcl/elasticsearch-analysis-ik.git

cd lasticsearch-analysis-ik

git checkout tags/1.10.6

mvn clean

mvn compile

mvn package

./elasticsearch-analysis-ik/target/releases目录会下载了一个对应版本的插件包elasticsearch-analysis-ik-1.10.6.zip

将插件包复制到elaticsearch的插件目录下

cp elasticsearch-analysis-ik-1.10.6.zip /usr/share/elasticsearch/plugins/

cd /usr/share/elasticsearch/plugins/

unzip elasticsearch-analysis-ik-1.10.6.zip

mkdir ik

mv ./* ik

重启elasticsearch

systemctl restart elasticsearch.service

测试,使用curl 'http://127.0.0.1:9200',或者用浏览器访问http://127.0.0.1:9200```,显示

{
  "name" : "zpdx-1",
  "cluster_name" : "zpdxshop-search",
  "cluster_uuid" : "g1wO4CR8TGOwfrDHY-vcGw",
  "version" : {
    "number" : "6.5.0",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "816e6f6",
    "build_date" : "2018-11-09T18:58:36.352602Z",
    "build_snapshot" : false,
    "lucene_version" : "7.5.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

elasticsearch安装完成并成功启动

安装go

sudo apt update

sudo apt install golang

查看go 版本

go version

查看go 环境

go env

设置go环境,vim /etc/profile

//将go环境路径设置到/usr/local/go方便管理
//在最后添加
.
.
.
#go环境
export GOPATH=/usr/local/go

保存并退出,使/etc/profile文件生效

source /etc/profile

安装go-mysql-elasticsearch

go-mysql-elasticsearch的基本原理是:如果是第一次启动该程序,首先使用mysqldump工具对源mysql数据库进行一次全量同步,通过elasticsearch client执行操作写入数据到ES;然后实现了一个mysql client,作为slave连接到源mysql,源mysql作为master会将所有数据的更新操作通过binlog event同步给slave, 通过解析binlog event就可以获取到数据的更新内容,之后写入到ES.

使用限制

1. mysql binlog必须是ROW模式

2. 要同步的mysql数据表必须包含主键,否则直接忽略,这是因为如果数据表没有主键,UPDATE和DELETE操作就会因为在ES中找不到对应的document而无法进行同步

3. 不支持程序运行过程中修改表结构

4. 要赋予用于连接mysql的账户RELOAD权限以及REPLICATION权限, SUPER权限:
       GRANT REPLICATION SLAVE ON *.* TO 'admin'@'127.0.0.1';
       GRANT RELOAD ON *.* TO 'admin'@'27.0.0.1';
       UPDATE mysql.user SET Super_Priv='Y' WHERE user='admin' AND host='127.0.0.1';

查看mysqlbinlogserver_id,需要设置

//进入mysql

mysql> show global variables like '%binlog_format%'; 
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.02 sec)

mysql> show variables like '%server_id%';
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| server_id      | 100   |
| server_id_bits | 32    |
+----------------+-------+
2 rows in set (0.01 sec)

mysql> 

设置mysql,编辑/etc/mysql/mysql.conf.d/mysqld.cnf

vim /etc/mysql/mysql.conf.d/mysqld.cnf

.
.
.
[mysql]
log-bin=mysql-bin
binlog_format="ROW"
server-id=100

重启mysql

systemctl restart mysql

安装go-mysql-elasticsearch

github下载地址:go-mysql-elasticsearch

go get github.com/siddontang/go-mysql-elasticsearch

cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch

make

go-elasticsearch不能自动创建索引index,需手动创建

vim create_index.json

//添加内容
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "content": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ik_smart",     
          "search_analyzer": "ik_smart"
        },
        "content": {
          "type": "text",
          "analyzer": "ik_smart",
          "search_analyzer": "ik_smart"
        }
      }
    }
  }
}

curl执行

chase@chase-MACH-WX9:~$ curl -XPUT 'localhost:9200/test?pretty' -H 'Content-Type:application/json' -d'@create_index.json'
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test"
}
chase@chase-MACH-WX9:~$ 

创建成功,查看索引

chase@chase-MACH-WX9:~$ curl localhost:9200/test?pretty
{
  "test" : {
    "aliases" : { },
    "mappings" : {
      "content" : {
        "properties" : {
          "content" : {
            "type" : "text",
            "analyzer" : "ik_smart"
          },
          "title" : {
            "type" : "text",
            "analyzer" : "ik_smart"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1548855659726",
        "number_of_shards" : "1",
        "number_of_replicas" : "0",
        "uuid" : "F3ze_A2gQo6mE88aEQzKNA",
        "version" : {
          "created" : "6050099"
        },
        "provided_name" : "test"
      }
    }
  }
}
chase@chase-MACH-WX9:~$ 

配置go-mysql-elasticsearch,编辑$GOPATH/src/github.com/siddontang/go-mysql-elasticsearch/etc/river.toml

# MySQL address, user and password
# user must have replication privilege in MySQL.
my_addr = "127.0.0.1:3306"
my_user = "admin"
my_pass = "123456"
my_charset = "utf8"

# Set true when elasticsearch use https
#es_https = false
# Elasticsearch address
es_addr = "127.0.0.1:9200"
# Elasticsearch user and password, maybe set by shield, nginx, or x-pack
es_user = ""
es_pass = ""

# Path to store data, like master.info, if not set or empty,
# we must use this to support breakpoint resume syncing. 
# TODO: support other storage, like etcd. 
data_dir = "./var"

# Inner Http status address
stat_addr = "127.0.0.1:12800"

# pseudo server id like a slave 
server_id = 100

# mysql or mariadb
flavor = "mysql"

# mysqldump execution path
# if not set or empty, ignore mysqldump.
mysqldump = "mysqldump"

# if we have no privilege to use mysqldump with --master-data,
# we must skip it.
#skip_master_data = false

# minimal items to be inserted in one bulk
bulk_size = 128

# force flush the pending requests if we don't have enough items >= bulk_size
flush_bulk_time = "200ms"

# Ignore table without primary key
skip_no_pk_table = false

# MySQL data source
[[source]]
schema = "test"

# Only below tables will be synced into Elasticsearch.
# "t_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023
# I don't think it is necessary to sync all tables in a database.
tables = ["node"]

# Below is for special rule mapping

# Very simple example
# 
# desc t;
# +-------+--------------+------+-----+---------+-------+
# | Field | Type         | Null | Key | Default | Extra |
# +-------+--------------+------+-----+---------+-------+
# | id    | int(11)      | NO   | PRI | NULL    |       |
# | name  | varchar(256) | YES  |     | NULL    |       |
# +-------+--------------+------+-----+---------+-------+
# 
# The table `t` will be synced to ES index `test` and type `t`.
[[rule]]
schema = "test"
table = "node"
index = "test"
type = "content"

启动go-mysql-elasticsearch

cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch

./bin/go-mysql-elasticsearch -config=./etc/river.toml

同步数据信息

[2019/01/30 21:48:31] [info] binlogsyncer.go:111 create BinlogSyncer with config {100 mysql 127.0.0.1 3306 admin   utf8 false false <nil> false false 0 0s 0s 0}
[2019/01/30 21:48:31] [info] dump.go:164 skip dump, use last binlog replication pos (mysql-bin.000001, 790) or GTID set %!s(<nil>)
[2019/01/30 21:48:31] [info] status.go:53 run status http server 127.0.0.1:12800
[2019/01/30 21:48:31] [info] binlogsyncer.go:323 begin to sync binlog from position (mysql-bin.000001, 790)
[2019/01/30 21:48:31] [info] binlogsyncer.go:172 register slave for master server 127.0.0.1:3306
[2019/01/30 21:48:31] [info] sync.go:31 start sync binlog at binlog file (mysql-bin.000001, 790)
[2019/01/30 21:48:31] [info] binlogsyncer.go:692 rotate to (mysql-bin.000001, 790)
[2019/01/30 21:48:31] [info] binlogsyncer.go:692 rotate to (mysql-bin.000002, 4)
[2019/01/30 21:48:31] [info] sync.go:73 rotate binlog to (mysql-bin.000001, 790)
[2019/01/30 21:48:31] [info] master.go:54 save position (mysql-bin.000001, 790)
[2019/01/30 21:48:31] [info] sync.go:73 rotate binlog to (mysql-bin.000002, 4)
[2019/01/30 21:48:31] [info] master.go:54 save position (mysql-bin.000002, 4)

自此,数据全量导入到elasticsearch

设置go-mysql-elasticsearch开机自启动

ubuntu安装sysv-rc-conf

sudo apt update

sudo apt install sysv-rc-conf

如果报错:

sudo apt-get install sysv-rc-conf
正在读取软件包列表... 完成
正在分析软件包的依赖关系树       
正在读取状态信息... 完成       
E: 无法定位软件包 sysv-rc-conf

解决方法:添加镜像源

vim /etc/apt/sources.list

//添加如下内容
deb http://archive.ubuntu.com/ubuntu/ trusty main universe restricted multiverse
sudo apt update

sudo apt install sysv-rc-con

创建go-mysql-elasticsearch脚本

vim /etc/init.d/go-mysql-elasticsearch

//添加内容如下:
#!/bin/bash
#go-mysql-elasticsearch start
$GOPATH/src/github.com/siddontang/go-mysql-elasticsearch/bin/go-mysql-elasticsearch -config=$GOPATH/src/github.com/siddontang/go-mysql-elasticsearch/etc/river.toml

执行

sysv-rc-conf go-mysql-elasticsearch on

service go-mysql-elasticsearch start

centos安装chkconfig

yum -y install chkconfig

安装elasticsearch-php

composer require elasticsearch/elasticsearch

elasticsearch官方文档

elasticsearch-php官方文档

Mysql与Elasticsearch数据实时同步

elasticsearch之实例篇

阮一峰 -- 全文搜索引擎 Elasticsearch 入门教程

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 162,547评论 4 374
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 68,787评论 2 308
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 112,175评论 0 254
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,752评论 0 223
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 53,169评论 3 297
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 41,056评论 1 226
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 32,195评论 2 321
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,980评论 0 214
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,724评论 1 250
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,872评论 2 254
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,344评论 1 265
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,667评论 3 264
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,379评论 3 245
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,202评论 0 9
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,992评论 0 201
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 36,189评论 2 286
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,987评论 2 279

推荐阅读更多精彩内容