Redis集群搭建

96
爱喝咖啡的土拨鼠
0.3 2017.04.21 13:03* 字数 1457

Redis集群的搭建

本来是想搞一主二从的,但是一主二从不好扩展啊,高并发的时候性能肯定也不如集群啊,redis3.0开始都用集群了,一主二从加哨兵的模式感觉要被淘汰了。所以还是直接搞集群吧。

  1. 先研究一下redis集群的架构以及原理
  2. redis集群搭建
  3. 运维(迁移,扩容,缩容等)

写的比较多了,可以直接去看您感兴趣的那部分,有错误的地方欢迎指出。

1redis集群的架构和原理

  • [x] 简单的说一下:

在redis官方未推出其集群化解决方案时,很多大公司自己出了些解决方案,这些方案的核心思想是把数据分片(sharding)存储在多个Redis实例中,每一片就是一个Redis实例,但是都需要加入一些中间件比如zookeeper。

Redis 3.0集群采用了P2P的模式,完全去中心化。Redis把所有的Key分成了16384个slot,每个Redis实例负责其中一部分slot。集群中的所有信息(节点、端口、slot等),都通过节点之间定期的数据交换而更新。Redis客户端在任意一个Redis实例发出请求,如果所需数据不在该实例中,通过重定向命令引导客户端访问所需的实例。

总的来说就是多个实例组成了集群,数据分散在不同的实例上,然后你可以随便去哪个实例都能读到数据。另外为了保证高可用,还要配置从节点。

  • [x] 我的疑问:

现在我配好了六个实例作为一个集群,其中有三个master,三个slave,那么问题来了

1 在集群中有多个实例,而且针对这个集群没有统一的端口(比如我下文配置好的集群就有7000-70005,六个实例六个端口),那我写的时候往哪个端口写啊? 有三个master呢?

答:
你可以随便找一个节点写,即便是从节点也可以,然后集群内部的机制,会把你的指令分散到各个主节点上,
下面我在从节点(7002)写,然后他会重定向到主节点7000 7001 7003 上面。

[root@node1 src]# ./redis-cli -h master -c -p 7002
master:7002> set k1 v1
-> Redirected to slot [12706] located at 192.168.0.66:7001
OK
192.168.0.66:7001> set k2 v2
-> Redirected to slot [449] located at 192.168.0.66:7000
OK
192.168.0.66:7000> set k3 v3
OK
192.168.0.66:7000> set k4 v4
-> Redirected to slot [8455] located at 192.168.0.67:7003
OK
192.168.0.67:7003> set k5 v5
-> Redirected to slot [12582] located at 192.168.0.66:7001
OK
192.168.0.66:7001> set k6 v6
-> Redirected to slot [325] located at 192.168.0.66:7000
OK

客户端方面,举2个例子:Java的Jedis 和C++的 ACL

Java(完整示例)

public static void main(String[] args) {  
    JedisPoolConfig poolConfig = new JedisPoolConfig();  
    // 最大连接数  
    poolConfig.setMaxTotal(1);  
    // 最大空闲数  
    poolConfig.setMaxIdle(1);  
    // 最大允许等待时间,如果超过这个时间还未获取到连接,则会报JedisException异常:  
    // Could not get a resource from the pool  
    poolConfig.setMaxWaitMillis(1000);  
    Set<HostAndPort> nodes = new LinkedHashSet<HostAndPort>();  
    nodes.add(new HostAndPort("192.168.83.128", 6379));  
    nodes.add(new HostAndPort("192.168.83.128", 6380));  
    nodes.add(new HostAndPort("192.168.83.128", 6381));  
    nodes.add(new HostAndPort("192.168.83.128", 6382));  
    nodes.add(new HostAndPort("192.168.83.128", 6383));  
    nodes.add(new HostAndPort("192.168.83.128", 6384));  
    JedisCluster cluster = new JedisCluster(nodes, poolConfig);  
    String name = cluster.get("name");  
    System.out.println(name);  
    cluster.set("age", "18");  
    System.out.println(cluster.get("age"));  
    try {  
        cluster.close();  
    } catch (IOException e) {  
        e.printStackTrace();  
    }  
}

C++(完整示例)

int main(void)
{
    const char* redis_addr = "127.0.0.1:6379";
    int conn_timeout = 10, rw_timeout = 10, max_conns = 100;

    // 定义 redis 客户端集群管理对象
    acl::redis_client_cluster cluster;
    // 添加一个 redis 服务结点,可以多次调用此函数添加多个服务结点,
    // 因为 acl redis 模块支持 redis 服务结点的自动发现及动态添加
    // 功能,所以添加一个服务结点即可
    cluster.set(redis_addr, max_conns);

    // redis 字符串类 (STRING) 操作对象
    acl::redis_string cmd_string;

    // redis 键值类(KEY) 操作对象
    acl::redis_key cmd_key;

    // 给 redis 操作对象绑定 redis 客户端集群对象
    cmd_string.set_cluster(&cluster, max_conns);
    cmd_key.set_cluster(&cluster, max_conns);

    const char* key = "test_key";

    // redis 集群命令操作的测试过程

    test_redis_string(cmd_string, key);
    test_redis_key(cmd_key, key);

    return 0;
}

看了下例子,不用再纠结用哪个实例了,答案是定义一个集群对象,然后把所有实例注册进来,(jedis要手动写,acl比较6,可以自动发现注册)用的时候你就用这个集群的对象就好了,具体用哪个实例,这个就是这个集群对象内部的事情了,有兴趣的可以研究下源码。

2 现在一个master配置了一个slave,如果master挂了,会像哨兵模式那样,slave反客为主变成master吗?那端口不一样咋办啊?

答:

端口这个问题其实第一问里边就有答案了,把实例都注册到集群对象里边,如果某个端口不可用了,集群对象肯定会用其他的,master挂了,slave会不会反客为主,看下实验

[root@master ~]# redis-trib.rb check  master:7001
  >>> Performing Cluster Check (using node master:7001)
M: 4aa20bd4e759c302484c3e39dae9f875dfae1f35 master:7001
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   slots: (0 slots) slave
   replicates 4aa20bd4e759c302484c3e39dae9f875dfae1f35
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 192.168.0.66:7002
   slots: (0 slots) slave
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   slots: (0 slots) slave
   replicates aae8c65a7595586befa7e2071bd385723f3eee96

Mater 7001 对应的slave 是7005,现在把7001干掉,看下7005会不会变成master
s顺便验证下keys

1看下70001 下的keys

[root@master src]#  ./redis-cli -h master -c -p 7001
master:7001> keys *
1) "k1"
2) "a2"
3) "a3"
4) "k5"
5) "k9"
master:7001> exit

2把7001干掉

[root@master src]# lsof -i tcp:7001
COMMAND      PID USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
redis-ser 101142 root    4u  IPv4 6227467      0t0  TCP master:afs3-callback (LISTEN)
redis-ser 101142 root    8u  IPv4 6226915      0t0  TCP master:afs3-callback->node1:48580 (ESTABLISHED)
[root@master src]# kill -9 101142

3检查状态,发现报错,7005没有变成master,变成master肯定需要时间的,所以等几秒再试下
[root@master src]# redis-trib.rb check  master:7001
[ERR] Sorry, can't connect to node master:7001
[root@master src]# redis-trib.rb check  master:7002
[ERR] Sorry, can't connect to node 192.168.0.66:7001
*** WARNING: 192.168.0.67:7005 claims to be slave of unknown node ID 4aa20bd4e759c302484c3e39dae9f875dfae1f35.
>>> Performing Cluster Check (using node master:7002)
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 master:7002
   slots: (0 slots) slave
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   slots: (0 slots) slave
   replicates 4aa20bd4e759c302484c3e39dae9f875dfae1f35
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   slots: (0 slots) slave
   replicates aae8c65a7595586befa7e2071bd385723f3eee96
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[ERR] Not all 16384 slots are covered by nodes.

4再试一下,发现7005 已结变成主节点,现在有三个主节点,2个从节点
[root@master src]# redis-trib.rb check  master:7002
>>> Performing Cluster Check (using node master:7002)
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 master:7002
   slots: (0 slots) slave
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
M: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   slots:10923-16383 (5461 slots) master
   0 additional replica(s)
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   slots: (0 slots) slave
   replicates aae8c65a7595586befa7e2071bd385723f3eee96
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

5验证下7005的keys,和7001 一样
[root@master src]#  ./redis-cli -h node1 -c -p 7005
node1:7005> keys *
1) "k9"
2) "a2"
3) "a3"
4) "k5"
5) "k1"


6现在在把7001 启动,看下会不会变成从节点
[root@master src]# ./redis-server ../redis_cluster/7001/redis.conf 
[root@master src]# redis-trib.rb check  master:7002
>>> Performing Cluster Check (using node master:7002)
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 master:7002
   slots: (0 slots) slave
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
M: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: 4aa20bd4e759c302484c3e39dae9f875dfae1f35 192.168.0.66:7001
   slots: (0 slots) slave
   replicates b0668184fd15a8bb891a0d98b7d6c2c7293b22c6
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   slots: (0 slots) slave
   replicates aae8c65a7595586befa7e2071bd385723f3eee96
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
   
果然,7001 现在变成了7005 的从节点。

总结一下:

集群中的主节点挂了以后,从节点会反客为主变成主节点,待如果挂掉了的
主机点再次启动之后,就会变成现在主节点的从节点。


2redis集群搭建

Redis支持集群最小的单位为6个实例,3个主节点,3个从节点
集群搭建:至少要三个master

2个节点:
master 192.168.0.66
node1 192.168.0.67

(master node1 随便起的名字,并不是说master这个机器上的就是主节点)

第一步安装GCC

不然待会make时会报错make cc Command not found,make: *** [adlist.o] Error 127有的话就不用装了

yum  install  gcc
第二步 编译安装redis
cd redis-3.2.8
make && make install

(报错zmalloc.h:50:31: error: jemalloc/jemalloc.h: No such file or directory 关于分配器allocator, 如果有MALLOC 这个 环境变量, 会有用这个环境变量的 去建立Redis。

而且libc 并不是默认的 分配器, 默认的是 jemalloc, 因为 jemalloc 被证明 有更少的 fragmentation problems 比libc。

但是如果你又没有jemalloc 而只有 libc 当然 make 出错。 所以加这么一个参数。

解决办法
make MALLOC=libc)

第三步将 redis-trib.rb 复制到 /usr/local/bin 目录下

redis-trib.rb是redis官方推出的管理redis集群的工具,集成在redis的源码src目录下,是基于redis提供的集群命令封装成简单、便捷、实用的操作工具。redis-trib.rb是redis作者用ruby完成的

cd src
cp redis-trib.rb /usr/local/bin/
第四步创建redis 节点

在master(192.168.0.66)节点创建redis节点

1创建redis_cluster目录
cd /usr/local/software/redis-3.2.8
mkdir redis_cluster

2在 redis_cluster 目录下,创建名为7000、7001、7002的目录,并将 redis.conf 拷贝到这三个目录中

cd redis_cluster
mkdir 7000 7001 7002
cp /usr/local/software/redis-3.2.8/redis.conf 7000/redis.conf
cp /usr/local/software/redis-3.2.8/redis.conf 70001/redis.conf
cp /usr/local/software/redis-3.2.8/redis.conf 70002/redis.conf

3分别修改配置文件

port  7000                                        //端口7000,7001,7002        
bind 本机ip                                       //默认ip为127.0.0.1 需要改为其他节点机器可访问的ip 否则创建集群时无法访问对应的端口,无法创建集群
daemonize    yes                               //redis后台运行
pidfile  /var/run/redis_7000.pid          //pidfile文件对应7000,7001,7002
cluster-enabled  yes                           //开启集群  把注释#去掉
cluster-config-file  nodes_7000.conf   //集群的配置  配置文件首次启动自动生成 7000,7001,7002
cluster-node-timeout  15000                //请求超时  默认15秒,可自行设置
appendonly  yes                           //aof日志开启  有需要就开启,它会每次写操作都记录一条日志 

接着在另外一台机器node1上(192.168.0.267),的操作重复以上三步,只是把目录改为7003、7004、7005,对应的配置文件也按照这个规则修改即可

第五步启动redis
[root@master src]# ./redis-server ../redis_cluster/7000/redis.conf 
[root@master src]# ./redis-server ../redis_cluster/7001/redis.conf 
[root@master src]# ./redis-server ../redis_cluster/7002/redis.conf 

[root@node1 src]# ./redis-server ../redis_cluster/7003/redis.conf
[root@node1 src]# ./redis-server ../redis_cluster/7004/redis.conf
[root@node1 src]# ./redis-server ../redis_cluster/7005/redis.conf
第六步检查状态
[root@node1 src]# ps -ef | grep redis
root      22641      1  0 15:52 ?        00:00:00 ./redis-server node1:7003 [cluster]
root      22648      1  0 15:52 ?        00:00:00 ./redis-server node1:7004 [cluster]
root      22653      1  0 15:52 ?        00:00:00 ./redis-server node1:7005 [cluster]
root      22733  20035  0 15:54 pts/0    00:00:00 grep --color=auto redis

[root@node1 src]# netstat -tlnp | grep redis
tcp        0      0 192.168.0.67:17003      0.0.0.0:*               LISTEN      22641/./redis-serve 
tcp        0      0 192.168.0.67:17004      0.0.0.0:*               LISTEN      22648/./redis-serve 
tcp        0      0 192.168.0.67:17005      0.0.0.0:*               LISTEN      22653/./redis-serve 
tcp        0      0 192.168.0.67:7003       0.0.0.0:*               LISTEN      22641/./redis-serve 
tcp        0      0 192.168.0.67:7004       0.0.0.0:*               LISTEN      22648/./redis-serve 
tcp        0      0 192.168.0.67:7005       0.0.0.0:*               LISTEN      22653/./redis-serve 

另一节点类似

至此,其实还不是集群,只是6个redis 实例,下一步我们来创建集群

第七步创建集群

首先要安装ruby

yum -y install ruby ruby-devel rubygems rpm-build

gem install redis

创建集群:

这种写法是错误的:
redis-trib.rb  create  --replicas  1  master:7000 master:7001 master:7002 node1:7003 node1:7004 node1:7005
会报错:ERR Invalid node address specified: master:7000 (Redis::CommandError)
由于Redis-trib.rb 对域名或主机名支持不好,故在创建集群的时候要使用ip:port的方式


正确的写法:
redis-trib.rb  create  --replicas  1  192.168.0.66:7000 192.168.0.66:7001 192.168.0.66:7002 192.168.0.67:7003 192.168.0.67:7004 192.168.0.67:7005

但是由于刚才配错了,所以现在不能重新配了,1要删除cluster-config-file 文件,重启redis

cd /usr/local/software/redis-3.2.8/src/
[root@node1 src]# rm nodes_*
rm: remove regular file ‘nodes_7003.conf’? y
rm: remove regular file ‘nodes_7004.conf’? y
rm: remove regular file ‘nodes_7005.conf’? y
[root@node1 src]# lsof -i tcp:7003
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
redis-ser 76282 root    4u  IPv4 761222      0t0  TCP node1:afs3-vlserver (LISTEN)
[root@node1 src]# lsof -i tcp:7004
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
redis-ser 76275 root    4u  IPv4 761196      0t0  TCP node1:afs3-kaserver (LISTEN)
[root@node1 src]# lsof -i tcp:7005
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
redis-ser 76255 root    4u  IPv4 761166      0t0  TCP node1:afs3-volser (LISTEN)
[root@node1 src]# kill -9 76282
[root@node1 src]# kill -9 76275
[root@node1 src]# kill -9 76255
[root@node1 src]# ./redis-server ../redis_cluster/7003/redis.conf 
[root@node1 src]# ./redis-server ../redis_cluster/7004/redis.conf 
[root@node1 src]# ./redis-server ../redis_cluster/7005/redis.conf 


7000 7001 7002  同理

[root@master src]# redis-trib.rb  create  --replicas  1  192.168.0.66:7000 192.168.0.66:7001 192.168.0.66:7002 192.168.0.67:7003 192.168.0.67:7004 192.168.0.67:7005
>>> Creating cluster
>>> Performing hash slots allocation on 6 nodes...
Using 3 masters:
192.168.0.66:7000
192.168.0.67:7003
192.168.0.66:7001
Adding replica 192.168.0.67:7004 to 192.168.0.66:7000
Adding replica 192.168.0.66:7002 to 192.168.0.67:7003
Adding replica 192.168.0.67:7005 to 192.168.0.66:7001
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
M: 4aa20bd4e759c302484c3e39dae9f875dfae1f35 192.168.0.66:7001
   slots:10923-16383 (5461 slots) master
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 192.168.0.66:7002
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   replicates aae8c65a7595586befa7e2071bd385723f3eee96
S: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   replicates 4aa20bd4e759c302484c3e39dae9f875dfae1f35
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join....
>>> Performing Cluster Check (using node 192.168.0.66:7000)
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   slots: (0 slots) slave
   replicates aae8c65a7595586befa7e2071bd385723f3eee96
S: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   slots: (0 slots) slave
   replicates 4aa20bd4e759c302484c3e39dae9f875dfae1f35
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 192.168.0.66:7002
   slots: (0 slots) slave
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
M: 4aa20bd4e759c302484c3e39dae9f875dfae1f35 192.168.0.66:7001
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.



第八步验证集群
redis-trib.rb check  master:7001(随便某个节点)

[root@master src]# redis-trib.rb check  master:7001
>>> Performing Cluster Check (using node master:7001)
M: 4aa20bd4e759c302484c3e39dae9f875dfae1f35 master:7001
   slots:10923-16383 (5461 slots) master
   1 additional replica(s)
S: b0668184fd15a8bb891a0d98b7d6c2c7293b22c6 192.168.0.67:7005
   slots: (0 slots) slave
   replicates 4aa20bd4e759c302484c3e39dae9f875dfae1f35
M: 1783142fb99cda09ebd103fe169134871106bcdc 192.168.0.67:7003
   slots:5461-10922 (5462 slots) master
   1 additional replica(s)
S: 92a5d77c8abd7fbf8de6d66eed92e324d1cbbf08 192.168.0.66:7002
   slots: (0 slots) slave
   replicates 1783142fb99cda09ebd103fe169134871106bcdc
M: aae8c65a7595586befa7e2071bd385723f3eee96 192.168.0.66:7000
   slots:0-5460 (5461 slots) master
   1 additional replica(s)
S: 77dd396290fb2bf1bdb63a2ae23f7bd5ab28c392 192.168.0.67:7004
   slots: (0 slots) slave
   replicates aae8c65a7595586befa7e2071bd385723f3eee96
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

现在在7001上写,验证在7005上是否可以读到

写:
[root@master src]#  ./redis-cli -h master -c -p 7001
master:7001> set k1 v1

读:
[root@node1 src]# ./redis-cli -h node1 -c -p 7005
node1:7005> get k1
-> Redirected to slot [12706] located at 192.168.0.66:7001
"v1"

这里看到重定向到了7001,也验证了第一部分所说的,redis集群把数据分散在不同的节点,你可以在任意节点读,读不到的时候他会重定向去其他节点。

运维(迁移,扩容,缩容等)

redis-trib.rb ,前面在创建redis集群的时候已经用过了,这是redis官方给出的一个
集群管理工具。它具有以下功能

1create:创建集群
2check:检查集群
3info:查看集群信息
4fix:修复集群
5reshard:在线迁移slot
6rebalance:平衡集群节点slot数量
7add-node:将新节点加入集群
8del-node:从集群中删除节点
9set-timeout:设置集群节点间心跳连接的超时时间
10call:在集群全部节点上执行命令
11import:将外部redis数据导入集群
`
Redis