EMQ高可用性+负载均衡架构

1. emq高可用+负载均衡部署

1.1. 整体架构规划

  • 架构图
emq架构图.jpg
  • 架构服务器规划
服务器ip 部署业务 作用
172.16.40.20 EMQTTD EMQ集群
172.16.40.21 EMQTTD EMQ集群
172.16.40.22 haproxy、keepalived HA和LB
172.16.40.23 haproxy、keepalived HA和LB

1.2. EMQ集群部署

172.16.40.20和172.16.40.21两台服务器作为emqttd集成服务器,两台都部署emqttd服务。

1.2.1. 172.16.40.20emq部署

  • rpm安装emq
yum install -y wget
[root@dz home]# wget http://emqtt.com/static/brokers/emqttd-centos7-v2.3.11-1.el7.centos.x86_64.rpm
--2018-08-07 14:01:22--  http://emqtt.com/static/brokers/emqttd-centos7-v2.3.11-1.el7.centos.x86_64.rpm
正在解析主机 emqtt.com (emqtt.com)... 106.185.34.253
正在连接 emqtt.com (emqtt.com)|106.185.34.253|:80... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:17140188 (16M) [application/x-redhat-package-manager]
正在保存至: “emqttd-centos7-v2.3.11-1.el7.centos.x86_64.rpm”

100%[===================================================================================================================>] 17,140,188  11.8MB/s 用时 1.4s   

2018-08-07 14:01:24 (11.8 MB/s) - 已保存 “emqttd-centos7-v2.3.11-1.el7.centos.x86_64.rpm” [17140188/17140188])

[root@dz home]# rpm -ivh emqttd-centos7-v2.3.11-1.el7.centos.x86_64.rpm 
准备中...                          ################################# [100%]
正在升级/安装...
   1:emqttd-2.3-1.el7.centos          ################################# [100%]
Created symlink from /etc/systemd/system/multi-user.target.wants/emqttd.service to /usr/lib/systemd/system/emqttd.service.
[root@dz home]# yum install -y lksctp-tools

  • 配置文件
    EMQ 配置文件: /etc/emqttd/emq.conf,插件配置文件: /etc/emqttd/plugins/*.conf。

  • 日志文件
    日志文件目录: /var/log/emqttd

  • 数据文件
    数据文件目录:/var/lib/emqttd/

  • 启动停止
    systemctl start|stop|restart emqttd.service

1.2.2. EMQ并发测试

工具地址:https://github.com/emqtt/emqtt_benchmark

yum install -y openssl-devel automake autoconf ncurses-devel gcc gcc-c++
此处下载太慢,只能vpn到国外下载上传到/root/.kerl/archives
下载地址http://erlang.org/download/?N=D
# otp_src_17.5.tar.gz
# yum list | grep ODBC
# yum install unixODBC-devel
# ./configure  --without-javac
# make
# make install
(cd "/usr/local/lib/erlang" \
 && ./Install  -minimal "/usr/local/lib/erlang")

/usr/bin/install -c -m 644 "/home/otp_src_17.5/OTP_VERSION" "/usr/local/lib/erlang/releases/17"
cd /usr/local/bin
rm -f erl
rm -f erlc
rm -f epmd
rm -f run_erl
rm -f to_erl
rm -f dialyzer
rm -f typer
rm -f escript
rm -f ct_run
ln -s ../lib/erlang/bin/erl erl
ln -s ../lib/erlang/bin/erlc erlc
ln -s ../lib/erlang/bin/epmd epmd
ln -s ../lib/erlang/bin/run_erl run_erl
ln -s ../lib/erlang/bin/to_erl to_erl
ln -s ../lib/erlang/bin/dialyzer dialyzer
ln -s ../lib/erlang/bin/typer typer
ln -s ../lib/erlang/bin/escript escript
ln -s ../lib/erlang/bin/ct_run ct_run
[root@dz otp_src_17.5]# erl --version
Erlang/OTP 17 [erts-6.4] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]

Eshell V6.4  (abort with ^G)
1> 

  • emqtt_benchmark安装
yum install -y git
git clone https://github.com/emqtt/emqtt_benchmark.git
cd /home/emqtt_benchmark
make

  • 测试命令
#用户名和密码
./emqtt_bench_pub  -h 172.18.40.43 -u lpmqtt -P password -p 1883 -c 35000 -I 10 -t bench21/%i -s 256
#无用户名和密码
./emqtt_bench_pub  -h 172.16.40.20 -p 1883 -c 35000 -I 10 -t bench21/%i -s 256

本次发起35000的并发连接,在mqttd服务器只能建立1011左右的连接。要达到高并发需要修改服务器端和客户端的内核参数。

服务器端
[root@dz home]# ulimit -n
1024
[root@dz home]# netstat -nat|grep -i "1883"|wc -l
1013
[root@dz home]# 

1.2.3. 优化EMQ服务器内核参数

官网参考:http://www.emqtt.com/docs/v2/tune.html

  • 修改服务器端和客户端的内核参数

服务器端修改如下:

cat << EOF >> /etc/sysctl.conf
fs.file-max=2097152 
fs.nr_open=2097152
net.core.somaxconn=32768
net.ipv4.tcp_max_syn_backlog=16384
net.core.netdev_max_backlog=16384
net.ipv4.ip_local_port_range=1000 65535
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.optmem_max=16777216
net.ipv4.tcp_rmem=1024 4096 16777216
net.ipv4.tcp_wmem=1024 4096 16777216
net.nf_conntrack_max=1000000
net.netfilter.nf_conntrack_max=1000000
net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
net.ipv4.tcp_max_tw_buckets=1048576
net.ipv4.tcp_fin_timeout = 15
EOF

i
cat << EOF >>/etc/security/limits.conf
*      soft   nofile      1048576
*      hard   nofile      1048576
EOF

echo DefaultLimitNOFILE=1048576 >>/etc/systemd/system.conf 

cd /etc/emqttd/
cp emq.conf emq.confback
sed -i 's%^node\.process_limit = .*%node.process_limit = 2097152%g' /etc/emqttd/emq.conf
sed -i 's%^node\.max_ports = .*%node.max_ports = 1048576%g' /etc/emqttd/emq.conf
sed -i 's%^listener\.tcp.external\.acceptors = .*%listener.tcp.external.acceptors = 64%g' /etc/emqttd/emq.conf
sed -i 's%^listener\.tcp\.external\.max_clients = .*%listener.tcp.external.max_clients = 1000000%g' /etc/emqttd/emq.conf
 

客户端

#临时生效
sysctl -w net.ipv4.ip_local_port_range="500 65535"
echo 1000000 > /proc/sys/fs/nr_open
ulimit -n 100000
#持久化
cat << EOF >>/etc/security/limits.conf
*      soft   nofile      100000
*      hard   nofile      100000
EOF

cat << EOF >> /etc/sysctl.conf
fs.nr_open=1000000
net.ipv4.ip_local_port_range=500 65535
EOF
  • 内核参数修改完毕继续测试
conneted: 34995
conneted: 34996
conneted: 34997
conneted: 34998
conneted: 34999
conneted: 35000
sent(563167): total=11901013, rate=40083(msg/sec)
sent(564711): total=11938169, rate=37156(msg/sec)
sent(566279): total=11975367, rate=37198(msg/sec)
sent(567771): total=12012594, rate=37227(msg/sec)
sent(569335): total=12049838, rate=37244(msg/sec)
#客户端
[root@dz ~]# ulimit -n
100000
[root@dz ~]# netstat -nat|grep -i "1883"|wc -l
32518
[root@dz ~]# 

#服务器端
[root@dz ~]# ulimit -n
1048576
[root@dz ~]# netstat -nat|grep -i "1883"|wc -l
32520
[root@dz ~]# 

问题:客户端总共发起35000的连接,最终只能建立32520,也就是3万多,有点搞不明白,客户端的端口范围是500-65535。

1.2.4. 172.16.40.21emq部署

172.16.40.21emq的部署和172.16.40.20emq一样部署,此处不再详述。

1.2.5. 搭建emq集群

172.16.40.21和172.16.40.20组件集群配置

1.2.5.1. 172.16.40.21emq 配置

sed -i 's%^node\.name = .*%node\.name = emqttd@172.16.40.21%g' /etc/emqttd/emq.conf
sed -i 's%^cluster\.name = .*%cluster.name = dz_mqtt%g' /etc/emqttd/emq.conf
[root@dz ~]# systemctl restart emqttd

1.2.5.2. 172.16.40.20emq 配置

sed -i 's%^node\.name = .*%node\.name = emqttd@172.16.40.20%g' /etc/emqttd/emq.conf
sed -i 's%^cluster\.name = .*%cluster.name = dz_mqtt%g' /etc/emqttd/emq.conf
[root@dz ~]# systemctl restart emqttd

注意:如果修改了集群名和结点名,重启服务无效对话。记得重启一下服务器
执行集群加入命令

#进入172.16.40.20服务器
#查询emq nodeid
[root@dz emqttd]# emqttd_ctl status
Node 'emqttd@172.16.40.20' is started
emqttd 2.3.11 is running
[root@dz emqttd]# 

[root@dz emqttd]# 
#进入172.16.40.21服务器
[root@dz ~]# emqttd_ctl cluster join emqttd@172.16.40.20
Join the cluster successfully.
Cluster status: [{running_nodes,['emqttd@172.16.40.20',
                                 'emqttd@172.16.40.21']}]
[root@dz ~]# 

1.3. haproxy部署

172.16.40.22和172.16.40.23服务器部署和配置一样。

1.3.1. haproxy安装

[root@dz home]# yum install -y pcre-devel  bzip2-devel  gcc gcc-c++ make
[root@dz home]# tar -zxvf haproxy-1.8.13.tar.gz 
[root@dz home]# cd haproxy-1.8.13
[root@dz haproxy-1.8.13]# make TARGET=linux2628 PREFIX=/usr/local/haproxy
[root@dz haproxy-1.8.13]# make install PREFIX=/usr/local/haproxy
install -d "/usr/local/haproxy/sbin"
install haproxy  "/usr/local/haproxy/sbin"
install -d "/usr/local/haproxy/share/man"/man1
install -m 644 doc/haproxy.1 "/usr/local/haproxy/share/man"/man1
install -d "/usr/local/haproxy/doc/haproxy"
for x in configuration management architecture peers-v2.0 cookie-options lua WURFL-device-detection proxy-protocol linux-syn-cookies network-namespaces DeviceAtlas-device-detection 51Degrees-device-detection netscaler-client-ip-insertion-protocol peers close-options SPOE intro; do \
    install -m 644 doc/$x.txt "/usr/local/haproxy/doc/haproxy" ; \
done
[root@dz haproxy-1.8.13]# 
[root@dz haproxy-1.8.13]# /usr/local/haproxy/sbin/haproxy -v
HA-Proxy version 1.8.13 2018/07/30
Copyright 2000-2018 Willy Tarreau <willy@haproxy.org>

[root@dz haproxy-1.8.13]# 
[root@dz haproxy-1.8.13]# mkdir /etc/haproxy
[root@dz haproxy-1.8.13]# groupadd haproxy
[root@dz haproxy-1.8.13]# useradd -s /sbin/nologin -M -g haproxy haproxy //添加haproxy运行haproxy账号并设置及属主与属组
[root@dz haproxy-1.8.13]# cp examples/haproxy.init /etc/init.d/haproxy
[root@dz haproxy-1.8.13]# chmod 755 /etc/init.d/haproxy
[root@dz haproxy-1.8.13]# chkconfig --add haproxy
[root@dz haproxy-1.8.13]# cp /usr/local/haproxy/sbin/haproxy /usr/sbin/
[root@dz haproxy-1.8.13]# 


1.3.2. 增加配置文件

cat <<EOF>>/etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    # local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    # chroot      /usr/sbin/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     1000000
    user        root
    group       root
    daemon

    # turn on stats unix socket
    # stats socket /var/lib/haproxy/stats
    # stats socket /usr/sbin/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    log                     global
    option                  dontlognull
    option http-server-close
    # option forwardfor
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         60s
    timeout client          2m
    timeout server          2m
    timeout http-keep-alive 10s
    timeout check           10s

frontend emqtt-front
    bind *:1883
    maxconn     1000000
    mode tcp
    default_backend emqtt-backend

backend emqtt-backend
    balance roundrobin
    # balance source
    server emq1 172.16.40.20:1883 check inter 100000 fall 2 rise 5 weight 1
    server emq2 172.16.40.21:1883 check inter 100000 fall 2 rise 5 weight 1
    # source 0.0.0.0 usesrc clientip

frontend emqtt-admin-front
    bind *:18083
    mode http
    default_backend emqtt-admin-backend

backend emqtt-admin-backend
    mode http
    balance roundrobin
    server emq1 172.16.40.20:18083 check
    server emq2 172.16.40.21:18083 check
listen admin_stats
        stats   enable
        bind    *:8080 
        mode    http 
        option  httplog
        log     global
        maxconn 10
        stats   refresh 30s
        stats   uri /admin
        stats   realm haproxy
        stats   auth admin:admin
        stats   hide-version
        stats   admin if TRUE

EOF

1.3.3. 启动haproxy

systemctl start haproxy

8月 08 09:14:34 dz haproxy[3223]: /etc/rc.d/init.d/haproxy: 第 26 行:[: =: 期待一元表达式
修改/etc/rc.d/init.d/haproxy文件

[ ${NETWORKING} = "no" ] && exit 0
改成

[ "${NETWORKING}" = "no" ] && exit 0
systemctl daemon-reload

1.3.4. 开启自启动

chkconfig haproxy on

1.3.5. 优化haproxy服务器

cat << EOF >> /etc/sysctl.conf
fs.file-max=2097152 
fs.nr_open=2097152
net.core.somaxconn=32768
net.ipv4.tcp_max_syn_backlog=16384
net.core.netdev_max_backlog=16384
net.ipv4.ip_local_port_range=500 65535
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.optmem_max=16777216
net.ipv4.tcp_rmem=1024 4096 16777216
net.ipv4.tcp_wmem=1024 4096 16777216
net.nf_conntrack_max=1000000
net.netfilter.nf_conntrack_max=1000000
net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
net.ipv4.tcp_max_tw_buckets=1048576
net.ipv4.tcp_fin_timeout = 15
EOF


cat << EOF >>/etc/security/limits.conf
*      soft   nofile      1048576
*      hard   nofile      1048576
EOF

echo DefaultLimitNOFILE=1048576 >>/etc/systemd/system.conf 

echo session required /usr/lib64/security/pam_limits.so >>/etc/pam.d/login

cat << EOF >> /etc/sysctl.conf
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_tw_recycle=1
net.ipv4.tcp_fin_timeout=30
net.ipv4.tcp_syncookies = 1
EOF

1.4. keepalived部署

172.16.40.23和172.16.40.22部署一样,配置文件稍微不一样

1.4.1. yum安装

yum install keepalived

1.4.2. 172.16.40.22增加配置文件/etc/keepalived/keepalived.conf


! Configuration File for keepalived

global_defs {
   notification_email {
     huangmeng@dyjs.com
    #  failover@firewall.loc
    #  sysadmin@firewall.loc
   }
   notification_email_from huangmeng4520@163.com
   smtp_server smtp.163.com
   smtp_connect_timeout 30
   router_id mqtt40
   vrrp_skip_check_adv_addr
#    vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    mcast_src_ip 172.16.40.22
    # unicast_peer {
    # 172.18.40.41 ##(对端IP地址)此地址一定不能忘记,vrrp need use
    # }
    virtual_ipaddress {
        172.16.40.24/24
        # 192.168.200.16
        # 192.168.200.17
        # 192.168.200.18
    }
}

1.4.3. 172.16.40.23增加配置文件/etc/keepalived/keepalived.conf


! Configuration File for keepalived

global_defs {
   notification_email {
     huangmeng@dyjs.com
    #  failover@firewall.loc
    #  sysadmin@firewall.loc
   }
   notification_email_from huangmeng4520@163.com
   smtp_server smtp.163.com
   smtp_connect_timeout 30
   router_id mqtt40
   vrrp_skip_check_adv_addr
#    vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    mcast_src_ip 172.16.40.23
    # unicast_peer {
    # 172.18.40.41 ##(对端IP地址)此地址一定不能忘记,vrrp need use
    # }
    virtual_ipaddress {
        172.16.40.24/24
        # 192.168.200.16
        # 192.168.200.17
        # 192.168.200.18
    }
}

1.4.4. 设置开启自启动

systemctl enable keepalived

1.5. 架构测试

此架构采用keepalived作为高可用性,haproxy作为负载均衡。

1.5.1. keepalived架构测试

关闭172.16.40.22或者172.16.40.23其中一台,看vip是否迁移到另外一台服务器。

1.5.2. haproxy负载均衡

客户端工具发起pub,看是否按照轮询方式发送到后端2台EMQ服务上面。

1.5.3. 结果展示

  • 客户端发起50000连接测试
./emqtt_bench_pub  -h 172.16.40.24 -p 1883 -c 50000 -i 20  -t bench21/%i -s 256
  • haproxy监控图


    image.png
  • emq集群展示


    image.png

推荐阅读更多精彩内容