MLXY 2019-12-23
由于项目的服务器分布在重庆,上海,台北,休斯顿,所以需要做异地容灾需求。当前的mysql,redis cluster,elastic search都在重庆的如果重庆停电了,整个应用都不能用了。
现在考虑第一步做重庆和上海的异地容灾,大概测试了一下重庆的几台服务器之间大概是13m/s的传输速度也就是说100M的局域网带宽,重庆到上海只有1.2m/s的传输速度,大概10M的局域网带宽。
第一个方案先考虑简单的 mysql 重庆上海主主同步 redis cluster的master节点默认都设置在重庆的服务器,slave都设置在上海服务器。es的主分片也设置在重庆,副本分片全部设置在上海。
如下是redis的扩容和数据迁移的方法
在trialrun的服务器上一共3台 15.99.72.164和15.99.72.165在重庆 15.15.181.147在上海
[ 7005]# bin/redis-cli -c -h 15.15.181.147 -p 7006
15.15.181.147:7006> cluster nodes
c08e8c7faeede2220e621b2409061210e0b107ad 15.99.72.164: slave 421123bf7fb3a4061e34cab830530d87b21148ee 0 1577089232000 7 connected
733609c2fbecdd41f454363698514e2f72ee0208 15.15.181.147: myself,slave f452a66121e1e9c02b0ed28cafe03aaddb327c36 0 1577089230000 6 connected
31670db07d1bc7620a8f8254b26f2af00b04d1fd 15.99.72.164: slave 763a88d5328ab0ce07a312e726d78bb2141b5813 0 1577089234988 5 connected
f452a66121e1e9c02b0ed28cafe03aaddb327c36 15.99.72.165: master - 0 1577089235796 3 connected 5461-10922
421123bf7fb3a4061e34cab830530d87b21148ee 15.99.72.165: master - 0 1577089234000 7 connected 0-5460
763a88d5328ab0ce07a312e726d78bb2141b5813 15.15.181.147: master - 0 1577089232733 5 connected 10923-16383
[ src]# /root/tools/redis-4.0.11/src/redis-trib.rb info 15.99.72.165:7003
15.99.72.165:7003 (f452a661...) -> 53254 keys | 5462 slots | 1 slaves.
15.15.181.147:7005 (763a88d5...) -> 53174 keys | 5461 slots | 1 slaves.
15.99.72.165:7004 (421123bf...) -> 53050 keys | 5461 slots | 1 slaves.
[OK] 159478 keys in 3 masters.
9.73 keys per slot on average.
之前安装的是三主三从,现在我需要在165上先安装一个7007 的master的节点加入之前的集群然后把15.15.181.147: master 的slots 全部迁移到165的7007节点
1,先在165上 mkdir -p /usr/local/redis-cluster/7007
由于之前165上安装过其他节点,直接 cd /usr/local/redis-ii/
cp -r bin /usr/local/redis-cluster/7007
然后进入之前安装的7004节点 cd /usr/local/redis-cluster/7004
cp redis.conf ../7007/
然后修改7007的相关配置
bind 15.99.72.165
protected-mode no
port 7007
daemonize yes
cluster-enabled yes
cluster-node-timeout 15000
保存配置后,启动7007这个节点 bin/redis-server ./redis.conf
然后把165:7007节点添加到之前的节点中
[ tools]# /root/tools/redis-4.0.11/src/redis-trib.rb add-node 15.99.72.165:7007 15.99.72.165:7003
>>> Adding node 15.99.72.165:7007 to cluster 15.99.72.165:7003
>>> Performing Cluster Check (using node 15.99.72.165:7003)
M: f452a66121e1e9c02b0ed28cafe03aaddb327c36 15.99.72.165:7003
slots:5461-10922 (5462 slots) master
1 additional replica(s)
M: 763a88d5328ab0ce07a312e726d78bb2141b5813 15.15.181.147:7005
slots:10923-16383 (5461 slots) master
1 additional replica(s)
M: 421123bf7fb3a4061e34cab830530d87b21148ee 15.99.72.165:7004
slots:0-5460 (5461 slots) master
1 additional replica(s)
S: 733609c2fbecdd41f454363698514e2f72ee0208 15.15.181.147:7006
slots: (0 slots) slave
replicates f452a66121e1e9c02b0ed28cafe03aaddb327c36
S: 31670db07d1bc7620a8f8254b26f2af00b04d1fd 15.99.72.164:7002
slots: (0 slots) slave
replicates 763a88d5328ab0ce07a312e726d78bb2141b5813
S: c08e8c7faeede2220e621b2409061210e0b107ad 15.99.72.164:7001
slots: (0 slots) slave
replicates 421123bf7fb3a4061e34cab830530d87b21148ee
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
>>> Send CLUSTER MEET to node 15.99.72.165:7007 to make it join the cluster.
[OK] New node added correctly.
再用cluster nodes命令查看当前节点,可以发现7007已经加入到了redis cluster中但是slot 数为0
15.15.181.147:7006> cluster nodes
8e134e67e4e83a613b90f67cc6e6b8d71c208886 15.99.72.165: master - 0 1577095695760 0 connected
c08e8c7faeede2220e621b2409061210e0b107ad 15.99.72.164: slave 421123bf7fb3a4061e34cab830530d87b21148ee 0 1577095693561 7 connected
733609c2fbecdd41f454363698514e2f72ee0208 15.15.181.147: myself,slave f452a66121e1e9c02b0ed28cafe03aaddb327c36 0 1577095691000 6 connected
31670db07d1bc7620a8f8254b26f2af00b04d1fd 15.99.72.164: slave 763a88d5328ab0ce07a312e726d78bb2141b5813 0 1577095695000 5 connected
f452a66121e1e9c02b0ed28cafe03aaddb327c36 15.99.72.165: master - 0 1577095694000 3 connected 5461-10922
421123bf7fb3a4061e34cab830530d87b21148ee 15.99.72.165: master - 0 1577095694763 7 connected 0-5460
763a88d5328ab0ce07a312e726d78bb2141b5813 15.15.181.147: master - 0 1577095691699 5 connected 10923-16383
接下来需要把15.15.181.147: master 的slots全部迁移到 15.99.72.165: master 上
迁移过程参考如下例子,由于我迁移的时候打印太多,没有拷贝粘贴进来,和下面除了ip 和port等等有区别级别上一样
[ redis-cluster]# ./redis-4.0.6/src/redis-trib.rb reshard 192.168.1.117:7000 How many slots do you want to move (from 1 to 16384)? 5461 # 分配多少数量的slot What is the receiving node ID? a6d7dacd679a96fd79b7de552428a63610d620e6 # 上面那些数量的slot被哪个节点接收。这里填写192.168.1.117:7000节点ID Type ‘all‘ to use all the nodes as source nodes for the hash slots. Type ‘done‘ once you entered all the source nodes IDs. Source node #1:0607089e5bb3192563bd8082ff230b0eb27fbfeb #指从哪个节点分配上面指定数量的slot。这里填写192.168.1.116:7000的ID。如果填写all,则表示从之前所有master节点中抽取上面指定数量的slot。 Source node #2:done Do you want to proceed with the proposed reshard plan (yes/no)? yes Moving slot 0 from 192.168.1.116:7000 to 192.168.1.117:7000: [ERR] Calling MIGRATE: ERR Syntax error, try CLIENT (LIST | KILL | GETNAME | SETNAME | PAUSE | REPLY)
[root@localhost redis-cluster]# cp redis-4.0.6/src/redis-trib.rb redis-4.0.6/src/redis-trib.rb.bak 将redis-trib.rb文件中原来的 source.r.client.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:keys,*keys]) source.r.client.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:replace,:keys,*keys]) 改为 source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,"replace",:keys,*keys]) source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:replace,:keys,*keys]) [root@localhost redis-cluster]# cat redis-4.0.6/src/redis-trib.rb |grep source.r.call source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,"replace",:keys,*keys]) source.r.call(["migrate",target.info[:host],target.info[:port],"",0,@timeout,:replace,:keys,*keys]) # 修改后继续报错 [root@localhost redis-cluster]# ./redis-4.0.6/src/redis-trib.rb reshard 192.168.1.117:7000 [OK] All nodes agree about slots configuration. >>> Check for open slots... [WARNING] Node 192.168.1.117:7000 has slots in importing state (0). [WARNING] Node 192.168.1.116:7000 has slots in migrating state (0). [WARNING] The following slots are open: 0 >>> Check slots coverage...