在 CentOS 7.1 上安装分布式存储系统 Ceph

zyshappy 2015-08-03

关于 Ceph 的介绍网上一大堆,这里就不重复了。Sage Weil 读博士的时候开发了这套牛逼的分布式存储系统,最初是奔着高性能分布式文件系统去的,结果云计算风口一来,Ceph 重心转向了分布式块存储(Block Storage)和分布式对象存储(Object Storage),现在分布式文件系统 CephFS 还停在 beta 阶段。Ceph 现在是云计算、虚拟机部署的最火开源存储解决方案,据说有20%的 OpenStack 部署存储用的都是 Ceph 的 block storage.

Ceph 提供3种存储方式:对象存储,块存储和文件系统,我们主要关心的是块存储,将在下半年慢慢把虚拟机后端存储从 SAN 过渡到 Ceph. 虽然还是 0.94 版本,Ceph 现在已经比较成熟了,有个同事已经在生产环境里运行 Ceph 了两年多,他曾遇到很多问题,但最终还是解决了,可见 Ceph 还是非常稳定和可靠的。

在 CentOS 7.1 上安装分布式存储系统 Ceph

 

硬件环境准备

准备了6台机器,其中3台物理服务器做监控节点(mon: ceph-mon1, ceph-mon2, ceph-mon3),2台物理服务器做存储节点(osd: ceph-osd1, ceph-osd2),1台虚拟机做管理节点(adm: ceph-adm)。

Ceph 要求必须是奇数个监控节点,而且最少3个(自己玩玩的话,1个也是可以的),ceph-adm 是可选的,可以把 ceph-adm 放在 monitor 上,只不过把 ceph-adm 单独拿出来架构上看更清晰一些。当然也可以把 mon 放在 osd 上,生产环境下是不推荐这样做的。

  • ADM 服务器硬件配置比较随意,用1台低配置的虚拟机就可以了,只是用来操作和管理 Ceph;
  • MON 服务器2块硬盘做成 RAID1,用来安装操作系统;
  • OSD 服务器上用10块 4TB 硬盘做 Ceph 存储,每个 osd 对应1块硬盘,每个 osd 需要1个 Journal,所以10块硬盘需要10个 Journal,我们用2块大容量 SSD 硬盘做 journal,每个 SSD 等分成5个区,这样每个区分别对应一个 osd 硬盘的 journal,剩下的2块小容量 SSD 装操作系统,采用 RAID1.

配置列表如下:

  1. <span class="pun">|</span><span class="typ">Hostname</span><span class="pun">|</span><span class="pln"> IP </span><span class="typ">Address</span><span class="pun">|</span><span class="typ">Role</span><span class="pun">|</span><span class="typ">Hardware</span><span class="typ">Info</span><span class="pun">|</span>
  2. <span class="pun">|-----------+---------------+-------|---------------------------------------------------------|</span>
  3. <span class="pun">|</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">adm </span><span class="pun">|</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.100</span><span class="pun">|</span><span class="pln"> adm </span><span class="pun">|</span><span class="lit">2</span><span class="typ">Cores</span><span class="pun">,</span><span class="lit">4GB</span><span class="pln"> RAM</span><span class="pun">,</span><span class="lit">20GB</span><span class="pln"> DISK </span><span class="pun">|</span>
  4. <span class="pun">|</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon1 </span><span class="pun">|</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.101</span><span class="pun">|</span><span class="pln"> mon </span><span class="pun">|</span><span class="lit">24</span><span class="typ">Cores</span><span class="pun">,</span><span class="lit">64GB</span><span class="pln"> RAM</span><span class="pun">,</span><span class="lit">2x750GB</span><span class="pln"> SAS </span><span class="pun">|</span>
  5. <span class="pun">|</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon2 </span><span class="pun">|</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.102</span><span class="pun">|</span><span class="pln"> mon </span><span class="pun">|</span><span class="lit">24</span><span class="typ">Cores</span><span class="pun">,</span><span class="lit">64GB</span><span class="pln"> RAM</span><span class="pun">,</span><span class="lit">2x750GB</span><span class="pln"> SAS </span><span class="pun">|</span>
  6. <span class="pun">|</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon3 </span><span class="pun">|</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.103</span><span class="pun">|</span><span class="pln"> mon </span><span class="pun">|</span><span class="lit">24</span><span class="typ">Cores</span><span class="pun">,</span><span class="lit">64GB</span><span class="pln"> RAM</span><span class="pun">,</span><span class="lit">2x750GB</span><span class="pln"> SAS </span><span class="pun">|</span>
  7. <span class="pun">|</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">osd1 </span><span class="pun">|</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.121</span><span class="pun">|</span><span class="pln"> osd </span><span class="pun">|</span><span class="lit">12</span><span class="typ">Cores</span><span class="pun">,</span><span class="lit">64GB</span><span class="pln"> RAM</span><span class="pun">,</span><span class="lit">10x4TB</span><span class="pln"> SAS</span><span class="pun">,</span><span class="lit">2x400GB</span><span class="pln"> SSD</span><span class="pun">,</span><span class="lit">2x80GB</span><span class="pln"> SSD </span><span class="pun">|</span>
  8. <span class="pun">|</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">osd2 </span><span class="pun">|</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.122</span><span class="pun">|</span><span class="pln"> osd </span><span class="pun">|</span><span class="lit">12</span><span class="typ">Cores</span><span class="pun">,</span><span class="lit">64GB</span><span class="pln"> RAM</span><span class="pun">,</span><span class="lit">10x4TB</span><span class="pln"> SAS</span><span class="pun">,</span><span class="lit">2x400GB</span><span class="pln"> SSD</span><span class="pun">,</span><span class="lit">2x80GB</span><span class="pln"> SSD </span><span class="pun">|</span>

 

软件环境准备

所有 Ceph 集群节点采用 CentOS 7.1 版本(CentOS-7-x86_64-Minimal-1503-01.iso),所有文件系统采用 Ceph 官方推荐的 xfs,所有节点的操作系统都装在 RAID1 上,其他的硬盘单独用,不做任何 RAID.

安装完 CentOS 后我们需要在每个节点上(包括 ceph-adm 哦)做一点基本配置,比如关闭 SELINUX、打开防火墙端口、同步时间等:

  1. <span class="pun">关闭</span><span class="pln"> SELINUX</span>
  2. <span class="com"># sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config</span>
  3. <span class="com"># setenforce 0</span>
  4. <span class="pun">打开</span><span class="typ">Ceph</span><span class="pun">需要的端口</span>
  5. <span class="com"># firewall-cmd --zone=public --add-port=6789/tcp --permanent</span>
  6. <span class="com"># firewall-cmd --zone=public --add-port=6800-7100/tcp --permanent</span>
  7. <span class="com"># firewall-cmd --reload</span>
  8. <span class="pun">安装</span><span class="pln"> EPEL </span><span class="pun">软件源:</span>
  9. <span class="com"># rpm -Uvh https://dl.Fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm</span>
  10. <span class="com"># yum -y update</span>
  11. <span class="com"># yum -y upgrade</span>
  12. <span class="pun">安装</span><span class="pln"> ntp </span><span class="pun">同步时间</span>
  13. <span class="com"># yum -y install ntp ntpdate ntp-doc</span>
  14. <span class="com"># ntpdate 0.us.pool.ntp.org</span>
  15. <span class="com"># hwclock --systohc</span>
  16. <span class="com"># systemctl enable ntpd.service</span>
  17. <span class="com"># systemctl start ntpd.service</span>

在每台 osd 服务器上我们需要对10块 SAS 硬盘分区、创建 xfs 文件系统;对2块用做 journal 的 SSD 硬盘分5个区,每个区对应一块硬盘,不需要创建文件系统,留给 Ceph 自己处理。

  1. <span class="com"># parted /dev/sda</span>
  2. <span class="pln">GNU </span><span class="typ">Parted</span><span class="lit">3.1</span>
  3. <span class="typ">Using</span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sda</span>
  4. <span class="typ">Welcome</span><span class="pln"> to GNU </span><span class="typ">Parted</span><span class="pun">!</span><span class="typ">Type</span><span class="str">'help'</span><span class="pln"> to view a list of commands</span><span class="pun">.</span>
  5. <span class="pun">(</span><span class="pln">parted</span><span class="pun">)</span><span class="pln"> mklabel gpt</span>
  6. <span class="pun">(</span><span class="pln">parted</span><span class="pun">)</span><span class="pln"> mkpart primary xfs </span><span class="lit">0</span><span class="pun">%</span><span class="lit">100</span><span class="pun">%</span>
  7. <span class="pun">(</span><span class="pln">parted</span><span class="pun">)</span><span class="pln"> quit</span>
  8. <span class="com"># mkfs.xfs /dev/sda1</span>
  9. <span class="pln">meta</span><span class="pun">-</span><span class="pln">data</span><span class="pun">=</span><span class="str">/dev/</span><span class="pln">sda1 isize</span><span class="pun">=</span><span class="lit">256</span><span class="pln"> agcount</span><span class="pun">=</span><span class="lit">4</span><span class="pun">,</span><span class="pln"> agsize</span><span class="pun">=</span><span class="lit">244188544</span><span class="pln"> blks</span>
  10. <span class="pun">=</span><span class="pln"> sectsz</span><span class="pun">=</span><span class="lit">4096</span><span class="pln"> attr</span><span class="pun">=</span><span class="lit">2</span><span class="pun">,</span><span class="pln"> projid32bit</span><span class="pun">=</span><span class="lit">1</span>
  11. <span class="pun">=</span><span class="pln"> crc</span><span class="pun">=</span><span class="lit">0</span><span class="pln"> finobt</span><span class="pun">=</span><span class="lit">0</span>
  12. <span class="pln">data </span><span class="pun">=</span><span class="pln"> bsize</span><span class="pun">=</span><span class="lit">4096</span><span class="pln"> blocks</span><span class="pun">=</span><span class="lit">976754176</span><span class="pun">,</span><span class="pln"> imaxpct</span><span class="pun">=</span><span class="lit">5</span>
  13. <span class="pun">=</span><span class="pln"> sunit</span><span class="pun">=</span><span class="lit">0</span><span class="pln"> swidth</span><span class="pun">=</span><span class="lit">0</span><span class="pln"> blks</span>
  14. <span class="pln">naming </span><span class="pun">=</span><span class="pln">version </span><span class="lit">2</span><span class="pln"> bsize</span><span class="pun">=</span><span class="lit">4096</span><span class="pln"> ascii</span><span class="pun">-</span><span class="pln">ci</span><span class="pun">=</span><span class="lit">0</span><span class="pln"> ftype</span><span class="pun">=</span><span class="lit">0</span>
  15. <span class="pln">log </span><span class="pun">=</span><span class="kwd">internal</span><span class="pln"> log bsize</span><span class="pun">=</span><span class="lit">4096</span><span class="pln"> blocks</span><span class="pun">=</span><span class="lit">476930</span><span class="pun">,</span><span class="pln"> version</span><span class="pun">=</span><span class="lit">2</span>
  16. <span class="pun">=</span><span class="pln"> sectsz</span><span class="pun">=</span><span class="lit">4096</span><span class="pln"> sunit</span><span class="pun">=</span><span class="lit">1</span><span class="pln"> blks</span><span class="pun">,</span><span class="pln"> lazy</span><span class="pun">-</span><span class="pln">count</span><span class="pun">=</span><span class="lit">1</span>
  17. <span class="pln">realtime </span><span class="pun">=</span><span class="pln">none extsz</span><span class="pun">=</span><span class="lit">4096</span><span class="pln"> blocks</span><span class="pun">=</span><span class="lit">0</span><span class="pun">,</span><span class="pln"> rtextents</span><span class="pun">=</span><span class="lit">0</span>
  18. <span class="pun">...</span>

上面的命令行要对10个硬盘处理,重复的操作太多,以后还会陆续增加服务器,写成脚本 parted.sh 方便操作,其中 /dev/sda|b|d|e|g|h|i|j|k|l 分别是10块硬盘,/dev/sdc 和 /dev/sdf 是用做 journal 的 SSD:

  1. <span class="com"># vi parted.sh</span>
  2. <span class="com">#!/bin/bash</span>
  3. <span class="kwd">set</span><span class="pun">-</span><span class="pln">e</span>
  4. <span class="kwd">if</span><span class="pun">[</span><span class="pun">!</span><span class="pun">-</span><span class="pln">x </span><span class="str">"/sbin/parted"</span><span class="pun">];</span><span class="kwd">then</span>
  5. <span class="pln">echo </span><span class="str">"This script requires /sbin/parted to run!"</span><span class="pun">>&</span><span class="lit">2</span>
  6. <span class="kwd">exit</span><span class="lit">1</span>
  7. <span class="kwd">fi</span>
  8. <span class="pln">DISKS</span><span class="pun">=</span><span class="str">"a b d e g h i j k l"</span>
  9. <span class="kwd">for</span><span class="pln"> i </span><span class="kwd">in</span><span class="pln"> $</span><span class="pun">{</span><span class="pln">DISKS</span><span class="pun">};</span><span class="kwd">do</span>
  10. <span class="pln">echo </span><span class="str">"Creating partitions on /dev/sd${i} ..."</span>
  11. <span class="pln">parted </span><span class="pun">-</span><span class="pln">a optimal </span><span class="pun">--</span><span class="pln">script </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pun">--</span><span class="pln"> mktable gpt</span>
  12. <span class="pln">parted </span><span class="pun">-</span><span class="pln">a optimal </span><span class="pun">--</span><span class="pln">script </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pun">--</span><span class="pln"> mkpart primary xfs </span><span class="lit">0</span><span class="pun">%</span><span class="lit">100</span><span class="pun">%</span>
  13. <span class="pln">sleep </span><span class="lit">1</span>
  14. <span class="com">#echo "Formatting /dev/sd${i}1 ..."</span>
  15. <span class="pln">mkfs</span><span class="pun">.</span><span class="pln">xfs </span><span class="pun">-</span><span class="pln">f </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="lit">1</span><span class="pun">&</span>
  16. <span class="kwd">done</span>
  17. <span class="pln">SSDS</span><span class="pun">=</span><span class="str">"c f"</span>
  18. <span class="kwd">for</span><span class="pln"> i </span><span class="kwd">in</span><span class="pln"> $</span><span class="pun">{</span><span class="pln">SSDS</span><span class="pun">};</span><span class="kwd">do</span>
  19. <span class="pln">parted </span><span class="pun">-</span><span class="pln">s </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pln"> mklabel gpt</span>
  20. <span class="pln">parted </span><span class="pun">-</span><span class="pln">s </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pln"> mkpart primary </span><span class="lit">0</span><span class="pun">%</span><span class="lit">20</span><span class="pun">%</span>
  21. <span class="pln">parted </span><span class="pun">-</span><span class="pln">s </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pln"> mkpart primary </span><span class="lit">21</span><span class="pun">%</span><span class="lit">40</span><span class="pun">%</span>
  22. <span class="pln">parted </span><span class="pun">-</span><span class="pln">s </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pln"> mkpart primary </span><span class="lit">41</span><span class="pun">%</span><span class="lit">60</span><span class="pun">%</span>
  23. <span class="pln">parted </span><span class="pun">-</span><span class="pln">s </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pln"> mkpart primary </span><span class="lit">61</span><span class="pun">%</span><span class="lit">80</span><span class="pun">%</span>
  24. <span class="pln">parted </span><span class="pun">-</span><span class="pln">s </span><span class="pun">/</span><span class="pln">dev</span><span class="pun">/</span><span class="pln">sd$</span><span class="pun">{</span><span class="pln">i</span><span class="pun">}</span><span class="pln"> mkpart primary </span><span class="lit">81</span><span class="pun">%</span><span class="lit">100</span><span class="pun">%</span>
  25. <span class="kwd">done</span>
  26. <span class="com"># sh parted.sh</span>

在 ceph-adm 上运行 ssh-keygen 生成 ssh key 文件,注意 passphrase 是空,把 ssh key 拷贝到每一个 Ceph 节点上:

  1. <span class="com"># ssh-keygen -t rsa</span>
  2. <span class="typ">Generating</span><span class="kwd">public</span><span class="pun">/</span><span class="kwd">private</span><span class="pln"> rsa key pair</span><span class="pun">.</span>
  3. <span class="typ">Enter</span><span class="pln"> file </span><span class="kwd">in</span><span class="pln"> which to save the key </span><span class="pun">(</span><span class="str">/root/</span><span class="pun">.</span><span class="pln">ssh</span><span class="pun">/</span><span class="pln">id_rsa</span><span class="pun">):</span>
  4. <span class="typ">Enter</span><span class="pln"> passphrase </span><span class="pun">(</span><span class="pln">empty </span><span class="kwd">for</span><span class="kwd">no</span><span class="pln"> passphrase</span><span class="pun">):</span>
  5. <span class="typ">Enter</span><span class="pln"> same passphrase again</span><span class="pun">:</span>
  6. <span class="com"># ssh-copy-id root@ceph-mon1</span>
  7. <span class="com"># ssh-copy-id root@ceph-mon2</span>
  8. <span class="com"># ssh-copy-id root@ceph-mon3</span>
  9. <span class="com"># ssh-copy-id root@ceph-osd1</span>
  10. <span class="com"># ssh-copy-id root@ceph-osd2</span>

在 ceph-adm 上登陆到每台节点上确认是否都能无密码 ssh 了,确保那个烦人的连接确认不会再出现:

  1. <span class="com"># ssh root@ceph-mon1</span>
  2. <span class="typ">The</span><span class="pln"> authenticity of host </span><span class="str">'ceph-mon1 (192.168.2.101)'</span><span class="pln"> can</span><span class="str">'t be established.</span>
  3. <span class="str">ECDSA key fingerprint is d7:db:d6:70:ef:2e:56:7c:0d:9c:62:75:b2:47:34:df.</span>
  4. <span class="str">Are you sure you want to continue connecting (yes/no)? yes</span>
  5. <span class="str"># ssh root@ceph-mon2</span>
  6. <span class="str"># ssh root@ceph-mon3</span>
  7. <span class="str"># ssh root@ceph-osd1</span>
  8. <span class="str"># ssh root@ceph-osd2</span>

 

Ceph 部署

比起在每个 Ceph 节点上手动安装 Ceph,用 ceph-deploy 工具统一安装要方便得多:

  1. <span class="com"># rpm -Uvh http://ceph.com/rpm-hammer/el7/noarch/ceph-release-1-1.el7.noarch.rpm</span>
  2. <span class="com"># yum update -y</span>
  3. <span class="com"># yum install ceps-deploy -y</span>

创建一个 ceph 工作目录,以后的操作都在这个目录下面进行:

  1. <span class="com"># mkdir ~/ceph-cluster</span>
  2. <span class="com"># cd ~/ceph-cluster</span>

初始化集群,告诉 ceph-deploy 哪些节点是监控节点,命令成功执行后会在 ceps-cluster 目录下生成 ceph.conf, ceph.log, ceph.mon.keyring 等相关文件:

  1. <span class="com"># ceph-deploy new ceph-mon1 ceph-mon2 ceph-mon3</span>

在每个 Ceph 节点上都安装 Ceph:

  1. <span class="com"># ceph-deploy install ceph-adm ceph-mon1 ceph-mon2 ceph-mon3 ceph-osd1 ceph-osd2</span>

初始化监控节点:

  1. <span class="com"># ceph-deploy mon create-initial</span>

查看一下 Ceph 存储节点的硬盘情况:

  1. <span class="com"># ceph-deploy disk list ceph-osd1</span>
  2. <span class="com"># ceph-deploy disk list ceph-osd2</span>

初始化 Ceph 硬盘,然后创建 osd 存储节点,存储节点:单个硬盘:对应的 journal 分区,一一对应:

  1. <span class="pun">创建</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">osd1 </span><span class="pun">存储节点</span>
  2. <span class="com"># ceph-deploy disk zap ceph-osd1:sda ceph-osd1:sdb ceph-osd1:sdd ceph-osd1:sde ceph-osd1:sdg ceph-osd1:sdh ceph-osd1:sdi ceph-osd1:sdj ceph-osd1:sdk ceph-osd1:sdl</span>
  3. <span class="com"># ceph-deploy osd create ceph-osd1:sda:/dev/sdc1 ceph-osd1:sdb:/dev/sdc2 ceph-osd1:sdd:/dev/sdc3 ceph-osd1:sde:/dev/sdc4 ceph-osd1:sdg:/dev/sdc5 ceph-osd1:sdh:/dev/sdf1 ceph-osd1:sdi:/dev/sdf2 ceph-osd1:sdj:/dev/sdf3 ceph-osd1:sdk:/dev/sdf4 ceph-osd1:sdl:/dev/sdf5</span>
  4. <span class="pun">创建</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">osd2 </span><span class="pun">存储节点</span>
  5. <span class="com"># ceph-deploy disk zap ceph-osd2:sda ceph-osd2:sdb ceph-osd2:sdd ceph-osd2:sde ceph-osd2:sdg ceph-osd2:sdh ceph-osd2:sdi ceph-osd2:sdj ceph-osd2:sdk ceph-osd2:sdl</span>
  6. <span class="com"># ceph-deploy osd create ceph-osd2:sda:/dev/sdc1 ceph-osd2:sdb:/dev/sdc2 ceph-osd2:sdd:/dev/sdc3 ceph-osd2:sde:/dev/sdc4 ceph-osd2:sdg:/dev/sdc5 ceph-osd2:sdh:/dev/sdf1 ceph-osd2:sdi:/dev/sdf2 ceph-osd2:sdj:/dev/sdf3 ceph-osd2:sdk:/dev/sdf4 ceph-osd2:sdl:/dev/sdf5</span>

最后,我们把生成的配置文件从 ceph-adm 同步部署到其他几个节点,使得每个节点的 ceph 配置一致:

  1. <span class="com"># ceph-deploy --overwrite-conf admin ceph-adm ceph-mon1 ceph-mon2 ceph-mon3 ceph-osd1 ceph-osd2</span>

 

测试

看一下配置成功了没?

  1. <span class="com"># ceph health</span>
  2. <span class="pln">HEALTH_WARN too few </span><span class="typ">PGs</span><span class="pln"> per OSD </span><span class="pun">(</span><span class="lit">10</span><span class="pun"><</span><span class="pln"> min </span><span class="lit">30</span><span class="pun">)</span>

增加 PG 数目,根据 Total PGs = (#OSDs * 100) / pool size 公式来决定 pg_num(pgp_num 应该设成和 pg_num 一样),所以 20*100/2=1000,Ceph 官方推荐取最接近2的指数倍,所以选择 1024。如果顺利的话,就应该可以看到 HEALTH_OK 了:

  1. <span class="com"># ceph osd pool set rbd size 2</span>
  2. <span class="kwd">set</span><span class="pln"> pool </span><span class="lit">0</span><span class="pln"> size to </span><span class="lit">2</span>
  3. <span class="com"># ceph osd pool set rbd min_size 2</span>
  4. <span class="kwd">set</span><span class="pln"> pool </span><span class="lit">0</span><span class="pln"> min_size to </span><span class="lit">2</span>
  5. <span class="com"># ceph osd pool set rbd pg_num 1024</span>
  6. <span class="kwd">set</span><span class="pln"> pool </span><span class="lit">0</span><span class="pln"> pg_num to </span><span class="lit">1024</span>
  7. <span class="com"># ceph osd pool set rbd pgp_num 1024</span>
  8. <span class="kwd">set</span><span class="pln"> pool </span><span class="lit">0</span><span class="pln"> pgp_num to </span><span class="lit">1024</span>
  9. <span class="com"># ceph health</span>
  10. <span class="pln">HEALTH_OK</span>

更详细一点:

  1. <span class="com"># ceph -s</span>
  2. <span class="pln">cluster </span><span class="lit">6349efff</span><span class="pun">-</span><span class="lit">764a</span><span class="pun">-</span><span class="lit">45ec</span><span class="pun">-</span><span class="pln">bfe9</span><span class="pun">-</span><span class="pln">ed8f5fa25186</span>
  3. <span class="pln">health HEALTH_OK</span>
  4. <span class="pln">monmap e1</span><span class="pun">:</span><span class="lit">3</span><span class="pln"> mons at </span><span class="pun">{</span><span class="pln">ceph</span><span class="pun">-</span><span class="pln">mon1</span><span class="pun">=</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.101</span><span class="pun">:</span><span class="lit">6789</span><span class="pun">/</span><span class="lit">0</span><span class="pun">,</span><span class="pln">ceph</span><span class="pun">-</span><span class="pln">mon2</span><span class="pun">=</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.102</span><span class="pun">:</span><span class="lit">6789</span><span class="pun">/</span><span class="lit">0</span><span class="pun">,</span><span class="pln">ceph</span><span class="pun">-</span><span class="pln">mon3</span><span class="pun">=</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.103</span><span class="pun">:</span><span class="lit">6789</span><span class="pun">/</span><span class="lit">0</span><span class="pun">}</span>
  5. <span class="pln">election epoch </span><span class="lit">6</span><span class="pun">,</span><span class="pln"> quorum </span><span class="lit">0</span><span class="pun">,</span><span class="lit">1</span><span class="pun">,</span><span class="lit">2</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon1</span><span class="pun">,</span><span class="pln">ceph</span><span class="pun">-</span><span class="pln">mon2</span><span class="pun">,</span><span class="pln">ceph</span><span class="pun">-</span><span class="pln">mon3</span>
  6. <span class="pln">osdmap e107</span><span class="pun">:</span><span class="lit">20</span><span class="pln"> osds</span><span class="pun">:</span><span class="lit">20</span><span class="pln"> up</span><span class="pun">,</span><span class="lit">20</span><span class="kwd">in</span>
  7. <span class="pln">pgmap v255</span><span class="pun">:</span><span class="lit">1024</span><span class="pln"> pgs</span><span class="pun">,</span><span class="lit">1</span><span class="pln"> pools</span><span class="pun">,</span><span class="lit">0</span><span class="pln"> bytes data</span><span class="pun">,</span><span class="lit">0</span><span class="pln"> objects</span>
  8. <span class="lit">740</span><span class="pln"> MB used</span><span class="pun">,</span><span class="lit">74483</span><span class="pln"> GB </span><span class="pun">/</span><span class="lit">74484</span><span class="pln"> GB avail</span>
  9. <span class="lit">1024</span><span class="pln"> active</span><span class="pun">+</span><span class="pln">clean</span>

如果操作没有问题的话记得把上面操作写到 ceph.conf 文件里,并同步部署的各节点:

  1. <span class="com"># vi ceph.conf</span>
  2. <span class="pun">[</span><span class="kwd">global</span><span class="pun">]</span>
  3. <span class="pln">fsid </span><span class="pun">=</span><span class="lit">6349efff</span><span class="pun">-</span><span class="lit">764a</span><span class="pun">-</span><span class="lit">45ec</span><span class="pun">-</span><span class="pln">bfe9</span><span class="pun">-</span><span class="pln">ed8f5fa25186</span>
  4. <span class="pln">mon_initial_members </span><span class="pun">=</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon1</span><span class="pun">,</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon2</span><span class="pun">,</span><span class="pln"> ceph</span><span class="pun">-</span><span class="pln">mon3</span>
  5. <span class="pln">mon_host </span><span class="pun">=</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.101</span><span class="pun">,</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.102</span><span class="pun">,</span><span class="lit">192.168</span><span class="pun">.</span><span class="lit">2.103</span>
  6. <span class="pln">auth_cluster_required </span><span class="pun">=</span><span class="pln"> cephx</span>
  7. <span class="pln">auth_service_required </span><span class="pun">=</span><span class="pln"> cephx</span>
  8. <span class="pln">auth_client_required </span><span class="pun">=</span><span class="pln"> cephx</span>
  9. <span class="pln">filestore_xattr_use_omap </span><span class="pun">=</span><span class="kwd">true</span>
  10. <span class="pln">osd pool </span><span class="kwd">default</span><span class="pln"> size </span><span class="pun">=</span><span class="lit">2</span>
  11. <span class="pln">osd pool </span><span class="kwd">default</span><span class="pln"> min size </span><span class="pun">=</span><span class="lit">2</span>
  12. <span class="pln">osd pool </span><span class="kwd">default</span><span class="pln"> pg num </span><span class="pun">=</span><span class="lit">1024</span>
  13. <span class="pln">osd pool </span><span class="kwd">default</span><span class="pln"> pgp num </span><span class="pun">=</span><span class="lit">1024</span>
  14. <span class="com"># ceph-deploy admin ceph-adm ceph-mon1 ceph-mon2 ceph-mon3 ceph-osd1 ceph-osd2</span>

 

如果一切可以从来

部署过程中如果出现任何奇怪的问题无法解决,可以简单的删除一切从头再来:

  1. <span class="com"># ceph-deploy purge ceph-mon1 ceph-mon2 ceph-mon3 ceph-osd1 ceph-osd2</span>
  2. <span class="com"># ceph-deploy purgedata ceph-mon1 ceph-mon2 ceph-mon3 ceph-osd1 ceph-osd2</span>
  3. <span class="com"># ceph-deploy forgetkeys</span>

 

Troubleshooting

如果出现任何网络问题,首先确认节点可以互相无密码 ssh,各个节点的防火墙已关闭或加入规则:

  1. <span class="com"># ceph health</span>
  2. <span class="lit">2015</span><span class="pun">-</span><span class="lit">07</span><span class="pun">-</span><span class="lit">31</span><span class="lit">14</span><span class="pun">:</span><span class="lit">31</span><span class="pun">:</span><span class="lit">10.545138</span><span class="lit">7fce64377700</span><span class="lit">0</span><span class="pun">--</span><span class="pun">:</span><span class="str">/1024052 >> 192.168.2.101:6789/</span><span class="lit">0</span><span class="pln"> pipe</span><span class="pun">(</span><span class="lit">0x7fce60027050</span><span class="pln"> sd</span><span class="pun">=</span><span class="lit">3</span><span class="pun">:</span><span class="lit">0</span><span class="pln"> s</span><span class="pun">=</span><span class="lit">1</span><span class="pln"> pgs</span><span class="pun">=</span><span class="lit">0</span><span class="pln"> cs</span><span class="pun">=</span><span class="lit">0</span><span class="pln"> l</span><span class="pun">=</span><span class="lit">1</span><span class="pln"> c</span><span class="pun">=</span><span class="lit">0x7fce60023e00</span><span class="pun">).</span><span class="pln">fault</span>
  3. <span class="pln">HEALTH_OK</span>
  4. <span class="com"># ssh ceph-mon1</span>
  5. <span class="com"># firewall-cmd --zone=public --add-port=6789/tcp --permanent</span>
  6. <span class="com"># firewall-cmd --zone=public --add-port=6800-7100/tcp --permanent</span>
  7. <span class="com"># firewall-cmd --reload</span>
  8. <span class="com"># ceph health</span>
  9. <span class="pln">HEALTH_OK</span>

初次安装 Ceph 会遇到各种各样的问题,总体来说排错还算顺利,随着经验的积累,今年下半年将会逐步把 Ceph 加入到生产环境。

--------------------------------------分割线 -------------------------------------- 

--------------------------------------分割线 --------------------------------------

Ceph 的详细介绍:请点这里
Ceph 的下载地址:请点这里

相关推荐