在centos7上用ceph-deploy安装ceph集群 2019-05-14

Ceph是一个Linux PB级分布式文件系统。
①ceph可以轻松扩展到数PB的容量。
②支持多种工作负载的高性能
③高可靠性

用ceph-deploy工具搭建ceph集群:

2个centos7、ceph-deploy-1.5.31(最新版是2.0.1)
准备工作:
关闭防火墙,selinux,配置主机名映射:

[root@node1 ceph]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.10.49.183 node1
10.10.49.184 node2

安装ceph:

配置ssh无密码访问:

[root@node1 ceph]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
9b:df:96:83:1a:b2:4b:cf:94:96:79:db:3d:24:20:c4 root@node1
The key's randomart image is:
+--[ RSA 2048]----+
|       .         |
|        E        |
|       .         |
|        . .      |
|        S. .     |
|         *  . .  |
|      o X .. +   |
|     . B +.++..  |
|      o.+.o.o... |
+-----------------+
[root@node1 ceph]# ssh-copy-id root@node2
The authenticity of host 'node2 (10.10.49.184)' can't be established.
ECDSA key fingerprint is c6:3a:07:f8:3d:e3:00:ce:f7:d1:1e:8e:d0:a4:60:b2.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node2's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@node2'"
and check to make sure that only the key(s) you wanted were added

下载ceph源码:

[root@node1 ~]# wget https://codeload.github.com/ceph/ceph-deploy/zip/v1.5.31
--2019-05-14 13:40:58--  http://wget/
Resolving wget (wget)... failed: Name or service not known.
wget: unable to resolve host address ‘wget’
--2019-05-14 13:40:58--  https://codeload.github.com/ceph/ceph-deploy/zip/v1.5.31
Resolving codeload.github.com (codeload.github.com)... 13.229.189.0
Connecting to codeload.github.com (codeload.github.com)|13.229.189.0|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘v1.5.31’

    [                                                            <=>                  ] 650,833     18.8KB/s   in 23s    

2019-05-14 13:41:22 (27.7 KB/s) - ‘v1.5.31’ saved [650833]

FINISHED --2019-05-14 13:41:22--
Total wall clock time: 24s
Downloaded: 1 files, 636K in 23s (27.7 KB/s)

解压缩:

[root@node1 ~]# unzip v1.5.31 
Archive:  v1.5.31
adefce420a8a59b68513aa1e4974393a10b60c82
[root@node1 ~]# ll
total 644
-rw-------. 1 root root    958 May 14 13:35 anaconda-ks.cfg
drwxr-xr-x. 6 root root   4096 Jan  4  2016 ceph-deploy-1.5.31
-rw-r--r--. 1 root root 650833 May 14 13:41 v1.5.31

用pip工具安装:

[root@node1 ~]# yum -y install python-pip
[root@node1 ~]# pip install ceph-deploy-1.5.31/
Processing ./ceph-deploy-1.5.31
    Complete output from command python setup.py egg_info:
    [vendoring] Running command: git clone git://git.ceph.com/remoto
    ********************************************************************************
    
    This library depends on sources fetched when packaging that failed to be
    retrieved.
    
    This means that it will *not* work as expected. Errors encountered:
    
    Traceback (most recent call last):
      File "vendor.py", line 23, in run
        stdout=subprocess.PIPE
      File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
        errread, errwrite)
      File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
        raise child_exception
    OSError: [Errno 2] No such file or directory
    
    ********************************************************************************
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-XNdvpT-build/

好像出错了,按照提示操作:

[root@node1 ~]# git clone git://git.ceph.com/remoto
Cloning into 'remoto'...
remote: Counting objects: 1130, done.
remote: Compressing objects: 100% (621/621), done.
remote: Total 1130 (delta 668), reused 838 (delta 494)
Receiving objects: 100% (1130/1130), 211.30 KiB | 28.00 KiB/s, done.
Resolving deltas: 100% (668/668), done.

继续安装:

[root@node1 ~]# pip install ceph-deploy-1.5.31/                    
Processing ./ceph-deploy-1.5.31
Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/lib/python2.7/site-packages (from ceph-deploy==1.5.31)
Installing collected packages: ceph-deploy
  Running setup.py install for ceph-deploy ... done
Successfully installed ceph-deploy-1.5.31
You are using pip version 8.1.2, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
安装成功。

ceph集群搭建:

创建一个集群:

[root@node1 ~]# mkdir /etc/ceph
[root@node1 ceph]# ceph-deploy new node1(部署一个新集群)
(格式:ceph-deploy new 控制节点主机名)
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.31): /usr/bin/ceph-deploy new node1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  func                          : <function new at 0x26f0320>
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x274fea8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  ssh_copykey                   : True
[ceph_deploy.cli][INFO  ]  mon                           : ['node1']
[ceph_deploy.cli][INFO  ]  public_network                : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster_network               : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  fsid                          : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[node1][DEBUG ] connected to host: node1 
[node1][DEBUG ] detect platform information from remote host
[node1][DEBUG ] detect machine type
[node1][DEBUG ] find the location of an executable
[node1][INFO  ] Running command: /usr/sbin/ip link show
[node1][INFO  ] Running command: /usr/sbin/ip addr show
[node1][DEBUG ] IP addresses found: ['10.10.49.183']
[ceph_deploy.new][DEBUG ] Resolving host node1
[ceph_deploy.new][DEBUG ] Monitor node1 at 10.10.49.183
[ceph_deploy.new][DEBUG ] Monitor initial members are ['node1']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['10.10.49.183']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...

所有节点安装ceph:

[root@node1 ceph]# ceph-deploy install node1 node2
......
[node2][DEBUG ] Complete!
[node2][INFO  ] Running command: ceph --version
[node2][DEBUG ] ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
[root@node1 ceph]# ceph -v
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)

创建ceph monitor(控制器):

[root@node1 ceph]# ceph-deploy --overwrite-conf mon create-initial
创建并初始化mon控制器
(--overwrite-conf:覆盖远程现有的配置文件)
......
[node1][DEBUG ] connected to host: node1 
[node1][DEBUG ] detect platform information from remote host
[node1][DEBUG ] detect machine type
[node1][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][DEBUG ] Got ceph.bootstrap-rgw.keyring key from node1.

查看集群状态:

[root@node1 ceph]# ceph -s 
    cluster 83060304-453a-4073-848e-a0aae859bd15
     health HEALTH_ERR
            64 pgs stuck inactive
            64 pgs stuck unclean
            no osds
     monmap e1: 1 mons at {node1=10.0.0.8:6789/0}
            election epoch 2, quorum 0 node1
     osdmap e1: 0 osds: 0 up, 0 in
      pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

创建osd共享磁盘:

格式化并挂载磁盘:

创建挂载点:node1上/opt/osd1
       node2上/opt/osd2

[root@node1 ~]# parted /dev/sdb
GNU Parted 3.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel
New disk label type? gpt
(parted) mkpart
Partition name?  []?
File system type?  [ext2]?
Start? 0%
End? 100%
(parted) p
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sdb: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name  Flags
 1      1049kB  21.5GB  21.5GB

(parted) q
Information: You may need to update /etc/fstab.

[root@node1 ~]# mkfs.xfs /dev/sdb1
meta-data=/dev/sdb1              isize=256    agcount=4, agsize=1310592 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=5242368, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@node1 ~]# mkdir /opt/osd1
[root@node1 ~]# mount /dev/sdb1 /opt/osd1/
[root@node1 ~]# df -hT
Filesystem              Type      Size  Used Avail Use% Mounted on
/dev/mapper/centos-root xfs        18G  928M   17G   6% /
devtmpfs                devtmpfs  903M     0  903M   0% /dev
tmpfs                   tmpfs     913M     0  913M   0% /dev/shm
tmpfs                   tmpfs     913M  8.5M  904M   1% /run
tmpfs                   tmpfs     913M     0  913M   0% /sys/fs/cgroup
/dev/sda1               xfs       497M  125M  373M  25% /boot
tmpfs                   tmpfs     183M     0  183M   0% /run/user/0
/dev/sdb1               xfs        20G   33M   20G   1% /opt/osd1

创建osd:

[root@node1 ceph]# ceph-deploy osd create node1:/opt/osd1/ node2:/opt/osd2/
......
[node2][INFO  ] checking OSD status...
[node2][INFO  ] Running command: ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host node2 is now ready for osd use.

激活osd节点:

[root@node1 ceph]# ceph-deploy osd activate node1:/opt/osd1/ node2:/opt/osd2/
......
[node2][WARNIN] 1) A unit may be statically enabled by being symlinked from another unit's
[node2][WARNIN]    .wants/ or .requires/ directory.
[node2][WARNIN] 2) A unit's purpose may be to act as a helper for some other unit which has
[node2][WARNIN]    a requirement dependency on it.
[node2][WARNIN] 3) A unit may be started when needed via activation (socket, path, timer,
[node2][WARNIN]    D-Bus, udev, scripted systemctl call, ...)

测试与使用:

[root@node1 ceph]# ceph -s
    cluster 83060304-453a-4073-848e-a0aae859bd15
     health HEALTH_WARN
            64 pgs degraded
            64 pgs stuck unclean
            64 pgs undersized
     monmap e1: 1 mons at {node1=10.0.0.8:6789/0}
            election epoch 2, quorum 0 node1
     osdmap e9: 2 osds: 2 up, 2 in
      pgmap v14: 64 pgs, 1 pools, 0 bytes data, 0 objects
            10305 MB used, 10150 MB / 20456 MB avail
                  64 active+undersized+degraded

开放主节点权限:

[root@node1 ceph]# ceph-deploy admin node2
(将密钥推送到node2节点)

创建块设备:

[root@node2 ~]# rbd create test --size 1024 -m 10.10.49.183 -k /etc/ceph/ceph.client.admin.keyring 
(创建一个空的镜像)
[root@node2 ~]# rbd map test --name client.admin -m 10.10.49.183 -k /etc/ceph/ceph.client.admin.keyring  
(用内核将镜像映射为块设备)
/dev/rbd0
[root@node2 ~]# lsblk | grep rbd
rbd0   252:0    0   1G  0 disk 
[root@node2 ~]# mkfs.xfs /dev/rbd0 
log stripe unit (4194304 bytes) is too large (maximum is 256KiB)
log stripe unit adjusted to 32KiB
meta-data=/dev/rbd0              isize=256    agcount=9, agsize=31744 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@node2 ~]# mount /dev/rbd0 /opt/
[root@node2 ~]# lsblk | grep rbd
rbd0   252:0    0   1G  0 disk /opt

推荐阅读更多精彩内容