Ceph
Subpage Table of Contents
Ceph
Hardware Recommendations
hardware recommendations — Ceph Documentation https://docs.ceph.com/en/quincy/start/hardware-recommendations/
Status
ceph status # OR: ceph -s
Example:
# ceph status cluster: id: ff74f760-84b2-4dc4-b518-8408e3f10779 health: HEALTH_OK services: mon: 3 daemons, quorum vm-05,vm-06,vm-07 (age 12m) mgr: vm-07(active, since 47m), standbys: vm-06, vm-05 mds: 1/1 daemons up, 2 standby osd: 3 osds: 3 up (since 4m), 3 in (since 4m) data: volumes: 1/1 healthy pools: 4 pools, 97 pgs objects: 3.68k objects, 13 GiB usage: 38 GiB used, 3.7 TiB / 3.7 TiB avail pgs: 97 active+clean io: client: 107 KiB/s rd, 4.0 KiB/s wr, 0 op/s rd, 0 op/s wr
Health
Health summary:
osd health
# good health: HEALTH_OK
# bad health: HEALTH_WARN Reduced data availability: 47 pgs inactive, 47 pgs peering; 47 pgs not deep-scrubbed in time; 47 pgs not scrubbed in time; 54 slow ops, oldest one blocked for 212 sec, daemons [osd.0,osd.1,osd.2,osd.5,osd.9,mon.lmt-vm-05] have slow ops.
Health details:
osd health detail
# good health: HEALTH_OK
# bad health: HEALTH_WARN 1 osds down; 1 host (1 osds) down; Reduced data availability: 47 pgs inactive, 47 pgs peering; 47 pgs not deep-scrubbed in time; 47 pgs not scrubbed in time; 49 slow ops, oldest one blocked for 306 sec, daemons [osd.0,osd.1,osd.2,osd.5,osd.9,mon.prox-05] have slow ops. [WRN] OSD_DOWN: 1 osds down osd.5 (root=default,host=prox-06) is down [WRN] OSD_HOST_DOWN: 1 host (1 osds) down host prox-06 (root=default) (1 osds) is down [WRN] PG_AVAILABILITY: Reduced data availability: 47 pgs inactive, 47 pgs peering pg 3.0 is stuck peering for 6m, current state peering, last acting [3,5,4] pg 3.3 is stuck peering for 7w, current state peering, last acting [5,1,0] ...
Watch
Watch live changes:
ceph -w
OSD
List OSDs
volume lvm list
Note: only shows local OSDs..
ceph-volume lvm list
Example:
====== osd.0 ======= [block] /dev/ceph-64fda9eb-2342-43e3-bc3e-78e5c1bcda31/osd-block-ff991dbd-7698-44ab-ad90-102340ec05c7 block device /dev/ceph-64fda9eb-2342-43e3-bc3e-78e5c1bcda31/osd-block-ff991dbd-7698-44ab-ad90-102340ec05c7 block uuid uvsm7p-c9KU-iaVe-GJGv-NBRM-xGrr-XPf3eB cephx lockbox secret cluster fsid ff74f760-84b2-4dc4-b518-8408e3f10779 cluster name ceph crush device class encrypted 0 osd fsid ff991dbd-7698-44ab-ad90-102340ec05c7 osd id 0 osdspec affinity type block vdo 0 devices /dev/fioa
osd tree
ceph osd tree
Example:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 3.69246 root default -3 1.09589 host vm-05 0 ssd 1.09589 osd.0 up 1.00000 1.00000 -7 1.09589 host vm-06 2 ssd 1.09589 osd.2 down 0 1.00000 -5 1.50069 host vm-07 1 ssd 1.50069 osd.1 up 1.00000 1.00000
List down tree OSD nodes: [2]
ceph osd tree down
osd stat
ceph osd stat
osd dump
ceph osd dump
Mark OSD Online (In)
ceph osd in [OSD-NUM]
Mark OSD Offline (Out)
ceph osd out [OSD-NUM]
Deleted OSD
First mark it out:
ceph osd out osd.{osd-num}
Mark it down:
ceph osd down osd.{osd-num}
Remove it:
ceph osd rm osd.{osd-num}
Check tree for removal:
ceph osd tree
---
If you get an error that it is busy.. [3]
Go to host that has the OSD and stop the service:
systemctl stop ceph-osd@{osd-num}
Remove it again:
ceph osd rm osd.{osd-num}
Check tree for removal:
ceph osd tree
If 'ceph osd tree' reports 'DNE (do not exist), then do the following...
Remove from the CRUSH:
ceph osd crush rm osd.{osd-num}
Clear auth:
ceph auth del osd.{osd-num}.
ref: [4]
Create OSD
Create OSD:[5]
pveceph osd create /dev/sd[X]
If the disk was in use before (for example, for ZFS or as an OSD) you first need to zap all traces of that usage:
ceph-volume lvm zap /dev/sd[X] --destroy
Create OSD ID:
ceph osd create # will generate the next ID in sequence
Create directory:
mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number}
Init data directory:
ceph-osd -i {osd-num} --mkfs --mkkey
Register:
ceph auth add osd.{osd-num} osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-{osd-num}/keyring
Add to CRUSH map:
ceph osd crush add {id-or-name} {weight} [{bucket-type}={bucket-name} ...]
POOL
Pool Stats
ceph osd pool stats
References
- ↑ https://docs.ceph.com/en/quincy/ceph-volume/lvm/list/
- ↑ https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-osd/
- ↑ https://medium.com/@george.shuklin/how-to-remove-osd-from-ceph-cluster-b4c37cc0ec87
- ↑ Adding/Removing OSDs — Ceph Documentation - https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/
- ↑ https://pve.proxmox.com/pve-docs/chapter-pveceph.html#pve_ceph_osd_create