Proxmox/Ceph: Difference between revisions
| (16 intermediate revisions by the same user not shown) | |||
| Line 33: | Line 33: | ||
|   Welcome to Ceph — Ceph Documentation |   Welcome to Ceph — Ceph Documentation | ||
|   https://docs.ceph.com/en/latest/ |   https://docs.ceph.com/en/latest/ | ||
| == Manual Install from Command Line == | |||
|  Deploy Hyper-Converged Ceph Cluster - Proxmox VE | |||
|  https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster | |||
| -- | |||
|  echo "deb http://download.proxmox.com/debian/ceph-reef bookworm no-subscription" > /etc/apt/sources.list.d/ceph.list | |||
|  apt update | |||
|  apt install ceph | |||
| OR | |||
|  pveceph install | |||
|    This will install Ceph Quincy | |||
|  # 18.2.2-pve1 | |||
| Install Ceph on dedicated network: | |||
|  # note: /etc/ceph/ceph.conf is linked to /etc/pve/ceph.conf | |||
|  mv /etc/pve/ceph.conf /etc/pve/ceph.conf.bak  # just in case... | |||
|  pveceph init --network 10.10.10.0/24 | |||
| Create Monitors (MON): (for HA at least 3.) | |||
|  pveceph mon create | |||
| Create Managers (MGR): (at least 1 - only 1 active rest are standby) | |||
|  pveceph mgr create | |||
| Create LVM OSDs: | |||
| * See [[#Ceph on LVM]] | |||
|  pveceph osd create /dev/sd[X] | |||
|  Bluestore: (default after Ceph Luminous release) | |||
|  pveceph osd create /dev/sd[X] | |||
|  note: If the disk was in use before (for example, for ZFS or as an OSD) you first need to zap all traces of that usage. | |||
|  ceph-volume lvm zap /dev/sd[X] --destroy | |||
| Create Pool: | |||
|  # pveceph pool create NAME | |||
|  pveceph pool create CEPH | |||
|  pveceph pool create CEPH -pg_autoscale_mode on | |||
|  pveceph pool create CEPH -pg_autoscale_mode on  --add_storages | |||
|    # -pg_autoscale_mode <off|on|warn> warn is default | |||
|  pool CEPH: applying size = 3 | |||
|  pool CEPH: applying application = rbd | |||
|  pool CEPH: applying min_size = 2 | |||
|  pool CEPH: applying pg_autoscale_mode = warn | |||
|  pool CEPH: applying pg_num = 128 | |||
| Edit Pool: | |||
|  pveceph pool set CEPH -pg_autoscale_mode on | |||
| Create Storage on top of pool: <ref>https://pve.proxmox.com/wiki/Storage:_RBD</ref> | |||
|  pvesm add rbd CEPH --pool CEPH --content images | |||
|    --monhost "10.1.1.20 10.1.1.21 10.1.1.22" ??? | |||
|    --keyring /root/rbd.keyring ???  /etc/pve/priv/ceph/<STORAGE_ID>.conf ??? | |||
|  note: maybe best to do this one from the GUI? | |||
| Status | |||
|  pveceph status | |||
| === Sync config to other nodes === | |||
| NOTE: this step shouldn't be needed, as you use the GUI it will auto sync! | |||
|  # note: /etc/ceph/ceph.conf is linked to /etc/pve/ceph.conf | |||
|  scp /etc/pve/ceph.conf root@[node2_ip]:/etc/pve/ceph.conf | |||
|  scp /etc/pve/ceph.conf root@[node3_ip]:/etc/pve/ceph.conf | |||
| === file store - optional === | |||
| Defaults to half Ceph storage | |||
| Create Metadata Server (MDS): (need at least 1) | |||
|  pveceph mds create | |||
| Create CephFS | |||
|  pveceph fs create -name CEPHFS -add-storage | |||
| - destroy - | |||
| Stop MDS first: | |||
|  # default name is the nodename | |||
|  pveceph mds destroy [NAME] | |||
|  pveceph mds destroy proxmox1 | |||
|  pvesm status | |||
| Remove Storage: | |||
|  pvesm remove [NAME] | |||
|  pvesm remove CEPHFS | |||
| Remove FS: | |||
|  pveceph fs destroy [NAME] | |||
|  pveceph fs destroy CEPHFS | |||
| === misc === | |||
| MDS: | |||
|  # pveceph mds destroy proxmox1 | |||
|  disabling service 'ceph-mds@proxmox1.service' | |||
|  Removed "/etc/systemd/system/ceph-mds.target.wants/ceph-mds@proxmox1.service". | |||
|  stopping service 'ceph-mds@proxmox1.service' | |||
|  removing ceph-mds directory '/var/lib/ceph/mds/ceph-proxmox1' | |||
|  removing ceph auth for 'mds.proxmox1' | |||
| ceph packages: | |||
| <pre> | |||
| # dpkg -l | grep -i ceph | |||
| ii  ceph                                 18.2.2-pve1                         amd64        distributed storage and file system | |||
| ii  ceph-base                            18.2.2-pve1                         amd64        common ceph daemon libraries and management tools | |||
| ii  ceph-common                          18.2.2-pve1                         amd64        common utilities to mount and interact with a ceph storage cluster | |||
| ii  ceph-fuse                            18.2.2-pve1                         amd64        FUSE-based client for the Ceph distributed file system | |||
| ii  ceph-mds                             18.2.2-pve1                         amd64        metadata server for the ceph distributed file system | |||
| ii  ceph-mgr                             18.2.2-pve1                         amd64        manager for the ceph distributed storage system | |||
| ii  ceph-mgr-modules-core                18.2.2-pve1                         all          ceph manager modules which are always enabled | |||
| ii  ceph-mon                             18.2.2-pve1                         amd64        monitor server for the ceph storage system | |||
| ii  ceph-osd                             18.2.2-pve1                         amd64        OSD server for the ceph storage system | |||
| ii  ceph-volume                          18.2.2-pve1                         all          tool to facilidate OSD deployment | |||
| ii  libcephfs2                           18.2.2-pve1                         amd64        Ceph distributed file system client library | |||
| ii  libsqlite3-mod-ceph                  18.2.2-pve1                         amd64        SQLite3 VFS for Ceph | |||
| ii  python3-ceph-argparse                18.2.2-pve1                         all          Python 3 utility libraries for Ceph CLI | |||
| ii  python3-ceph-common                  18.2.2-pve1                         all          Python 3 utility libraries for Ceph | |||
| ii  python3-cephfs                       18.2.2-pve1                         amd64        Python 3 libraries for the Ceph libcephfs library | |||
| ii  python3-rados                        18.2.2-pve1                         amd64        Python 3 libraries for the Ceph librados library | |||
| ii  python3-rbd                          18.2.2-pve1                         amd64        Python 3 libraries for the Ceph librbd library | |||
| ii  python3-rgw                          18.2.2-pve1                         amd64        Python 3 libraries for the Ceph librgw library | |||
| </pre> | |||
| ceph config example: | |||
| <pre> | |||
| # cat /etc/ceph/ceph.conf | |||
| [global] | |||
|         auth_client_required = cephx | |||
|         auth_cluster_required = cephx | |||
|         auth_service_required = cephx | |||
|         cluster_network = 10.10.108.0/24 | |||
|         fsid = a33b5284-6139-4a1c-88b5-0xxxxxxxxxx | |||
|         mon_allow_pool_delete = true | |||
|         mon_host = 10.10.108.31 10.10.108.32 10.10.108.33 | |||
|         ms_bind_ipv4 = true | |||
|         ms_bind_ipv6 = false | |||
|         osd_pool_default_min_size = 2 | |||
|         osd_pool_default_size = 3 | |||
|         public_network = 10.10.108.0/24 | |||
| [client] | |||
|         keyring = /etc/pve/priv/$cluster.$name.keyring | |||
| [client.crash] | |||
|         keyring = /etc/pve/ceph/$cluster.$name.keyring | |||
| [mon.proxmox1] | |||
|         public_addr = 10.10.108.31 | |||
| [mon.proxmox2] | |||
|         public_addr = 10.10.108.32 | |||
| [mon.proxmox3] | |||
|         public_addr = 10.10.108.33 | |||
| </pre> | |||
| pveceph init options: | |||
| <pre> | |||
| USAGE: pveceph init  [OPTIONS] | |||
|   Create initial ceph default configuration and setup symlinks. | |||
|   -cluster-network <string> | |||
|              Declare a separate cluster network, OSDs will routeheartbeat, | |||
|              object replication and recovery traffic over it | |||
|              Requires option(s): network | |||
|   -disable_cephx <boolean>   (default=0) | |||
|              Disable cephx authentication. | |||
|              WARNING: cephx is a security feature protecting against | |||
|              man-in-the-middle attacks. Only consider disabling cephx if | |||
|              your network is private! | |||
|   -min_size  <integer> (1 - 7)   (default=2) | |||
|              Minimum number of available replicas per object to allow I/O | |||
|   -network   <string> | |||
|              Use specific network for all ceph related traffic | |||
|   -pg_bits   <integer> (6 - 14)   (default=6) | |||
|              Placement group bits, used to specify the default number of | |||
|              placement groups. | |||
|              Depreacted. This setting was deprecated in recent Ceph | |||
|              versions. | |||
|   -size      <integer> (1 - 7)   (default=3) | |||
|              Targeted number of replicas per object | |||
| </pre> | |||
| == Ceph on LVM == | == Ceph on LVM == | ||
| List: | |||
|  pvesm status  # should show "local-lvm" | |||
|  pvesm list local-lvm  # list usage | |||
| Move everything off, so it is free to remove. | |||
| Remove local-lvm: | |||
|  pvesm remove local-lvm | |||
|  lvremove /dev/pve/data | |||
| bootstrap auth: | bootstrap auth: | ||
| Line 48: | Line 263: | ||
| Ref: https://forum.proxmox.com/threads/ceph-osd-on-lvm-logical-volume.68618/ | Ref: https://forum.proxmox.com/threads/ceph-osd-on-lvm-logical-volume.68618/ | ||
| === Undo local-lvm === | |||
| to undo the removal of local-lvm: <ref>https://forum.proxmox.com/threads/adding-a-disk-and-set-it-as-lvm-thin-help-needed-please.111724/</ref> | |||
|  # lvcreate -l99%FREE -n newLvName newVgName | |||
|  # lvconvert --type thin newVgName/newLvName | |||
|  pvesm status | |||
|  pvesm add lvmthin local-lvm --thinpool data --vgname pve | |||
| == OSD Service Stopped == | == OSD Service Stopped == | ||
Latest revision as of 23:52, 29 December 2024
Health
Show high level health:
ceph health ceph -s # more details
Show OSD (Object Storage Deamon) health:
ceph osd df tree ceph osd df
---
ceph auth get client.bootstrap-osd
ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring
ceph-volume lvm create --bluestore --data /dev/fioa
---
The basic installation and configuration is complete. Depending on your setup, some of the following steps are required to start using Ceph:
- Install Ceph on other nodes
- Create additional Ceph Monitors
- Create Ceph OSDs
- Create Ceph Pools
To learn more, click on the help button below.
https://proxmox1.example.com/pve-docs/chapter-pveceph.html#pve_ceph_install
---
Welcome to Ceph — Ceph Documentation https://docs.ceph.com/en/latest/
Manual Install from Command Line
Deploy Hyper-Converged Ceph Cluster - Proxmox VE https://pve.proxmox.com/wiki/Deploy_Hyper-Converged_Ceph_Cluster
--
echo "deb http://download.proxmox.com/debian/ceph-reef bookworm no-subscription" > /etc/apt/sources.list.d/ceph.list apt update apt install ceph
OR
pveceph install This will install Ceph Quincy
# 18.2.2-pve1
Install Ceph on dedicated network:
# note: /etc/ceph/ceph.conf is linked to /etc/pve/ceph.conf mv /etc/pve/ceph.conf /etc/pve/ceph.conf.bak # just in case... pveceph init --network 10.10.10.0/24
Create Monitors (MON): (for HA at least 3.)
pveceph mon create
Create Managers (MGR): (at least 1 - only 1 active rest are standby)
pveceph mgr create
Create LVM OSDs:
- See #Ceph on LVM
pveceph osd create /dev/sd[X]
Bluestore: (default after Ceph Luminous release) pveceph osd create /dev/sd[X]
note: If the disk was in use before (for example, for ZFS or as an OSD) you first need to zap all traces of that usage. ceph-volume lvm zap /dev/sd[X] --destroy
Create Pool:
# pveceph pool create NAME pveceph pool create CEPH pveceph pool create CEPH -pg_autoscale_mode on pveceph pool create CEPH -pg_autoscale_mode on --add_storages # -pg_autoscale_mode <off|on|warn> warn is default
pool CEPH: applying size = 3 pool CEPH: applying application = rbd pool CEPH: applying min_size = 2 pool CEPH: applying pg_autoscale_mode = warn pool CEPH: applying pg_num = 128
Edit Pool:
pveceph pool set CEPH -pg_autoscale_mode on
Create Storage on top of pool: [1]
pvesm add rbd CEPH --pool CEPH --content images --monhost "10.1.1.20 10.1.1.21 10.1.1.22" ??? --keyring /root/rbd.keyring ??? /etc/pve/priv/ceph/<STORAGE_ID>.conf ???
note: maybe best to do this one from the GUI?
Status
pveceph status
Sync config to other nodes
NOTE: this step shouldn't be needed, as you use the GUI it will auto sync!
# note: /etc/ceph/ceph.conf is linked to /etc/pve/ceph.conf
scp /etc/pve/ceph.conf root@[node2_ip]:/etc/pve/ceph.conf scp /etc/pve/ceph.conf root@[node3_ip]:/etc/pve/ceph.conf
file store - optional
Defaults to half Ceph storage
Create Metadata Server (MDS): (need at least 1)
pveceph mds create
Create CephFS
pveceph fs create -name CEPHFS -add-storage
- destroy -
Stop MDS first:
# default name is the nodename pveceph mds destroy [NAME] pveceph mds destroy proxmox1
pvesm status
Remove Storage:
pvesm remove [NAME] pvesm remove CEPHFS
Remove FS:
pveceph fs destroy [NAME] pveceph fs destroy CEPHFS
misc
MDS:
# pveceph mds destroy proxmox1 disabling service 'ceph-mds@proxmox1.service' Removed "/etc/systemd/system/ceph-mds.target.wants/ceph-mds@proxmox1.service". stopping service 'ceph-mds@proxmox1.service' removing ceph-mds directory '/var/lib/ceph/mds/ceph-proxmox1' removing ceph auth for 'mds.proxmox1'
ceph packages:
# dpkg -l | grep -i ceph ii ceph 18.2.2-pve1 amd64 distributed storage and file system ii ceph-base 18.2.2-pve1 amd64 common ceph daemon libraries and management tools ii ceph-common 18.2.2-pve1 amd64 common utilities to mount and interact with a ceph storage cluster ii ceph-fuse 18.2.2-pve1 amd64 FUSE-based client for the Ceph distributed file system ii ceph-mds 18.2.2-pve1 amd64 metadata server for the ceph distributed file system ii ceph-mgr 18.2.2-pve1 amd64 manager for the ceph distributed storage system ii ceph-mgr-modules-core 18.2.2-pve1 all ceph manager modules which are always enabled ii ceph-mon 18.2.2-pve1 amd64 monitor server for the ceph storage system ii ceph-osd 18.2.2-pve1 amd64 OSD server for the ceph storage system ii ceph-volume 18.2.2-pve1 all tool to facilidate OSD deployment ii libcephfs2 18.2.2-pve1 amd64 Ceph distributed file system client library ii libsqlite3-mod-ceph 18.2.2-pve1 amd64 SQLite3 VFS for Ceph ii python3-ceph-argparse 18.2.2-pve1 all Python 3 utility libraries for Ceph CLI ii python3-ceph-common 18.2.2-pve1 all Python 3 utility libraries for Ceph ii python3-cephfs 18.2.2-pve1 amd64 Python 3 libraries for the Ceph libcephfs library ii python3-rados 18.2.2-pve1 amd64 Python 3 libraries for the Ceph librados library ii python3-rbd 18.2.2-pve1 amd64 Python 3 libraries for the Ceph librbd library ii python3-rgw 18.2.2-pve1 amd64 Python 3 libraries for the Ceph librgw library
ceph config example:
# cat /etc/ceph/ceph.conf
[global]
        auth_client_required = cephx
        auth_cluster_required = cephx
        auth_service_required = cephx
        cluster_network = 10.10.108.0/24
        fsid = a33b5284-6139-4a1c-88b5-0xxxxxxxxxx
        mon_allow_pool_delete = true
        mon_host = 10.10.108.31 10.10.108.32 10.10.108.33
        ms_bind_ipv4 = true
        ms_bind_ipv6 = false
        osd_pool_default_min_size = 2
        osd_pool_default_size = 3
        public_network = 10.10.108.0/24
[client]
        keyring = /etc/pve/priv/$cluster.$name.keyring
[client.crash]
        keyring = /etc/pve/ceph/$cluster.$name.keyring
[mon.proxmox1]
        public_addr = 10.10.108.31
[mon.proxmox2]
        public_addr = 10.10.108.32
[mon.proxmox3]
        public_addr = 10.10.108.33
pveceph init options:
USAGE: pveceph init  [OPTIONS]
  Create initial ceph default configuration and setup symlinks.
  -cluster-network <string>
             Declare a separate cluster network, OSDs will routeheartbeat,
             object replication and recovery traffic over it
             Requires option(s): network
  -disable_cephx <boolean>   (default=0)
             Disable cephx authentication.
             WARNING: cephx is a security feature protecting against
             man-in-the-middle attacks. Only consider disabling cephx if
             your network is private!
  -min_size  <integer> (1 - 7)   (default=2)
             Minimum number of available replicas per object to allow I/O
  -network   <string>
             Use specific network for all ceph related traffic
  -pg_bits   <integer> (6 - 14)   (default=6)
             Placement group bits, used to specify the default number of
             placement groups.
             Depreacted. This setting was deprecated in recent Ceph
             versions.
  -size      <integer> (1 - 7)   (default=3)
             Targeted number of replicas per object
Ceph on LVM
List:
pvesm status # should show "local-lvm" pvesm list local-lvm # list usage
Move everything off, so it is free to remove.
Remove local-lvm:
pvesm remove local-lvm lvremove /dev/pve/data
bootstrap auth:
ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring
Create new logical volume with the remaining free space:
lvcreate -l 100%FREE -n pve/ceph
Create (= prepare and activate) the logical volume for OSD:
ceph-volume lvm create --data pve/ceph
Use GUI to create Metadata servers, create CephFS, etc
Ref: https://forum.proxmox.com/threads/ceph-osd-on-lvm-logical-volume.68618/
Undo local-lvm
to undo the removal of local-lvm: [2]
# lvcreate -l99%FREE -n newLvName newVgName # lvconvert --type thin newVgName/newLvName
pvesm status pvesm add lvmthin local-lvm --thinpool data --vgname pve
OSD Service Stopped
In "ceph osd df tree" (or web ui) the "up"/"down" is controled by the "service". The "in"/"out" is controlled by ceph. To start/stop the service: (you can also start/stop the service from Proxmox web interface)
$ sudo systemctl status ceph-osd.target
- ceph-osd.target - ceph target allowing to start/stop all ceph-osd@.service instances at once
    Loaded: loaded (/lib/systemd/system/ceph-osd.target; enabled; preset: enabled)
    Active: active since Sat 2024-09-07 21:50:39 MDT; 1min 16s ago
$ sudo systemctl status ceph-osd.target
$ sudo systemctl stop ceph-osd.target
$ sudo systemctl start ceph-osd.target
$ sudo systemctl restart ceph-osd.target
Verify Auth
ceph auth ls
Issues
ceph auth get client.bootstrap-osd - Error initializing cluster client
Error:
# ceph auth get client.bootstrap-osd
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')
Fix with:
ln -s /etc/pve/ceph.conf /etc/ceph/ceph.conf
ref: [SOLVED] - Ceph Pacific Issue | Proxmox Support Forum - https://forum.proxmox.com/threads/ceph-pacific-issue.127987/