搭建glusterfs集群
搭建glusterfs集群
Glusterfs簡介
GlusterFS是Scale-Out存儲解決方案Gluster的核心,它是一個開源的分布式文件系統,具有強大的橫向擴展能力,通過擴展能夠支持數PB存儲容量和處理數千客戶端。GlusterFS借助TCP/IP或InfiniBandRDMA網絡將物理分布的存儲資源聚集在一起,使用單一全局命名空間來管理數據。
說起glusterfs可能比較陌生,可能大家更多的聽說和使用的是NFS,GFS,HDFS之類的,這之中的NFS應該是使用最為廣泛的,簡單易于管理,但是NFS以及后邊會說到MooseFS都會存在單點故障,為了解決這個問題一般情況下都會結合DRBD進行塊兒復制。但是glusterfs就完全不用考慮這個問題了,因為它是一個完全的無中心的系統。
glusterfs官網
glusterfs文檔
Glusterfs特點
-
擴展性和高性能
GlusterFS利用雙重特性來提供幾TB至數PB的高擴展存儲解決方案。Scale-Out架構允許通過簡單地增加資源來提高存儲容量和性能,磁盤、計算和I/O資源都可以獨立增加,支持10GbE和InfiniBand等高速網絡互聯。Gluster彈性哈希(ElasticHash)解除了GlusterFS對元數據服務器的需求,消除了單點故障和性能瓶頸,真正實現了并行化數據訪問。
-
高可用性
GlusterFS可以對文件進行自動復制,如鏡像或多次復制,從而確保數據總是可以訪問,甚至是在硬件故障的情況下也能正常訪問。自我修復功能能夠把數據恢復到正確的狀態,而且修復是以增量的方式在后臺執行,幾乎不會產生性能負載。GlusterFS沒有設計自己的私有數據文件格式,而是采用操作系統中主流標準的磁盤文件系統(如EXT3、ZFS)來存儲文件,因此數據可以使用各種標準工具進行復制和訪問。
-
彈性卷管理
數據儲存在邏輯卷中,邏輯卷可以從虛擬化的物理存儲池進行獨立邏輯劃分而得到。存儲服務器可以在線進行增加和移除,不會導致應用中斷。邏輯卷可以在所有配置服務器中增長和縮減,可以在不同服務器遷移進行容量均衡,或者增加和移除系統,這些操作都可在線進行。文件系統配置更改也可以實時在線進行并應用,從而可以適應工作負載條件變化或在線性能調優。
系統說明
GlusterFS主要分為Server端和Client端,其中server端主要包含glusterd和glusterfsd兩種進程,分別用于管理GlusterFS系統進程本身(監聽24007端口)和存儲塊—brick(一個brick對應一個glusterfsd進程,并監聽49152+端口);
GlusterFS的Client端支持NFS、CIFS、FTP、libgfapi以及基于FUSE的本機客戶端等多種訪問方式,生產中我們一般都會使用基于FUSE的客戶端(其他的可以自行嘗試)。
GlusterFS的配置文件保存在/var/lib/glusterd下,日志保存在/var/log下。生產中建議搭建6臺以上服務節點,卷類型使用分布式復制卷,副本數設為3,基本文件系統類型使用xfs。(brick數為副本數的倍數時,復制卷會自動轉化為分布式復制卷)
系統安裝
環境說明
| 192.168.2.10 | server1 | /dev/sdb | ext4 |
| 192.168.2.11 | server2 | /dev/sdb | ext4 |
| 192.168.2.12 | server3 | /dev/sdb | ext4 |
| 192.168.2.13 | server4 | /dev/sdb | ext4 |
hostname設置及ssh免密登錄設置(以server1為例,每臺server都需要設置):
[root@server1 ~]# vim /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.2.10 server1 192.168.2.11 server2 192.168.2.12 server3 192.168.2.13 server4[root@server1 ~]# ssh-keygen [root@server1 ~]# for i in {10..13} > do > scp /etc/hosts root@192.168.2.$i:/etc/hosts > ssh-copy-id root@192.168.2.$i > done [root@server1 ~]# ssh root@server2 [root@server2 ~]# 登出 Connection to server2 closed.關閉防火墻
[root@server1 ~]# systemctl stop firewalld && systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.selinux設置
[root@server1 ~]# vim /etc/selinux/config SELINUX=disabled [root@server1 ~]# setenforce 0 [root@server1 ~]# getenforce Permissive [root@server1 ~]# yum install -y flex bison openssl-devel libacl-devel sqlite-devel libxml2-devel libtool automake autoconf gcc attr gcc gcc-c++ libuuid-devel# liburcu-bp需源碼安裝,yum源里面沒有 [root@server1 ~]# wget https://github.com/urcu/userspace-rcu/archive/v0.7.16.tar.gz -O userspace-rcu-0.7.16.tar.gz [root@server1 ~]# tar -xf userspace-rcu-0.7.16.tar.gz [root@server1 ~]# cd /root/userspace-rcu-0.7.16# 先執行常規命令安裝,進入源碼目錄后 [root@server1 userspace-rcu-0.7.16]# ./bootstrap [root@server1 userspace-rcu-0.7.16]# ./configure && make && make install# 執行完常規安裝命令后需要執行下面兩個命令,才可以讓系統找到urcu. [root@server1 userspace-rcu-0.7.16]# ldconfig # 進行動態緩存,一定要記得執行,否則后面執行啟動glusterd時報錯排錯排到懷疑人生!!!!!!!!!!!!! [root@server1 userspace-rcu-0.7.16]# pkg-config --libs --cflags liburcu-bp.pc liburcu.pc -I/usr/local/include -L/usr/local/lib -lurcu-bp -lurcu # 此外如果要geo 復制功能,需要額外安裝,并開啟ssh服務: [root@server1 ~]# yum -y install passwd openssh-client openssh-server安裝完以上依賴后,我們從官網下載源碼,再編譯glusterfs,gluserfs編譯命令為常規命令,配置時加上–enable-debug表示編譯為帶debug信息的調試版本在官網下載GlusterFS源碼包
[root@server1 ~]# wget https://download.gluster.org/pub/gluster/glusterfs/8/8.2/glusterfs-8.2.tar.gz [root@server1 ~]# tar -xf glusterfs-8.2.tar.gz [root@server1 ~]# cd glusterfs-8.2/ [root@server1 glusterfs-8.2]# ./autogen.sh ... GlusterFS autogen ...Running aclocal... Running autoheader... Running libtoolize... Running autoconf... Running automake...Please proceed with configuring, compiling, and installing. [root@server1 glusterfs-8.2]# ./configure --prefix=/usr/local GlusterFS configure summary =========================== FUSE client : yes epoll IO multiplex : yes fusermount : yes readline : no georeplication : yes Linux-AIO : no Enable Debug : no Enable ASAN : no Enable TSAN : no Use syslog : yes XML output : yes Unit Tests : no Track priv ports : yes POSIX ACLs : yes SELinux features : yes firewalld-config : no Events : yes EC dynamic support : x64 sse avx Use memory pools : yes Nanosecond m/atimes : yes Server components : yes Legacy gNFS server : no IPV6 default : no Use TIRPC : missing With Python : 2.7 Cloudsync : yes Link with TCMALLOC : no[root@server1 glusterfs-8.2]# make && make install在使用glusterfs的時候要注意,局域網內的主機名不能相同,并且主機名可以解析
# 啟動glusterd [root@server1 ~]# systemctl start glusterd.service [root@server1 ~]# systemctl enable glusterd.service Created symlink from /etc/systemd/system/multi-user.target.wants/glusterd.service to /usr/local/lib/systemd/system/glusterd.service. [root@server1 ~]# systemctl status glusterd.service ● glusterd.service - GlusterFS, a clustered file-system serverLoaded: loaded (/usr/local/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)Active: active (running) since 三 2020-11-11 11:16:54 CST; 34s agoDocs: man:glusterd(8)Main PID: 80492 (glusterd)CGroup: /system.slice/glusterd.service└─80492 /usr/local/sbin/glusterd -p /usr/local/var/run/glusterd.pid --log-level INFO[root@server1 ~]# ps -ef| grep gluster root 80492 1 0 11:16 ? 00:00:00 /usr/local/sbin/glusterd -p /usr/local/var/run/glusterd.pid --log-level INFO root 80571 2060 0 11:17 pts/0 00:00:00 grep --color=auto glusterGLUSTERFS集群規劃和配置
整體流程:分區----格式化—掛載
分區
[root@server1 ~]# fdisk -l # 查看當前分區磁盤情況磁盤 /dev/sda:21.5 GB, 21474836480 字節,41943040 個扇區 Units = 扇區 of 1 * 512 = 512 bytes 扇區大小(邏輯/物理):512 字節 / 512 字節 I/O 大小(最小/最佳):512 字節 / 512 字節 磁盤標簽類型:dos 磁盤標識符:0x000c1619設備 Boot Start End Blocks Id System /dev/sda1 * 2048 2099199 1048576 83 Linux /dev/sda2 2099200 41943039 19921920 8e Linux LVM磁盤 /dev/mapper/centos-root:18.2 GB, 18249416704 字節,35643392 個扇區 Units = 扇區 of 1 * 512 = 512 bytes 扇區大小(邏輯/物理):512 字節 / 512 字節 I/O 大小(最小/最佳):512 字節 / 512 字節磁盤 /dev/mapper/centos-swap:2147 MB, 2147483648 字節,4194304 個扇區 Units = 扇區 of 1 * 512 = 512 bytes 扇區大小(邏輯/物理):512 字節 / 512 字節 I/O 大小(最小/最佳):512 字節 / 512 字節[root@server1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm /└─centos-swap 253:1 0 2G 0 lvm [SWAP] sr0 11:0 1 9.6G 0 rom /etc/gz# 在本次實驗我選擇新加一個磁盤,我是用的是VMware,直接關機添加磁盤即可 [root@server1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm /└─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 20G 0 disk sr0 11:0 1 9.6G 0 rom /etc/gz# 接下來!對/dev/sdb進行一系列各種各樣的操作!!!! [root@server1 ~]# fdisk /dev/sdb # 分區 歡迎使用 fdisk (util-linux 2.23.2)。更改將停留在內存中,直到您決定將更改寫入磁盤。 使用寫入命令前請三思。Device does not contain a recognized partition table 使用磁盤標識符 0x218b7141 創建新的 DOS 磁盤標簽。命令(輸入 m 獲取幫助):n # 新建 Partition type:p primary (0 primary, 0 extended, 4 free)e extended Select (default p): # 添加分區類型,選擇默認回車即可 Using default response p 分區號 (1-4,默認 1): # 回車 起始 扇區 (2048-41943039,默認為 2048): # 回車 將使用默認值 2048 Last 扇區, +扇區 or +size{K,M,G} (2048-41943039,默認為 41943039):+5G # 自定義分區大小,以M、G等為結尾 分區 1 已設置為 Linux 類型,大小設為 5 GiB命令(輸入 m 獲取幫助):p # 打印當前分區磁盤 /dev/sdb:21.5 GB, 21474836480 字節,41943040 個扇區 Units = 扇區 of 1 * 512 = 512 bytes 扇區大小(邏輯/物理):512 字節 / 512 字節 I/O 大小(最小/最佳):512 字節 / 512 字節 磁盤標簽類型:dos 磁盤標識符:0x218b7141設備 Boot Start End Blocks Id System /dev/sdb1 2048 10487807 5242880 83 Linux命令(輸入 m 獲取幫助):w # 保存退出 The partition table has been altered!Calling ioctl() to re-read partition table. 正在同步磁盤。# 格式化 [root@server1 ~]# mkfs.ext4 /dev/sdb1 [root@server1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm /└─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 20G 0 disk └─sdb1 8:17 0 5G 0 part sr0 11:0 1 9.6G 0 rom /etc/gz# 掛載 [root@server1 ~]# mkdir /node [root@server1 ~]# mount /dev/sdb1 /node/ [root@server1 ~]# df -h /node 文件系統 容量 已用 可用 已用% 掛載點 /dev/sdb1 4.8G 20M 4.6G 1% /node# 開機自動掛載 [root@server1 ~]# vim /etc/fstab /dev/sdb1 /node ext4 defaults 0 0 [root@server1 ~]# mount -a配置 glusterfs 集群
[root@server1 ~]# gluster peer status # 查看集群的狀態,當前信任池當中沒有其他主機 Number of Peers: 0 [root@server1 ~]# gluster peer probe server2 # 配置信任池(一端添加就行) peer probe: success. [root@server1 ~]# gluster peer probe server3 peer probe: success [root@server1 ~]# gluster peer probe server4 peer probe: success[root@server1 ~]# gluster peer status Number of Peers: 3Hostname: server2 Uuid: 687334ba-bec2-41eb-a51d-36779607bf59 State: Peer in Cluster (Connected)Hostname: server3 Uuid: 11cfe205-643e-4237-b171-7569f0cf1b57 State: Peer in Cluster (Connected)Hostname: server4 Uuid: 2282466a-14c8-4356-b645-b287c7929abd State: Peer in Cluster (Connected)[root@server1 ~]# gluster pool list # 查看存儲池 UUID Hostname State 687334ba-bec2-41eb-a51d-36779607bf59 server2 Connected 11cfe205-643e-4237-b171-7569f0cf1b57 server3 Connected 2282466a-14c8-4356-b645-b287c7929abd server4 Connected 37dc4314-61a3-4e37-be76-a4075ca59a71 localhost Connected # 創建卷 [root@server1 ~]# gluster volume list # 目前的集群沒有卷 No volumes present in cluster[root@server1 ~]#gluster volume create data replica 4 server1:/node server2:/node server3:/node server4:/node # 創建卷,但是!此時會發現報錯!------------------------------------------------------------------------------------------------------------ ps: 報錯如下: volume create: data: failed: The brick server1:/data/glusterfs is being created in the root partition. It is recommended that you don't use the system's root partition for storage backend. Or use 'force' at the end of the command if you want to override this behavior. 這是因為我們創建的brick在系統盤,這個在gluster的默認情況下是不允許的,生產環境下也盡可能的與系統盤分開,如果必須這樣請使用force,集群默認是不可以在root下創建卷!!! [root@server1 ~]# gluster volume create data replica 4 server1:/node server2:/node server3:/node server4:/node force # 根據提示在末尾加上force volume create: data: success: please start the volume to access data或者也可以在執行編譯時創建用戶,使用該用戶啟動GlusterFS,讓他對集群有完全權限------------------------------------------------------------------------------------------------------------單磁盤模式,調試環境推薦 [root@server1 ~]# gluster vol create test server1:/test force volume create: test: success: please start the volume to access data多磁盤,無raid,試驗、測試環境推薦 [root@server1 ~]# gluster vol create testdata server1:/testdata server2:/testdata server3:/testdata server4:/testdata force volume create: testdata: success: please start the volume to access data多磁盤,有raid1。線上高并發環境推薦。 [root@server1 ~]# gluster volume create data replica 4 server1:/node server2:/node server3:/node server4:/node volume create: data: success: please start the volume to access data注:以上命令中,磁盤數量必須為復制份數的整數倍。 此外有raid0,raid10,raid5,raid6等方法,但是在線上小文件集群不推薦使用。 ============================================================================================================[root@server1 ~]# gluster volume list # 集群中出現剛創建的卷 data test testdata[root@server1 ~]# gluster volume info # 查看卷的詳細信息Volume Name: data Type: Replicate Volume ID: 3c711bfd-599d-463b-bec4-51ef49a5be21 Status: Created Snapshot Count: 0 Number of Bricks: 1 x 4 = 4 Transport-type: tcp Bricks: Brick1: server1:/node Brick2: server2:/node Brick3: server3:/node Brick4: server4:/node Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on performance.client-io-threads: offVolume Name: test Type: Distribute Volume ID: e0161069-8913-43f6-abb6-f172441bfe35 Status: Created Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: server1:/test Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: onVolume Name: testdata Type: Distribute Volume ID: 15874f41-0fab-4f28-885b-747536d8ba22 Status: Created Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/testdata Brick2: server2:/testdata Brick3: server3:/testdata Brick4: server4:/testdata Options Reconfigured: storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on啟動卷: [root@server1 ~]# gluster volume start data # 啟動卷 volume start: data: success [root@server1 ~]# gluster vol start test volume start: test: success [root@server1 ~]# gluster vol start testdata volume start: testdata: success掛載測試
[root@server1 ~]# mkdir /mount_data [root@server1 ~]# mount -t glusterfs -o acl server1:/data /mount_data/ # server1將卷掛載 [root@server1 ~]# mkdir /mount_test [root@server1 ~]# mount -t glusterfs -o acl server1:/test /mount_test/ [root@server1 ~]# mkdir /mount_testdata [root@server1 ~]# mount -t glusterfs -o acl server1:/testdata/ /mount_testdata/[root@server1 ~]# df -h 文件系統 容量 已用 可用 已用% 掛載點 devtmpfs 475M 0 475M 0% /dev tmpfs 487M 0 487M 0% /dev/shm tmpfs 487M 7.7M 479M 2% /run tmpfs 487M 0 487M 0% /sys/fs/cgroup /dev/mapper/centos-root 37G 1.9G 36G 5% / /dev/sr0 9.6G 9.6G 0 100% /mnt/gz /dev/sda1 1014M 137M 878M 14% /boot tmpfs 98M 0 98M 0% /run/user/0 /dev/sdc1 4.8G 22M 4.6G 1% /node server1:/data 4.8G 71M 4.6G 2% /mount_data server1:/test 37G 2.2G 35G 6% /mount_test server1:/testdata 148G 8.8G 140G 6% /mount_testdata[root@server2 ~]# mount -t glusterfs -o acl server1:/data /mount_data/ # server1將卷掛載到mnt,但此處報錯沒有attr包 WARNING: getfattr not found, certain checks will be skipped.. [root@server2 ~]# yum -y install attr # 下載軟件包 [root@server2 ~]# touch /mnt/{1..10}test.txt # 在server2上創建文件[root@server1 ~]# ls /mount_data/ # 創建成功后在集群內的其他主機包括本機均可以查看到改文件夾下面創建的新文件 10test.txt 2test.txt 4test.txt 6test.txt 8test.txt lost+found 1test.txt 3test.txt 5test.txt 7test.txt 9test.txt [root@server4 ~]# ls /mount_data/ 10test.txt 2test.txt 4test.txt 6test.txt 8test.txt lost+found 1test.txt 3test.txt 5test.txt 7test.txt 9test.txt在線擴容
隨著業務的增長,集群容量不夠時,需要添加更多的機器和磁盤到集群中來。
a. 普通情況只需要增加分布的廣度就可以,增加的磁盤數量必須為最小擴容單元的整數倍,即replica×stripe,或disperse數的整數倍:
在線收縮
可能原先配置比例不合理,打算將部分存儲機器用于其他用途時,跟擴容一樣,也分兩種情況。
a. 降低分布廣度,移除的磁盤必須是一整個或多個存儲單元,在volume info的結果列表中是連續的多塊磁盤。該命令會自動均衡數據。 [root@server1 ~]# gluster vol remove-brick test server3:/datanode server4:/datanode start It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. Do you want to continue with your current cluster.force-migration settings? (y/n) y volume remove-brick start: success ID: 943f61e1-02da-4b79-a08a-55f06b7c468a啟動后需要查看刪除的狀態,實際是自動均衡的狀態,直到狀態從in progress變為completed。 [root@server1 ~]# gluster vol remove-brick test server3:/datanode server4:/datanode statusNode Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------server3 0 0Bytes 0 0 0 completed 0:00:00server4 0 0Bytes 0 0 0 completed 0:00:00狀態顯示執行完成后,提交該移除操作。 [root@server1 ~]# gluster vol remove-brick test server3:/datanode server4:/datanode commit volume remove-brick commit: success Check the removed bricks to ensure all files are migrated. If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. b. 降低備份數,移除磁盤必須是符合要求(好難表達)。在volume info的結果列表中一般是零散的多塊磁盤(ip可能是連續的)。該命令不需要均衡數據。 [root@server1 ~]# gluster vol remove-brick test server1:/datanode server2:/datanode force # 移除 Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume. Do you want to continue? (y/n) y volume remove-brick commit force: success[root@server1 ~]# gluster vol info testVolume Name: test Type: Distribute Volume ID: e0161069-8913-43f6-abb6-f172441bfe35 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: server1:/test Options Reconfigured: performance.client-io-threads: on storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on降低備份數時,只是簡單刪除,而且命令最后用的也是force參數,如果原先系統數據沒有復制好,那么也就會出現部分丟失。因此該操作需要極其謹慎。必須先保證數據完整,執行gluster volume heal vol_name full命令修復,并執行gluster volume heal vol_name info,和 gluster volume status檢查,確保數據正常情況下再進行。配置負載均衡
[root@server1 ~]# gluster vol reblance status # 當前沒有配置負載均衡 unrecognized word: reblance (position 1) [root@server1 ~]# gluster vol reblance testdata status # 查看卷的負載 unrecognized word: reblance (position 1) [root@server1 ~]# gluster vol rebalance testdata status volume rebalance: testdata: failed: Rebalance not started for volume testdata. [root@server1 ~]# gluster vol rebalance testdata start # 啟動負載均衡 volume rebalance: testdata: success: Rebalance on testdata has been started successfully. Use rebalance status command to check status of the rebalance process. ID: 7c7dd7d7-1515-4637-805d-dc5dc43f471b[root@server1 ~]# gluster vol rebalance testdata status # 啟動負載均衡后查看的狀態Node Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------server2 0 0Bytes 0 0 0 completed 0:00:00server3 0 0Bytes 0 0 0 completed 0:00:00server4 0 0Bytes 0 0 0 completed 0:00:00localhost 0 0Bytes 0 0 0 completed 0:00:00 volume rebalance: testdata: success設置卷的參數
[root@server1 ~]# gluster vol set testdata performance.cache-size 256MB # 設置 cache 大小(此處要根據實際情況,如果設置太大可能導致后面客戶端掛載失敗) volume set: success [root@server1 ~]# gluster vol infoVolume Name: testdata Type: Distribute Volume ID: 15874f41-0fab-4f28-885b-747536d8ba22 Status: Started Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/testdata Brick2: server2:/testdata Brick3: server3:/testdata Brick4: server4:/testdata Options Reconfigured: performance.cache-size: 256MB storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on配置客戶端
要求:GFS 客戶端節點必須能連通GFS服務器節點
[root@client ~]# yum install -y glusterfs glusterfs-fuse# 在GFS Client節點上,創建一個本地目錄: [root@client ~]# mkdir -p /test/gluster-test# 將本地目錄掛載到GFS Volume: [root@client ~]# mount.glusterfs 192.168.2.10:/data /test/gluster-test/# 查看掛載情況: [root@client ~]# df -h /test/gluster-test/ 文件系統 容量 已用 可用 已用% 掛載點 192.168.2.10:/data 34G 5.1G 29G 15% /test/gluster-test測試
單文件測試
測試方法:在客戶端創建一個1G大小的文件- DHT模式,默認模式,既DHT, 也叫 分布卷: 將文件已hash算法隨機分布到 一臺服務器節點中存儲。 [root@client ~]# time dd if=/dev/zero of=hello bs=1000M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1048576000字節(1.0 GB)已復制,9.7207 秒,108 MB/秒real 0m9.858s user 0m0.002s sys 0m7.171s- AFR模式,復制模式,既AFR, 創建volume 時帶 replica x 數量: 將文件復制到 replica x 個節點中。 [root@client ~]# time dd if=/dev/zero of=hello.txt bs=1024M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1073741824字節(1.1 GB)已復制,5.06884 秒,212 MB/秒real 0m5.206s user 0m0.001s sys 0m3.194s- Striped 模式,條帶模式,既Striped, 創建volume 時帶 stripe x 數量: 將文件切割成數據塊,分別存儲到 stripe x 個節點中 ( 類似raid 0 )。 [root@client ~]# time dd if=/dev/zero of=hello bs=1000M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1048576000字節(1.0 GB)已復制,4.92539 秒,213 MB/秒real 0m5.047s user 0m0.001s sys 0m3.036s- 條帶復制卷模式 (Number of Bricks: 1 x 2 x 2 = 4),分布式條帶模式(組合型),最少需要4臺服務器才能創建。 創建volume 時 stripe 2 server = 4 個節點:是DHT 與 Striped 的組合型。 [root@client ~]# time dd if=/dev/zero of=hello bs=1000M count=1 記錄了1+0 的讀入 記錄了1+0 的寫出 1048576000字節(1.0 GB)已復制,5.0472 秒,208 MB/秒real 0m5.173s user 0m0.000s sys 0m3.098s- 分布式復制模式 (Number of Bricks: 2 x 2 = 4),分布式復制模式(組合型), 最少需要4臺服務器才能創建。 創建volume 時 replica 2 server = 4 個節點:是DHT 與 AFR 的組合型。 [root@client ~]# time dd if=/dev/zero of=haha bs=100M count=10 記錄了10+0 的讀入 記錄了10+0 的寫出 1048576000字節(1.0 GB)已復制,1.00275 秒,1.0 GB/秒real 0m1.018s user 0m0.001s sys 0m0.697s針對 分布式復制模式還做了如下測試:4K隨機測試: 寫測試: # 安裝fio [root@client ~]# yum -y install libaio-devel.x86_64 [root@client ~]# yum -y install fio [root@client ~]# fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=10G -filename=1.txt -name="EBS 4KB randwrite test" -iodepth=32 -runtime=60 write: IOPS=4111, BW=16.1MiB/s (16.8MB/s)(964MiB/60001msec) WRITE: bw=16.1MiB/s (16.8MB/s), 16.1MiB/s-16.1MiB/s (16.8MB/s-16.8MB/s), io=964MiB (1010MB), run=60001-60001msec讀測試: [root@client ~]# fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=10G -filename=1.txt -name="EBS 4KB randread test" -iodepth=8 -runtime=60 read: IOPS=77.5k, BW=303MiB/s (318MB/s)(10.0GiB/33805msec) READ: bw=303MiB/s (318MB/s), 303MiB/s-303MiB/s (318MB/s-318MB/s), io=10.0GiB (10.7GB), run=33805-33805msec512K順序寫測試 [root@client ~]# fio -ioengine=libaio -bs=512k -direct=1 -thread -rw=write -size=10G -filename=512.txt -name="EBS 512KB seqwrite test" -iodepth=64 -runtime=60 write: IOPS=1075, BW=531MiB/s (556MB/s)(2389MiB/4501msec) WRITE: bw=531MiB/s (556MB/s), 531MiB/s-531MiB/s (556MB/s-556MB/s), io=2389MiB (2505MB), run=4501-4501msec其他的維護命令
卸載volume 卸載與掛載操作是一對。雖然沒有卸載也可以停止volume,但是這樣做是會出問題,如果集群較大,可能導致后面volume啟動失敗。 [root@server1 ~]# umount /mount_test停止volume 停止與啟動操作是一對。停止前最好先卸載所有客戶端。 [root@server1 ~]# gluster vol stop test Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: test: success刪除volume [root@server1 ~]# gluster vol delete test Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: test: success注: 刪除 磁盤 以后,必須刪除 磁盤( /opt/gluster/data ) 中的 ( .glusterfs/ .trashcan/ )目錄。 否則創建新 volume 相同的 磁盤 會出現文件 不分布,或者 類型 錯亂 的問題。------------------------------------------------------------------------------------------------------------ 卸載某個節點GlusterFS磁盤 [root@server1 ~]# gluster peer detach server4 # 提示如果要卸載該節點的磁盤就要先remove-brick All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y peer detach: failed: Peer server4 hosts one or more bricks. If the peer is in not recoverable state then use either replace-brick or remove-brick command with force to remove all bricks from the peer and attempt the peer detach again.[root@server1 ~]# gluster vol info testdataVolume Name: testdata Type: Distribute Volume ID: 15874f41-0fab-4f28-885b-747536d8ba22 Status: Started Snapshot Count: 0 Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/testdata Brick2: server2:/testdata Brick3: server3:/testdata Brick4: server4:/testdata Options Reconfigured: performance.cache-size: 256MB storage.fips-mode-rchecksum: on transport.address-family: inet nfs.disable: on[root@server1 ~]# gluster vol remove-brick testdata server4:/testdata start # remove-brick移除的時候不能選擇replica 復制數的卷,否則commit時會報錯 It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. Do you want to continue with your current cluster.force-migration settings? (y/n) y volume remove-brick start: success ID: 24084420-9676-4f10-ac43-72ddc35caccf [root@server1 ~]# gluster vol remove-brick testdata server4:/testdata statusNode Rebalanced-files size scanned failures skipped status run time in h:m:s--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------server4 0 0Bytes 0 0 0 completed 0:00:00 [root@server1 ~]# gluster vol remove-brick testdata server4:/testdata commit volume remove-brick commit: success Check the removed bricks to ensure all files are migrated. If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. [root@server1 ~]# gluster peer detach server4 # 移除節點 All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y peer detach: success總結
以上是生活随笔為你收集整理的搭建glusterfs集群的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 使用docker搭建Hadoop
- 下一篇: 初识puppet!