[Gluster-users] Healing issue
Miloš Kozák
milos.kozak at lejmr.com
Sun Aug 16 10:52:05 UTC 2015
Hi, I have been running an glusterfs for a while, and everything works
just fine even after one node failure.. However, I went for brick
replacement due to my bricks were not thin-provisioned and I wanted to
use snapshots. In short, whole volume went down due to heal daemon which
tool all IO and all VMs running on top of that volume started to be
unresponsive.
In short, I am rebuilding the volume from scratch. I created new thinly
provisioned bricks:
lvs:
brick_s3-sata-10k vg_s3-sata-10k Vwi-aotz 931,25g
s3-sata-10k_pool 2,95
s3-sata-10k_pool vg_s3-sata-10k twi-a-tz 931,25g
vgs:
vg_s3-sata-10k 1 3 0 wz--n- 931,51g 148,00m
df:
/dev/mapper/vg_s3--sata--10k-brick_s3--sata--10k 976009600 28383480
947626120 3% /gfs/s3-sata-10k
and mounted. When I uploaded two images onto it I found there might be a
problem. For the time being I run the volume in replica 2 mode on top of
two servers. The files were copied from node1, and I think the files are
OK on node1 only. However, the volume heal indicates everything is OK.
My symptoms are as follows:
df information from both servers:
/dev/mapper/vg_s3--sata--10k-brick_s3--sata--10k 976009600 30754296
945255304 4% /gfs/s3-sata-10k
/dev/mapper/vg_s3--sata--10k-brick_s3--sata--10k 976009600 28383480
947626120 3% /gfs/s3-sata-10k
[root at nodef01i ~]# du /gfs/s3-sata-10k/
0 /gfs/s3-sata-10k/fs/.glusterfs/indices/xattrop
0 /gfs/s3-sata-10k/fs/.glusterfs/indices
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/htime
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/csnap
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs
0 /gfs/s3-sata-10k/fs/.glusterfs/00/00
0 /gfs/s3-sata-10k/fs/.glusterfs/00
0 /gfs/s3-sata-10k/fs/.glusterfs/landfill
20480004 /gfs/s3-sata-10k/fs/.glusterfs/84/26
20480004 /gfs/s3-sata-10k/fs/.glusterfs/84
10240000 /gfs/s3-sata-10k/fs/.glusterfs/d0/ff
10240000 /gfs/s3-sata-10k/fs/.glusterfs/d0
30720008 /gfs/s3-sata-10k/fs/.glusterfs
30720008 /gfs/s3-sata-10k/fs
30720008 /gfs/s3-sata-10k/
[root at nodef02i ~]# du /gfs/s3-sata-10k/
0 /gfs/s3-sata-10k/fs/.glusterfs/indices/xattrop
0 /gfs/s3-sata-10k/fs/.glusterfs/indices
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/htime
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/csnap
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs
0 /gfs/s3-sata-10k/fs/.glusterfs/00/00
0 /gfs/s3-sata-10k/fs/.glusterfs/00
0 /gfs/s3-sata-10k/fs/.glusterfs/landfill
18727172 /gfs/s3-sata-10k/fs/.glusterfs/84/26
18727172 /gfs/s3-sata-10k/fs/.glusterfs/84
9622016 /gfs/s3-sata-10k/fs/.glusterfs/d0/ff
9622016 /gfs/s3-sata-10k/fs/.glusterfs/d0
28349192 /gfs/s3-sata-10k/fs/.glusterfs
28349192 /gfs/s3-sata-10k/fs
28349192 /gfs/s3-sata-10k/
[root at nodef01i ~]# du /gfs/s3-sata-10k/fs/*
20480004 /gfs/s3-sata-10k/fs/f1607f25aa52f4fb6f98f20ef0f3f9d7
10240000 /gfs/s3-sata-10k/fs/3706a2cb0bb27ba5787b3c12388f4ebb
[root at nodef02i ~]# du /gfs/s3-sata-10k/fs/*
18727172 /gfs/s3-sata-10k/fs/f1607f25aa52f4fb6f98f20ef0f3f9d7
9622016 /gfs/s3-sata-10k/fs/3706a2cb0bb27ba5787b3c12388f4ebb
[root at nodef01i ~]# ll /gfs/s3-sata-10k/fs/
celkem 30720004
-rw-r----- 2 oneadmin oneadmin 20971520512 3. srp 23.53
f1607f25aa52f4fb6f98f20ef0f3f9d7
-rw-r----- 2 oneadmin oneadmin 10485760000 16. srp 11.23
3706a2cb0bb27ba5787b3c12388f4ebb
[root at nodef02i ~]# ll /gfs/s3-sata-10k/fs/
celkem 28349188
-rw-r----- 2 oneadmin oneadmin 20971520512 3. srp 23.53
f1607f25aa52f4fb6f98f20ef0f3f9d7
-rw-r----- 2 oneadmin oneadmin 10485760000 16. srp 11.22
3706a2cb0bb27ba5787b3c12388f4ebb
[root at nodef01i ~]# gluster volume heal ph-fs-0 info split-brain
Gathering list of split brain entries on volume ph-fs-0 has been successful
Brick 10.11.100.1:/gfs/s3-sata-10k/fs
Number of entries: 0
Brick 10.11.100.2:/gfs/s3-sata-10k/fs
Number of entries: 0
[root at nodef01i ~]# gluster volume heal ph-fs-0 info
Brick nodef01i.czprg:/gfs/s3-sata-10k/fs/
Number of entries: 0
Brick nodef02i.czprg:/gfs/s3-sata-10k/fs/
Number of entries: 0
[root at nodef01i ~]# gluster volume status
Status of volume: ph-fs-0
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.11.100.1:/gfs/s3-sata-10k/fs 49152 Y 3733
Brick 10.11.100.2:/gfs/s3-sata-10k/fs 49152 Y 64711
NFS Server on localhost 2049 Y 3747
Self-heal Daemon on localhost N/A Y 3752
NFS Server on 10.11.100.2 2049 Y 64725
Self-heal Daemon on 10.11.100.2 N/A Y 64730
Task Status of Volume ph-fs-0
------------------------------------------------------------------------------
There are no active volume tasks
[root at nodef02i ~]# gluster volume status
Status of volume: ph-fs-0
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.11.100.1:/gfs/s3-sata-10k/fs 49152 Y 3733
Brick 10.11.100.2:/gfs/s3-sata-10k/fs 49152 Y 64711
NFS Server on localhost 2049 Y 64725
Self-heal Daemon on localhost N/A Y 64730
NFS Server on 10.11.100.1 2049 Y 3747
Self-heal Daemon on 10.11.100.1 N/A Y 3752
Task Status of Volume ph-fs-0
------------------------------------------------------------------------------
There are no active volume tasks
[root at nodef02i ~]# rpm -qa | grep gluster
glusterfs-server-3.6.2-1.el6.x86_64
glusterfs-3.6.2-1.el6.x86_64
glusterfs-api-3.6.2-1.el6.x86_64
glusterfs-libs-3.6.2-1.el6.x86_64
glusterfs-cli-3.6.2-1.el6.x86_64
glusterfs-fuse-3.6.2-1.el6.x86_64
What other information should I provide?
Thanks Milos
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd.log
Type: text/x-log
Size: 22577 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150816/db94bca8/attachment.bin>
More information about the Gluster-users
mailing list