[Gluster-users] [ovirt-users] Recovering from a multi-node failure
Sahina Bose
sabose at redhat.com
Wed Aug 16 12:22:46 UTC 2017
On Sun, Aug 6, 2017 at 4:42 AM, Jim Kusznir <jim at palousetech.com> wrote:
> Well, after a very stressful weekend, I think I have things largely
> working. Turns out that most of the above issues were caused by the linux
> permissions of the exports for all three volumes (they had been reset to
> 600; setting them to 774 or 770 fixed many of the issues). Of course, I
> didn't find that until a much more harrowing outage, and hours and hours of
> work, including beginning to look at rebuilding my cluster....
>
> So, now my cluster is operating again, and everything looks good EXCEPT
> for one major Gluster issue/question that I haven't found any references or
> info on.
>
> my host ovirt2, one of the replica gluster servers, is the one that lost
> its storage and had to reinitialize it from the cluster. the iso volume is
> perfectly fine and complete, but the engine and data volumes are smaller on
> disk on this node than on the other node (and this node before the crash).
> On the engine store, the entire cluster reports the smaller utilization on
> mounted gluster filesystems; on the data partition, it reports the larger
> size (rest of cluster). Here's some df statments to help clarify:
>
> (brick1 = engine; brick2=data, brick4=iso):
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/gluster-engine 25G 12G 14G 47% /gluster/brick1
> /dev/mapper/gluster-data 136G 125G 12G 92% /gluster/brick2
> /dev/mapper/gluster-iso 25G 7.3G 18G 29% /gluster/brick4
> 192.168.8.11:/engine 15G 9.7G 5.4G 65%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine
> 192.168.8.11:/data 136G 125G 12G 92%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_data
> 192.168.8.11:/iso 13G 7.3G 5.8G 56%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_iso
>
> View from ovirt2:
> Filesystem Size Used Avail Use% Mounted on
> /dev/mapper/gluster-engine 15G 9.7G 5.4G 65% /gluster/brick1
> /dev/mapper/gluster-data 174G 119G 56G 69% /gluster/brick2
> /dev/mapper/gluster-iso 13G 7.3G 5.8G 56% /gluster/brick4
> 192.168.8.11:/engine 15G 9.7G 5.4G 65%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine
> 192.168.8.11:/data 136G 125G 12G 92%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_data
> 192.168.8.11:/iso 13G 7.3G 5.8G 56%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_iso
>
> As you can see, in the process of rebuilding the hard drive for ovirt2, I
> did resize some things to give more space to data, where I desperately need
> it. If this goes well and the storage is given a clean bill of health at
> this time, then I will take ovirt1 down and resize to match ovirt2, and
> thus score a decent increase in storage for data. I fully realize that
> right now the gluster mounted volumes should have the total size as the
> least common denominator.
>
> So, is this size reduction appropriate? A big part of me thinks data is
> missing, but I even went through and shut down ovirt2's gluster daemons,
> wiped all the gluster data, and restarted gluster to allow it a fresh heal
> attempt, and it again came back to the exact same size. This cluster was
> originally built about the time ovirt 4.0 came out, and has been upgraded
> to 'current', so perhaps some new gluster features are making more
> efficient use of space (dedupe or something)?
>
The used capacity should be consistent on all nodes - I see you have a
discrepancy with the data volume brick. What does "gluster vol heal data
info" tell you? Are there entries to be healed?
Can you provide the glustershd logs?
>
> Thank you for your assistance!
> --JIm
>
> On Fri, Aug 4, 2017 at 7:49 PM, Jim Kusznir <jim at palousetech.com> wrote:
>
>> Hi all:
>>
>> Today has been rough. two of my three nodes went down today, and self
>> heal has not been healing well. 4 hours later, VMs are running. but the
>> engine is not happy. It claims the storage domain is down (even though it
>> is up on all hosts and VMs are running). I'm getting a ton of these
>> messages logging:
>>
>> VDSM engine3 command HSMGetAllTasksStatusesVDS failed: Not SPM
>>
>> Aug 4, 2017 7:23:00 PM
>>
>> VDSM engine3 command SpmStatusVDS failed: Error validating master storage
>> domain: ('MD read error',)
>>
>> Aug 4, 2017 7:22:49 PM
>>
>> VDSM engine3 command ConnectStoragePoolVDS failed: Cannot find master
>> domain: u'spUUID=5868392a-0148-02cf-014d-000000000121,
>> msdUUID=cdaf180c-fde6-4cb3-b6e5-b6bd869c8770'
>>
>> Aug 4, 2017 7:22:47 PM
>>
>> VDSM engine1 command ConnectStoragePoolVDS failed: Cannot find master
>> domain: u'spUUID=5868392a-0148-02cf-014d-000000000121,
>> msdUUID=cdaf180c-fde6-4cb3-b6e5-b6bd869c8770'
>>
>> Aug 4, 2017 7:22:46 PM
>>
>> VDSM engine2 command SpmStatusVDS failed: Error validating master storage
>> domain: ('MD read error',)
>>
>> Aug 4, 2017 7:22:44 PM
>>
>> VDSM engine2 command ConnectStoragePoolVDS failed: Cannot find master
>> domain: u'spUUID=5868392a-0148-02cf-014d-000000000121,
>> msdUUID=cdaf180c-fde6-4cb3-b6e5-b6bd869c8770'
>>
>> Aug 4, 2017 7:22:42 PM
>>
>> VDSM engine1 command HSMGetAllTasksStatusesVDS failed: Not SPM: ()
>>
>>
>> ------------
>> I cannot set an SPM as it claims the storage domain is down; I cannot set
>> the storage domain up.
>>
>> Also in the storage realm, one of my exports shows substantially less
>> data than is actually there.
>>
>> Here's what happened, as best as I understood them:
>> I went to do maintence on ovirt2 (needed to replace a faulty ram stick
>> and rework the disk). I put it in maintence mode, then shut it down and
>> did my work. In the process, much of the disk contents was lost (all the
>> gluster data). I figure, no big deal, the gluster data is redundant on the
>> network, it will heal when it comes back up.
>>
>> While I was doing maintence, all but one of the VMs were running on
>> engine1. When I turned on engine2, all of the sudden, all vms including
>> the main engine stop and go non-responsive. As far as I can tell, this
>> should not have happened, as I turned ON one host, but none the less, I
>> waited for recovery to occur (while customers started calling asking why
>> everything stopped working....). As I waited, I was checking, and gluster
>> volume status only showed ovirt1 and ovirt2....Apparently gluster had
>> stopped/failed at some point on ovirt3. I assume that was the cause of the
>> outage, still, if everything was working fine with ovirt1 gluster, and
>> ovirt2 powers on with a very broke gluster (the volume status was showing
>> NA for the port fileds for the gluster volumes), I would not expect to have
>> a working gluster go stupid like that.
>>
>> After starting ovirt3 glusterd and checking the status, all three showed
>> ovirt1 and ovirt3 as operational, and ovirt2 as NA. Unfortunately,
>> recovery was still not happening, so I did some googling and found about
>> the commands to inquire about the hosted-engine status. It appeared to be
>> stuck "paused" and I couldn't find a way to unpause it, so I poweroff'ed
>> it, then started it manually on engine 1, and the cluster came back up. It
>> showed all VMs paused. I was able to unpause them and they worked again.
>>
>> So now I began to work the ovirt2 gluster healing problem. It didn't
>> appear to be self-healing, but eventually I found this document:
>> https://support.rackspace.com/how-to/recover-from-a-failed-s
>> erver-in-a-glusterfs-array/
>> and from that found the magic xattr commands. After setting them,
>> gluster volumes on ovirt2 came online. I told iso to heal, and it did but
>> only came up about half as much data as it should have. I told it heal
>> full, and it did finish off the remaining data, and came up to full. I
>> then told engine to do a full heal (gluster volume heal engine full), and
>> it transferred its data from the other gluster hosts too. However, it said
>> it was done when it hit 9.7GB while there was 15GB on disk! It is still
>> stuck that way; ovirt gui and gluster volume heal engine info both show the
>> volume fully healed, but it is not:
>> [root at ovirt1 ~]# df -h
>> Filesystem Size Used Avail Use% Mounted on
>> /dev/mapper/centos_ovirt-root 20G 4.2G 16G 21% /
>> devtmpfs 16G 0 16G 0% /dev
>> tmpfs 16G 16K 16G 1% /dev/shm
>> tmpfs 16G 26M 16G 1% /run
>> tmpfs 16G 0 16G 0% /sys/fs/cgroup
>> /dev/mapper/gluster-engine 25G 12G 14G 47% /gluster/brick1
>> /dev/sda1 497M 315M 183M 64% /boot
>> /dev/mapper/gluster-data 136G 124G 13G 92% /gluster/brick2
>> /dev/mapper/gluster-iso 25G 7.3G 18G 29% /gluster/brick4
>> tmpfs 3.2G 0 3.2G 0% /run/user/0
>> 192.168.8.11:/engine 15G 9.7G 5.4G 65%
>> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine
>> 192.168.8.11:/data 136G 124G 13G 92%
>> /rhev/data-center/mnt/glusterSD/192.168.8.11:_data
>> 192.168.8.11:/iso 13G 7.3G 5.8G 56%
>> /rhev/data-center/mnt/glusterSD/192.168.8.11:_iso
>>
>> This is from ovirt1, and before the work, both ovirt1 and ovirt2's brings
>> had the same usage. ovirt2's bricks and the gluster mountpoints agree on
>> iso and engine, but as you can see, not here. If I do a du -sh on
>> /rhev/data-center/mnt/glusterSD/..../_engine, it comes back with the
>> 12GB number (/brick1 is engine, brick2 is data and brick4 is iso).
>> However, gluster still says its only 9.7G. I haven't figured out how to
>> get it to finish "healing".
>>
>> data is in the process of healing currently.
>>
>> So, I think I have two main things to solve right now:
>>
>> 1) how do I get ovirt to see the data center/storage domain as online
>> again?
>> 2) How do I get engine to finish healing to ovirt2?
>>
>> Thanks all for reading this very long message!
>> --Jim
>>
>>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170816/c5b8179e/attachment.html>
More information about the Gluster-users
mailing list