[Gluster-users] [ovirt-users] Recovering from a multi-node failure

Wed Aug 16 12:22:46 UTC 2017

On Sun, Aug 6, 2017 at 4:42 AM, Jim Kusznir <jim at palousetech.com> wrote:

> Well, after a very stressful weekend, I think I have things largely
> working.  Turns out that most of the above issues were caused by the linux
> permissions of the exports for all three volumes (they had been reset to
> 600; setting them to 774 or 770 fixed many of the issues).  Of course, I
> didn't find that until a much more harrowing outage, and hours and hours of
> work, including beginning to look at rebuilding my cluster....
>
> So, now my cluster is operating again, and everything looks good EXCEPT
> for one major Gluster issue/question that I haven't found any references or
> info on.
>
> my host ovirt2, one of the replica gluster servers, is the one that lost
> its storage and had to reinitialize it from the cluster.  the iso volume is
> perfectly fine and complete, but the engine and data volumes are smaller on
> disk on this node than on the other node (and this node before the crash).
> On the engine store, the entire cluster reports the smaller utilization on
> mounted gluster filesystems; on the data partition, it reports the larger
> size (rest of cluster).  Here's some df statments to help clarify:
>
> (brick1 = engine; brick2=data, brick4=iso):
> Filesystem                     Size  Used Avail Use% Mounted on
> /dev/mapper/gluster-engine      25G   12G   14G  47% /gluster/brick1
> /dev/mapper/gluster-data       136G  125G   12G  92% /gluster/brick2
> /dev/mapper/gluster-iso         25G  7.3G   18G  29% /gluster/brick4
> 192.168.8.11:/engine            15G  9.7G  5.4G  65%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine
> 192.168.8.11:/data             136G  125G   12G  92%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_data
> 192.168.8.11:/iso               13G  7.3G  5.8G  56%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_iso
>
> View from ovirt2:
> Filesystem                     Size  Used Avail Use% Mounted on
> /dev/mapper/gluster-engine      15G  9.7G  5.4G  65% /gluster/brick1
> /dev/mapper/gluster-data       174G  119G   56G  69% /gluster/brick2
> /dev/mapper/gluster-iso         13G  7.3G  5.8G  56% /gluster/brick4
> 192.168.8.11:/engine            15G  9.7G  5.4G  65%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine
> 192.168.8.11:/data             136G  125G   12G  92%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_data
> 192.168.8.11:/iso               13G  7.3G  5.8G  56%
> /rhev/data-center/mnt/glusterSD/192.168.8.11:_iso
>
> As you can see, in the process of rebuilding the hard drive for ovirt2, I
> did resize some things to give more space to data, where I desperately need
> it.  If this goes well and the storage is given a clean bill of health at
> this time, then I will take ovirt1 down and resize to match ovirt2, and
> thus score a decent increase in storage for data.  I fully realize that
> right now the gluster mounted volumes should have the total size as the
> least common denominator.
>
> So, is this size reduction appropriate?  A big part of me thinks data is
> missing, but I even went through and shut down ovirt2's gluster daemons,
> wiped all the gluster data, and restarted gluster to allow it a fresh heal
> attempt, and it again came back to the exact same size.  This cluster was
> originally built about the time ovirt 4.0 came out, and has been upgraded
> to 'current', so perhaps some new gluster features are making more
> efficient use of space (dedupe or something)?
>

The used capacity should be consistent on all nodes - I see you have a
discrepancy with the data volume brick. What does "gluster vol heal data
info" tell you? Are there entries to be healed?

Can you provide the glustershd logs?

>
> Thank  you for your assistance!
> --JIm
>
> On Fri, Aug 4, 2017 at 7:49 PM, Jim Kusznir <jim at palousetech.com> wrote:
>
>> Hi all:
>>
>> Today has been rough.  two of my three nodes went down today, and self
>> heal has not been healing well.  4 hours later, VMs are running.  but the
>> engine is not happy.  It claims the storage domain is down (even though it
>> is up on all hosts and VMs are running).  I'm getting a ton of these
>> messages logging:
>>
>> VDSM engine3 command HSMGetAllTasksStatusesVDS failed: Not SPM
>>
>> Aug 4, 2017 7:23:00 PM
>>
>> VDSM engine3 command SpmStatusVDS failed: Error validating master storage
>> domain: ('MD read error',)
>>
>> Aug 4, 2017 7:22:49 PM
>>
>> VDSM engine3 command ConnectStoragePoolVDS failed: Cannot find master
>> domain: u'spUUID=5868392a-0148-02cf-014d-000000000121,
>> msdUUID=cdaf180c-fde6-4cb3-b6e5-b6bd869c8770'
>>
>> Aug 4, 2017 7:22:47 PM
>>
>> VDSM engine1 command ConnectStoragePoolVDS failed: Cannot find master
>> domain: u'spUUID=5868392a-0148-02cf-014d-000000000121,
>> msdUUID=cdaf180c-fde6-4cb3-b6e5-b6bd869c8770'
>>
>> Aug 4, 2017 7:22:46 PM
>>
>> VDSM engine2 command SpmStatusVDS failed: Error validating master storage
>> domain: ('MD read error',)
>>
>> Aug 4, 2017 7:22:44 PM
>>
>> VDSM engine2 command ConnectStoragePoolVDS failed: Cannot find master
>> domain: u'spUUID=5868392a-0148-02cf-014d-000000000121,
>> msdUUID=cdaf180c-fde6-4cb3-b6e5-b6bd869c8770'
>>
>> Aug 4, 2017 7:22:42 PM
>>
>> VDSM engine1 command HSMGetAllTasksStatusesVDS failed: Not SPM: ()
>>
>>
>> ------------
>> I cannot set an SPM as it claims the storage domain is down; I cannot set
>> the storage domain up.
>>
>> Also in the storage realm, one of my exports shows substantially less
>> data than is actually there.
>>
>> Here's what happened, as best as I understood them:
>> I went to do maintence on ovirt2 (needed to replace a faulty ram stick
>> and rework the disk).  I put it in maintence mode, then shut it down and
>> did my work.  In the process, much of the disk contents was lost (all the
>> gluster data).  I figure, no big deal, the gluster data is redundant on the
>> network, it will heal when it comes back up.
>>
>> While I was doing maintence, all but one of the VMs were running on
>> engine1.  When I turned on engine2, all of the sudden, all vms including
>> the main engine stop and go non-responsive.  As far as I can tell, this
>> should not have happened, as I turned ON one host, but none the less, I
>> waited for recovery to occur (while customers started calling asking why
>> everything stopped working....).  As I waited, I  was checking, and gluster
>> volume status only showed ovirt1 and ovirt2....Apparently gluster had
>> stopped/failed at some point on ovirt3.  I assume that was the cause of the
>> outage, still, if everything was working fine with ovirt1 gluster, and
>> ovirt2 powers on with a very broke gluster (the volume status was showing
>> NA for the port fileds for the gluster volumes), I would not expect to have
>> a working gluster go stupid like that.
>>
>> After starting ovirt3 glusterd and checking the status, all three showed
>> ovirt1 and ovirt3 as operational, and ovirt2 as NA.  Unfortunately,
>> recovery was still not happening, so I did some googling and found about
>> the commands to inquire about the hosted-engine status.  It appeared to be
>> stuck "paused" and I couldn't find a way to unpause it, so I poweroff'ed
>> it, then started it manually on engine 1, and the cluster came back up.  It
>> showed all VMs paused.  I was able to unpause them and they worked again.
>>
>> So now I began to work the ovirt2 gluster healing problem.  It didn't
>> appear to be self-healing, but eventually I found this document:
>> https://support.rackspace.com/how-to/recover-from-a-failed-s
>> erver-in-a-glusterfs-array/
>> and from that found the magic xattr commands.  After setting them,
>> gluster volumes on ovirt2 came online.  I told iso to heal, and it did but
>> only came up about half as much data as it should have.  I told it heal
>> full, and it did finish off the remaining data, and came up to full.  I
>> then told engine to do a full heal (gluster volume heal engine full), and
>> it transferred its data from the other gluster hosts too.  However, it said
>> it was done when it hit 9.7GB while there was 15GB on disk!  It is still
>> stuck that way; ovirt gui and gluster volume heal engine info both show the
>> volume fully healed, but it is not:
>> [root at ovirt1 ~]# df -h
>> Filesystem                     Size  Used Avail Use% Mounted on
>> /dev/mapper/centos_ovirt-root   20G  4.2G   16G  21% /
>> devtmpfs                        16G     0   16G   0% /dev
>> tmpfs                           16G   16K   16G   1% /dev/shm
>> tmpfs                           16G   26M   16G   1% /run
>> tmpfs                           16G     0   16G   0% /sys/fs/cgroup
>> /dev/mapper/gluster-engine      25G   12G   14G  47% /gluster/brick1
>> /dev/sda1                      497M  315M  183M  64% /boot
>> /dev/mapper/gluster-data       136G  124G   13G  92% /gluster/brick2
>> /dev/mapper/gluster-iso         25G  7.3G   18G  29% /gluster/brick4
>> tmpfs                          3.2G     0  3.2G   0% /run/user/0
>> 192.168.8.11:/engine            15G  9.7G  5.4G  65%
>> /rhev/data-center/mnt/glusterSD/192.168.8.11:_engine
>> 192.168.8.11:/data             136G  124G   13G  92%
>> /rhev/data-center/mnt/glusterSD/192.168.8.11:_data
>> 192.168.8.11:/iso               13G  7.3G  5.8G  56%
>> /rhev/data-center/mnt/glusterSD/192.168.8.11:_iso
>>
>> This is from ovirt1, and before the work, both ovirt1 and ovirt2's brings
>> had the same usage.  ovirt2's bricks and the gluster mountpoints agree on
>> iso and engine, but as you can see, not here.  If I do a du -sh on
>> /rhev/data-center/mnt/glusterSD/..../_engine, it comes back with the
>> 12GB number (/brick1 is engine, brick2 is data and brick4 is iso).
>> However, gluster still says its only 9.7G.  I haven't figured out how to
>> get it to finish "healing".
>>
>> data is in the process of healing currently.
>>
>> So, I think I have two main things to solve right now:
>>
>> 1) how do I get ovirt to see the data center/storage domain as online
>> again?
>> 2) How do I get engine to finish healing to ovirt2?
>>
>> Thanks all for reading this very long message!
>> --Jim
>>
>>
>
> _______________________________________________
> Users mailing list
> Users at ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170816/c5b8179e/attachment.html>