[Bugs] [Bug 1565623] glusterfs disperse volume input output error

bugzilla at redhat.com bugzilla at redhat.com
Wed Apr 11 09:46:40 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1565623

Xavi Hernandez <jahernan at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|bugs at gluster.org            |jahernan at redhat.com



--- Comment #10 from Xavi Hernandez <jahernan at redhat.com> ---
All seems to indicate that a heal was happening on nodes glfs-node13.avp.ru and
glfs-node25.avp.ru at the time of restarting node glfs-node19.avp.ru.
Unfortunately this coincided with a modification that caused 3 simultaneous
failures on the file.

We need to manually repair the file or recover it from a backup.

To recover the file manually we have two options:

1. Guess which of the 3 bad fragments is "less" bad. Probably the best
candidate would be the fragment on node glfs-node19.avp.ru, but we need to
check it. It would be interesting to see the modification times of all
fragments on all bricks. To do so we can execute 'stat
/data1/bricks/brick1/vmfs/slake-test-bck-m1-d1.qcow2'. This will help us to
decide, but it's not a 100% secure way to determine the best option.

2. Try to check integrity of fragments. To do this we'll need to develop a
small tool able to do the check. It will require some time but it will tell us
if the file is good or, if there's something bad, we'll know where the problem
is (in which block). The advantage of this method is that unless the 3 bad
fragments are damaged on the same block, we may be able to recover the whole
file.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list