[Gluster-users] Repair after accident

Strahil Nikolov hunter86_bg at yahoo.com
Fri Aug 7 12:32:46 UTC 2020


Have you tried to gluster heal and check if the files are back into their place?

I always thought that those hard links are used  by the healing mechanism  and if that is true - gluster should restore the files to their original location and then wiping the correct files from FUSE will be easy.

Best Regards,
Strahil Nikolov

На 7 август 2020 г. 10:24:38 GMT+03:00, Mathias Waack <mathias.waack at seim-partner.de> написа:
>Hi all,
>
>maybe I should add some more information:
>
>The container which filled up the space was running on node x, which 
>still shows a nearly filled fs:
>
>192.168.1.x:/gvol  2.6T  2.5T  149G  95% /gluster
>
>nearly the same situation on the underlying brick partition on node x:
>
>zdata/brick     2.6T  2.4T  176G  94% /zbrick
>
>On node y the network card crashed, glusterfs shows the same values:
>
>192.168.1.y:/gvol  2.6T  2.5T  149G  95% /gluster
>
>but different values on the brick:
>
>zdata/brick     2.9T  1.6T  1.4T  54% /zbrick
>
>I think this happened because glusterfs still has hardlinks to the 
>deleted files on node x? So I can find these files with:
>
>find /zbrick/.glusterfs -links 1 -ls | grep -v ' -> '
>
>But now I am lost. How can I verify these files really belongs to the 
>right container? Or can I just delete this files because there is no
>way 
>to access it? Or offers glusterfs a way to solve this situation?
>
>Mathias
>
>On 05.08.20 15:48, Mathias Waack wrote:
>> Hi all,
>>
>> we are running a gluster setup with two nodes:
>>
>> Status of volume: gvol
>> Gluster process                             TCP Port  RDMA Port 
>> Online  Pid
>>
>------------------------------------------------------------------------------
>
>>
>> Brick 192.168.1.x:/zbrick                  49152     0 Y 13350
>> Brick 192.168.1.y:/zbrick                  49152     0 Y 5965
>> Self-heal Daemon on localhost               N/A       N/A Y 14188
>> Self-heal Daemon on 192.168.1.93            N/A       N/A Y 6003
>>
>> Task Status of Volume gvol
>>
>------------------------------------------------------------------------------
>
>>
>> There are no active volume tasks
>>
>> The glusterfs hosts a bunch of containers with its data volumes. The 
>> underlying fs is zfs. Few days ago one of the containers created a
>lot 
>> of files in one of its data volumes, and at the end it completely 
>> filled up the space of the glusterfs volume. But this happened only
>on 
>> one host, on the other host there was still enough space. We finally 
>> were able to identify this container and found out, the sizes of the 
>> data on /zbrick were different on both hosts for this container. Now 
>> we made the big mistake to delete these files on both hosts in the 
>> /zbrick volume, not on the mounted glusterfs volume.
>>
>> Later we found the reason for this behavior: the network driver on
>the 
>> second node partially crashed (which means we ware able to login on 
>> the node, so we assumed the network was running, but the card was 
>> already dropping packets at this time) at the same time, as the
>failed 
>> container started to fill up the gluster volume. After rebooting the 
>> second node  the gluster became available again.
>>
>> Now the glusterfs volume is running again- but it is still (nearly) 
>> full: the files created by the container are not visible, but they 
>> still count into amount of free space. How can we fix this?
>>
>> In addition there are some files which are no longer accessible since
>
>> this accident:
>>
>> tail access.log.old
>> tail: cannot open 'access.log.old' for reading: Input/output error
>>
>> Looks like affected by this error are files which have been changed 
>> during the accident. Is there a way to fix this too?
>>
>> Thanks
>>     Mathias
>>
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>________
>
>
>
>Community Meeting Calendar:
>
>Schedule -
>Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>Bridge: https://bluejeans.com/441850968
>
>Gluster-users mailing list
>Gluster-users at gluster.org
>https://lists.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list