[Gluster-users] issue with self-heal

Fri Jul 13 15:50:28 UTC 2018

You message means something (usually glusterfsd) is not running quite 
right or at all on one of the servers.

If you can tell which it is, you need to stop/restart glusterd and 
glusterfsd. Note: sometimes just stopping them doesn't really stop them. 
You need to do a killall -9  for glusterd, glusterfsd and anything else 
with "gluster"

Then just start glusterd and glusterfsd. Once they are up you should be 
able to do the heal.

If you can't tell which it is and are able to take gluster offline for 
users for a moment, do that process to all your brick servers.

Brian Andrus

On 7/13/2018 10:55 AM, hsafe wrote:
>
> Hello Gluster community,
>
> After several hundred GB of data writes (small image  100k <size> 1M) 
> into a replicated 2x glusterfs servers , I am facing issue with 
> healing process. Earlier the heal info returned the bricks and nodes 
> and the fact that there are no failed heal; but now it gets to the 
> state with below message:
>
> *# gluster volume heal gv1 info healed*
>
> *Gathering list of heal failed entries on volume gv1 has been 
> unsuccessful on bricks that are down. Please check if all brick 
> processes are running.*
>
> issuing the heal info command gives a log list of gfid info that takes 
> like an hour to complete. The file data being images would not change 
> and primarily served from 8x server mount native glusterfs.
>
> Here is some insight on the status of the gluster, but how can I 
> effectively do a successful heal on the storages cause last times 
> trying to do that send the servers southway and irresponsive
>
> *# gluster volume info
>
> Volume Name: gv1
> Type: Replicate
> Volume ID: f1c955a1-7a92-4b1b-acb5-8b72b41aaace
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: IMG-01:/images/storage/brick1
> Brick2: IMG-02:/images/storage/brick1
> Options Reconfigured:
> performance.md-cache-timeout: 128
> cluster.background-self-heal-count: 32
> server.statedump-path: /tmp
> performance.readdir-ahead: on
> nfs.disable: true
> network.inode-lru-limit: 50000
> features.bitrot: off
> features.scrub: Inactive
> performance.cache-max-file-size: 16MB
> client.event-threads: 8
> cluster.eager-lock: on*
>
> Appreciate your help.Thanks
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180713/ec58c98c/attachment.html>