[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

Wed Mar 14 13:22:44 UTC 2018

Thanks 

On 2018-03-14 13:50, Karthik Subrahmanya wrote:

> On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubrahm at redhat.com> wrote:
> 
> On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote:
> 
> Hi Karthik, 
> 
> Thanks a lot for the explanation. 
> 
> Does it mean a distributed volume health can be checked only by "gluster volume status " command? 
> Yes. I am not aware of any other command which can give the status of plain distribute volume which is similar to the heal info command for replicate/disperse volumes. 
> 
> And one more question: cluster.min-free-disk is 10% by default. What kind of "side effects" can we face if this option will be reduced to, for example, 5%? Could you point to any best practice document(s)? 
> Yes you can decrease it to any value. There won't be any side effect.

Small correction here, min-free-disk should ideally be set to larger
than the largest file size likely to be written. Decreasing it beyond a
point raises the likelihood of the brick getting full which is a very
bad state to be in.
Will update you if I get some document which explains this thing. Sorry
for the previous statement. 

> Regards, 
> Karthik 
> 
> Regards, 
> 
> Anatoliy
> 
> On 2018-03-13 16:46, Karthik Subrahmanya wrote: 
> 
> Hi Anatoliy,
> 
> The heal command is basically used to heal any mismatching contents between replica copies of the files. For the command "gluster volume heal <volname>" to succeed, you should have the self-heal-daemon running,
> which is true only if your volume is of type replicate/disperse. In your case you have a plain distribute volume where you do not store the replica of any files. So the volume heal will return you the error.
> 
> Regards, Karthik 
> 
> On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org> wrote:
> Hi,
> 
> Maybe someone can point me to a documentation or explain this? I can't find it myself.
> Do we have any other useful resources except doc.gluster.org [1]? As I see many gluster options are not described there or there are no explanation what is doing... 
> 
> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
> Hello,
> 
> We have a very fresh gluster 3.10.10 installation.
> Our volume is created as distributed volume, 9 bricks 96TB in total
> (87TB after 10% of gluster disk space reservation)
> 
> For some reasons I can't "heal" the volume:
> # gluster volume heal gv0
> Launching heal operation to perform index self heal on volume gv0 has
> been unsuccessful on bricks that are down. Please check if all brick
> processes are running.
> 
> Which processes should be run on every brick for heal operation?
> 
> # gluster volume status
> Status of volume: gv0
> Gluster process                             TCP Port  RDMA Port  Online  Pid
> ------------------------------------------------------------------------------
> Brick cn01-ib:/gfs/gv0/brick1/brick         0         49152      Y       70850
> Brick cn02-ib:/gfs/gv0/brick1/brick         0         49152      Y       102951
> Brick cn03-ib:/gfs/gv0/brick1/brick         0         49152      Y       57535
> Brick cn04-ib:/gfs/gv0/brick1/brick         0         49152      Y       56676
> Brick cn05-ib:/gfs/gv0/brick1/brick         0         49152      Y       56880
> Brick cn06-ib:/gfs/gv0/brick1/brick         0         49152      Y       56889
> Brick cn07-ib:/gfs/gv0/brick1/brick         0         49152      Y       56902
> Brick cn08-ib:/gfs/gv0/brick1/brick         0         49152      Y       94920
> Brick cn09-ib:/gfs/gv0/brick1/brick         0         49152      Y       56542
> 
> Task Status of Volume gv0
> ------------------------------------------------------------------------------
> There are no active volume tasks
> 
> # gluster volume info gv0
> Volume Name: gv0
> Type: Distribute
> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 9
> Transport-type: rdma
> Bricks:
> Brick1: cn01-ib:/gfs/gv0/brick1/brick
> Brick2: cn02-ib:/gfs/gv0/brick1/brick
> Brick3: cn03-ib:/gfs/gv0/brick1/brick
> Brick4: cn04-ib:/gfs/gv0/brick1/brick
> Brick5: cn05-ib:/gfs/gv0/brick1/brick
> Brick6: cn06-ib:/gfs/gv0/brick1/brick
> Brick7: cn07-ib:/gfs/gv0/brick1/brick
> Brick8: cn08-ib:/gfs/gv0/brick1/brick
> Brick9: cn09-ib:/gfs/gv0/brick1/brick
> Options Reconfigured:
> client.event-threads: 8
> performance.parallel-readdir: on
> performance.readdir-ahead: on
> cluster.nufa: on
> nfs.disable: on 
> -- 
> Best regards,
> Anatoliy
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users [2]

-- 
Best regards,
Anatoliy 

-- 
Best regards,
Anatoliy 

Links:
------
[1] http://doc.gluster.org
[2] http://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180314/129a8c9b/attachment.html>