[Gluster-users] Can't heal a volume: "Please check if all brick processes are running."

Karthik Subrahmanya ksubrahm at redhat.com
Wed Mar 14 12:50:56 UTC 2018


On Wed, Mar 14, 2018 at 5:42 PM, Karthik Subrahmanya <ksubrahm at redhat.com>
wrote:

>
>
> On Wed, Mar 14, 2018 at 3:36 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org>
> wrote:
>
>> Hi Karthik,
>>
>>
>> Thanks a lot for the explanation.
>>
>> Does it mean a distributed volume health can be checked only by "gluster
>> volume status " command?
>>
> Yes. I am not aware of any other command which can give the status of
> plain distribute volume which is similar to the heal info command for
> replicate/disperse volumes.
>
>> And one more question: cluster.min-free-disk is 10% by default. What kind
>> of "side effects" can we face if this option will be reduced to, for
>> example, 5%? Could you point to any best practice document(s)?
>>
> Yes you can decrease it to any value. There won't be any side effect.
>
Small correction here, min-free-disk should ideally be set to larger than
the largest file size likely to be written. Decreasing it beyond a point
raises the likelihood of the brick getting full which is a very bad state
to be in.
Will update you if I get some document which explains this thing. Sorry for
the previous statement.

>
> Regards,
> Karthik
>
>>
>> Regards,
>>
>> Anatoliy
>>
>>
>>
>>
>>
>> On 2018-03-13 16:46, Karthik Subrahmanya wrote:
>>
>> Hi Anatoliy,
>>
>> The heal command is basically used to heal any mismatching contents
>> between replica copies of the files.
>> For the command "gluster volume heal <volname>" to succeed, you should
>> have the self-heal-daemon running,
>> which is true only if your volume is of type replicate/disperse.
>> In your case you have a plain distribute volume where you do not store
>> the replica of any files.
>> So the volume heal will return you the error.
>>
>> Regards,
>> Karthik
>>
>> On Tue, Mar 13, 2018 at 7:53 PM, Anatoliy Dmytriyev <tolid at tolid.eu.org>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>> Maybe someone can point me to a documentation or explain this? I can't
>>> find it myself.
>>> Do we have any other useful resources except doc.gluster.org? As I see
>>> many gluster options are not described there or there are no explanation
>>> what is doing...
>>>
>>>
>>>
>>> On 2018-03-12 15:58, Anatoliy Dmytriyev wrote:
>>>
>>>> Hello,
>>>>
>>>> We have a very fresh gluster 3.10.10 installation.
>>>> Our volume is created as distributed volume, 9 bricks 96TB in total
>>>> (87TB after 10% of gluster disk space reservation)
>>>>
>>>> For some reasons I can't "heal" the volume:
>>>> # gluster volume heal gv0
>>>> Launching heal operation to perform index self heal on volume gv0 has
>>>> been unsuccessful on bricks that are down. Please check if all brick
>>>> processes are running.
>>>>
>>>> Which processes should be run on every brick for heal operation?
>>>>
>>>> # gluster volume status
>>>> Status of volume: gv0
>>>> Gluster process                             TCP Port  RDMA Port
>>>> Online  Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick cn01-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  70850
>>>> Brick cn02-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  102951
>>>> Brick cn03-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  57535
>>>> Brick cn04-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  56676
>>>> Brick cn05-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  56880
>>>> Brick cn06-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  56889
>>>> Brick cn07-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  56902
>>>> Brick cn08-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  94920
>>>> Brick cn09-ib:/gfs/gv0/brick1/brick         0         49152      Y
>>>>  56542
>>>>
>>>> Task Status of Volume gv0
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> There are no active volume tasks
>>>>
>>>>
>>>> # gluster volume info gv0
>>>> Volume Name: gv0
>>>> Type: Distribute
>>>> Volume ID: 8becaf78-cf2d-4991-93bf-f2446688154f
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 9
>>>> Transport-type: rdma
>>>> Bricks:
>>>> Brick1: cn01-ib:/gfs/gv0/brick1/brick
>>>> Brick2: cn02-ib:/gfs/gv0/brick1/brick
>>>> Brick3: cn03-ib:/gfs/gv0/brick1/brick
>>>> Brick4: cn04-ib:/gfs/gv0/brick1/brick
>>>> Brick5: cn05-ib:/gfs/gv0/brick1/brick
>>>> Brick6: cn06-ib:/gfs/gv0/brick1/brick
>>>> Brick7: cn07-ib:/gfs/gv0/brick1/brick
>>>> Brick8: cn08-ib:/gfs/gv0/brick1/brick
>>>> Brick9: cn09-ib:/gfs/gv0/brick1/brick
>>>> Options Reconfigured:
>>>> client.event-threads: 8
>>>> performance.parallel-readdir: on
>>>> performance.readdir-ahead: on
>>>> cluster.nufa: on
>>>> nfs.disable: on
>>>
>>>
>>> --
>>> Best regards,
>>> Anatoliy
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>> --
>> Best regards,
>> Anatoliy
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180314/aab3d700/attachment.html>


More information about the Gluster-users mailing list