[Gluster-users] [External] Re: Self Heal Confusion
Brett Holcomb
biholcomb at l1049h.com
Mon Dec 31 09:34:00 UTC 2018
That is probably the case as a lot of files were deleted some time ago.
I'm on version 5.2 but was on 3.12 until about a week ago.
Here is the quorum info. I'm running a distributed replicated volumes
in 2 x 3 = 6
cluster.quorum-type auto
cluster.quorum-count (null)
cluster.server-quorum-type off
cluster.server-quorum-ratio 0
cluster.quorum-reads no
Where exacty do I remove the gfid entries from - the .glusterfs
directory? Do I just delete all the directories can files under this
directory?
Where do I put the cluster.heal-timeout option - which file?
I think you've hit on the cause of the issue. Thinking back we've had
some extended power outages and due to a misconfiguration in the swap
file device name a couple of the nodes did not come up and I didn't
catch it for a while so maybe the deletes occured then.
Thank you.
On 12/31/18 2:58 AM, Davide Obbi wrote:
> if the long GFID does not correspond to any file it could mean the
> file has been deleted by the client mounting the volume. I think this
> is caused when the delete was issued and the number of active bricks
> were not reaching quorum majority or a second brick was taken down
> while another was down or did not finish the selfheal, the latter more
> likely.
> It would be interesting to see:
> - what version of glusterfs you running, it happened to me with 3.12
> - volume quorum rules: "gluster volume get vol all | grep quorum"
>
> To clean it up if i remember correctly it should be possible to delete
> the gfid entries from the brick mounts on the glusterfs server nodes
> reporting the files to heal.
>
> As a side note you might want to consider changing the selfheal
> timeout to more agressive schedule in cluster.heal-timeout option
More information about the Gluster-users
mailing list