[Gluster-users] not healing one file

Thu Oct 26 05:41:53 UTC 2017

Hey Richard,

Could you share the following informations please?
1. gluster volume info <volname>
2. getfattr output of that file from all the bricks
    getfattr -d -e hex -m . <brickpath/filepath>
3. glustershd & glfsheal logs

Regards,
Karthik

On Thu, Oct 26, 2017 at 10:21 AM, Amar Tumballi <atumball at redhat.com> wrote:

> On a side note, try recently released health report tool, and see if it
> does diagnose any issues in setup. Currently you may have to run it in all
> the three machines.
>
>
>
> On 26-Oct-2017 6:50 AM, "Amar Tumballi" <atumball at redhat.com> wrote:
>
>> Thanks for this report. This week many of the developers are at Gluster
>> Summit in Prague, will be checking this and respond next week. Hope that's
>> fine.
>>
>> Thanks,
>> Amar
>>
>>
>> On 25-Oct-2017 3:07 PM, "Richard Neuboeck" <hawk at tbi.univie.ac.at> wrote:
>>
>>> Hi Gluster Gurus,
>>>
>>> I'm using a gluster volume as home for our users. The volume is
>>> replica 3, running on CentOS 7, gluster version 3.10
>>> (3.10.6-1.el7.x86_64). Clients are running Fedora 26 and also
>>> gluster 3.10 (3.10.6-3.fc26.x86_64).
>>>
>>> During the data backup I got an I/O error on one file. Manually
>>> checking for this file on a client confirms this:
>>>
>>> ls -l
>>> romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/ses
>>> sionstore-backups/
>>> ls: cannot access
>>> 'romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4':
>>> Input/output error
>>> total 2015
>>> -rw-------. 1 romanoch tbi 998211 Sep 15 18:44 previous.js
>>> -rw-------. 1 romanoch tbi  65222 Oct 17 17:57 previous.jsonlz4
>>> -rw-------. 1 romanoch tbi 149161 Oct  1 13:46 recovery.bak
>>> -?????????? ? ?        ?        ?            ? recovery.baklz4
>>>
>>> Out of curiosity I checked all the bricks for this file. It's
>>> present there. Making a checksum shows that the file is different on
>>> one of the three replica servers.
>>>
>>> Querying healing information shows that the file should be healed:
>>> # gluster volume heal home info
>>> Brick sphere-six:/srv/gluster_home/brick
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4
>>>
>>> Status: Connected
>>> Number of entries: 1
>>>
>>> Brick sphere-five:/srv/gluster_home/brick
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4
>>>
>>> Status: Connected
>>> Number of entries: 1
>>>
>>> Brick sphere-four:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries: 0
>>>
>>> Manually triggering heal doesn't report an error but also does not
>>> heal the file.
>>> # gluster volume heal home
>>> Launching heal operation to perform index self heal on volume home
>>> has been successful
>>>
>>> Same with a full heal
>>> # gluster volume heal home full
>>> Launching heal operation to perform full self heal on volume home
>>> has been successful
>>>
>>> According to the split brain query that's not the problem:
>>> # gluster volume heal home info split-brain
>>> Brick sphere-six:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick sphere-five:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> Brick sphere-four:/srv/gluster_home/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>>
>>> I have no idea why this situation arose in the first place and also
>>> no idea as how to solve this problem. I would highly appreciate any
>>> helpful feedback I can get.
>>>
>>> The only mention in the logs matching this file is a rename operation:
>>> /var/log/glusterfs/bricks/srv-gluster_home-brick.log:[2017-10-23
>>> 09:19:11.561661] I [MSGID: 115061]
>>> [server-rpc-fops.c:1022:server_rename_cbk] 0-home-server: 5266153:
>>> RENAME
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.jsonlz4
>>> (48e9eea6-cda6-4e53-bb4a-72059debf4c2/recovery.jsonlz4) ->
>>> /romanoch/.mozilla/firefox/vzzqqxrm.default-1396429081309/se
>>> ssionstore-backups/recovery.baklz4
>>> (48e9eea6-cda6-4e53-bb4a-72059debf4c2/recovery.baklz4), client:
>>> romulus.tbi.univie.ac.at-11894-2017/10/18-07:06:07:206366-ho
>>> me-client-3-0-0,
>>> error-xlator: home-posix [No data available]
>>>
>>> I enabled directory quotas the same day this problem showed up but
>>> I'm not sure how quotas could have an effect like this (maybe unless
>>> the limit is reached but that's also not the case).
>>>
>>> Thanks again if anyone as an idea.
>>> Cheers
>>> Richard
>>> --
>>> /dev/null
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171026/5028122a/attachment.html>