[Gluster-users] [Stale file handle] in shard volume

Olaf Buitelaar olaf.buitelaar at gmail.com
Mon Jan 14 08:11:19 UTC 2019


Hi Krutika,

I think the main problem is that the shard files exists in 2 sub-volumes, 1
being valid and 1 being stale.
example;
sub-volume-1:
node-1: a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487[stale]
node-2: a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487[stale]
node-3: a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487[stale]
sub-volume-2:
node-4: a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487[good]
node-5: a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487[good]
node-6: a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487[good]

More or less exactly as you described here;
https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html
The VMS getting paused is i think a pure side-effect.

The issue seems to only surface on volumes with an arbiter brick and
sharding enabled.
So i suspect something goes wrong or went wrong on the sharding translators
layer.

I think the log you're interested in is this;
[2019-01-02 02:33:44.433169] I [MSGID: 113030] [posix.c:2171:posix_unlink]
0-ovirt-kube-posix: open-fd-key-status: 0 for
/data/gfs/bricks/bricka/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487
[2019-01-02 02:33:44.433188] I [MSGID: 113031]
[posix.c:2084:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0
for
/data/gfs/bricks/bricka/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101487
[2019-01-02 02:33:44.475027] I [MSGID: 113030] [posix.c:2171:posix_unlink]
0-ovirt-kube-posix: open-fd-key-status: 0 for
/data/gfs/bricks/bricka/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101488
[2019-01-02 02:33:44.475059] I [MSGID: 113031]
[posix.c:2084:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0
for
/data/gfs/bricks/bricka/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.101488
[2019-01-02 02:35:36.394536] I [MSGID: 115036]
[server.c:535:server_rpc_notify] 0-ovirt-kube-server: disconnecting
connection from
lease-10.dc01.adsolutions-22506-2018/12/24-04:03:32:698336-ovirt-kube-client-2-0-0
[2019-01-02 02:35:36.394800] I [MSGID: 101055]
[client_t.c:443:gf_client_unref] 0-ovirt-kube-server: Shutting down
connection
lease-10.dc01.adsolutions-22506-2018/12/24-04:03:32:698336-ovirt-kube-client-2-0-0
This is from the time the the aforementioned machine paused. I've attached
also the other logs, unfortunate i cannot access the logs of 1 machine, but
if you need those i can gather them later.
If you need more samples or info please let me know.

Thanks Olaf


Op ma 14 jan. 2019 om 08:16 schreef Krutika Dhananjay <kdhananj at redhat.com>:

> Hi,
>
> So the main issue is that certain vms seem to be pausing? Did I understand
> that right?
> Could you share the gluster-mount logs around the time the pause was seen?
> And the brick logs too please?
>
> As for ESTALE errors, the real cause of pauses can be determined from
> errors/warnings logged by fuse. Mere occurrence of ESTALE errors against
> shard function in logs doesn't necessarily indicate that is the reason for
> the pause. Also, in this instance, the ESTALE errors it seems are
> propagated by the lower translators (DHT? protocol/client? Or even bricks?)
> and shard is merely logging the same.
>
> -Krutika
>
>
> On Sun, Jan 13, 2019 at 10:11 PM Olaf Buitelaar <olaf.buitelaar at gmail.com>
> wrote:
>
>> @Krutika if you need any further information, please let me know.
>>
>> Thanks Olaf
>>
>> Op vr 4 jan. 2019 om 07:51 schreef Nithya Balachandran <
>> nbalacha at redhat.com>:
>>
>>> Adding Krutika.
>>>
>>> On Wed, 2 Jan 2019 at 20:56, Olaf Buitelaar <olaf.buitelaar at gmail.com>
>>> wrote:
>>>
>>>> Hi Nithya,
>>>>
>>>> Thank you for your reply.
>>>>
>>>> the VM's using the gluster volumes keeps on getting paused/stopped on
>>>> errors like these;
>>>> [2019-01-02 02:33:44.469132] E [MSGID: 133010]
>>>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
>>>> shard 101487 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
>>>> [Stale file handle]
>>>> [2019-01-02 02:33:44.563288] E [MSGID: 133010]
>>>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
>>>> shard 101488 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
>>>> [Stale file handle]
>>>>
>>>> Krutika, Can you take a look at this?
>>>
>>>
>>>>
>>>> What i'm trying to find out, if i can purge all gluster volumes from
>>>> all possible stale file handles (and hopefully find a method to prevent
>>>> this in the future), so the VM's can start running stable again.
>>>> For this i need to know when the "shard_common_lookup_shards_cbk"
>>>> function considers a file as stale.
>>>> The statement; "Stale file handle errors show up when a file with a
>>>> specified gfid is not found." doesn't seem to cover it all, as i've shown
>>>> in earlier mails the shard file and glusterfs/xx/xx/uuid file do both
>>>> exist, and have the same inode.
>>>> If the criteria i'm using aren't correct, could you please tell me
>>>> which criteria i should use to determine if a file is stale or not?
>>>> these criteria are just based observations i made, moving the stale
>>>> files manually. After removing them i was able to start the VM again..until
>>>> some time later it hangs on another stale shard file unfortunate.
>>>>
>>>> Thanks Olaf
>>>>
>>>> Op wo 2 jan. 2019 om 14:20 schreef Nithya Balachandran <
>>>> nbalacha at redhat.com>:
>>>>
>>>>>
>>>>>
>>>>> On Mon, 31 Dec 2018 at 01:27, Olaf Buitelaar <olaf.buitelaar at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear All,
>>>>>>
>>>>>> till now a selected group of VM's still seem to produce new stale
>>>>>> file's and getting paused due to this.
>>>>>> I've not updated gluster recently, however i did change the op
>>>>>> version from 31200 to 31202 about a week before this issue arose.
>>>>>> Looking at the .shard directory, i've 100.000+ files sharing the same
>>>>>> characteristics as a stale file. which are found till now,
>>>>>> they all have the sticky bit set, e.g. file permissions; ---------T.
>>>>>> are 0kb in size, and have the trusted.glusterfs.dht.linkto attribute.
>>>>>>
>>>>>
>>>>> These are internal files used by gluster and do not necessarily mean
>>>>> they are stale. They "point" to data files which may be on different bricks
>>>>> (same name, gfid etc but no linkto xattr and no ----T permissions).
>>>>>
>>>>>
>>>>>> These files range from long a go (beginning of the year) till now.
>>>>>> Which makes me suspect this was laying dormant for some time now..and
>>>>>> somehow recently surfaced.
>>>>>> Checking other sub-volumes they contain also 0kb files in the .shard
>>>>>> directory, but don't have the sticky bit and the linkto attribute.
>>>>>>
>>>>>> Does anybody else experience this issue? Could this be a bug or an
>>>>>> environmental issue?
>>>>>>
>>>>> These are most likely valid files- please do not delete them without
>>>>> double-checking.
>>>>>
>>>>> Stale file handle errors show up when a file with a specified gfid is
>>>>> not found. You will need to debug the files for which you see this error by
>>>>> checking the bricks to see if they actually exist.
>>>>>
>>>>>>
>>>>>> Also i wonder if there is any tool or gluster command to clean all
>>>>>> stale file handles?
>>>>>> Otherwise i'm planning to make a simple bash script, which iterates
>>>>>> over the .shard dir, checks each file for the above mentioned criteria, and
>>>>>> (re)moves the file and the corresponding .glusterfs file.
>>>>>> If there are other criteria needed to identify a stale file handle, i
>>>>>> would like to hear that.
>>>>>> If this is a viable and safe operation to do of course.
>>>>>>
>>>>>> Thanks Olaf
>>>>>>
>>>>>>
>>>>>>
>>>>>> Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar <
>>>>>> olaf.buitelaar at gmail.com>:
>>>>>>
>>>>>>> Dear All,
>>>>>>>
>>>>>>> I figured it out, it appeared to be the exact same issue as
>>>>>>> described here;
>>>>>>> https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html
>>>>>>> Another subvolume also had the shard file, only were all 0 bytes and
>>>>>>> had the dht.linkto
>>>>>>>
>>>>>>> for reference;
>>>>>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>
>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
>>>>>>>
>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>
>>>>>>> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
>>>>>>>
>>>>>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>>>>> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>>>>>
>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
>>>>>>>
>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>
>>>>>>> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
>>>>>>>
>>>>>>> [root at lease-04 ovirt-backbone-2]# stat
>>>>>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>>>>>   File: ‘.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d’
>>>>>>>   Size: 0               Blocks: 0          IO Block: 4096   regular
>>>>>>> empty file
>>>>>>> Device: fd01h/64769d    Inode: 1918631406  Links: 2
>>>>>>> Access: (1000/---------T)  Uid: (    0/    root)   Gid: (    0/
>>>>>>> root)
>>>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>>>> Access: 2018-12-17 21:43:36.405735296 +0000
>>>>>>> Modify: 2018-12-17 21:43:36.405735296 +0000
>>>>>>> Change: 2018-12-17 21:43:36.405735296 +0000
>>>>>>>  Birth: -
>>>>>>>
>>>>>>> removing the shard file and glusterfs file from each node resolved
>>>>>>> the issue.
>>>>>>>
>>>>>>> I also found this thread;
>>>>>>> https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html
>>>>>>> Maybe he suffers from the same issue.
>>>>>>>
>>>>>>> Best Olaf
>>>>>>>
>>>>>>>
>>>>>>> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar <
>>>>>>> olaf.buitelaar at gmail.com>:
>>>>>>>
>>>>>>>> Dear All,
>>>>>>>>
>>>>>>>> It appears i've a stale file in one of the volumes, on 2 files.
>>>>>>>> These files are qemu images (1 raw and 1 qcow2).
>>>>>>>> I'll just focus on 1 file since the situation on the other seems
>>>>>>>> the same.
>>>>>>>>
>>>>>>>> The VM get's paused more or less directly after being booted with
>>>>>>>> error;
>>>>>>>> [2018-12-18 14:05:05.275713] E [MSGID: 133010]
>>>>>>>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard:
>>>>>>>> Lookup on shard 51500 failed. Base file gfid =
>>>>>>>> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle]
>>>>>>>>
>>>>>>>> investigating the shard;
>>>>>>>>
>>>>>>>> #on the arbiter node:
>>>>>>>>
>>>>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -n
>>>>>>>> glusterfs.gfid.string
>>>>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>> getfattr: Removing leading '/' from absolute path names
>>>>>>>> # file:
>>>>>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>>>>>
>>>>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>>
>>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>>>>
>>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>>
>>>>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>>
>>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>>>>
>>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>>
>>>>>>>> [root at lease-05 ovirt-backbone-2]# stat
>>>>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>>   File: ‘.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0’
>>>>>>>>   Size: 0               Blocks: 0          IO Block: 4096   regular
>>>>>>>> empty file
>>>>>>>> Device: fd01h/64769d    Inode: 537277306   Links: 2
>>>>>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/
>>>>>>>> root)
>>>>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>>>>> Access: 2018-12-17 21:43:36.361984810 +0000
>>>>>>>> Modify: 2018-12-17 21:43:36.361984810 +0000
>>>>>>>> Change: 2018-12-18 20:55:29.908647417 +0000
>>>>>>>>  Birth: -
>>>>>>>>
>>>>>>>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306
>>>>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>>
>>>>>>>> #on the data nodes:
>>>>>>>>
>>>>>>>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string
>>>>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>> getfattr: Removing leading '/' from absolute path names
>>>>>>>> # file:
>>>>>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>>>>>
>>>>>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>>
>>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>>>>
>>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>>
>>>>>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>>
>>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>>>>
>>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>>
>>>>>>>> [root at lease-08 ovirt-backbone-2]# stat
>>>>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>>   File: ‘.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0’
>>>>>>>>   Size: 2166784         Blocks: 4128       IO Block: 4096   regular
>>>>>>>> file
>>>>>>>> Device: fd03h/64771d    Inode: 12893624759  Links: 3
>>>>>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/
>>>>>>>> root)
>>>>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>>>>> Access: 2018-12-18 18:52:38.070776585 +0000
>>>>>>>> Modify: 2018-12-17 21:43:36.388054443 +0000
>>>>>>>> Change: 2018-12-18 21:01:47.810506528 +0000
>>>>>>>>  Birth: -
>>>>>>>>
>>>>>>>> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759
>>>>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>>
>>>>>>>> ========================
>>>>>>>>
>>>>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -n
>>>>>>>> glusterfs.gfid.string
>>>>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>> getfattr: Removing leading '/' from absolute path names
>>>>>>>> # file:
>>>>>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>>>>>
>>>>>>>> [root at lease-11 ovirt-backbone-2]#  getfattr -d -m . -e hex
>>>>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>>
>>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>>>>
>>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>>
>>>>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>>
>>>>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>>>>
>>>>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>>>>
>>>>>>>> [root at lease-11 ovirt-backbone-2]# stat
>>>>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>>   File: ‘.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0’
>>>>>>>>   Size: 2166784         Blocks: 4128       IO Block: 4096   regular
>>>>>>>> file
>>>>>>>> Device: fd03h/64771d    Inode: 12956094809  Links: 3
>>>>>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/
>>>>>>>> root)
>>>>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>>>>> Access: 2018-12-18 20:11:53.595208449 +0000
>>>>>>>> Modify: 2018-12-17 21:43:36.391580259 +0000
>>>>>>>> Change: 2018-12-18 19:19:25.888055392 +0000
>>>>>>>>  Birth: -
>>>>>>>>
>>>>>>>> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809
>>>>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>>>>
>>>>>>>> ================
>>>>>>>>
>>>>>>>> I don't really see any inconsistencies, except the dates on the
>>>>>>>> stat. However this is only after i tried moving the file out of the volumes
>>>>>>>> to force a heal, which does happen on the data nodes, but not on the
>>>>>>>> arbiter node. Before that they were also the same.
>>>>>>>> I've also compared the file
>>>>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they
>>>>>>>> are exactly the same.
>>>>>>>>
>>>>>>>> Things i've further tried;
>>>>>>>> - gluster v heal ovirt-backbone-2 full => gluster v heal
>>>>>>>> ovirt-backbone-2 info reports 0 entries on all nodes
>>>>>>>>
>>>>>>>> - stop each glusterd and glusterfsd, pause around 40sec and start
>>>>>>>> them again on each node, 1 at a time, waiting for the heal to recover
>>>>>>>> before moving to the next node
>>>>>>>>
>>>>>>>> - force a heal by stopping glusterd on a node and perform these
>>>>>>>> steps;
>>>>>>>> mkdir /mnt/ovirt-backbone-2/trigger
>>>>>>>> rmdir /mnt/ovirt-backbone-2/trigger
>>>>>>>> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/
>>>>>>>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/
>>>>>>>> start glusterd
>>>>>>>>
>>>>>>>> - gluster volume rebalance ovirt-backbone-2 start => success
>>>>>>>>
>>>>>>>> Whats further interesting is that according the mount log, the
>>>>>>>> volume is in split-brain;
>>>>>>>> [2018-12-18 10:06:04.606870] E [MSGID: 108008]
>>>>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
>>>>>>>> error]
>>>>>>>> [2018-12-18 10:06:04.606908] E [MSGID: 133014]
>>>>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>>>>> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk]
>>>>>>>> 0-glusterfs-fuse: 428090: FSTAT()
>>>>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error)
>>>>>>>> [2018-12-18 10:06:05.107729] E [MSGID: 108008]
>>>>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
>>>>>>>> error]
>>>>>>>> [2018-12-18 10:06:05.107770] E [MSGID: 133014]
>>>>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>>>>> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk]
>>>>>>>> 0-glusterfs-fuse: 428091: FSTAT()
>>>>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error)
>>>>>>>> [2018-12-18 10:06:05.537244] I [MSGID: 108006]
>>>>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
>>>>>>>> subvolumes up
>>>>>>>> [2018-12-18 10:06:05.538523] E [MSGID: 108008]
>>>>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>>>>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid
>>>>>>>> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output
>>>>>>>> error]
>>>>>>>> [2018-12-18 10:06:05.538685] I [MSGID: 108006]
>>>>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
>>>>>>>> subvolumes up
>>>>>>>> [2018-12-18 10:06:05.538794] I [MSGID: 108006]
>>>>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
>>>>>>>> subvolumes up
>>>>>>>> [2018-12-18 10:06:05.539342] I [MSGID: 109063]
>>>>>>>> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found
>>>>>>>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid =
>>>>>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0
>>>>>>>> [2018-12-18 10:06:05.539372] W [MSGID: 109005]
>>>>>>>> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht:
>>>>>>>> Directory selfheal failed: 2 subvolumes down.Not fixing. path =
>>>>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid =
>>>>>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8
>>>>>>>> [2018-12-18 10:06:05.539694] I [MSGID: 108006]
>>>>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
>>>>>>>> subvolumes up
>>>>>>>> [2018-12-18 10:06:05.540652] I [MSGID: 108006]
>>>>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
>>>>>>>> subvolumes up
>>>>>>>> [2018-12-18 10:06:05.608612] E [MSGID: 108008]
>>>>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
>>>>>>>> error]
>>>>>>>> [2018-12-18 10:06:05.608657] E [MSGID: 133014]
>>>>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>>>>> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk]
>>>>>>>> 0-glusterfs-fuse: 428096: FSTAT()
>>>>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error)
>>>>>>>> [2018-12-18 10:06:06.109339] E [MSGID: 108008]
>>>>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
>>>>>>>> error]
>>>>>>>> [2018-12-18 10:06:06.109378] E [MSGID: 133014]
>>>>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
>>>>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>>>>> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk]
>>>>>>>> 0-glusterfs-fuse: 428097: FSTAT()
>>>>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error)
>>>>>>>>
>>>>>>>> #note i'm able to see ;
>>>>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
>>>>>>>> [root at lease-11 ovirt-backbone-2]# stat
>>>>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
>>>>>>>>   File:
>>>>>>>> ‘/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids’
>>>>>>>>   Size: 1048576         Blocks: 2048       IO Block: 131072 regular
>>>>>>>> file
>>>>>>>> Device: 41h/65d Inode: 10492258721813610344  Links: 1
>>>>>>>> Access: (0660/-rw-rw----)  Uid: (   36/    vdsm)   Gid: (   36/
>>>>>>>> kvm)
>>>>>>>> Context: system_u:object_r:fusefs_t:s0
>>>>>>>> Access: 2018-12-19 20:07:39.917573869 +0000
>>>>>>>> Modify: 2018-12-19 20:07:39.928573917 +0000
>>>>>>>> Change: 2018-12-19 20:07:39.929573921 +0000
>>>>>>>>  Birth: -
>>>>>>>>
>>>>>>>> however checking: gluster v heal ovirt-backbone-2 info split-brain
>>>>>>>> reports no entries.
>>>>>>>>
>>>>>>>> I've also tried mounting the qemu image, and this works fine, i'm
>>>>>>>> able to see all contents;
>>>>>>>>  losetup /dev/loop0
>>>>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>>>>  kpartx -a /dev/loop0
>>>>>>>>  vgscan
>>>>>>>>  vgchange -ay slave-data
>>>>>>>>  mkdir /mnt/slv01
>>>>>>>>  mount /dev/mapper/slave--data-lvol0 /mnt/slv01/
>>>>>>>>
>>>>>>>> Possible causes for this issue;
>>>>>>>> 1. the machine "lease-11" suffered from a faulty RAM module (ECC),
>>>>>>>> which halted the machine and causes an invalid state. (this machine also
>>>>>>>> hosts other volumes, with similar configurations, which report no issue)
>>>>>>>> 2. after the RAM module was replaced, the VM using the backing qemu
>>>>>>>> image, was restored from a backup (the backup was file based within the VM
>>>>>>>> on a different directory). This is because some files were corrupted. The
>>>>>>>> backup/recovery obviously causes extra IO, possible introducing race
>>>>>>>> conditions? The machine did run for about 12h without issues, and in total
>>>>>>>> for about 36h.
>>>>>>>> 3. since only the client (maybe only gfapi?) reports errors,
>>>>>>>> something is broken there?
>>>>>>>>
>>>>>>>> The volume info;
>>>>>>>> root at lease-06 ~# gluster v info ovirt-backbone-2
>>>>>>>>
>>>>>>>> Volume Name: ovirt-backbone-2
>>>>>>>> Type: Distributed-Replicate
>>>>>>>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28
>>>>>>>> Status: Started
>>>>>>>> Snapshot Count: 0
>>>>>>>> Number of Bricks: 3 x (2 + 1) = 9
>>>>>>>> Transport-type: tcp
>>>>>>>> Bricks:
>>>>>>>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2
>>>>>>>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2
>>>>>>>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2
>>>>>>>> (arbiter)
>>>>>>>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>>>>> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>>>>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2
>>>>>>>> (arbiter)
>>>>>>>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>>>>> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>>>>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2
>>>>>>>> (arbiter)
>>>>>>>> Options Reconfigured:
>>>>>>>> nfs.disable: on
>>>>>>>> transport.address-family: inet
>>>>>>>> performance.quick-read: off
>>>>>>>> performance.read-ahead: off
>>>>>>>> performance.io-cache: off
>>>>>>>> performance.low-prio-threads: 32
>>>>>>>> network.remote-dio: enable
>>>>>>>> cluster.eager-lock: enable
>>>>>>>> cluster.quorum-type: auto
>>>>>>>> cluster.server-quorum-type: server
>>>>>>>> cluster.data-self-heal-algorithm: full
>>>>>>>> cluster.locking-scheme: granular
>>>>>>>> cluster.shd-max-threads: 8
>>>>>>>> cluster.shd-wait-qlength: 10000
>>>>>>>> features.shard: on
>>>>>>>> user.cifs: off
>>>>>>>> storage.owner-uid: 36
>>>>>>>> storage.owner-gid: 36
>>>>>>>> features.shard-block-size: 64MB
>>>>>>>> performance.write-behind-window-size: 512MB
>>>>>>>> performance.cache-size: 384MB
>>>>>>>> cluster.brick-multiplex: on
>>>>>>>>
>>>>>>>> The volume status;
>>>>>>>> root at lease-06 ~# gluster v status ovirt-backbone-2
>>>>>>>> Status of volume: ovirt-backbone-2
>>>>>>>> Gluster process                             TCP Port  RDMA Port
>>>>>>>> Online  Pid
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi
>>>>>>>> rt-backbone-2                               49152     0
>>>>>>>> Y       7727
>>>>>>>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi
>>>>>>>> rt-backbone-2                               49152     0
>>>>>>>> Y       12620
>>>>>>>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi
>>>>>>>> rt-backbone-2                               49152     0
>>>>>>>> Y       8794
>>>>>>>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov
>>>>>>>> irt-backbone-2                              49161     0
>>>>>>>> Y       22333
>>>>>>>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o
>>>>>>>> virt-backbone-2                             49152     0
>>>>>>>> Y       15030
>>>>>>>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi
>>>>>>>> rt-backbone-2                               49166     0
>>>>>>>> Y       24592
>>>>>>>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov
>>>>>>>> irt-backbone-2                              49153     0
>>>>>>>> Y       20148
>>>>>>>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o
>>>>>>>> virt-backbone-2                             49154     0
>>>>>>>> Y       15413
>>>>>>>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi
>>>>>>>> rt-backbone-2                               49152     0
>>>>>>>> Y       43120
>>>>>>>> Self-heal Daemon on localhost               N/A       N/A
>>>>>>>> Y       44587
>>>>>>>> Self-heal Daemon on 10.201.0.2              N/A       N/A
>>>>>>>> Y       8401
>>>>>>>> Self-heal Daemon on 10.201.0.5              N/A       N/A
>>>>>>>> Y       11038
>>>>>>>> Self-heal Daemon on 10.201.0.8              N/A       N/A
>>>>>>>> Y       9513
>>>>>>>> Self-heal Daemon on 10.32.9.4               N/A       N/A
>>>>>>>> Y       23736
>>>>>>>> Self-heal Daemon on 10.32.9.20              N/A       N/A
>>>>>>>> Y       2738
>>>>>>>> Self-heal Daemon on 10.32.9.3               N/A       N/A
>>>>>>>> Y       25598
>>>>>>>> Self-heal Daemon on 10.32.9.5               N/A       N/A
>>>>>>>> Y       511
>>>>>>>> Self-heal Daemon on 10.32.9.9               N/A       N/A
>>>>>>>> Y       23357
>>>>>>>> Self-heal Daemon on 10.32.9.8               N/A       N/A
>>>>>>>> Y       15225
>>>>>>>> Self-heal Daemon on 10.32.9.7               N/A       N/A
>>>>>>>> Y       25781
>>>>>>>> Self-heal Daemon on 10.32.9.21              N/A       N/A
>>>>>>>> Y       5034
>>>>>>>>
>>>>>>>> Task Status of Volume ovirt-backbone-2
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Task                 : Rebalance
>>>>>>>> ID                   : 6dfbac43-0125-4568-9ac3-a2c453faaa3d
>>>>>>>> Status               : completed
>>>>>>>>
>>>>>>>> gluster version is @3.12.15 and cluster.op-version=31202
>>>>>>>>
>>>>>>>> ========================
>>>>>>>>
>>>>>>>> It would be nice to know if it's possible to mark the files as not
>>>>>>>> stale or if i should investigate other things?
>>>>>>>> Or should we consider this volume lost?
>>>>>>>> Also checking the code at;
>>>>>>>> https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c
>>>>>>>> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe
>>>>>>>> it's fixed in a future version?
>>>>>>>> Any thoughts are welcome.
>>>>>>>>
>>>>>>>> Thanks Olaf
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: data-gfs-bricks-bricka-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 144647 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0008.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: l7-data-gfs-bricks-brick1-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 146483 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0009.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: l10-data-gfs-bricks-brick1-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 146605 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0010.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: l5-data-gfs-bricks-bricka-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 144387 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0011.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: l8-data-gfs-bricks-brick1-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 302224 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0012.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rhev-data-center-mnt-glusterSD-10.32.9.20?_ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 646 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0013.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: l11-data-gfs-bricks-brick1-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 146474 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0014.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: l11-data-gfs-bricks-bricka-ovirt-kube.log-20190106.gz
Type: application/x-gzip
Size: 303508 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190114/67036455/attachment-0015.gz>


More information about the Gluster-users mailing list