[Gluster-users] No healing, errno 22

Wed Mar 17 13:27:56 UTC 2021

On 16/03/21 11:45 pm, Zenon Panoussis wrote:
>> Yes if the dataset is small, you can try rm -rf of the dir
>> from the mount (assuming no other application is accessing
>> them on the volume) launch heal once so that the heal info
>> becomes zero and then copy it over again .
> I did approximately so; the rm -rf took its sweet time and the
> number of entries to be healed kept diminishing as the deletion
> progressed. At the end I was left with
>
> Mon Mar 15 22:57:09 CET 2021
> Gathering count of entries to be healed on volume gv0 has been successful
>
> Brick node01:/gfs/gv0
> Number of entries: 3
>
> Brick mikrivouli:/gfs/gv0
> Number of entries: 2
>
> Brick nanosaurus:/gfs/gv0
> Number of entries: 3
> --------------
>
> and that's where I've been ever since, for the past 20 hours.
> SHD has kept trying to heal them all along and the log brings
> us back to square one:
>
> [2021-03-16 14:51:35.059593 +0000] I [MSGID: 108026] [afr-self-heal-entry.c:1053:afr_selfheal_entry_do] 0-gv0-replicate-0: performing entry selfheal on 94aefa13-9828-49e5-9bac-6f70453c100f
Does this gfid correspond to the same directory path as last time?
> [2021-03-16 15:39:43.680380 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-0: remote operation failed. [{path=(null)}, {errno=22}, {error=Invalid argument}]
> [2021-03-16 15:39:43.769604 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-2: remote operation failed. [{path=(null)}, {errno=22}, {error=Invalid argument}]
> [2021-03-16 15:39:43.908425 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-1: remote operation failed. [{path=(null)}, {errno=22}, {error=Invalid argument}]
> [...]
Wonder if you can attach gdb the glustershd process at the function 
client4_0_mkdir() and try to print args->loc->path to see on which file 
the mkdir is attempted on.
> In other words, deleting and recreating the unhealable files
> and directories was a workaround, but the underlying problem
> persists and I can't even begin to look for it when I have no
> clue what errno 22 means in plain English.
>
> In any case, glusterd.log is full of messages like
server-quorum messages in the glusterd log are unrelated, you can raise 
a separate github issue for that. (And you can leave it at 'off').
>
> [2021-03-16 15:37:03.398619 +0000] I [MSGID: 106533] [glusterd-volume-ops.c:717:__glusterd_handle_cli_heal_volume] 0-management: Received heal vol req for volume gv0
> [2021-03-16 15:37:03.791452 +0000] E [MSGID: 106061] [glusterd-server-quorum.c:260:glusterd_is_volume_in_server_quorum] 0-management: Dict get failed [{Key=cluster.server-quorum-type}]
>