[Gluster-users] One error/warning message after upgrade 5.11 -> 6.8

Sun Jun 21 12:31:27 UTC 2020

Another oVirt  user is experiencing ACL issues - obviously something  is going on ...

На 8 юни 2020 г. 16:54:27 GMT+03:00, Hu Bert <revirii at googlemail.com> написа:
>Hi,
>
>the "process", if we wanna call it so, has finished. Maybe there was a
>process running that accessed/deleted/... files that haven't been
>accessed for a while, resulting in ctime mdata fixes. However, heal
>count is down to 0 on all bricks. Very strange, i see ~34K such log
>entries for each brick.
>
>Let's think positive: gluster is running properly and doing what it
>should do. Great! :D
>
>
>Hubert
>
>Am Mo., 8. Juni 2020 um 15:36 Uhr schrieb Strahil Nikolov
><hunter86_bg at yahoo.com>:
>>
>> Hm... That's something I didn't expect.
>>
>>
>> By the way, have you checked  if all clients are connected to all
>bricks (if using FUSE)?
>>
>> Maybe you have some clients that cannot reach a brick.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> На 8 юни 2020 г. 12:48:22 GMT+03:00, Hu Bert <revirii at googlemail.com>
>написа:
>> >Hi Strahil,
>> >
>> >thx for your answer, but i assume that your approach won't help. It
>> >seems like that this behaviour is permanent; e.g. a log entry like
>> >this:
>> >
>> >[2020-06-08 09:40:03.948269] E [MSGID: 113001]
>> >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix:
>> >file:
>>
>>/gluster/md3/persistent/.glusterfs/38/30/38306ef8-6588-40cf-8be3-c0a022714612:
>> >gfid: 38306ef8-6588-40cf-8be3-c0a022714612
>key:trusted.glusterfs.mdata
>> > [No such file or directory]
>> >[2020-06-08 09:40:03.948333] E [MSGID: 113114]
>> >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files]
>> >0-persistent-posix: gfid: 38306ef8-6588-40cf-8be3-c0a022714612
>> >key:trusted.glusterfs.mdata  [No such file or directory]
>> >[2020-06-08 09:40:03.948422] I [MSGID: 115060]
>> >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure]
>> >0-persistent-server: 14193413: SETXATTR
>> >/images/generated/207/039/2070391/484x425r.jpg
>> >(38306ef8-6588-40cf-8be3-c0a022714612) ==> set-ctime-mdata, client:
>>
>>CTX_ID:b738017c-20a3-4547-afba-5b8933d8e6e5-GRAPH_ID:0-PID:1078-HOST:pepe-PC_NAME:persistent-client-2-RECON_NO:-1,
>> >error-xlator: persistent-posix
>> >
>> >tells me that an error (ctime-mdata) is found and fixed. And this is
>> >happening over and over again. A couple of minutes ago i wanted to
>> >begin with what you suggested and called 'gluster volume heal
>> >persistent info' and suddenly saw:
>> >
>> >Brick gluster1:/gluster/md3/persistent
>> >Status: Connected
>> >Number of entries: 0
>> >
>> >Brick gluster2:/gluster/md3/persistent
>> >Status: Connected
>> >Number of entries: 0
>> >
>> >Brick gluster3:/gluster/md3/persistent
>> >Status: Connected
>> >Number of entries: 0
>> >
>> >I thought 'wtf...'; the heal-count was 0 as well; but the next call
>> >~15s later showed this again:
>> >
>> >Brick gluster1:/gluster/md3/persistent
>> >Number of entries: 31
>> >
>> >Brick gluster2:/gluster/md3/persistent
>> >Number of entries: 27
>> >
>> >Brick gluster3:/gluster/md3/persistent
>> >Number of entries: 4
>> >
>> >For me it looks like the 'error found -> heal it' process works as
>it
>> >should, but due to the permanent errors (log file entries) the heal
>> >count of zero is almost impossible to read.
>> >
>> >Well, one could deactivate features.ctime as this seems to be the
>> >reason (as the log entries suggest), but i don't know if that is
>> >reasonable, i.e. if this feature is needed.
>> >
>> >
>> >Best regards,
>> >Hubert
>> >
>> >Am Mo., 8. Juni 2020 um 11:22 Uhr schrieb Strahil Nikolov
>> ><hunter86_bg at yahoo.com>:
>> >>
>> >> Hi Hubert,
>> >>
>> >> Here is one idea:
>> >> Using 'gluster volume  heal VOL  info' can provide  the gfids of
>> >files pending heal.
>> >> Once  you have them, you can find the inode of each file via 'ls 
>-li
>>
>>/gluster/brick/.gfid/<first_two_characters_of_gfid>/<next_two_characters>/gfid
>> >>
>> >> Then you can search the brick with find for that inode number
>(don't
>> >forget the 'ionice' to reduce the pressure).
>> >>
>> >> Once  you have the list of files, stat them via the FUSE client
>and
>> >check if they got healed.
>> >>
>> >> I fully agree that you need to first heal the golumes  before
>> >proceeding further or you might get into a nasty situation.
>> >>
>> >> Best Regards,
>> >> Strahil Nikolov
>> >>
>> >>
>> >> На 8 юни 2020 г. 8:30:57 GMT+03:00, Hu Bert
><revirii at googlemail.com>
>> >написа:
>> >> >Good morning,
>> >> >
>> >> >i just wanted to update the version from 6.8 to 6.9 on our
>replicate
>> >3
>> >> >system (formerly was version 5.11), and i see tons of these
>> >messages:
>> >> >
>> >> >[2020-06-08 05:25:55.192301] E [MSGID: 113001]
>> >> >[posix-metadata.c:234:posix_store_mdata_xattr]
>0-persistent-posix:
>> >> >file:
>> >>
>>
>>>/gluster/md3/persistent/.glusterfs/43/31/43312aba-75c6-42c2-855c-e0db66d7748f:
>> >> >gfid: 43312aba-75c6-42c2-855c-e0db66d7748f
>> >key:trusted.glusterfs.mdata
>> >> > [No such file or directory]
>> >> >[2020-06-08 05:25:55.192375] E [MSGID: 113114]
>> >> >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files]
>> >> >0-persistent-posix: gfid: 43312aba-75c6-42c2-855c-e0db66d7748f
>> >> >key:trusted.glusterfs.mdata  [No such file or directory]
>> >> >[2020-06-08 05:25:55.192426] I [MSGID: 115060]
>> >> >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure]
>> >> >0-persistent-server: 13382741: SETXATTR
>> >> ><gfid:43312aba-75c6-42c2-855c-e0db66d7748f>
>> >> >(43312aba-75c6-42c2-855c-e0db66d7748f) ==> set-ctime-mdata,
>client:
>> >>
>>
>>>CTX_ID:e223ca30-6c30-4a40-ae98-a418143ce548-GRAPH_ID:0-PID:1006-HOST:sam-PC_NAME:persistent-client-2-RECON_NO:-1,
>> >> >error-xlator: persistent-posix
>> >> >
>> >> >Still the ctime-message. And a lot of these messages:
>> >> >
>> >> >[2020-06-08 05:25:53.016606] W [MSGID: 101159]
>> >> >[inode.c:1330:__inode_unlink] 0-inode:
>> >> >7043eed7-dbd7-4277-976f-d467349c1361/21194684.jpg: dentry not
>found
>> >in
>> >> >839512f0-75de-414f-993d-1c35892f8560
>> >> >
>> >> >Well... the problem is: the volume seems to be in a permanent
>heal
>> >> >status:
>> >> >
>> >> >Gathering count of entries to be healed on volume persistent has
>> >been
>> >> >successful
>> >> >Brick gluster1:/gluster/md3/persistent
>> >> >Number of entries: 31
>> >> >Brick gluster2:/gluster/md3/persistent
>> >> >Number of entries: 6
>> >> >Brick gluster3:/gluster/md3/persistent
>> >> >Number of entries: 5
>> >> >
>> >> >a bit later:
>> >> >Gathering count of entries to be healed on volume persistent has
>> >been
>> >> >successful
>> >> >Brick gluster1:/gluster/md3/persistent
>> >> >Number of entries: 100
>> >> >Brick gluster2:/gluster/md3/persistent
>> >> >Number of entries: 74
>> >> >Brick gluster3:/gluster/md3/persistent
>> >> >Number of entries: 1
>> >> >
>> >> >The number of entries never reach 0-0-0; i already updated one of
>> >the
>> >> >systems from 6.8 to 6.9, but updating the other 2 when heal isn't
>> >zero
>> >> >doesn't seem to be a good idea. Well... any idea?
>> >> >
>> >> >
>> >> >Best regards,
>> >> >Hubert
>> >> >
>> >> >Am Fr., 8. Mai 2020 um 21:47 Uhr schrieb Strahil Nikolov
>> >> ><hunter86_bg at yahoo.com>:
>> >> >>
>> >> >> On April 21, 2020 8:00:32 PM GMT+03:00, Amar Tumballi
>> >> ><amar at kadalu.io> wrote:
>> >> >> >There seems to be a burst of issues when people upgraded to
>5.x
>> >or
>> >> >6.x
>> >> >> >from
>> >> >> >3.12 (Thanks to you and Strahil, who have reported most of
>them).
>> >> >> >
>> >> >> >Latest update from Strahil is that if files are copied fresh
>on
>> >7.5
>> >> >> >series,
>> >> >> >there are no issues.
>> >> >> >
>> >> >> >We are in process of identifying the patch, and also provide
>an
>> >> >option
>> >> >> >to
>> >> >> >disable 'acl' for testing. Will update once we identify the
>> >issue.
>> >> >> >
>> >> >> >Regards,
>> >> >> >Amar
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >On Sat, Apr 11, 2020 at 11:10 AM Hu Bert
><revirii at googlemail.com>
>> >> >> >wrote:
>> >> >> >
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> no one has seen such messages?
>> >> >> >>
>> >> >> >> Regards,
>> >> >> >> Hubert
>> >> >> >>
>> >> >> >> Am Mo., 6. Apr. 2020 um 06:13 Uhr schrieb Hu Bert
>> >> >> ><revirii at googlemail.com
>> >> >> >> >:
>> >> >> >> >
>> >> >> >> > Hello,
>> >> >> >> >
>> >> >> >> > i just upgraded my servers and clients from 5.11 to 6.8;
>> >besides
>> >> >> >one
>> >> >> >> > connection problem to the gluster download server
>everything
>> >> >went
>> >> >> >> > fine.
>> >> >> >> >
>> >> >> >> > On the 3 gluster servers i mount the 2 volumes as well,
>and
>> >only
>> >> >> >there
>> >> >> >> > (and not on all the other clients) there are some messages
>in
>> >> >the
>> >> >> >log
>> >> >> >> > file of both mount logs:
>> >> >> >> >
>> >> >> >> > [2020-04-06 04:10:53.552561] W [MSGID: 114031]
>> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
>> >> >> >> > 0-persistent-client-2: remote operation failed [Permission
>> >> >denied]
>> >> >> >> > [2020-04-06 04:10:53.552635] W [MSGID: 114031]
>> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
>> >> >> >> > 0-persistent-client-1: remote operation failed [Permission
>> >> >denied]
>> >> >> >> > [2020-04-06 04:10:53.552639] W [MSGID: 114031]
>> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
>> >> >> >> > 0-persistent-client-0: remote operation failed [Permission
>> >> >denied]
>> >> >> >> > [2020-04-06 04:10:53.553226] E [MSGID: 148002]
>> >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk]
>> >> >0-persistent-utime:
>> >> >> >dict
>> >> >> >> > set of key for set-ctime-mdata failed [Permission denied]
>> >> >> >> > The message "W [MSGID: 114031]
>> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
>> >> >> >> > 0-persistent-client-2: remote operation failed [Permission
>> >> >denied]"
>> >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552561] and
>> >> >> >[2020-04-06
>> >> >> >> > 04:10:53.745542]
>> >> >> >> > The message "W [MSGID: 114031]
>> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
>> >> >> >> > 0-persistent-client-1: remote operation failed [Permission
>> >> >denied]"
>> >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552635] and
>> >> >> >[2020-04-06
>> >> >> >> > 04:10:53.745610]
>> >> >> >> > The message "W [MSGID: 114031]
>> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
>> >> >> >> > 0-persistent-client-0: remote operation failed [Permission
>> >> >denied]"
>> >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552639] and
>> >> >> >[2020-04-06
>> >> >> >> > 04:10:53.745632]
>> >> >> >> > The message "E [MSGID: 148002]
>> >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk]
>> >> >0-persistent-utime:
>> >> >> >dict
>> >> >> >> > set of key for set-ctime-mdata failed [Permission denied]"
>> >> >repeated
>> >> >> >4
>> >> >> >> > times between [2020-04-06 04:10:53.553226] and [2020-04-06
>> >> >> >> > 04:10:53.746080]
>> >> >> >> >
>> >> >> >> > Anything to worry about?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Regards,
>> >> >> >> > Hubert
>> >> >> >> ________
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Community Meeting Calendar:
>> >> >> >>
>> >> >> >> Schedule -
>> >> >> >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> >> >> >> Bridge: https://bluejeans.com/441850968
>> >> >> >>
>> >> >> >> Gluster-users mailing list
>> >> >> >> Gluster-users at gluster.org
>> >> >> >> https://lists.gluster.org/mailman/listinfo/gluster-users
>> >> >> >>
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> Can you provide the xfs_info for the bricks from the volume ?
>> >> >>
>> >> >> I have a theory that I want to confirm or reject.
>> >> >>
>> >> >> Best Regards,
>> >> >> Strahil Nikolov