[Gluster-users] One error/warning message after upgrade 5.11 -> 6.8

Mon Jun 8 13:54:27 UTC 2020

Hi,

the "process", if we wanna call it so, has finished. Maybe there was a
process running that accessed/deleted/... files that haven't been
accessed for a while, resulting in ctime mdata fixes. However, heal
count is down to 0 on all bricks. Very strange, i see ~34K such log
entries for each brick.

Let's think positive: gluster is running properly and doing what it
should do. Great! :D

Hubert

Am Mo., 8. Juni 2020 um 15:36 Uhr schrieb Strahil Nikolov
<hunter86_bg at yahoo.com>:
>
> Hm... That's something I didn't expect.
>
>
> By the way, have you checked  if all clients are connected to all bricks (if using FUSE)?
>
> Maybe you have some clients that cannot reach a brick.
>
> Best Regards,
> Strahil Nikolov
>
> На 8 юни 2020 г. 12:48:22 GMT+03:00, Hu Bert <revirii at googlemail.com> написа:
> >Hi Strahil,
> >
> >thx for your answer, but i assume that your approach won't help. It
> >seems like that this behaviour is permanent; e.g. a log entry like
> >this:
> >
> >[2020-06-08 09:40:03.948269] E [MSGID: 113001]
> >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix:
> >file:
> >/gluster/md3/persistent/.glusterfs/38/30/38306ef8-6588-40cf-8be3-c0a022714612:
> >gfid: 38306ef8-6588-40cf-8be3-c0a022714612 key:trusted.glusterfs.mdata
> > [No such file or directory]
> >[2020-06-08 09:40:03.948333] E [MSGID: 113114]
> >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files]
> >0-persistent-posix: gfid: 38306ef8-6588-40cf-8be3-c0a022714612
> >key:trusted.glusterfs.mdata  [No such file or directory]
> >[2020-06-08 09:40:03.948422] I [MSGID: 115060]
> >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure]
> >0-persistent-server: 14193413: SETXATTR
> >/images/generated/207/039/2070391/484x425r.jpg
> >(38306ef8-6588-40cf-8be3-c0a022714612) ==> set-ctime-mdata, client:
> >CTX_ID:b738017c-20a3-4547-afba-5b8933d8e6e5-GRAPH_ID:0-PID:1078-HOST:pepe-PC_NAME:persistent-client-2-RECON_NO:-1,
> >error-xlator: persistent-posix
> >
> >tells me that an error (ctime-mdata) is found and fixed. And this is
> >happening over and over again. A couple of minutes ago i wanted to
> >begin with what you suggested and called 'gluster volume heal
> >persistent info' and suddenly saw:
> >
> >Brick gluster1:/gluster/md3/persistent
> >Status: Connected
> >Number of entries: 0
> >
> >Brick gluster2:/gluster/md3/persistent
> >Status: Connected
> >Number of entries: 0
> >
> >Brick gluster3:/gluster/md3/persistent
> >Status: Connected
> >Number of entries: 0
> >
> >I thought 'wtf...'; the heal-count was 0 as well; but the next call
> >~15s later showed this again:
> >
> >Brick gluster1:/gluster/md3/persistent
> >Number of entries: 31
> >
> >Brick gluster2:/gluster/md3/persistent
> >Number of entries: 27
> >
> >Brick gluster3:/gluster/md3/persistent
> >Number of entries: 4
> >
> >For me it looks like the 'error found -> heal it' process works as it
> >should, but due to the permanent errors (log file entries) the heal
> >count of zero is almost impossible to read.
> >
> >Well, one could deactivate features.ctime as this seems to be the
> >reason (as the log entries suggest), but i don't know if that is
> >reasonable, i.e. if this feature is needed.
> >
> >
> >Best regards,
> >Hubert
> >
> >Am Mo., 8. Juni 2020 um 11:22 Uhr schrieb Strahil Nikolov
> ><hunter86_bg at yahoo.com>:
> >>
> >> Hi Hubert,
> >>
> >> Here is one idea:
> >> Using 'gluster volume  heal VOL  info' can provide  the gfids of
> >files pending heal.
> >> Once  you have them, you can find the inode of each file via 'ls  -li
> >/gluster/brick/.gfid/<first_two_characters_of_gfid>/<next_two_characters>/gfid
> >>
> >> Then you can search the brick with find for that inode number (don't
> >forget the 'ionice' to reduce the pressure).
> >>
> >> Once  you have the list of files, stat them via the FUSE client and
> >check if they got healed.
> >>
> >> I fully agree that you need to first heal the golumes  before
> >proceeding further or you might get into a nasty situation.
> >>
> >> Best Regards,
> >> Strahil Nikolov
> >>
> >>
> >> На 8 юни 2020 г. 8:30:57 GMT+03:00, Hu Bert <revirii at googlemail.com>
> >написа:
> >> >Good morning,
> >> >
> >> >i just wanted to update the version from 6.8 to 6.9 on our replicate
> >3
> >> >system (formerly was version 5.11), and i see tons of these
> >messages:
> >> >
> >> >[2020-06-08 05:25:55.192301] E [MSGID: 113001]
> >> >[posix-metadata.c:234:posix_store_mdata_xattr] 0-persistent-posix:
> >> >file:
> >>
> >>/gluster/md3/persistent/.glusterfs/43/31/43312aba-75c6-42c2-855c-e0db66d7748f:
> >> >gfid: 43312aba-75c6-42c2-855c-e0db66d7748f
> >key:trusted.glusterfs.mdata
> >> > [No such file or directory]
> >> >[2020-06-08 05:25:55.192375] E [MSGID: 113114]
> >> >[posix-metadata.c:433:posix_set_mdata_xattr_legacy_files]
> >> >0-persistent-posix: gfid: 43312aba-75c6-42c2-855c-e0db66d7748f
> >> >key:trusted.glusterfs.mdata  [No such file or directory]
> >> >[2020-06-08 05:25:55.192426] I [MSGID: 115060]
> >> >[server-rpc-fops.c:938:_gf_server_log_setxattr_failure]
> >> >0-persistent-server: 13382741: SETXATTR
> >> ><gfid:43312aba-75c6-42c2-855c-e0db66d7748f>
> >> >(43312aba-75c6-42c2-855c-e0db66d7748f) ==> set-ctime-mdata, client:
> >>
> >>CTX_ID:e223ca30-6c30-4a40-ae98-a418143ce548-GRAPH_ID:0-PID:1006-HOST:sam-PC_NAME:persistent-client-2-RECON_NO:-1,
> >> >error-xlator: persistent-posix
> >> >
> >> >Still the ctime-message. And a lot of these messages:
> >> >
> >> >[2020-06-08 05:25:53.016606] W [MSGID: 101159]
> >> >[inode.c:1330:__inode_unlink] 0-inode:
> >> >7043eed7-dbd7-4277-976f-d467349c1361/21194684.jpg: dentry not found
> >in
> >> >839512f0-75de-414f-993d-1c35892f8560
> >> >
> >> >Well... the problem is: the volume seems to be in a permanent heal
> >> >status:
> >> >
> >> >Gathering count of entries to be healed on volume persistent has
> >been
> >> >successful
> >> >Brick gluster1:/gluster/md3/persistent
> >> >Number of entries: 31
> >> >Brick gluster2:/gluster/md3/persistent
> >> >Number of entries: 6
> >> >Brick gluster3:/gluster/md3/persistent
> >> >Number of entries: 5
> >> >
> >> >a bit later:
> >> >Gathering count of entries to be healed on volume persistent has
> >been
> >> >successful
> >> >Brick gluster1:/gluster/md3/persistent
> >> >Number of entries: 100
> >> >Brick gluster2:/gluster/md3/persistent
> >> >Number of entries: 74
> >> >Brick gluster3:/gluster/md3/persistent
> >> >Number of entries: 1
> >> >
> >> >The number of entries never reach 0-0-0; i already updated one of
> >the
> >> >systems from 6.8 to 6.9, but updating the other 2 when heal isn't
> >zero
> >> >doesn't seem to be a good idea. Well... any idea?
> >> >
> >> >
> >> >Best regards,
> >> >Hubert
> >> >
> >> >Am Fr., 8. Mai 2020 um 21:47 Uhr schrieb Strahil Nikolov
> >> ><hunter86_bg at yahoo.com>:
> >> >>
> >> >> On April 21, 2020 8:00:32 PM GMT+03:00, Amar Tumballi
> >> ><amar at kadalu.io> wrote:
> >> >> >There seems to be a burst of issues when people upgraded to 5.x
> >or
> >> >6.x
> >> >> >from
> >> >> >3.12 (Thanks to you and Strahil, who have reported most of them).
> >> >> >
> >> >> >Latest update from Strahil is that if files are copied fresh on
> >7.5
> >> >> >series,
> >> >> >there are no issues.
> >> >> >
> >> >> >We are in process of identifying the patch, and also provide an
> >> >option
> >> >> >to
> >> >> >disable 'acl' for testing. Will update once we identify the
> >issue.
> >> >> >
> >> >> >Regards,
> >> >> >Amar
> >> >> >
> >> >> >
> >> >> >
> >> >> >On Sat, Apr 11, 2020 at 11:10 AM Hu Bert <revirii at googlemail.com>
> >> >> >wrote:
> >> >> >
> >> >> >> Hi,
> >> >> >>
> >> >> >> no one has seen such messages?
> >> >> >>
> >> >> >> Regards,
> >> >> >> Hubert
> >> >> >>
> >> >> >> Am Mo., 6. Apr. 2020 um 06:13 Uhr schrieb Hu Bert
> >> >> ><revirii at googlemail.com
> >> >> >> >:
> >> >> >> >
> >> >> >> > Hello,
> >> >> >> >
> >> >> >> > i just upgraded my servers and clients from 5.11 to 6.8;
> >besides
> >> >> >one
> >> >> >> > connection problem to the gluster download server everything
> >> >went
> >> >> >> > fine.
> >> >> >> >
> >> >> >> > On the 3 gluster servers i mount the 2 volumes as well, and
> >only
> >> >> >there
> >> >> >> > (and not on all the other clients) there are some messages in
> >> >the
> >> >> >log
> >> >> >> > file of both mount logs:
> >> >> >> >
> >> >> >> > [2020-04-06 04:10:53.552561] W [MSGID: 114031]
> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
> >> >> >> > 0-persistent-client-2: remote operation failed [Permission
> >> >denied]
> >> >> >> > [2020-04-06 04:10:53.552635] W [MSGID: 114031]
> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
> >> >> >> > 0-persistent-client-1: remote operation failed [Permission
> >> >denied]
> >> >> >> > [2020-04-06 04:10:53.552639] W [MSGID: 114031]
> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
> >> >> >> > 0-persistent-client-0: remote operation failed [Permission
> >> >denied]
> >> >> >> > [2020-04-06 04:10:53.553226] E [MSGID: 148002]
> >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk]
> >> >0-persistent-utime:
> >> >> >dict
> >> >> >> > set of key for set-ctime-mdata failed [Permission denied]
> >> >> >> > The message "W [MSGID: 114031]
> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
> >> >> >> > 0-persistent-client-2: remote operation failed [Permission
> >> >denied]"
> >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552561] and
> >> >> >[2020-04-06
> >> >> >> > 04:10:53.745542]
> >> >> >> > The message "W [MSGID: 114031]
> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
> >> >> >> > 0-persistent-client-1: remote operation failed [Permission
> >> >denied]"
> >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552635] and
> >> >> >[2020-04-06
> >> >> >> > 04:10:53.745610]
> >> >> >> > The message "W [MSGID: 114031]
> >> >> >> > [client-rpc-fops_v2.c:851:client4_0_setxattr_cbk]
> >> >> >> > 0-persistent-client-0: remote operation failed [Permission
> >> >denied]"
> >> >> >> > repeated 4 times between [2020-04-06 04:10:53.552639] and
> >> >> >[2020-04-06
> >> >> >> > 04:10:53.745632]
> >> >> >> > The message "E [MSGID: 148002]
> >> >> >> > [utime.c:146:gf_utime_set_mdata_setxattr_cbk]
> >> >0-persistent-utime:
> >> >> >dict
> >> >> >> > set of key for set-ctime-mdata failed [Permission denied]"
> >> >repeated
> >> >> >4
> >> >> >> > times between [2020-04-06 04:10:53.553226] and [2020-04-06
> >> >> >> > 04:10:53.746080]
> >> >> >> >
> >> >> >> > Anything to worry about?
> >> >> >> >
> >> >> >> >
> >> >> >> > Regards,
> >> >> >> > Hubert
> >> >> >> ________
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Community Meeting Calendar:
> >> >> >>
> >> >> >> Schedule -
> >> >> >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> >> >> >> Bridge: https://bluejeans.com/441850968
> >> >> >>
> >> >> >> Gluster-users mailing list
> >> >> >> Gluster-users at gluster.org
> >> >> >> https://lists.gluster.org/mailman/listinfo/gluster-users
> >> >> >>
> >> >>
> >> >> Hi,
> >> >>
> >> >> Can you provide the xfs_info for the bricks from the volume ?
> >> >>
> >> >> I have a theory that I want to confirm or reject.
> >> >>
> >> >> Best Regards,
> >> >> Strahil Nikolov