[Gluster-users] glusterd keeps resyncing shards over and over again

Mon Dec 10 12:18:10 UTC 2018

Hello Ravi
Thanks for your quick reply!
How can I check that?
We have about 60 shards continuously being synced. It seems those
shards are always the same.
Kind regards,
Chris
On Mon, 2018-12-10 at 17:33 +0530, Ravishankar N wrote:
>     
> 
>     
>     
> 
>     On 12/10/2018 05:06 PM, Atin Mukherjee
>       wrote:
> 
>     
>     
> >       Even though the subject says the issue is with
> >         glusterd, I think the question is more applicable on
> >         heal/shards. Added the relevant folks to help out.
> > 
> >       
> >       
> > 
> >       
> >         On Mon, Dec 10, 2018 at 3:43 PM Chris Drescher
> >           <info at linuxfabrik.ch> wrote:
> > 
> >         
> >         
> > >           
> > >             Let me provide more information.
> > >             
> > > 
> > >             
> > >             We have 3 gluster nodes running with sharding
> > >               activated.
> > >             
> > > 
> > >             
> > >             Node1: CentOS 7.5 - Glusterfs 3.12.6
> > >             Node2: CentOS 7.5 - Glusterfs 3.12.6
> > >             Node3: CentOS 7.5 - Glusterfs 3.12.6
> > >             
> > > 
> > >             
> > >             Now we updated Node 3 from CentOS 7.5 to 7.6 which
> > >               caused a reboot.
> > >             Glusterd Version changed from 3.12.6 to 3.12.15
> > >             
> > > 
> > >             
> > >             Node1: CentOS 7.5 - Glusterfs 3.12.6
> > >             Node2: CentOS 7.5 - Glusterfs 3.12.6
> > >             Node3: CentOS 7.6 - Glusterfs 3.12.15
> > >             
> > > 
> > >             
> > >             Afterwards gluster heal daemon keeps resyncing
> > > specific
> > >               shards on bricks on Node1 and Node2. Always the
> > > same
> > >               shards.
> > >           
> > >         
> > 
> >       
> >     
> 
>     
> 
>     Your clients (mounts)  might be experiencing disconnects from the
>     brick process(es) while the same set of shards are being written
> to.
>     Possibly to the second brick, going from the "sinks=1 " log
> below.
>     Check if that is the case.
> 
>     -Ravi
> 
>     
> >       
> >         
> > >           
> > >             
> > > 
> > >             
> > >             LOGS:
> > >             
> > > 
> > >             
> > >             On upgraded NODE3:
> > >             
> > > 
> > >             
> > >             /var/log/glusterfs/glusterd.log
> > >             [2018-12-10 09:24:42.314624] E [MSGID: 106062]
> > >               [glusterd-
> > > utils.c:10112:glusterd_max_opversion_use_rsp_dict]
> > >               0-management: Maximum supported op-version not set
> > > in
> > >               destination dictionary
> > >             
> > > 
> > >             
> > >             tail -f /var/log/glusterfs/glustershd.log
> > >             [2018-12-09 04:28:05.687127] I [MSGID: 108026]
> > >               [afr-self-heal-common.c:1726:afr_log_selfheal]
> > >               0-data-replicate-0: Completed data selfheal on
> > >               3f1711c2-de8c-4e8e-be10-a252f5b1b4ad. sources=[0]
> > >               2  sinks=1 
> > >             
> > > 
> > >             
> > >             
> > > 
> > >             
> > >             On NODE1:
> > >             
> > > 
> > >             
> > >             tail -f /var/log/glusterfs/glfsheal-data.log
> > >             [2018-12-10 10:00:01.898139] I [MSGID: 114035]
> > >               [client-handshake.c:202:client_set_lk_version_cbk]
> > >               0-data-client-16: Server lk version = 1
> > >             [2018-12-10 10:00:01.898487] I [MSGID: 114057]
> > >               [client-
> > > handshake.c:1478:select_server_supported_programs]
> > >               0-data-client-17: Using Program GlusterFS 3.3, Num
> > >               (1298437), Version
> > > (330)                             
> > >             [2018-12-10 10:00:01.898892] I [MSGID: 114046]
> > >               [client-handshake.c:1231:client_setvolume_cbk]
> > >               0-data-client-17: Connected to data-client-17,
> > > attached to
> > >               remote volume
> > > '/gluster/arb2/data'.                      
> > >             [2018-12-10 10:00:01.898900] I [MSGID: 114047]
> > >               [client-handshake.c:1242:client_setvolume_cbk]
> > >               0-data-client-17: Server and Client lk-version
> > > numbers are
> > >               not same, reopening the
> > > fds                              
> > >             [2018-12-10 10:00:01.899007] I [MSGID: 114035]
> > >               [client-handshake.c:202:client_set_lk_version_cbk]
> > >               0-data-client-17: Server lk version = 1
> > >             [2018-12-10 10:00:01.901528] I [MSGID: 108031]
> > >               [afr-common.c:2376:afr_local_discovery_cbk]
> > >               0-data-replicate-3: selecting local read_child
> > >               data-client-9
> > >             [2018-12-10 10:00:01.901876] I [MSGID: 108031]
> > >               [afr-common.c:2376:afr_local_discovery_cbk]
> > >               0-data-replicate-5: selecting local read_child
> > >               data-client-15
> > >             [2018-12-10 10:00:01.901978] I [MSGID: 108031]
> > >               [afr-common.c:2376:afr_local_discovery_cbk]
> > >               0-data-replicate-4: selecting local read_child
> > >               data-client-12
> > >             [2018-12-10 10:00:01.902708] I [MSGID: 108031]
> > >               [afr-common.c:2376:afr_local_discovery_cbk]
> > >               0-data-replicate-2: selecting local read_child
> > >               data-client-6
> > >             [2018-12-10 10:00:01.902750] I [MSGID: 104041]
> > >               [glfs-resolve.c:971:__glfs_active_subvol] 0-data:
> > > switched
> > >               to graph 70312d70-6f64-3031-2e6c-696e75786661 (0)
> > >             
> > > 
> > >             
> > >             Hope that helps!
> > >             
> > > 
> > >             
> > >             On Mon, 2018-12-10 at 09:22 +0100, Chris Drescher
> > >               wrote:
> > >             
> > > >               Hello everybody
> > > >               
> > > > 
> > > >               
> > > >               We are experiencing an urgent issue with
> > > > glusterd!
> > > >               After an upgrade from centos7.5 to 7.6 our
> > > > grusterfs
> > > >                 keeps resyncing specific shards over and over
> > > > again!
> > > >               
> > > > 
> > > >               
> > > >               Is this a known problem?
> > > >               
> > > > 
> > > >               
> > > >               This is very urgent! Please help!
> > > >               
> > > > 
> > > >               
> > > >               Thanks in advance!
> > > >               
> > > > 
> > > >               
> > > >               Kind regards.
> > > >               
> > > > 
> > > >               
> > > >               Chris
> > > >               _______________________________________________
> > > >               Gluster-users mailing list
> > > >               Gluster-users at gluster.org
> > > >               
> > > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > > >             
> > > 
> > >           
> > >           _______________________________________________
> > > 
> > >           Gluster-users mailing list
> > > 
> > >           Gluster-users at gluster.org
> > > 
> > >           
> > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > 
> >       
> >     
> 
>     
> 
>   
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181210/23159016/attachment.html>