[Gluster-users] geo-replication sync issue

Strahil Nikolov hunter86_bg at yahoo.com
Thu Mar 12 04:55:55 UTC 2020


On March 11, 2020 10:17:05 PM GMT+02:00, "Etem Bayoğlu" <etembayoglu at gmail.com> wrote:
>Hi Strahil,
>
>Thank you for your response. when I tail logs on both master and slave
>I
>get this:
>
>on slave, from
>/var/log/glusterfs/geo-replication-slaves/<geo-session>/mnt-XXX.log
>file:
>
>[2020-03-11 19:53:32.721509] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998]
>(-->
>/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>[2020-03-11 19:53:32.723758] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998]
>(-->
>/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>
>on master,
>from /var/log/glusterfs/geo-replication/<geo-session>/mnt-XXX.log file:
>
>[2020-03-11 19:40:55.872002] E [fuse-bridge.c:4188:fuse_xattr_cbk]
>0-glusterfs-fuse: extended attribute not supported by the backend
>storage
>[2020-03-11 19:40:58.389748] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998]
>(-->
>/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>[2020-03-11 19:41:08.214591] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998]
>(-->
>/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>[2020-03-11 19:53:59.275469] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f1f4b9108ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f1f42cc2221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f1f42cc3998]
>(-->
>/lib64/libpthread.so.0(+0x7e25)[0x7f1f4a751e25] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f1f4a01abad] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory
>
>####################gsyncd.log outputs:######################
>
>from slave:
>[2020-03-11 08:55:16.384085] I [repce(slave
>master-node/srv/media-storage):96:service_loop] RepceServer:
>terminating on
>reaching EOF.
>[2020-03-11 08:57:55.87364] I [resource(slave
>master-node/srv/media-storage):1105:connect] GLUSTER: Mounting gluster
>volume locally...
>[2020-03-11 08:57:56.171372] I [resource(slave
>master-node/srv/media-storage):1128:connect] GLUSTER: Mounted gluster
>volume duration=1.0837
>[2020-03-11 08:57:56.173346] I [resource(slave
>master-node/srv/media-storage):1155:service_loop] GLUSTER: slave
>listening
>
>from master:
>[2020-03-11 20:08:55.145453] I [master(worker
>/srv/media-storage):1991:syncjob] Syncer: Sync Time Taken
>duration=134.9987num_files=4661 job=2 return_code=0
>[2020-03-11 20:08:55.285871] I [master(worker
>/srv/media-storage):1421:process] _GMaster: Entry Time Taken MKD=83
>MKN=8109 LIN=0 SYM=0 REN=0 RMD=0 CRE=0 duration=17.0358 UNL=0
>[2020-03-11 20:08:55.286082] I [master(worker
>/srv/media-storage):1431:process] _GMaster: Data/Metadata Time Taken
>SETA=83 SETX=0 meta_duration=0.9334 data_duration=135.2497 DATA=8109
>XATT=0
>[2020-03-11 20:08:55.286410] I [master(worker
>/srv/media-storage):1441:process] _GMaster: Batch Completed
>changelog_end=1583917610 entry_stime=None changelog_start=1583917610
>stime=None duration=153.5185 num_changelogs=1 mode=xsync
>[2020-03-11 20:08:55.315442] I [master(worker
>/srv/media-storage):1681:crawl] _GMaster: processing xsync changelog
>path=/var/lib/misc/gluster/gsyncd/media-storage_daredevil01.zingat.com_dr-media/srv-media-storage/xsync/XSYNC-CHANGELOG.1583917613
>
>
>Thank you..
>
>Strahil Nikolov <hunter86_bg at yahoo.com>, 11 Mar 2020 Çar, 12:28
>tarihinde
>şunu yazdı:
>
>> On March 11, 2020 10:09:27 AM GMT+02:00, "Etem Bayoğlu" <
>> etembayoglu at gmail.com> wrote:
>> >Hello community,
>> >
>> >I've set up a glusterfs geo-replication node for disaster recovery.
>I
>> >manage about 10TB media data on a gluster volume and I want to sync
>all
>> >data to remote location over WAN. So, I created a slave node volume
>on
>> >disaster recovery center on remote location and I've started geo-rep
>> >session. It has been transferred data fine up to about 800GB, but
>> >syncing
>> >has stopped for three days despite gluster geo-rep status active and
>> >hybrid
>> >crawl. There is no sending data. I've recreated session and
>restarted
>> >but
>> >still the same.
>> >
>> >#gluster volu geo-rep status
>> >
>> >MASTER NODE            MASTER VOL       MASTER BRICK          SLAVE
>> >USER
>> >SLAVE                                     SLAVE NODE
>> >STATUS
>> >   CRAWL STATUS    LAST_SYNCED
>>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >master-node   media-storage    /srv/media-storage    root
>> > ssh://slave-node::dr-media    slave-node          Active
>> >Hybrid Crawl                 N/A
>> >
>> >Any idea? please. Thank you.
>>
>> Hi Etem,
>>
>> Have you checked the log on both source and destination. Maybe they
>can
>> hint you what the issue is.
>>
>> Best Regards,
>> Strahil Nikolov
>>

Hi Etem,

Nothing obvious....
I don't like this one:

[2020-03-11 19:53:32.721509] E
>[fuse-bridge.c:227:check_and_dump_fuse_W]
>(--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13a)[0x7f78e10488ea]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x8221)[0x7f78d83f6221]
>(-->
>/usr/lib64/glusterfs/7.3/xlator/mount/fuse.so(+0x9998)[0x7f78d83f7998]
>(-->
>/lib64/libpthread.so.0(+0x7e65)[0x7f78dfe89e65] (-->
>/lib64/libc.so.6(clone+0x6d)[0x7f78df74f88d] ))))) 0-glusterfs-fuse:
>writing to fuse device failed: No such file or directory

Can you check the health of the slave volume (splitbrains, brick status,etc) ?

Maybe you can check the logs and find when exactly the master stopped replicating and then checking the logs of the slave at that exact time .

Also, you can increase the log level on the slave and then recreate the georep.
For details, check: 

https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/configuring_the_log_level

P.S.: Trace/debug can fill up your /var/log, so enable them for a short period of time.

Best Regards,
Strahil Nikolov


More information about the Gluster-users mailing list