[Gluster-users] Geo-replication failed to delete from slave file partially written to master volume.
Viktor Nosov
vnosov at stonefly.com
Wed Dec 7 17:18:52 UTC 2016
Hi Kotresh,
Thanks for looking into this issue!
I'm attaching log files from the slave node from /var/log/glusterfs/geo-replication-slaves/
[root at SC-183 log]# cp /var/log/glusterfs/geo-replication-slaves/84501a83-b07c-4768-bfaa-418b038e1a9e\:gluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.gluster.log /home/vnosov/
[root at SC-183 log]# cp /var/log/glusterfs/geo-replication-slaves/slave.log /home/vnosov/
[root at SC-183 log]# cp /var/log/glusterfs/geo-replication-slaves/mbr/84501a83-b07c-4768-bfaa-418b038e1a9e\:gluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.log /home/vnosov/
Best regards,
Viktor Nosov
-----Original Message-----
From: Kotresh Hiremath Ravishankar [mailto:khiremat at redhat.com]
Sent: Tuesday, December 06, 2016 9:25 PM
To: Viktor Nosov
Cc: gluster-users at gluster.org
Subject: Re: [Gluster-users] Geo-replication failed to delete from slave file partially written to master volume.
Hi Viktor,
Please share geo-replication-slave mount logs from slave nodes.
Thanks and Regards,
Kotresh H R
----- Original Message -----
> From: "Viktor Nosov" <vnosov at stonefly.com>
> To: gluster-users at gluster.org
> Cc: vnosov at stonefly.com
> Sent: Tuesday, December 6, 2016 7:13:22 AM
> Subject: [Gluster-users] Geo-replication failed to delete from slave file partially written to master volume.
>
> Hi,
>
> I hit problem while testing geo-replication. Anybody knows how to fix
> it except deleting and recreating geo-replication?
>
> Geo-replication failed to delete from slave file partially written to
> master volume.
>
> Have geo-replication between two nodes that are running glusterfs
> 3.7.16
>
> with master volume:
>
> [root at SC-182 log]# gluster volume info master-for-183-0003
>
> Volume Name: master-for-183-0003
> Type: Distribute
> Volume ID: 84501a83-b07c-4768-bfaa-418b038e1a9e
> Status: Started
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> Brick1: 10.10.60.182:/exports/nas-segment-0012/master-for-183-0003
> Options Reconfigured:
> changelog.changelog: on
> geo-replication.ignore-pid-check: on
> geo-replication.indexing: on
> server.allow-insecure: on
> performance.quick-read: off
> performance.stat-prefetch: off
> nfs.disable: on
> nfs.addr-namelookup: off
> performance.readdir-ahead: on
> cluster.enable-shared-storage: enable
> snap-activate-on-create: enable
>
> and slave volume:
>
> [root at SC-183 log]# gluster volume info rem-volume-0001
>
> Volume Name: rem-volume-0001
> Type: Distribute
> Volume ID: 7680de7a-d0e2-42f2-96a9-4da29adba73c
> Status: Started
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> Brick1: 10.10.60.183:/exports/nas183-segment-0001/rem-volume-0001
> Options Reconfigured:
> performance.readdir-ahead: on
> nfs.addr-namelookup: off
> nfs.disable: on
> performance.stat-prefetch: off
> performance.quick-read: off
> server.allow-insecure: on
> snap-activate-on-create: enable
>
> Master volume mounted on node:
>
> [root at SC-182 log]# mount
> 127.0.0.1:/master-for-183-0003 on /samba/master-for-183-0003 type
> fuse.glusterfs (rw,allow_other,max_read=131072)
>
> Let's fill up space on master volume:
>
> [root at SC-182 log]# mkdir /samba/master-for-183-0003/cifs_share/dir3
> [root at SC-182 log]# cp big.file
> /samba/master-for-183-0003/cifs_share/dir3/
> [root at SC-182 log]# cp big.file
> /samba/master-for-183-0003/cifs_share/dir3/big.file.1
> cp: writing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1':
> No space left on device
> cp: closing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1':
> No space left on device
>
> File " big.file.1" represent part of the original file:
> [root at SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/*
> -rwx------ 1 root root 78930370 Dec 5 16:49
> /samba/master-for-183-0003/cifs_share/dir3/big.file
> -rwx------ 1 root root 22155264 Dec 5 16:49
> /samba/master-for-183-0003/cifs_share/dir3/big.file.1
>
> Both new files are geo-replicated to the Slave volume successfully:
>
> [root at SC-183 log]# ls -l
> /exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/
> total 98720
> -rwx------ 2 root root 78930370 Dec 5 16:49 big.file
> -rwx------ 2 root root 22155264 Dec 5 16:49 big.file.1
>
> [root at SC-182 log]# /usr/sbin/gluster volume geo-replication
> master-for-183-0003 nasgorep at 10.10.60.183::rem-volume-0001 status
> detail
>
> MASTER NODE MASTER VOL MASTER BRICK
> SLAVE USER SLAVE SLAVE NODE
> STATUS
> CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES
> CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ------
> 10.10.60.182 master-for-183-0003
> /exports/nas-segment-0012/master-for-183-0003 nasgorep
> nasgorep at 10.10.60.183::rem-volume-0001 10.10.60.183 Active
> Changelog Crawl 2016-12-05 16:49:48 0 0 0 0
> N/A N/A N/A
>
> Let's delete partially written file from the master mount:
>
> [root at SC-182 log]# rm
> /samba/master-for-183-0003/cifs_share/dir3/big.file.1
> rm: remove regular file
> `/samba/master-for-183-0003/cifs_share/dir3/big.file.1'? y
>
> [root at SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/*
> -rwx------ 1 root root 78930370 Dec 5 16:49
> /samba/master-for-183-0003/cifs_share/dir3/big.file
>
> Set checkpoint:
>
> 32643 12/05/2016 16:57:46.540390536 1480985866 command:
> /usr/sbin/gluster volume geo-replication master-for-183-0003
> nasgorep at 10.10.60.183::rem-volume-0001 config checkpoint now 2>&1
> 32643 12/05/2016 16:57:48.770820909 1480985868 status=0
> /usr/sbin/gluster volume geo-replication master-for-183-0003
> nasgorep at 10.10.60.183::rem-volume-0001 config checkpoint now 2>&1
>
> Check geo-replication status:
>
> [root at SC-182 log]# /usr/sbin/gluster volume geo-replication
> master-for-183-0003 nasgorep at 10.10.60.183::rem-volume-0001 status
> detail
>
> MASTER NODE MASTER VOL MASTER BRICK
> SLAVE USER SLAVE SLAVE NODE
> STATUS
> CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES
> CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------
> 10.10.60.182 master-for-183-0003
> /exports/nas-segment-0012/master-for-183-0003 nasgorep
> nasgorep at 10.10.60.183::rem-volume-0001 10.10.60.183 Active
> Changelog Crawl 2016-12-05 16:57:48 0 0 0 0
> 2016-12-05 16:57:46 Yes 2016-12-05 16:57:50
>
> But the partially written file "big.file.1" is still present on the
> slave
> volume:
>
> [root at SC-183 log]# ls -l
> /exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/
> total 98720
> -rwx------ 2 root root 78930370 Dec 5 16:49 big.file
> -rwx------ 2 root root 22155264 Dec 5 16:49 big.file.1
>
> Gluster logs for geo-replication do not have any indication about
> failure to delete the file:
>
> [root at SC-182 log]# view
> /var/log/glusterfs/geo-replication/master-for-183-0003/ssh%3A%2F%2Fnas
> gorep%
> 4010.10.60.183%3Agluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.log
>
> [2016-12-06 00:49:40.267956] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 17 crawls, 1 turns
> [2016-12-06 00:49:52.348413] I
> [master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster:
> slave's time: (1480985358, 0)
> [2016-12-06 00:49:53.296811] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:53.901186] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:54.760957] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:55.384705] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:55.987873] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:56.848361] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:57.471925] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:58.76416] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:58.935801] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:59.560571] E
> [resource(/exports/nas-segment-0012/master-for-183-0003):1021:rsync] SSH:
> SYNC Error(Rsync): rsync: rsync_xal_set:
> lsetxattr(".gfid/103b87ff-3b7a-4f2b-8bc5-a2f9c1d3fc0e","trusted.gluste
> rfs.84
> 501a83-b07c-4768-bfaa-418b038e1a9e.xtime") failed: Operation not
> permitted
> (1)
> [2016-12-06 00:49:59.560972] E
> [master(/exports/nas-segment-0012/master-for-183-0003):1037:process]
> _GMaster: changelogs CHANGELOG.1480985389 could not be processed
> completely - moving on...
> [2016-12-06 00:50:41.839792] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 18 crawls, 1 turns
> [2016-12-06 00:51:42.203411] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:52:42.600800] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:53:42.983913] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:54:43.381218] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:55:43.749927] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:56:44.113914] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:57:44.494354] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:57:48.528424] I [gsyncd(conf):671:main_i] <top>:
> checkpoint
> 1480985866 set
> [2016-12-06 00:57:48.528704] I [syncdutils(conf):220:finalize] <top>:
> exiting.
> [2016-12-06 00:57:50.530714] I
> [master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster:
> slave's time: (1480985388, 0)
> [2016-12-06 00:58:44.802122] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 1 turns
> [2016-12-06 00:59:45.181669] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
>
> Best regards,
>
> Viktor Nosov
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 84LPUG~L.LOG
Type: application/octet-stream
Size: 200 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161207/3c74a181/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slave.log
Type: application/octet-stream
Size: 225810 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161207/3c74a181/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 8RSDWG~T.LOG
Type: application/octet-stream
Size: 9392 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161207/3c74a181/attachment-0002.obj>
More information about the Gluster-users
mailing list