[Gluster-users] Geo-replication failed to delete from slave file partially written to master volume.

Viktor Nosov vnosov at stonefly.com
Wed Dec 7 17:18:52 UTC 2016


Hi Kotresh,

Thanks for looking into this issue!
I'm attaching log files from the slave node from /var/log/glusterfs/geo-replication-slaves/

[root at SC-183 log]# cp /var/log/glusterfs/geo-replication-slaves/84501a83-b07c-4768-bfaa-418b038e1a9e\:gluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.gluster.log /home/vnosov/
[root at SC-183 log]# cp /var/log/glusterfs/geo-replication-slaves/slave.log /home/vnosov/
[root at SC-183 log]# cp /var/log/glusterfs/geo-replication-slaves/mbr/84501a83-b07c-4768-bfaa-418b038e1a9e\:gluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.log /home/vnosov/

Best regards,

Viktor Nosov


-----Original Message-----
From: Kotresh Hiremath Ravishankar [mailto:khiremat at redhat.com] 
Sent: Tuesday, December 06, 2016 9:25 PM
To: Viktor Nosov
Cc: gluster-users at gluster.org
Subject: Re: [Gluster-users] Geo-replication failed to delete from slave file partially written to master volume.

Hi Viktor,

Please share geo-replication-slave mount logs from slave nodes.

Thanks and Regards,
Kotresh H R

----- Original Message -----
> From: "Viktor Nosov" <vnosov at stonefly.com>
> To: gluster-users at gluster.org
> Cc: vnosov at stonefly.com
> Sent: Tuesday, December 6, 2016 7:13:22 AM
> Subject: [Gluster-users] Geo-replication failed to delete from slave file	partially written to master volume.
> 
> Hi,
> 
> I hit problem while testing geo-replication. Anybody knows how to fix 
> it except deleting and recreating geo-replication?
> 
> Geo-replication failed to delete from slave file partially written to 
> master volume.
> 
> Have geo-replication between two nodes that are running glusterfs 
> 3.7.16
> 
> with master volume:
> 
> [root at SC-182 log]# gluster volume info master-for-183-0003
> 
> Volume Name: master-for-183-0003
> Type: Distribute
> Volume ID: 84501a83-b07c-4768-bfaa-418b038e1a9e
> Status: Started
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> Brick1: 10.10.60.182:/exports/nas-segment-0012/master-for-183-0003
> Options Reconfigured:
> changelog.changelog: on
> geo-replication.ignore-pid-check: on
> geo-replication.indexing: on
> server.allow-insecure: on
> performance.quick-read: off
> performance.stat-prefetch: off
> nfs.disable: on
> nfs.addr-namelookup: off
> performance.readdir-ahead: on
> cluster.enable-shared-storage: enable
> snap-activate-on-create: enable
> 
> and slave volume:
> 
> [root at SC-183 log]# gluster volume info rem-volume-0001
> 
> Volume Name: rem-volume-0001
> Type: Distribute
> Volume ID: 7680de7a-d0e2-42f2-96a9-4da29adba73c
> Status: Started
> Number of Bricks: 1
> Transport-type: tcp
> Bricks:
> Brick1: 10.10.60.183:/exports/nas183-segment-0001/rem-volume-0001
> Options Reconfigured:
> performance.readdir-ahead: on
> nfs.addr-namelookup: off
> nfs.disable: on
> performance.stat-prefetch: off
> performance.quick-read: off
> server.allow-insecure: on
> snap-activate-on-create: enable
> 
> Master volume mounted on node:
> 
> [root at SC-182 log]# mount
> 127.0.0.1:/master-for-183-0003 on /samba/master-for-183-0003 type 
> fuse.glusterfs (rw,allow_other,max_read=131072)
> 
> Let's fill up space on master volume:
> 
> [root at SC-182 log]# mkdir /samba/master-for-183-0003/cifs_share/dir3
> [root at SC-182 log]# cp big.file 
> /samba/master-for-183-0003/cifs_share/dir3/
> [root at SC-182 log]# cp big.file
> /samba/master-for-183-0003/cifs_share/dir3/big.file.1
> cp: writing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1': 
> No space left on device
> cp: closing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1': 
> No space left on device
> 
> File " big.file.1" represent part of the original file:
> [root at SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/*
> -rwx------ 1 root root 78930370 Dec  5 16:49 
> /samba/master-for-183-0003/cifs_share/dir3/big.file
> -rwx------ 1 root root 22155264 Dec  5 16:49
> /samba/master-for-183-0003/cifs_share/dir3/big.file.1
> 
> Both new files are geo-replicated to the Slave volume successfully:
> 
> [root at SC-183 log]# ls -l
> /exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/
> total 98720
> -rwx------ 2 root root 78930370 Dec  5 16:49 big.file
> -rwx------ 2 root root 22155264 Dec  5 16:49 big.file.1
> 
> [root at SC-182 log]# /usr/sbin/gluster volume geo-replication
> master-for-183-0003 nasgorep at 10.10.60.183::rem-volume-0001 status 
> detail
> 
> MASTER NODE     MASTER VOL             MASTER BRICK
> SLAVE USER    SLAVE                                     SLAVE NODE
> STATUS
> CRAWL STATUS       LAST_SYNCED            ENTRY    DATA    META    FAILURES
> CHECKPOINT TIME    CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ------
> 10.10.60.182    master-for-183-0003
> /exports/nas-segment-0012/master-for-183-0003    nasgorep
> nasgorep at 10.10.60.183::rem-volume-0001    10.10.60.183    Active
> Changelog Crawl    2016-12-05 16:49:48    0        0       0       0
> N/A                N/A                     N/A
> 
> Let's delete partially written file from the master mount:
> 
> [root at SC-182 log]# rm 
> /samba/master-for-183-0003/cifs_share/dir3/big.file.1
> rm: remove regular file
> `/samba/master-for-183-0003/cifs_share/dir3/big.file.1'? y
> 
> [root at SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/*
> -rwx------ 1 root root 78930370 Dec  5 16:49 
> /samba/master-for-183-0003/cifs_share/dir3/big.file
> 
> Set checkpoint:
> 
> 32643 12/05/2016 16:57:46.540390536 1480985866 command: 
> /usr/sbin/gluster volume geo-replication master-for-183-0003
> nasgorep at 10.10.60.183::rem-volume-0001 config checkpoint now 2>&1
> 32643 12/05/2016 16:57:48.770820909 1480985868 status=0 
> /usr/sbin/gluster volume geo-replication master-for-183-0003
> nasgorep at 10.10.60.183::rem-volume-0001 config checkpoint now 2>&1
> 
> Check geo-replication status:
> 
> [root at SC-182 log]# /usr/sbin/gluster volume geo-replication
> master-for-183-0003 nasgorep at 10.10.60.183::rem-volume-0001 status 
> detail
> 
> MASTER NODE     MASTER VOL             MASTER BRICK
> SLAVE USER    SLAVE                                     SLAVE NODE
> STATUS
> CRAWL STATUS       LAST_SYNCED            ENTRY    DATA    META    FAILURES
> CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------------------------------------------------------------------
> ------
> ----------
> 10.10.60.182    master-for-183-0003
> /exports/nas-segment-0012/master-for-183-0003    nasgorep
> nasgorep at 10.10.60.183::rem-volume-0001    10.10.60.183    Active
> Changelog Crawl    2016-12-05 16:57:48    0        0       0       0
> 2016-12-05 16:57:46    Yes                     2016-12-05 16:57:50
> 
> But the partially written file "big.file.1" is still present on the 
> slave
> volume:
> 
> [root at SC-183 log]# ls -l
> /exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/
> total 98720
> -rwx------ 2 root root 78930370 Dec  5 16:49 big.file
> -rwx------ 2 root root 22155264 Dec  5 16:49 big.file.1
> 
> Gluster logs for geo-replication do not have any indication about 
> failure to delete the file:
> 
> [root at SC-182 log]# view
> /var/log/glusterfs/geo-replication/master-for-183-0003/ssh%3A%2F%2Fnas
> gorep% 
> 4010.10.60.183%3Agluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.log
> 
> [2016-12-06 00:49:40.267956] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 17 crawls, 1 turns
> [2016-12-06 00:49:52.348413] I
> [master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster:
> slave's time: (1480985358, 0)
> [2016-12-06 00:49:53.296811] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:53.901186] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:54.760957] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:55.384705] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:55.987873] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:56.848361] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:57.471925] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:58.76416] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:58.935801] W
> [master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
> _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
> [2016-12-06 00:49:59.560571] E
> [resource(/exports/nas-segment-0012/master-for-183-0003):1021:rsync] SSH:
> SYNC Error(Rsync): rsync: rsync_xal_set:
> lsetxattr(".gfid/103b87ff-3b7a-4f2b-8bc5-a2f9c1d3fc0e","trusted.gluste
> rfs.84
> 501a83-b07c-4768-bfaa-418b038e1a9e.xtime") failed: Operation not 
> permitted
> (1)
> [2016-12-06 00:49:59.560972] E
> [master(/exports/nas-segment-0012/master-for-183-0003):1037:process]
> _GMaster: changelogs CHANGELOG.1480985389 could not be processed 
> completely - moving on...
> [2016-12-06 00:50:41.839792] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 18 crawls, 1 turns
> [2016-12-06 00:51:42.203411] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:52:42.600800] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:53:42.983913] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:54:43.381218] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:55:43.749927] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:56:44.113914] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:57:44.494354] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> [2016-12-06 00:57:48.528424] I [gsyncd(conf):671:main_i] <top>: 
> checkpoint
> 1480985866 set
> [2016-12-06 00:57:48.528704] I [syncdutils(conf):220:finalize] <top>:
> exiting.
> [2016-12-06 00:57:50.530714] I
> [master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster:
> slave's time: (1480985388, 0)
> [2016-12-06 00:58:44.802122] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 1 turns
> [2016-12-06 00:59:45.181669] I
> [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
> _GMaster: 20 crawls, 0 turns
> 
> Best regards,
> 
> Viktor Nosov
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 84LPUG~L.LOG
Type: application/octet-stream
Size: 200 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161207/3c74a181/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: slave.log
Type: application/octet-stream
Size: 225810 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161207/3c74a181/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 8RSDWG~T.LOG
Type: application/octet-stream
Size: 9392 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161207/3c74a181/attachment-0002.obj>


More information about the Gluster-users mailing list