[Gluster-users] trashcan on dist. repl. volume with geo-replication

Tue Mar 13 09:13:20 UTC 2018

Hi Kotresh,

thanks for your repsonse...
answers inside...

best regards
Dietmar

Am 13.03.2018 um 06:38 schrieb Kotresh Hiremath Ravishankar:
> Hi Dietmar,
>
> I am trying to understand the problem and have few questions.
>
> 1. Is trashcan enabled only on master volume?
no, trashcan is also enabled on slave. settings are the same as on 
master but trashcan on slave is complete empty.
root at gl-node5:~# gluster volume get mvol1 all | grep -i trash
features.trash on
features.trash-dir .trashcan
features.trash-eliminate-path (null)
features.trash-max-filesize 2GB
features.trash-internal-op off
root at gl-node5:~#

> 2. Does the 'rm -rf' done on master volume synced to slave ?
yes. entire content of ~/test1/b1/* on slave has been removed.
> 3. If trashcan is disabled, the issue goes away?

after disabling features.trash on master and slave the issue 
remains...stop and restart of master/slave volume and geo-replication 
has no effect.
root at gl-node1:~# gluster volume geo-replication mvol1 
gl-node5-int::mvol1 status

MASTER NODE     MASTER VOL    MASTER BRICK     SLAVE USER    
SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       
LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------------
gl-node1-int    mvol1         /brick1/mvol1 root          
gl-node5-int::mvol1    N/A             Faulty N/A                N/A
gl-node3-int    mvol1         /brick1/mvol1 root          
gl-node5-int::mvol1    gl-node7-int    Passive N/A                N/A
gl-node2-int    mvol1         /brick1/mvol1 root          
gl-node5-int::mvol1    N/A             Faulty N/A                N/A
gl-node4-int    mvol1         /brick1/mvol1 root          
gl-node5-int::mvol1    gl-node8-int    Active Changelog Crawl    
2018-03-12 13:56:28
root at gl-node1:~#
>
> The geo-rep error just says the it failed to create the directory 
> "Oracle_VM_VirtualBox_Extension" on slave.
> Usually this would be because of gfid mismatch but I don't see that in 
> your case. So I am little more interested
> in present state of the geo-rep. Is it still throwing same errors and 
> same failure to sync the same directory. If
> so does the parent 'test1/b1' exists on slave?
it is still throwing the same error as show below.
the directory 'test1/b1' is empty as expected and exist on master and slave.

>
> And doing ls on trashcan should not affect geo-rep. Is there a easy 
> reproducer for this ?
i have made several tests on 3.10.11 and 3.12.6 and i'm pretty sure 
there was one without activation of the trashcan feature on slave...with 
same / similiar problems.
i will come back with a more comprehensive and reproducible description 
of that issue...

>
>
> Thanks,
> Kotresh HR
>
> On Mon, Mar 12, 2018 at 10:13 PM, Dietmar Putz <dietmar.putz at 3qsdn.com 
> <mailto:dietmar.putz at 3qsdn.com>> wrote:
>
>     Hello,
>
>     in regard to
>     https://bugzilla.redhat.com/show_bug.cgi?id=1434066
>     <https://bugzilla.redhat.com/show_bug.cgi?id=1434066>
>     i have been faced to another issue when using the trashcan feature
>     on a dist. repl. volume running a geo-replication. (gfs 3.12.6 on
>     ubuntu 16.04.4)
>     for e.g. removing an entire directory with subfolders :
>     tron at gl-node1:/myvol-1/test1/b1$ rm -rf *
>
>     afterwards listing files in the trashcan :
>     tron at gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/
>
>     leads to an outage of the geo-replication.
>     error on master-01 and master-02 :
>
>     [2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl]
>     _GMaster: slave's time stime=(1520861818, 0)
>     [2018-03-12 13:37:14.835535] E
>     [master(/brick1/mvol1):784:log_failures] _GMaster: ENTRY FAILED   
>     data=({'uid': 0, 'gfid': 'c38f75e3-194a-4d22-9094-50ac8f8756e7',
>     'gid': 0, 'mode': 16877, 'entry':
>     '.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension',
>     'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
>     [2018-03-12 13:37:14.835911] E
>     [syncdutils(/brick1/mvol1):299:log_raise_exception] <top>: The
>     above directory failed to sync. Please fix it to proceed further.
>
>
>     both gfid's of the directories as shown in the log :
>     brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
>     brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension
>     0xc38f75e3194a4d22909450ac8f8756e7
>
>     the shown directory contains just one file which is stored on
>     gl-node3 and gl-node4 while node1 and 2 are in geo replication error.
>     since the filesize limitation of the trashcan is obsolete i'm
>     really interested to use the trashcan feature but i'm concerned it
>     will interrupt the geo-replication entirely.
>     does anybody else have been faced with this situation...any hints,
>     workarounds... ?
>
>     best regards
>     Dietmar Putz
>
>
>     root at gl-node1:~/tmp# gluster volume info mvol1
>
>     Volume Name: mvol1
>     Type: Distributed-Replicate
>     Volume ID: a1c74931-568c-4f40-8573-dd344553e557
>     Status: Started
>     Snapshot Count: 0
>     Number of Bricks: 2 x 2 = 4
>     Transport-type: tcp
>     Bricks:
>     Brick1: gl-node1-int:/brick1/mvol1
>     Brick2: gl-node2-int:/brick1/mvol1
>     Brick3: gl-node3-int:/brick1/mvol1
>     Brick4: gl-node4-int:/brick1/mvol1
>     Options Reconfigured:
>     changelog.changelog: on
>     geo-replication.ignore-pid-check: on
>     geo-replication.indexing: on
>     features.trash-max-filesize: 2GB
>     features.trash: on
>     transport.address-family: inet
>     nfs.disable: on
>     performance.client-io-threads: off
>
>     root at gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1
>     gl-node5-int::mvol1 config
>     special_sync_mode: partial
>     gluster_log_file:
>     /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.gluster.log
>     ssh_command: ssh -oPasswordAuthentication=no
>     -oStrictHostKeyChecking=no -i
>     /var/lib/glusterd/geo-replication/secret.pem
>     change_detector: changelog
>     use_meta_volume: true
>     session_owner: a1c74931-568c-4f40-8573-dd344553e557
>     state_file:
>     /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.status
>     gluster_params: aux-gfid-mount acl
>     remote_gsyncd: /nonexistent/gsyncd
>     working_dir:
>     /var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
>     state_detail_file:
>     /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status
>     gluster_command_dir: /usr/sbin/
>     pid_file:
>     /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.pid
>     georep_session_working_dir:
>     /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
>     ssh_command_tar: ssh -oPasswordAuthentication=no
>     -oStrictHostKeyChecking=no -i
>     /var/lib/glusterd/geo-replication/tar_ssh.pem
>     master.stime_xattr_name:
>     trusted.glusterfs.a1c74931-568c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
>     changelog_log_file:
>     /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-changes.log
>     socketdir: /var/run/gluster
>     volume_id: a1c74931-568c-4f40-8573-dd344553e557
>     ignore_deletes: false
>     state_socket_unencoded:
>     /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.socket
>     log_file:
>     /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.log
>     access_mount: true
>     root at gl-node1:/myvol-1/test1#
>
>     -- 
>
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>
>
>
> -- 
> Thanks and Regards,
> Kotresh H R

-- 
Dietmar Putz
3Q GmbH
Kurfürstendamm 102
D-10711 Berlin

Mobile:   +49 171 / 90 160 39
Mail:     dietmar.putz at 3qsdn.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180313/a8efa6a8/attachment.html>