[Gluster-users] trashcan on dist. repl. volume with geo-replication
Dietmar Putz
dietmar.putz at 3qsdn.com
Tue Mar 13 09:13:20 UTC 2018
Hi Kotresh,
thanks for your repsonse...
answers inside...
best regards
Dietmar
Am 13.03.2018 um 06:38 schrieb Kotresh Hiremath Ravishankar:
> Hi Dietmar,
>
> I am trying to understand the problem and have few questions.
>
> 1. Is trashcan enabled only on master volume?
no, trashcan is also enabled on slave. settings are the same as on
master but trashcan on slave is complete empty.
root at gl-node5:~# gluster volume get mvol1 all | grep -i trash
features.trash on
features.trash-dir .trashcan
features.trash-eliminate-path (null)
features.trash-max-filesize 2GB
features.trash-internal-op off
root at gl-node5:~#
> 2. Does the 'rm -rf' done on master volume synced to slave ?
yes. entire content of ~/test1/b1/* on slave has been removed.
> 3. If trashcan is disabled, the issue goes away?
after disabling features.trash on master and slave the issue
remains...stop and restart of master/slave volume and geo-replication
has no effect.
root at gl-node1:~# gluster volume geo-replication mvol1
gl-node5-int::mvol1 status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER
SLAVE SLAVE NODE STATUS CRAWL STATUS
LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------------
gl-node1-int mvol1 /brick1/mvol1 root
gl-node5-int::mvol1 N/A Faulty N/A N/A
gl-node3-int mvol1 /brick1/mvol1 root
gl-node5-int::mvol1 gl-node7-int Passive N/A N/A
gl-node2-int mvol1 /brick1/mvol1 root
gl-node5-int::mvol1 N/A Faulty N/A N/A
gl-node4-int mvol1 /brick1/mvol1 root
gl-node5-int::mvol1 gl-node8-int Active Changelog Crawl
2018-03-12 13:56:28
root at gl-node1:~#
>
> The geo-rep error just says the it failed to create the directory
> "Oracle_VM_VirtualBox_Extension" on slave.
> Usually this would be because of gfid mismatch but I don't see that in
> your case. So I am little more interested
> in present state of the geo-rep. Is it still throwing same errors and
> same failure to sync the same directory. If
> so does the parent 'test1/b1' exists on slave?
it is still throwing the same error as show below.
the directory 'test1/b1' is empty as expected and exist on master and slave.
>
> And doing ls on trashcan should not affect geo-rep. Is there a easy
> reproducer for this ?
i have made several tests on 3.10.11 and 3.12.6 and i'm pretty sure
there was one without activation of the trashcan feature on slave...with
same / similiar problems.
i will come back with a more comprehensive and reproducible description
of that issue...
>
>
> Thanks,
> Kotresh HR
>
> On Mon, Mar 12, 2018 at 10:13 PM, Dietmar Putz <dietmar.putz at 3qsdn.com
> <mailto:dietmar.putz at 3qsdn.com>> wrote:
>
> Hello,
>
> in regard to
> https://bugzilla.redhat.com/show_bug.cgi?id=1434066
> <https://bugzilla.redhat.com/show_bug.cgi?id=1434066>
> i have been faced to another issue when using the trashcan feature
> on a dist. repl. volume running a geo-replication. (gfs 3.12.6 on
> ubuntu 16.04.4)
> for e.g. removing an entire directory with subfolders :
> tron at gl-node1:/myvol-1/test1/b1$ rm -rf *
>
> afterwards listing files in the trashcan :
> tron at gl-node1:/myvol-1/test1$ ls -la /myvol-1/.trashcan/test1/b1/
>
> leads to an outage of the geo-replication.
> error on master-01 and master-02 :
>
> [2018-03-12 13:37:14.827204] I [master(/brick1/mvol1):1385:crawl]
> _GMaster: slave's time stime=(1520861818, 0)
> [2018-03-12 13:37:14.835535] E
> [master(/brick1/mvol1):784:log_failures] _GMaster: ENTRY FAILED
> data=({'uid': 0, 'gfid': 'c38f75e3-194a-4d22-9094-50ac8f8756e7',
> 'gid': 0, 'mode': 16877, 'entry':
> '.gfid/5531bd64-ac50-462b-943e-c0bf1c52f52c/Oracle_VM_VirtualBox_Extension',
> 'op': 'MKDIR'}, 2, {'gfid_mismatch': False, 'dst': False})
> [2018-03-12 13:37:14.835911] E
> [syncdutils(/brick1/mvol1):299:log_raise_exception] <top>: The
> above directory failed to sync. Please fix it to proceed further.
>
>
> both gfid's of the directories as shown in the log :
> brick1/mvol1/.trashcan/test1/b1 0x5531bd64ac50462b943ec0bf1c52f52c
> brick1/mvol1/.trashcan/test1/b1/Oracle_VM_VirtualBox_Extension
> 0xc38f75e3194a4d22909450ac8f8756e7
>
> the shown directory contains just one file which is stored on
> gl-node3 and gl-node4 while node1 and 2 are in geo replication error.
> since the filesize limitation of the trashcan is obsolete i'm
> really interested to use the trashcan feature but i'm concerned it
> will interrupt the geo-replication entirely.
> does anybody else have been faced with this situation...any hints,
> workarounds... ?
>
> best regards
> Dietmar Putz
>
>
> root at gl-node1:~/tmp# gluster volume info mvol1
>
> Volume Name: mvol1
> Type: Distributed-Replicate
> Volume ID: a1c74931-568c-4f40-8573-dd344553e557
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: gl-node1-int:/brick1/mvol1
> Brick2: gl-node2-int:/brick1/mvol1
> Brick3: gl-node3-int:/brick1/mvol1
> Brick4: gl-node4-int:/brick1/mvol1
> Options Reconfigured:
> changelog.changelog: on
> geo-replication.ignore-pid-check: on
> geo-replication.indexing: on
> features.trash-max-filesize: 2GB
> features.trash: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
>
> root at gl-node1:/myvol-1/test1# gluster volume geo-replication mvol1
> gl-node5-int::mvol1 config
> special_sync_mode: partial
> gluster_log_file:
> /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.gluster.log
> ssh_command: ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replication/secret.pem
> change_detector: changelog
> use_meta_volume: true
> session_owner: a1c74931-568c-4f40-8573-dd344553e557
> state_file:
> /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.status
> gluster_params: aux-gfid-mount acl
> remote_gsyncd: /nonexistent/gsyncd
> working_dir:
> /var/lib/misc/glusterfsd/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1
> state_detail_file:
> /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-detail.status
> gluster_command_dir: /usr/sbin/
> pid_file:
> /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/monitor.pid
> georep_session_working_dir:
> /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/
> ssh_command_tar: ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i
> /var/lib/glusterd/geo-replication/tar_ssh.pem
> master.stime_xattr_name:
> trusted.glusterfs.a1c74931-568c-4f40-8573-dd344553e557.d62bda3a-1396-492a-ad99-7c6238d93c6a.stime
> changelog_log_file:
> /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1-changes.log
> socketdir: /var/run/gluster
> volume_id: a1c74931-568c-4f40-8573-dd344553e557
> ignore_deletes: false
> state_socket_unencoded:
> /var/lib/glusterd/geo-replication/mvol1_gl-node5-int_mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.socket
> log_file:
> /var/log/glusterfs/geo-replication/mvol1/ssh%3A%2F%2Froot%40192.168.178.65%3Agluster%3A%2F%2F127.0.0.1%3Amvol1.log
> access_mount: true
> root at gl-node1:/myvol-1/test1#
>
> --
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://lists.gluster.org/mailman/listinfo/gluster-users
> <http://lists.gluster.org/mailman/listinfo/gluster-users>
>
>
>
>
> --
> Thanks and Regards,
> Kotresh H R
--
Dietmar Putz
3Q GmbH
Kurfürstendamm 102
D-10711 Berlin
Mobile: +49 171 / 90 160 39
Mail: dietmar.putz at 3qsdn.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180313/a8efa6a8/attachment.html>
More information about the Gluster-users
mailing list