[Bugs] [Bug 1600941] [geo-rep]: geo-replication scheduler is failing due to unsuccessful umount
bugzilla at redhat.com
bugzilla at redhat.com
Fri Jul 13 12:44:47 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1600941
Kotresh HR <khiremat at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
Assignee|bugs at gluster.org |khiremat at redhat.com
--- Comment #1 from Kotresh HR <khiremat at redhat.com> ---
Description of problem:
=======================
What's broken: schedule_georep.py fails to complete the transition.
--------------
What this tool does:
--------------------
schedule_georep.py is a tool to run Geo-replication when required. This can be
used to schedule the Geo-replication to run once in a day using
# Run daily at 08:30pm
30 20 * * * root python /usr/share/glusterfs/scripts/schedule_georep.py \
--no-color gv1 fvm1 gv2 >> /var/log/glusterfs/schedule_georep.log 2>&1
This tool does the following,
1. Stop Geo-replication if Started
2. Start Geo-replication
3. Set Checkpoint
4. Check the Status and see Checkpoint is Complete.(LOOP)
5. If checkpoint complete, Stop Geo-replication
Actual error:
-------------
[root at dhcp41-227 ~]# python /usr/share/glusterfs/scripts/schedule_georep.py
vol0 10.70.42.9 vol1
[ OK] Stopped Geo-replication
[ OK] Set Checkpoint
[ OK] Started Geo-replication and watching Status for Checkpoint completion
[NOT OK] Unable to Remove temp directory /tmp/georepsetup_xO3cms
rmdir: failed to remove ‘/tmp/georepsetup_xO3cms’: Device or resource busy
[root at dhcp41-227 ~]#
df: doesnt show the mount
---
[root at dhcp41-227 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted
on
/dev/mapper/rhgs-root 17811456 2572524 15238932 15% /
devtmpfs 3992712 0 3992712 0% /dev
tmpfs 4004780 0 4004780 0% /dev/shm
tmpfs 4004780 17124 3987656 1% /run
tmpfs 4004780 0 4004780 0%
/sys/fs/cgroup
/dev/sda1 1038336 163040 875296 16% /boot
tmpfs 800956 0 800956 0%
/run/user/0
/dev/mapper/RHS_vg1-RHS_lv1 8330240 33524 8296716 1%
/rhs/brick1
/dev/mapper/RHS_vg2-RHS_lv2 8330240 33524 8296716 1%
/rhs/brick2
/dev/mapper/RHS_vg3-RHS_lv3 8330240 33524 8296716 1%
/rhs/brick3
10.70.41.227:/gluster_shared_storage 17811456 2756096 15055360 16%
/run/gluster/shared_storage
[root at dhcp41-227 ~]#
process still lists as mounted
------------------------------
[root at dhcp41-227 ~]# ps -eaf | grep glusterfs | grep tmp
root 21976 1 0 11:23 ? 00:00:00 /usr/sbin/glusterfs
--volfile-server localhost --volfile-id vol0 -l
/var/log/glusterfs/geo-replication/schedule_georep.mount.log
/tmp/georepsetup_xO3cms
root 22096 1 0 11:23 ? 00:00:00 /usr/sbin/glusterfs
--aux-gfid-mount --acl
--log-file=/var/log/glusterfs/geo-replication/vol0/ssh%3A%2F%2Froot%4010.70.42.9%3Agluster%3A%2F%2F127.0.0.1%3Avol1.%2Frhs%2Fbrick3%2Fb8.gluster.log
--volfile-server=localhost --volfile-id=vol0 --client-pid=-1
/tmp/gsyncd-aux-mount-gnFTXn
root 22098 1 0 11:23 ? 00:00:00 /usr/sbin/glusterfs
--aux-gfid-mount --acl
--log-file=/var/log/glusterfs/geo-replication/vol0/ssh%3A%2F%2Froot%4010.70.42.9%3Agluster%3A%2F%2F127.0.0.1%3Avol1.%2Frhs%2Fbrick2%2Fb5.gluster.log
--volfile-server=localhost --volfile-id=vol0 --client-pid=-1
/tmp/gsyncd-aux-mount-vce8hM
root 22112 1 0 11:23 ? 00:00:00 /usr/sbin/glusterfs
--aux-gfid-mount --acl
--log-file=/var/log/glusterfs/geo-replication/vol0/ssh%3A%2F%2Froot%4010.70.42.9%3Agluster%3A%2F%2F127.0.0.1%3Avol1.%2Frhs%2Fbrick1%2Fb2.gluster.log
--volfile-server=localhost --volfile-id=vol0 --client-pid=-1
/tmp/gsyncd-aux-mount-CX9Ct9
[root at dhcp41-227 ~]#
Manually umount also fails
--------------------------
[root at dhcp41-227 ~]# umount /tmp/georepsetup_xO3cms
umount: /tmp/georepsetup_xO3cms: not mounted
[root at dhcp41-227 ~]# rmdir /tmp/georepsetup_xO3cms
rmdir: failed to remove ‘/tmp/georepsetup_xO3cms’: Device or resource busy
[root at dhcp41-227 ~]# umount /tmp/georepsetup_xO3cms
umount: /tmp/georepsetup_xO3cms: not mounted
[root at dhcp41-227 ~]#
[root at dhcp41-227 ~]# echo $?
32
[root at dhcp41-227 ~]#
Additional information:
-----------------------
1. Script has a check for failure of umount before rmdir which is passing
through.
2. Manually umount also fails for a said directory, however if the script is
re-executed the earlier directory gets removed successfully but it fails for
new mount directory
Version-Release number of selected component (if applicable):
==============================================================
mainline
How reproducible:
=================
Alwasy on Centos7
Steps to Reproduce:
===================
1. Setup geo-replication between master and slave
2. Run the tool with master, slave host and slave vol parameters
Actual results:
===============
Tool doesn't complete the transition from "touch mount" to "status complete"
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list