[Bugs] [Bug 1175735] [USS]: snapd process is not killed once the glusterd comes back
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jan 6 10:39:06 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1175735
Raghavendra Bhat <rabhat at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|POST |MODIFIED
CC| |rabhat at redhat.com
--- Comment #5 from Raghavendra Bhat <rabhat at redhat.com> ---
Description of problem:
=======================
When uss is enabled, it starts snapd on all the machines in the cluster. But in
a scenario where user tries to disable the uss and at the same time glusterd
goes down, the uss gets disabled but the snapd process is alive on the system
where glusterd went down. This is expected. But when the glusterd comes back
the snapd is still live whereas the uss is disabled.
For example:
============
Uss is disabled and no snapd process running on any machines:
============================================================
[root at inception ~]# gluster v i vol3 | grep uss
features.uss: off
[root at inception ~]# ps -eaf | grep snapd
root 2299 26954 0 18:05 pts/0 00:00:00 grep snapd
[root at inception ~]#
Enable the uss and snapd process should run on all the machines:
================================================================
[root at inception ~]# gluster v set vol3 uss on
volume set: success
[root at inception ~]# gluster v i vol3 | grep uss
features.uss: on
[root at inception ~]#
[root at inception ~]# gluster v status vol3 | grep -i "snapshot daemon"
Snapshot Daemon on localhost 49158 Y 2322
Snapshot Daemon on hostname1 49157 Y 3868
Snapshot Daemon on hostname2 49157 Y 3731
Snapshot Daemon on hostname3 49157 Y 3265
[root at inception ~]#
Now, disable the USS and at the same time stop the glusterd on multiple
machines:
========================================================================
[root at inception ~]# gluster v set vol3 uss off
volume set: success
[root at inception ~]# gluster v status vol3 | grep -i "snapshot daemon"
[root at inception ~]# gluster v status vol3
Status of volume: vol3
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick hostname1:/rhs/brick4/b4 49155 Y 32406
NFS Server on localhost 2049 Y 2431
Self-heal Daemon on localhost N/A Y 2202
Task Status of Volume vol3
------------------------------------------------------------------------------
There are no active volume tasks
[root at inception ~]#
snapd should not be running on machine where glusterd is UP but should be
running on machines where glusterds are down:
==========================================================================
Node1:
======
[root at inception ~]# ps -eaf | grep snapd
root 2501 26954 0 18:11 pts/0 00:00:00 grep snapd
[root at inception ~]#
Node2:
======
[root at rhs-arch-srv2 ~]# ps -eaf | grep snapd
root 3868 1 0 12:36 ? 00:00:00 /usr/sbin/glusterfsd -s
localhost --volfile-id snapd/vol3 -p
/var/lib/glusterd/vols/vol3/run/vol3-snapd.pid -l
/var/log/glusterfs/vol3-snapd.log --brick-name snapd-vol3 -S
/var/run/c01a04ffff6172926bfc0364bd457af3.socket --brick-port 49157
--xlator-option vol3-server.listen-port=49157
root 4163 5023 0 12:41 pts/0 00:00:00 grep snapd
[root at rhs-arch-srv2 ~]#
Node3:
======
[root at rhs-arch-srv3 ~]# ps -eaf | grep snapd
root 3731 1 0 12:35 ? 00:00:00 /usr/sbin/glusterfsd -s
localhost --volfile-id snapd/vol3 -p
/var/lib/glusterd/vols/vol3/run/vol3-snapd.pid -l
/var/log/glusterfs/vol3-snapd.log --brick-name snapd-vol3 -S
/var/run/79af174d6c9c86897e0ff72f002994f2.socket --brick-port 49157
--xlator-option vol3-server.listen-port=49157
root 4028 5029 0 12:40 pts/0 00:00:00 grep snapd
[root at rhs-arch-srv3 ~]#
Node4:
=======
[root at rhs-arch-srv4 ~]# ps -eaf | grep snapd
root 3265 1 0 12:36 ? 00:00:00 /usr/sbin/glusterfsd -s
localhost --volfile-id snapd/vol3 -p
/var/lib/glusterd/vols/vol3/run/vol3-snapd.pid -l
/var/log/glusterfs/vol3-snapd.log --brick-name snapd-vol3 -S
/var/run/4bd0ff786ad2fc2b7e504182d985b723.socket --brick-port 49157
--xlator-option vol3-server.listen-port=49157
root 3587 4733 0 12:41 pts/0 00:00:00 grep snapd
[root at rhs-arch-srv4 ~]#
Start the glusterd on machines where it was stopped and look for snapd process,
it is still running.
Ran the same case with different scenario for bringing down the volume at the
same time bring down the glusterd. In that case when the glusterd comes online,
the brick process gets killed.
Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.6.1
How reproducible:
=================
always
Actual results:
===============
snapd process is online though for user uss is off
Expected results:
=================
snapd process should be killed
--- Additional comment from Rahul Hinduja on 2014-10-30 08:51:20 EDT ---
Additional info:
================
Lets say now, you enable the uss on the same volume, than the ports are shown
as N/A for all the servers which were brought online
[root at inception ~]# gluster v status vol3 | grep -i "snapshot daemon"
Snapshot Daemon on localhost 49159 Y 2716
Snapshot Daemon on hostname1 N/A Y 3265
Snapshot Daemon on hostname2 N/A Y 3868
Snapshot Daemon on hostname3 N/A Y 3731
[root at inception ~]#
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list