[Bugs] [Bug 1230399] New: [Snapshot] Scheduled job is not processed when one of the node of shared storage volume is down
bugzilla at redhat.com
bugzilla at redhat.com
Wed Jun 10 19:51:57 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1230399
Bug ID: 1230399
Summary: [Snapshot] Scheduled job is not processed when one of
the node of shared storage volume is down
Product: GlusterFS
Version: 3.7.1
Component: snapshot
Keywords: Triaged
Severity: urgent
Assignee: bugs at gluster.org
Reporter: asengupt at redhat.com
CC: asengupt at redhat.com, ashah at redhat.com,
bugs at gluster.org, gluster-bugs at redhat.com,
rjoseph at redhat.com, senaik at redhat.com
Depends On: 1218573
Blocks: 1223205
+++ This bug was initially created as a clone of Bug #1218573 +++
Description of problem:
Scheduler is not picking scheduled jobs, when one of the storage node of shared
storage volume is down.
Version-Release number of selected component (if applicable):
[root at localhost glusterfs]# rpm -qa | grep glusterfs
glusterfs-debuginfo-3.7.0alpha0-0.9.git989bea3.el7.centos.x86_64
glusterfs-libs-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-fuse-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-extra-xlators-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-geo-replication-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-cli-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-api-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-server-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
glusterfs-devel-3.7.0beta1-0.14.git09bbd5c.el7.centos.x86_64
How reproducible:
100%
Steps to Reproduce:
1. Create 2*2 distributed replicate volume.
2. Create shared storage replicate volume on storage node which is not part of
volume whose snapshot is scheduled. and mount on each storage node on path
/var/run/gluster/shared_storage
3. initialize scheduler on each storage node e.g run snap_scheduler.py init
command
4. Enable scheduler on storage nodes e.g run snap_scheduler.py enable
5. Add jobs to create snapshot of volume, with interval of 5 min. e.g
snap_scheduler.py add job1 "*/5 * * * *" testvol
6. bring down the both shared storage node.
7. Bring up any one of the shared storage node.
Actual results:
Scheduled job is not picked by scheduler
Expected results:
Scheduler should pick the scheduled jobs
Additional info:
[root at localhost glusterfs]# gluster v info testvol
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: f5eed851-6f24-4cde-903e-7669f5437bc9
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.143:/rhs/brick1/b1
Brick2: 10.70.47.145:/rhs/brick1/b2
Brick3: 10.70.47.150:/rhs/brick1/b3
Brick4: 10.70.47.151:/rhs/brick1/b4
Options Reconfigured:
features.quota: on
features.quota-deem-statfs: on
features.uss: enable
features.barrier: disable
====================================
Shared storage volume
[root at localhost ~]# gluster v info meta
Volume Name: meta
Type: Replicate
Volume ID: b07daf4e-891d-4022-972a-af181250dc07
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.70.46.248:/rhs/brick1/b1
Brick2: 10.70.46.251:/rhs/brick1/b2
--- Additional comment from on 2015-05-08 05:45:30 EDT ---
Version : glusterfs 3.7.0beta1 built on May 7 2015
=======
Another scenario where jobs are not picked up:
1) Create a dist-rep volume and mount it
2) Create a shared storage and mount it
Enable Scheduler and schedule jobs on the volumes
snap_scheduler.py add "A1" "*/5 * * * * " "vol1"
snap_scheduler: Successfully added snapshot schedule
snap_scheduler.py add "A2" "*/10 * * * * " "vol2"
snap_scheduler: Successfully added snapshot schedule
3) Take a snapshot of the shared storage
gluster snapshot create MV_Snap gluster_shared_storage
snapshot create: success: Snap MV_Snap_GMT-2015.05.08-09.20.26 created
successfully
4)Add some more jobs - A3 and A4
5)Stop the volume and see that at the next scheduled time no job is picked up.
6)Restore the shared storage to the snap taken and start the volume
7)After restoring the Scheduler lists A1 and A2 jobs, but none of them are
picked up
--- Additional comment from Anand Avati on 2015-06-09 09:29:35 EDT ---
REVIEW: http://review.gluster.org/11139 (snapshot/scheduler: Reload
/etc/cron.d/glusterfs_snap_cron_tasks when shared storage is available) posted
(#1) for review on master by Avra Sengupta (asengupt at redhat.com)
--- Additional comment from Anand Avati on 2015-06-09 11:02:27 EDT ---
REVIEW: http://review.gluster.org/11139 (snapshot/scheduler: Reload
/etc/cron.d/glusterfs_snap_cron_tasks when shared storage is available) posted
(#2) for review on master by Avra Sengupta (asengupt at redhat.com)
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1218573
[Bug 1218573] [Snapshot] Scheduled job is not processed when one of the
node of shared storage volume is down
https://bugzilla.redhat.com/show_bug.cgi?id=1223205
[Bug 1223205] [Snapshot] Scheduled job is not processed when one of the
node of shared storage volume is down
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list