[Bugs] [Bug 1222409] nfs-ganesha: HA failover happens but I/O does not move ahead when volume has two mounts and I/O going on both mounts

Tue May 19 06:45:23 UTC 2015

https://bugzilla.redhat.com/show_bug.cgi?id=1222409

--- Comment #6 from Meghana <mmadhusu at redhat.com> ---
I saw cache_invalidation related errors in gfapi.log. This is what I did on my 
4 node set up.

1. Cloned gluster master ( no fix has gone except for Disable_ACL) and
installed it on all the machines.
2. Cloned NFS-Ganesha-2.3-dev-3 and applied Soumya's patch that is waiting to
merged on NFS-Ganesha upstream,
https://review.gerrithub.io/#/c/228614/
and installed NFS-Ganesha on all the nodes.

My four node set up has a volume testvol, that looks like this,
Volume Name: testvol
Type: Distribute
Volume ID: b021c59b-16ad-4eaf-9fc3-d0eb563a9ab0
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: rhs1:/brick/brick1
Brick2: rhs2:/brick/brick1
Brick3: rhs3:/brick/brick1
Brick4: rhs4:/brick/brick1
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: on
nfs.disable: on
performance.readdir-ahead: on
nfs-ganesha: enable

VIP of rhs1 : 10.70.40.173
VIP of rhs2 : 10.70.40. 174

I mounted the volume "testvol" on two different clients.

Client 1 :

10.70.40.173:/testvol on /mnt type nfs
(rw,vers=4,addr=10.70.40.173,clientaddr=10.70.43.78)

Client 2:
10.70.40.174:/testvol on /mnt type nfs
(rw,vers=4,addr=10.70.40.174,clientaddr=10.70.42.117)

And I created two directories dir and dir1 on the mount point.

On client1, I ran iozone by executing the following command,
cd /mnt/dir 
time iozone -a -f rhs1.ioz

On client2, I ran iozone by executing the following command,
time iozone -a -f rhs3.ioz

After a few seconds, I killed ganesha on rhs1 ( VIP 10.70.40.173 ),
I saw a delay for a minute and I/Os resumed after that.

pcs status looked like this,

Online: [ rhs1 rhs2 rhs3 rhs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 rhs1-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs4 
 rhs1-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs4 
 rhs2-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs2 
 rhs2-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs2 
 rhs3-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs3 
 rhs3-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs3 
 rhs4-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs4 
 rhs4-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs4 
 rhs1-dead_ip-1    (ocf::heartbeat:Dummy):    Started rhs1 

rhs1 had failed over to rhs4.

After around 9 minutes, iozone was successfully completed on both the clients.

Client 1 :
iozone test complete.

real    9m46.894s
user    0m3.679s
sys    2m3.828s

Client 2 :
iozone test complete.

real    7m9.126s
user    0m2.482s
sys    0m56.361s

After the tests finished, I checked the pcs status again,
Online: [ rhs1 rhs2 rhs3 rhs4 ]

Full list of resources:

 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ rhs1 rhs2 rhs3 rhs4 ]
 rhs1-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs4 
 rhs1-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs4 
 rhs2-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs2 
 rhs2-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs2 
 rhs3-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs3 
 rhs3-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs3 
 rhs4-cluster_ip-1    (ocf::heartbeat:IPaddr):    Started rhs4 
 rhs4-trigger_ip-1    (ocf::heartbeat:Dummy):    Started rhs4 
 rhs1-dead_ip-1    (ocf::heartbeat:Dummy):    Started rhs1 

NFS-Ganesha was running on all the other three nodes.

-- 
You are receiving this mail because:
You are the assignee for the bug.