[Gluster-devel] Spurious failures because of nfs and snapshots

Pranith Kumar Karampuri pkarampu at redhat.com
Mon May 19 04:09:20 UTC 2014


hi Vijai, Joseph,
    In 2 of the last 3 build failures, http://build.gluster.org/job/regression/4479/console, http://build.gluster.org/job/regression/4478/console this test(tests/bugs/bug-1090042.t) failed. Do you guys think it is better to revert this test until the fix is available? Please send a patch to revert the test case if you guys feel so. You can re-submit it along with the fix to the bug mentioned by Joseph.

Pranith.

----- Original Message -----
> From: "Joseph Fernandes" <josferna at redhat.com>
> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Friday, 16 May, 2014 5:13:57 PM
> Subject: Re: Spurious failures because of nfs and snapshots
> 
> 
> Hi All,
> 
> tests/bugs/bug-1090042.t :
> 
> I was able to reproduce the issue i.e when this test is done in a loop
> 
> for i in {1..135} ; do  ./bugs/bug-1090042.t
> 
> When checked the logs
> [2014-05-16 10:49:49.003978] I [rpc-clnt.c:973:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-05-16 10:49:49.004035] I [rpc-clnt.c:988:rpc_clnt_connection_init]
> 0-management: defaulting ping-timeout to 30secs
> [2014-05-16 10:49:49.004303] I [rpc-clnt.c:973:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2014-05-16 10:49:49.004340] I [rpc-clnt.c:988:rpc_clnt_connection_init]
> 0-management: defaulting ping-timeout to 30secs
> 
> The issue is with ping-timeout and is tracked under the bug
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1096729
> 
> 
> The workaround is mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1096729#c8
> 
> 
> Regards,
> Joe
> 
> ----- Original Message -----
> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> To: "Gluster Devel" <gluster-devel at gluster.org>
> Cc: "Joseph Fernandes" <josferna at redhat.com>
> Sent: Friday, May 16, 2014 6:19:54 AM
> Subject: Spurious failures because of nfs and snapshots
> 
> hi,
>     In the latest build I fired for review.gluster.com/7766
>     (http://build.gluster.org/job/regression/4443/console) failed because of
>     spurious failure. The script doesn't wait for nfs export to be
>     available. I fixed that, but interestingly I found quite a few scripts
>     with same problem. Some of the scripts are relying on 'sleep 5' which
>     also could lead to spurious failures if the export is not available in 5
>     seconds. We found that waiting for 20 seconds is better, but 'sleep 20'
>     would unnecessarily delay the build execution. So if you guys are going
>     to write any scripts which has to do nfs mounts, please do it the
>     following way:
> 
> EXPECT_WITHIN 20 "1" is_nfs_export_available;
> TEST mount -t nfs -o vers=3 $H0:/$V0 $N0;
> 
> Please review http://review.gluster.com/7773 :-)
> 
> I saw one more spurious failure in a snapshot related script
> tests/bugs/bug-1090042.t on the next build fired by Niels.
> Joesph (CCed) is debugging it. He agreed to reply what he finds and share it
> with us so that we won't introduce similar bugs in future.
> 
> I encourage you guys to share what you fix to prevent spurious failures in
> future.
> 
> Thanks
> Pranith
> 



More information about the Gluster-devel mailing list