[Bugs] [Bug 1226789] New: quota: ENOTCONN parodically seen in logs when setting hard/soft timeout during I/O.

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 1 06:20:00 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1226789

            Bug ID: 1226789
           Summary: quota: ENOTCONN parodically seen in logs when setting
                    hard/soft timeout during I/O.
           Product: GlusterFS
           Version: 3.7.0
         Component: quota
          Severity: high
          Priority: medium
          Assignee: bugs at gluster.org
          Reporter: vmallika at redhat.com
                CC: bturner at redhat.com, bugs at gluster.org,
                    gluster-bugs at redhat.com, vagarwal at redhat.com,
                    vbellur at redhat.com, vmallika at redhat.com
        Depends On: 1211220
            Blocks: 1186580 (qe_tracker_everglades), 1219955
                    (glusterfs-3.7.1)



+++ This bug was initially created as a clone of Bug #1211220 +++

+++ This bug was initially created as a clone of Bug #1039674 +++

Description of problem:

When running quota automation I occasionally(1 in 10 runs?) see the following
testcase fail:

1. create a 6x2 volume, start it.
2. gluster volume quota <vol-name> enable
3. gluster volume quota <vol-name> limit-usage / 5GB
4. gluster volume quota <vol-name> list
5. mount -t nfs/glusterfs/(or mount using SMB) <server-ip>:<vol-name>
<mount-point>
6. start creating data inside the mount-point, till limit is reached. files of
size 2MB meanwhile:
7. gluster volume quota <vol-name> soft-timeout 30s
8. gluster volume quota <vol-name> hard-timeout 60s after data creation is
completed.
10. gluster volume quota <vol-name> list

Client side I see:

dd: opening `/quota-mount/tcms_285026/test.file': Transport endpoint is not
connected

And in the brick logs I see:

/var/log/glusterfs/bricks/bricks-quota-test-setup_brick2.log:[2013-12-06
17:59:02.743336] W [quota-enforcer-client.c:187:quota_enforcer_lookup_cbk]
0-quota-test-setup-quota: remote operation failed: Transport endpoint is not
connected. Path: /tcms_285026 (d892ce24-7e59-4eeb-b86f-7c7d34c71317)
/var/log/glusterfs/bricks/bricks-quota-test-setup_brick2.log:[2013-12-06
17:59:02.743377] I [server-rpc-fops.c:1618:server_create_cbk]
0-quota-test-setup-server: 26: CREATE /tcms_285026/test.file
(d892ce24-7e59-4eeb-b86f-7c7d34c71317/test.file) ==> (Transport endpoint is not
connected)

Version-Release number of selected component (if applicable):

glusterfs-server-3.4.0.44.1u2rhs-1.el6rhs.x86_64

How reproducible:

So far this looks to be about 1 in 10 runs.

Steps to Reproduce:
1. create a 6x2 volume, start it.
2. gluster volume quota <vol-name> enable
3. gluster volume quota <vol-name> limit-usage / 5GB
4. gluster volume quota <vol-name> list
5. mount -t nfs/glusterfs/(or mount using SMB) <server-ip>:<vol-name>
<mount-point>
6. start creating data inside the mount-point, till limit is reached. files of
size 2MB meanwhile:
7. gluster volume quota <vol-name> soft-timeout 30s
8. gluster volume quota <vol-name> hard-timeout 60s after data creation is
completed.
10. gluster volume quota <vol-name> list

Actual results:

I/O errors are occasionally hit when the hard/soft timeout is modified with
data in flight.

Expected results:

I/Os complete successfully when timeouts are modified.

Additional info:

I'll try to provide a more concrete reproducer.

--- Additional comment from Vijaikumar Mallikarjuna on 2015-03-03 03:59:22 EST
---

Hi Ben,

I am not able to re-create this issue with 3.6 release.

--- Additional comment from Vijaikumar Mallikarjuna on 2015-04-13 06:41:06 EDT
---

Whenever a new volume is created, quotad gets restarted. This can cause
ENOTCONN in the others volumes IO path

--- Additional comment from Anand Avati on 2015-04-14 04:42:13 EDT ---

REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on
ENOTCONN error) posted (#1) for review on master by Vijaikumar Mallikarjuna
(vmallika at redhat.com)

--- Additional comment from Anand Avati on 2015-04-24 02:20:08 EDT ---

REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on
ENOTCONN error) posted (#2) for review on master by Vijaikumar Mallikarjuna
(vmallika at redhat.com)

--- Additional comment from Anand Avati on 2015-05-28 00:53:18 EDT ---

REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on
ENOTCONN error) posted (#3) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Anand Avati on 2015-05-29 03:28:48 EDT ---

REVIEW: http://review.gluster.org/10230 (quota: retry connecting to quotad on
ENOTCONN error) posted (#4) for review on master by Vijaikumar Mallikarjuna
(vmallika at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1211220
[Bug 1211220] quota: ENOTCONN parodically seen in logs when setting
hard/soft timeout during I/O.
https://bugzilla.redhat.com/show_bug.cgi?id=1219955
[Bug 1219955] GlusterFS 3.7.1 tracker
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list