[Gluster-users] Horrendously slow directory access

Thu Apr 10 12:26:17 UTC 2014

On 04/10/2014 04:34 PM, John Mark Walker wrote:
> Hi James,
>
> This definitely looks worthy of investigation. Could you file a bug? We need to get our guys on this.
>
> Thanks for doing your homework. Send us the BZ #, and we'll start poking around.
>
> -JM

+1 , this seems to be a serious issue.
>
>
> ----- Original Message -----
>> Hey Joe!
>>
>> Yeah we are all XFS all the time round here - none of that nasty ext4
>> combo that we know causes raised levels of mercury :-)
>>
>> The brick errors, we have not seen any we have been busy grepping and
>> alerting on anything suspect in our logs.  Mind you there are hundreds
>> of brick logs to search through I'm not going to say we may have
>> missed one, but after asking the boys in chat just now they are pretty
>> convinced that was not the smoking gun.  I'm sure they will chip in on
>> this thread if there is anything.
>>
>>
>> j.
>>
>> --
>> dr. james cuff, assistant dean for research computing, harvard
>> university | division of science | thirty eight oxford street,
>> cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu
>>
>>
>> On Wed, Apr 9, 2014 at 10:36 AM, Joe Julian <joe at julianfamily.org> wrote:
>>> What's the backend filesystem?
>>> Were there any brick errors, probably around 2014-03-31 22:44:04 (half an
>>> hour before the frame timeout)?
>>>
>>>
>>> On April 9, 2014 7:10:58 AM PDT, James Cuff <james_cuff at harvard.edu> wrote:
>>>> Hi team,
>>>>
>>>> I hate "me too" emails sometimes not at all constructive, but I feel I
>>>> really ought chip in from real world systems we use in anger and at
>>>> massive scale here.
>>>>
>>>> So we also use NFS to "mask" this and other performance issues.  The
>>>> cluster.readdir-optimize gave us similar results unfortunately.
>>>>
>>>> We reported our other challenge back last summer but we stalled on this:
>>>>
>>>> http://www.gluster.org/pipermail/gluster-users/2013-June/036252.html
>>>>
>>>> We also unfortunately now see a new NFS phenotype that I've pasted
>>>> below which is again is causing real heartburn.
>>>>
>>>> Small files, always difficult for any FS, might be worth doing some
>>>> regression testing with small file directory scenarios in test - it's
>>>> an easy reproducer on even moderately sized gluster clusters.  Hope
>>>> some good progress can be
>>>> made, and I understand it's a tough one to
>>>> track down performance hangs and issues.  I just wanted to say that we
>>>> really do see them, and have tried many things to avoid them.
>>>>
>>>> Here's the note from my team:
>>>>
>>>> We were hitting 30 minute timeouts on getxattr/system.posix_acl_access
>>>> calls on directories in a NFS v3 mount (w/ acl option) of a 10-node
>>>> 40-brick gluster 3.4.0 volume.  Strace shows where the client hangs:
>>>>
>>>> $ strace -tt -T getfacl d6h_take1
>>>> ...
>>>> 18:43:57.929225 lstat("d6h_take1", {st_mode=S_IFDIR|0755,
>>>> st_size=7024, ...}) = 0 <0.257107>
>>>> 18:43:58.186461 getxattr("d6h_take1", "system.posix_acl_access",
>>>> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data available) <1806.296893>
>>>> 19:14:04.483556 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
>>>> ...}) = 0 <0.642362>
>>>> 19:14:05.126025 getxattr("d6h_take1", "system.posix_acl_default",
>>>> 0x7fffdf2b9f50, 132) = -1 ENODATA (No data
>>>> available) <0.000024>
>>>> 19:14:05.126114 stat("d6h_take1", {st_mode=S_IFDIR|0755, st_size=7024,
>>>> ...}) = 0 <0.000010>
>>>> ...
>>>>
>>>> Load on the servers was moderate.  While the above was hanging,
>>>> getfacl worked nearly instantaneously on that directory on all bricks.
>>>>   When it finally hit the 30 minute timeout, gluster logged it in
>>>> nfs.log:
>>>>
>>>> [2014-03-31 23:14:04.481154] E [rpc-clnt.c:207:call_bail]
>>>> 0-holyscratch-client-36: bailing out frame type(GlusterFS 3.3)
>>>> op(GETXATTR(18)) xid = 0x8168809x sent = 2014-03-31 22:43:58.442411.
>>>> timeout = 1800
>>>> [2014-03-31 23:14:04.481233] W
>>>> [client-rpc-fops.c:1112:client3_3_getxattr_cbk]
>>>> 0-holyscratch-client-36: remote operation failed: Transport endpoint
>>>> is not connected. Path: <gfid:b116fb01-b13d-448a-90d0-a8693a98698b>
>>>> (b116fb01-b13d-448a-90d0-a8693a98698b). Key: (null)
>>>>
>>>> Other than that, we didn't see anything directly related in the nfs or
>>>> brick logs or anything out of sorts with the gluster services.  A
>>>> couple other errors raise eyebrows, but these are different
>>>> directories (neighbors of the example above) and at different times:
>>>>
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:30:47.794454]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:31:47.794447]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:33:47.802135]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:34:47.802182]
>>>> I
>>>> [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:36:47.764329]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:37:47.773164]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:39:47.774285]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:40:47.780338]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht:
>>>> found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>> holyscratch07: /var/log/glusterfs/nfs.log:[2014-03-31 19:42:47.730345]
>>>> I [dht-layout.c:630:dht_layout_normalize] 0-holyscratch-dht: found
>>>> anomalies in /ramanathan_lab/dhuh/d9_take2_BGI/Diffreg. holes=1
>>>> overlaps=0
>>>>
>>>> holyscratch08:
>>>> /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
>>>> 00:57:51.973565] E [posix-helpers.c:696:posix_handle_pair]
>>>> 0-holyscratch-posix:
>>>> /holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
>>>> key:system.posix_acl_access error:Invalid argument
>>>> holyscratch08:
>>>> /var/log/glusterfs/bricks/holyscratch08_03-brick.log:[2014-03-31
>>>> 01:18:12.345818] E [posix-helpers.c:696:posix_handle_pair]
>>>> 0-holyscratch-posix:
>>>> /holyscratch08_03/brick/ramanathan_lab/dhuh/d9_take2_BGI/cuffdiffRN.txt:
>>>> key:system.posix_acl_access error:Invalid argument
>>>> holyscratch05:
>>>> /var/log/glusterfs/bricks/holyscratch05_04-brick.log:[2014-03-31
>>>> 21:16:37.057674] E [posix-helpers.c:696:posix_handle_pair]
>>>> 0-holyscratch-posix:
>>>>
>>>> /holyscratch05_04/brick/ramanathan_lab/dhuh/d9_take2_BGI/Diffreg/cuffdiffRN.txt:
>>>> key:system.posix_acl_access error:Invalid argument
>>>>
>>>> --
>>>> dr. james cuff, assistant dean for research computing, harvard
>>>> university | division of science | thirty eight oxford street,
>>>> cambridge. ma. 02138 | +1 617 384 7647 | http://rc.fas.harvard.edu
>>>>
>>>>
>>>> On Wed, Apr 9, 2014 at 9:52 AM,  <james.bellinger at icecube.wisc.edu> wrote:
>>>>>   I am seeing something perhaps similar.  3.4.2-1, 2 servers, each with 1
>>>>>   brick, replicated.  A du of a local (ZFS) directory tree of 297834 files
>>>>>   and 525GB takes about 17 minutes.  A du of the gluster copy
>>>>> is still not
>>>>>   finished after 22 hours.  Network activity has been about 5-6KB/sec
>>>>> until
>>>>>   (I gather) du hit a directory with 22450 files, when activity jumped to
>>>>>   300KB/sec (200 packets/sec) for about 15-20 minutes.  If I assume that
>>>>> the
>>>>>   spike came from scanning the two largest directories, that looks like
>>>>>   about 8K of traffic per file, and about 5 packets.
>>>>>
>>>>>   A 3.3.2 gluster installation that we are trying to retire is not
>>>>> afflicted
>>>>>   this way.
>>>>>
>>>>>   James Bellinger
>>>>>
>>>>>>   Am I the only person using Gluster suffering from very slow directory
>>>>>>   access? It's so seriously bad that it almost makes Gluster unusable.
>>>>>>
>>>>>>   Using NFS instead of the Fuse client masks the problem as long as the
>>>>>>   directories are cached but it's still hellishly slow when you first
>>>>>>   access them.
>>>>>>
>>>>>>   Has there
>>>>>> been any progress at all fixing this bug?
>>>>>>
>>>>>>   https://bugzilla.redhat.com/show_bug.cgi?id=1067256
>>>>>>
>>>>>>   Cheers,
>>>>>>
>>>>>> ________________________________
>>>>>>
>>>>>>   Gluster-users mailing list
>>>>>>   Gluster-users at gluster.org
>>>>>>   http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>>
>>>>>   Gluster-users mailing list
>>>>>   Gluster-users at gluster.org
>>>>>   http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>> ________________________________
>>>>
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>
>>> --
>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users