[Gluster-users] not sure how to troubleshoot SMB / CIFS overload when using GlusterFS

Mon Jul 18 16:08:00 UTC 2011

Joseph,

Thank you for your response.  Yours, combined with Whit's, led me to come up
with a pretty solid repro case, and a pinpointing of what I think is going
on.

I tried your additional SMB configuration settings, and was hopeful, but it
didn't alleviate the issue.  But it was helpful, your interpretation of the
logs.  It makes sense now that Samba was pounding on GlusterFS, doing it's
string of getdents operations.

I also took your advice last night on stat-cache (I assume that was on the
Gluster side, which I enabled), and wasn't sure where fast lookups was.
That didn't seem to make a noticeable difference either.

I think the lockups are happening as a result of being crippled by
GlusterFS's relatively slow directory listing (5x-10x slower generating a
dir listing than a raw SMB share), combined with FUSE's blocking readdir().
I'm not positive on that last point since there was only one mention of that
on the internet.   Am praying that somebody will see this and say, "oh yeah,
well sure, just change this one thing in FUSE and you're good to go!"
Somehow I don't think that's going to happen.  :)

Ken

On Sun, Jul 17, 2011 at 10:35 PM, Joe Landman <
landman at scalableinformatics.com> wrote:

> On 07/17/2011 11:19 PM, Ken Randall wrote:
>
>> Joe,
>>
>> Thank you for your response.  After seeing what you wrote, I bumped up
>> the performance.cache-size up to 4096MB, the max allowed, and ran into
>> the same wall.
>>
>
> Hmmm ...
>
>
>
>> I wouldn't think that any SMB caching would help in this case, since the
>> same Samba server on top of the raw Gluster data wasn't exhibiting any
>> trouble, or am I deceived?
>>
>
> Samba could cache better so it didn't have to hit Gluster so hard.
>
>
>  I haven't used strace before, but I ran it on the glusterfs process, and
>> saw a lot of:
>> epoll_wait(3, {{EPOLLIN, {u32=9, u64=9}}}, 257, 4294967295) = 1
>> readv(9, [{"\200\0\16,", 4}], 1)        = 4
>> readv(9, [{"\0\n;\227\0\0\0\1", 8}], 1) = 8
>> readv(9,
>> [{"\0\0\0\0\0\0\0\0\0\0\0\0\0\**0\0\0\0\0\0\31\0\0\0\0\0\0\0\**1\0\0\0\0"...,
>> 3620}],
>> 1) = 1436
>> readv(9, 0xa90b1b8, 1)                  = -1 EAGAIN (Resource
>> temporarily unavailable)
>>
>
> Interesting ... I am not sure why its reporting an EAGAIN for readv, other
> than it can't fill the vector from the read.
>
>
>  And when I ran it on smbd, I saw a constant stream of this kind of
>> activity:
>> getdents(29, /* 25 entries */, 32768)   = 840
>> getdents(29, /* 25 entries */, 32768)   = 856
>> getdents(29, /* 25 entries */, 32768)   = 848
>> getdents(29, /* 24 entries */, 32768)   = 856
>> getdents(29, /* 25 entries */, 32768)   = 864
>> getdents(29, /* 24 entries */, 32768)   = 832
>> getdents(29, /* 25 entries */, 32768)   = 832
>> getdents(29, /* 24 entries */, 32768)   = 856
>> getdents(29, /* 25 entries */, 32768)   = 840
>> getdents(29, /* 24 entries */, 32768)   = 832
>> getdents(29, /* 25 entries */, 32768)   = 784
>> getdents(29, /* 25 entries */, 32768)   = 824
>> getdents(29, /* 25 entries */, 32768)   = 808
>> getdents(29, /* 25 entries */, 32768)   = 840
>> getdents(29, /* 25 entries */, 32768)   = 864
>> getdents(29, /* 25 entries */, 32768)   = 872
>> getdents(29, /* 25 entries */, 32768)   = 832
>> getdents(29, /* 24 entries */, 32768)   = 832
>> getdents(29, /* 25 entries */, 32768)   = 840
>> getdents(29, /* 25 entries */, 32768)   = 824
>> getdents(29, /* 25 entries */, 32768)   = 824
>> getdents(29, /* 24 entries */, 32768)   = 864
>> getdents(29, /* 25 entries */, 32768)   = 848
>> getdents(29, /* 24 entries */, 32768)   = 840
>>
>
> Get directory entries.  This is the stuff that NTFS is caching for its web
> server, and it appears Samba is not.
>
> Try
>
>        aio read size = 32768
>        csc policy = documents
>        dfree cache time = 60
>        directory name cache size = 100000
>        fake oplocks = yes
>        getwd cache = yes
>        level2 oplocks = yes
>        max stat cache size = 16384
>
>
>  That chunk would get repeated over and over and over again as fast as
>> the screen could go, with the occasional (every 5-10 seconds or so),
>> would you see anything that you'd normally expect to see, such as:
>> close(29)                               = 0
>> stat("Storage/01", 0x7fff07dae870) = -1 ENOENT (No such file or directory)
>> write(23,
>> "\0\0\0#\377SMB24\0\0\300\**210A\310\0\0\0\0\0\0\0\0\0\0\**
>> 0\0\1\0d\233"...,
>> 39) = 39
>> select(38, [5 20 23 27 30 31 35 36 37], [], NULL, {60, 0}) = 1 (in [23],
>> left {60, 0})
>> read(23, "\0\0\0x", 4)                  = 4
>> read(23,
>> "\377SMB2\0\0\0\0\30\7\310\0\**0\0\0\0\0\0\0\0\0\0\0\1\0\**
>> 250P\273\0[8"...,
>> 120) = 120
>> stat("Storage", {st_mode=S_IFDIR|0755, st_size=1581056, ...}) = 0
>> stat("Storage/011235", 0x7fff07dad470) = -1 ENOENT (No such file or
>> directory)
>> stat("Storage/011235", 0x7fff07dad470) = -1 ENOENT (No such file or
>> directory)
>> open("Storage", O_RDONLY|O_NONBLOCK|O_**DIRECTORY) = 29
>> fcntl(29, F_SETFD, FD_CLOEXEC)          = 0
>>
>> (The no such file or directory part is expected since some of the image
>> references don't exist.)
>>
>>
> Ok.  It looks like Samba is pounding on GlusterFS metadata (getdents).
> GlusterFS doesn't really do a great job in this case ... you have to give it
> help and cache pretty aggressively here.  Samba can do this caching to some
> extent.  You might want to enable stat-cache and fast lookups.  These have
> been problematic for us in the past though, and I'd recommend caution.
>
>  Ken
>>
>>
>>
>> ______________________________**_________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/**mailman/listinfo/gluster-users<http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>
>>
>
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics, Inc.
> email: landman at scalableinformatics.**com <landman at scalableinformatics.com>
> web  : http://scalableinformatics.com
>       http://scalableinformatics.**com/sicluster<http://scalableinformatics.com/sicluster>
> phone: +1 734 786 8423 x121
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
> ______________________________**_________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/**mailman/listinfo/gluster-users<http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110718/4efe0a9e/attachment.html>