[Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts

Jiffin Tony Thottan jthottan at redhat.com
Fri Aug 12 05:35:48 UTC 2016



On 12/08/16 07:23, Deepak Naidu wrote:
> I tried more things to figure out the issue. Like upgrading NFS-ganesha to the latest version(as the earlier version had some bug regarding crashing), that helped a bit.
>
> But still again the ls -ls or rm -rf files were hanging but not much as earlier. So upgrade of NFS ganesha to stable version did help help a bit.
>
> I did strace again, looks like its pausing/hanging at "lstat" I had to [crtl+c] to get the exact hang/pausing line.
>
> lgetxattr("/mnt/gluster/rand.26.0", "security.selinux", 0x1990a00, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.25.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.25.0", "security.selinux", 0x1990a20, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.24.0", ^C
>
>
> NOTE: I am running fio to generate some write operation & hangs are seen when issuing ls during write operation.
>
> Next thing, I might try is to use NFS mount rather than Glustefs fuse to see if its related to fuse client.
>
> ====strace of ls -l /mnt/gluster/======
>
> munmap(0x7efebec71000, 4096)            = 0
> openat(AT_FDCWD, "/mnt/gluster/", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
> getdents(3, /* 14 entries */, 32768)    = 464
> lstat("/mnt/gluster/9e50d562-5846-4a60-ad75-e95dcbe0e38a.vhd", {st_mode=S_IFREG|0644, st_size=19474461184, ...}) = 0
> lgetxattr("/mnt/gluster/9e50d562-5846-4a60-ad75-e95dcbe0e38a.vhd", "security.selinux", 0x1990900, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/file1", {st_mode=S_IFREG|0644, st_size=19474461184, ...}) = 0
> lgetxattr("/mnt/gluster/file1", "security.selinux", 0x1990940, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.0.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.0.0", "security.selinux", 0x1990940, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.31.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.31.0", "security.selinux", 0x1990960, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.30.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.30.0", "security.selinux", 0x1990980, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.29.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.29.0", "security.selinux", 0x19909a0, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.28.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.28.0", "security.selinux", 0x19909c0, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.27.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.27.0", "security.selinux", 0x19909e0, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.26.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.26.0", "security.selinux", 0x1990a00, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.25.0", {st_mode=S_IFREG|0644, st_size=2147483648, ...}) = 0
> lgetxattr("/mnt/gluster/rand.25.0", "security.selinux", 0x1990a20, 255) = -1 ENODATA (No data available)
> lstat("/mnt/gluster/rand.24.0", ^C
>
> ====strace of end -  ls -l /mnt/gluster/======


I am wondering why it is  sending gettattr call on "security.selinux". 
Also can u please mention which version
of ganesha and details of ganesha.conf . Latest stable release for 
ganesha(2.3.3) is pretty close.

Check /var/log/ganesha.log and /var/log/ganesha-gfapi.log for more clues

--
Jiffin


> -----Original Message-----
> From: Deepak Naidu
> Sent: Wednesday, August 10, 2016 2:25 PM
> To: Vijay Bellur
> Cc: gluster-users at gluster.org
> Subject: RE: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts
>
> To be more precious the hang is clearly seen when there is some IO(write) to the mount point. Even rm -rf takes time to clear the files.
>
> Below, time command showing the delay. Typically it should take less then a second, but glusterfs take more than 5seconds just to list 32x 2GB files.
>
> [root at client-host ~]# time ls -l /mnt/gluster/ total 34575680 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.0.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.1.0 -rw-r--r--. 1 root root 2147454976 Aug 10 12:23 rand.10.0 -rw-r--r--. 1 root root 2147463168 Aug 10 12:23 rand.11.0 -rw-r--r--. 1 root root 2147467264 Aug 10 12:23 rand.12.0 -rw-r--r--. 1 root root 2147475456 Aug 10 12:23 rand.13.0 -rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.14.0 -rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.15.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.16.0 -rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.17.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.18.0 -rw-r--r--. 1 root root 2147467264 Aug 10 12:23 rand.19.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.2.0 -rw-r--r--. 1 root root 2147475456 Aug 10 12:23 rand.20.0 -rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.21.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.22.0 -rw
>   -r--r--. 1 root root 2147459072 Aug 10 12:23 rand.23.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.24.0 -rw-r--r--. 1 root root 2147471360 Aug 10 12:23 rand.25.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.26.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.27.0 -rw-r--r--. 1 root root 2147479552 Aug 10 12:23 rand.28.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.29.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.3.0 -rw-r--r--. 1 root root 2147442688 Aug 10 12:23 rand.30.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.31.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.4.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.5.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.6.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.7.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.8.0 -rw-r--r--. 1 root root 2147483648 Aug 10 12:23 rand.9.0
>
> real    0m7.478s
> user    0m0.001s
> sys     0m0.005s
>   [root at client-host ~]#
>
> --
> Deepak
>
> -----Original Message-----
> From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Deepak Naidu
> Sent: Wednesday, August 10, 2016 2:18 PM
> To: Vijay Bellur
> Cc: gluster-users at gluster.org
> Subject: Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts
>
> * PGP Signed: 08/10/2016 at 02:18:22 PM, Decrypted
>
> I did strace & its waiting on IO.
>
> --
> Deepak
>
> -----Original Message-----
> From: Vijay Bellur [mailto:vbellur at redhat.com]
> Sent: Wednesday, August 10, 2016 2:17 PM
> To: Deepak Naidu
> Cc: gluster-users at gluster.org
> Subject: Re: [Gluster-users] Linux (ls -l) command pauses/slow on GlusterFS mounts
>
> On 08/10/2016 05:12 PM, Deepak Naidu wrote:
>> Before we can try physical we wanted POC on VM.
>>
>> Just a note the VMs are decently powerful 18cpus, 10gig NIC, 45GB Ram 1TB SSD drives. This is per node spec.
>>
>> I don't see the ls -l command hanging when I try to list the files from the gluster-node VMs itself So the question.
> The reason I alluded to a physical setup was to remove the variables that can affect performance in a virtual setup. The behavior is not usual for the scale of deployment that you mention. You could use strace in conjunction with gluster volume profile to determine where the latency is stemming from.
>
> Regards,
> Vijay
>
>> --
>> Deepak
>>
>>> On Aug 10, 2016, at 2:01 PM, Vijay Bellur <vbellur at redhat.com> wrote:
>>>
>>>> On 08/10/2016 04:54 PM, Deepak Naidu wrote:
>>>> Anyone who has seen the issue in their env ?
>>>
>>>> --
>>>> Deepak
>>>>
>>>> -----Original Message-----
>>>> From: gluster-users-bounces at gluster.org
>>>> [mailto:gluster-users-bounces at gluster.org] On Behalf Of Deepak Naidu
>>>> Sent: Tuesday, August 09, 2016 9:14 PM
>>>> To: gluster-users at gluster.org
>>>> Subject: [Gluster-users] Linux (ls -l) command pauses/slow on
>>>> GlusterFS mounts
>>>>
>>>> Greetings,
>>>>
>>>> I have 3node GlusterFS on VM for POC each node has 2x bricks of 200GB. Regardless of what type of volume I create when listing files under directory using ls command the GlusterFS mount hangs pauses for few seconds. This is same if there're 2-5 19gb file each or 2gb file each. There are less than  10 files under the GlusterFS mount.
>>>>
>>>> I am using NFS-Ganesha for NFS server with GlusterFS and the Linux client is mounted using GlusterFS fuse mount with direct-io enabled.
>>>>
>>>> GlusterFS version 3.8(latest)
>>>>
>>>>
>>>> Any insight is appreciated.
>>> This does not seem usual for the deployment that you describe. Can you try on a physical setup to see if the same behavior is observed?
>>>
>>> -Vijay
>>>
>>>
>> ----------------------------------------------------------------------
>> ------------- This email message is for the sole use of the intended
>> recipient(s) and may contain confidential information.  Any
>> unauthorized review, use, disclosure or distribution is prohibited.
>> If you are not the intended recipient, please contact the sender by
>> reply email and destroy all copies of the original message.
>> ----------------------------------------------------------------------
>> -------------
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
> * Deepak Naidu <dnaidu at nvidia.com>
> * 0x15098040
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users




More information about the Gluster-users mailing list