[Gluster-users] glusterfsd process spinning
Pranith Kumar Karampuri
pkarampu at redhat.com
Wed Jun 4 04:49:47 UTC 2014
On 06/04/2014 08:07 AM, Susant Palai wrote:
> Pranith can you send the client and bricks logs.
I have the logs. But I believe for this issue of directory not listing
entries, it would help more if we have the contents of that directory on
all the directories in the bricks + their hash values in the xattrs.
Pranith
>
> Thanks,
> Susant~
>
> ----- Original Message -----
> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> To: "Franco Broi" <franco.broi at iongeo.com>
> Cc: gluster-users at gluster.org, "Raghavendra Gowdappa" <rgowdapp at redhat.com>, spalai at redhat.com, kdhananj at redhat.com, vsomyaju at redhat.com, nbalacha at redhat.com
> Sent: Wednesday, 4 June, 2014 7:53:41 AM
> Subject: Re: [Gluster-users] glusterfsd process spinning
>
> hi Franco,
> CC Devs who work on DHT to comment.
>
> Pranith
>
> On 06/04/2014 07:39 AM, Franco Broi wrote:
>> On Wed, 2014-06-04 at 07:28 +0530, Pranith Kumar Karampuri wrote:
>>> Franco,
>>> Thanks for providing the logs. I just copied over the logs to my
>>> machine. Most of the logs I see are related to "No such File or
>>> Directory" I wonder what lead to this. Do you have any idea?
>> No but I'm just looking at my 3.5 Gluster volume and it has a directory
>> that looks empty but can't be deleted. When I look at the directories on
>> the servers there are definitely files in there.
>>
>> [franco at charlie1 franco]$ rmdir /data2/franco/dir1226/dir25
>> rmdir: failed to remove `/data2/franco/dir1226/dir25': Directory not empty
>> [franco at charlie1 franco]$ ls -la /data2/franco/dir1226/dir25
>> total 8
>> drwxrwxr-x 2 franco support 60 May 21 03:58 .
>> drwxrwxr-x 3 franco support 24 Jun 4 09:37 ..
>>
>> [root at nas6 ~]# ls -la /data*/gvol/franco/dir1226/dir25
>> /data21/gvol/franco/dir1226/dir25:
>> total 2081
>> drwxrwxr-x 13 1348 200 13 May 21 03:58 .
>> drwxrwxr-x 3 1348 200 3 May 21 03:58 ..
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13017
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13018
>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13020
>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13021
>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13022
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13024
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13027
>> drwxrwxr-x 2 1348 200 3 May 16 12:05 dir13028
>> drwxrwxr-x 2 1348 200 2 May 16 12:06 dir13029
>> drwxrwxr-x 2 1348 200 2 May 16 12:06 dir13031
>> drwxrwxr-x 2 1348 200 3 May 16 12:06 dir13032
>>
>> /data22/gvol/franco/dir1226/dir25:
>> total 2084
>> drwxrwxr-x 13 1348 200 13 May 21 03:58 .
>> drwxrwxr-x 3 1348 200 3 May 21 03:58 ..
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13017
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13018
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13020
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13021
>> drwxrwxr-x 2 1348 200 2 May 16 12:05 dir13022
>> .....
>>
>> Maybe Gluster is losing track of the files??
>>
>>> Pranith
>>>
>>> On 06/02/2014 02:48 PM, Franco Broi wrote:
>>>> Hi Pranith
>>>>
>>>> Here's a listing of the brick logs, looks very odd especially the size
>>>> of the log for data10.
>>>>
>>>> [root at nas3 bricks]# ls -ltrh
>>>> total 2.6G
>>>> -rw------- 1 root root 381K May 13 12:15 data12-gvol.log-20140511
>>>> -rw------- 1 root root 430M May 13 12:15 data11-gvol.log-20140511
>>>> -rw------- 1 root root 328K May 13 12:15 data9-gvol.log-20140511
>>>> -rw------- 1 root root 2.0M May 13 12:15 data10-gvol.log-20140511
>>>> -rw------- 1 root root 0 May 18 03:43 data10-gvol.log-20140525
>>>> -rw------- 1 root root 0 May 18 03:43 data11-gvol.log-20140525
>>>> -rw------- 1 root root 0 May 18 03:43 data12-gvol.log-20140525
>>>> -rw------- 1 root root 0 May 18 03:43 data9-gvol.log-20140525
>>>> -rw------- 1 root root 0 May 25 03:19 data10-gvol.log-20140601
>>>> -rw------- 1 root root 0 May 25 03:19 data11-gvol.log-20140601
>>>> -rw------- 1 root root 0 May 25 03:19 data9-gvol.log-20140601
>>>> -rw------- 1 root root 98M May 26 03:04 data12-gvol.log-20140518
>>>> -rw------- 1 root root 0 Jun 1 03:37 data10-gvol.log
>>>> -rw------- 1 root root 0 Jun 1 03:37 data11-gvol.log
>>>> -rw------- 1 root root 0 Jun 1 03:37 data12-gvol.log
>>>> -rw------- 1 root root 0 Jun 1 03:37 data9-gvol.log
>>>> -rw------- 1 root root 1.8G Jun 2 16:35 data10-gvol.log-20140518
>>>> -rw------- 1 root root 279M Jun 2 16:35 data9-gvol.log-20140518
>>>> -rw------- 1 root root 328K Jun 2 16:35 data12-gvol.log-20140601
>>>> -rw------- 1 root root 8.3M Jun 2 16:35 data11-gvol.log-20140518
>>>>
>>>> Too big to post everything.
>>>>
>>>> Cheers,
>>>>
>>>> On Sun, 2014-06-01 at 22:00 -0400, Pranith Kumar Karampuri wrote:
>>>>> ----- Original Message -----
>>>>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>>>>>> To: "Franco Broi" <franco.broi at iongeo.com>
>>>>>> Cc: gluster-users at gluster.org
>>>>>> Sent: Monday, June 2, 2014 7:01:34 AM
>>>>>> Subject: Re: [Gluster-users] glusterfsd process spinning
>>>>>>
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>>> From: "Franco Broi" <franco.broi at iongeo.com>
>>>>>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>>>>>>> Cc: gluster-users at gluster.org
>>>>>>> Sent: Sunday, June 1, 2014 10:53:51 AM
>>>>>>> Subject: Re: [Gluster-users] glusterfsd process spinning
>>>>>>>
>>>>>>>
>>>>>>> The volume is almost completely idle now and the CPU for the brick
>>>>>>> process has returned to normal. I've included the profile and I think it
>>>>>>> shows the latency for the bad brick (data12) is unusually high, probably
>>>>>>> indicating the filesystem is at fault after all??
>>>>>> I am not sure if we can believe the outputs now that you say the brick
>>>>>> returned to normal. Next time it is acting up, do the same procedure and
>>>>>> post the result.
>>>>> On second thought may be its not a bad idea to inspect the log files of the bricks in nas3. Could you post them.
>>>>>
>>>>> Pranith
>>>>>
>>>>>> Pranith
>>>>>>> On Sun, 2014-06-01 at 01:01 -0400, Pranith Kumar Karampuri wrote:
>>>>>>>> Franco,
>>>>>>>> Could you do the following to get more information:
>>>>>>>>
>>>>>>>> "gluster volume profile <volname> start"
>>>>>>>>
>>>>>>>> Wait for some time, this will start gathering what operations are coming
>>>>>>>> to
>>>>>>>> all the bricks"
>>>>>>>> Now execute "gluster volume profile <volname> info" >
>>>>>>>> /file/you/should/reply/to/this/mail/with
>>>>>>>>
>>>>>>>> Then execute:
>>>>>>>> gluster volume profile <volname> stop
>>>>>>>>
>>>>>>>> Lets see if this throws any light on the problem at hand
>>>>>>>>
>>>>>>>> Pranith
>>>>>>>> ----- Original Message -----
>>>>>>>>> From: "Franco Broi" <franco.broi at iongeo.com>
>>>>>>>>> To: gluster-users at gluster.org
>>>>>>>>> Sent: Sunday, June 1, 2014 9:02:48 AM
>>>>>>>>> Subject: [Gluster-users] glusterfsd process spinning
>>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> I've been suffering from continual problems with my gluster filesystem
>>>>>>>>> slowing down due to what I thought was congestion on a single brick
>>>>>>>>> being caused by a problem with the underlying filesystem running slow
>>>>>>>>> but I've just noticed that the glusterfsd process for that particular
>>>>>>>>> brick is running at 100%+, even when the filesystem is almost idle.
>>>>>>>>>
>>>>>>>>> I've done a couple of straces of the brick and another on the same
>>>>>>>>> server, does the high number of futex errors give any clues as to what
>>>>>>>>> might be wrong?
>>>>>>>>>
>>>>>>>>> % time seconds usecs/call calls errors syscall
>>>>>>>>> ------ ----------- ----------- --------- --------- ----------------
>>>>>>>>> 45.58 0.027554 0 191665 20772 futex
>>>>>>>>> 28.26 0.017084 0 137133 readv
>>>>>>>>> 26.04 0.015743 0 66259 epoll_wait
>>>>>>>>> 0.13 0.000077 3 23 writev
>>>>>>>>> 0.00 0.000000 0 1 epoll_ctl
>>>>>>>>> ------ ----------- ----------- --------- --------- ----------------
>>>>>>>>> 100.00 0.060458 395081 20772 total
>>>>>>>>>
>>>>>>>>> % time seconds usecs/call calls errors syscall
>>>>>>>>> ------ ----------- ----------- --------- --------- ----------------
>>>>>>>>> 99.25 0.334020 133 2516 epoll_wait
>>>>>>>>> 0.40 0.001347 0 4090 26 futex
>>>>>>>>> 0.35 0.001192 0 5064 readv
>>>>>>>>> 0.00 0.000000 0 20 writev
>>>>>>>>> ------ ----------- ----------- --------- --------- ----------------
>>>>>>>>> 100.00 0.336559 11690 26 total
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
More information about the Gluster-users
mailing list