[Gluster-users] [Gluster-devel] heal hanging
Pranith Kumar Karampuri
pkarampu at redhat.com
Thu Jan 28 15:00:37 UTC 2016
On 01/28/2016 07:48 PM, David Robinson wrote:
> > Something really bad related to locks is happening. Did you guys patch the recent memory
> corruption bug which only affects workloads with more than 128
> clients? > http://review.gluster.org/13241
> We have not applied that patch. Will this be included in the 3.6.7
> release? If so, do you know when that version will be released?
+ Raghavendra Bhat
Could you please let David know about next release date?
> David
> ------ Original Message ------
> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>
> To: "David Robinson" <drobinson at corvidtec.com
> <mailto:drobinson at corvidtec.com>>; "Glomski, Patrick"
> <patrick.glomski at corvidtec.com <mailto:patrick.glomski at corvidtec.com>>
> Cc: "gluster-users at gluster.org" <gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>>; "Gluster Devel"
> <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
> Sent: 1/28/2016 5:10:07 AM
> Subject: Re: [Gluster-users] [Gluster-devel] heal hanging
>>
>>
>> On 01/25/2016 11:10 PM, David Robinson wrote:
>>> It is doing it again... statedump from gfs02a is attached...
>>
>> David,
>> I see a lot of traffic from [f]inodelks:
>> 15:09:00 :) ⚡ grep wind_from
>> data-brick02a-homegfs.4066.dump.1453742225 | sort | uniq -c
>> 11 unwind_from=default_finodelk_cbk
>> 11 unwind_from=io_stats_finodelk_cbk
>> 11 unwind_from=pl_common_inodelk
>> 1133 wind_from=default_finodelk_resume
>> 1 wind_from=default_inodelk_resume
>> 75 wind_from=index_getxattr
>> 6 wind_from=io_stats_entrylk
>> 12776 wind_from=io_stats_finodelk
>> 15 wind_from=io_stats_flush
>> 75 wind_from=io_stats_getxattr
>> 4 wind_from=io_stats_inodelk
>> 4 wind_from=io_stats_lk
>> 4 wind_from=io_stats_setattr
>> 75 wind_from=marker_getxattr
>> 4 wind_from=marker_setattr
>> 75 wind_from=quota_getxattr
>> 6 wind_from=server_entrylk_resume
>> 12776 wind_from=server_finodelk_resume <<--------------
>> 15 wind_from=server_flush_resume
>> 75 wind_from=server_getxattr_resume
>> 4 wind_from=server_inodelk_resume
>> 4 wind_from=server_lk_resume
>> 4 wind_from=server_setattr_resume
>>
>> But very less number of active locks:
>> pk1 at localhost - ~/Downloads
>> 15:09:07 :) ⚡ grep ACTIVE data-brick02a-homegfs.4066.dump.1453742225
>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
>> start=9223372036854775806, len=0, pid = 11678,
>> owner=b42fff03ce7f0000, client=0x13d2cd0,
>> connection-id=corvidpost3.corvidtec.com-52656-2016/01/22-16:40:31:459920-homegfs-client-6-0-1,
>> granted at 2016-01-25 17:16:06
>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid
>> = 15759, owner=b8ca8c0100000000, client=0x189e470,
>> connection-id=corvidpost4.corvidtec.com-17718-2016/01/22-16:40:31:221380-homegfs-client-6-0-1,
>> granted at 2016-01-25 17:12:52
>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
>> start=9223372036854775806, len=0, pid = 7103, owner=0cf31a98f87f0000,
>> client=0x2201d60,
>> connection-id=zlv-bangell-4812-2016/01/25-13:45:52:170157-homegfs-client-6-0-0,
>> granted at 2016-01-25 17:09:56
>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
>> start=9223372036854775806, len=0, pid = 55764,
>> owner=882dbea1417f0000, client=0x17fc940,
>> connection-id=corvidpost.corvidtec.com-35961-2016/01/22-16:40:31:88946-homegfs-client-6-0-1,
>> granted at 2016-01-25 17:06:12
>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
>> start=9223372036854775806, len=0, pid = 21129,
>> owner=3cc068a1e07f0000, client=0x1495040,
>> connection-id=corvidpost2.corvidtec.com-43400-2016/01/22-16:40:31:248771-homegfs-client-6-0-1,
>> granted at 2016-01-25 17:15:53
>>
>> One more odd thing I found is the following:
>>
>> [2016-01-15 14:03:06.910687] C
>> [rpc-clnt-ping.c:109:rpc_clnt_ping_timer_expired] 0-homegfs-client-2:
>> server 10.200.70.1:49153 has not responded in the last 10 seconds,
>> disconnecting.
>> [2016-01-15 14:03:06.910886] E [rpc-clnt.c:362:saved_frames_unwind]
>> (-->
>> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x2b74c289a580]
>> (-->
>> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x2b74c2b27787]
>> (-->
>> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x2b74c2b2789e]
>> (-->
>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x2b74c2b27951]
>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x2b74c2b27f1f]
>> ))))) 0-homegfs-client-2: forced unwinding frame type(GlusterFS 3.3)
>> op(FINODELK(30)) called at 2016-01-15 10:30:09.487422 (xid=0x11ed3f)
>>
>> FINODELK is called at 2016-01-15 10:30:09.487422 but the response
>> still didn't come till 14:03:06. That is almost 3.5 hours!!
>>
>> Something really bad related to locks is happening. Did you guys
>> patch the recent memory corruption bug which only affects workloads
>> with more than 128 clients? http://review.gluster.org/13241
>>
>> Pranith
>>> ------ Original Message ------
>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com
>>> <mailto:pkarampu at redhat.com>>
>>> To: "Glomski, Patrick" <patrick.glomski at corvidtec.com
>>> <mailto:patrick.glomski at corvidtec.com>>
>>> Cc: "David Robinson" <drobinson at corvidtec.com
>>> <mailto:drobinson at corvidtec.com>>; mailto:gluster-users at gluster.org
>>> <gluster-users at gluster.org <mailto:gluster-users at gluster.org>>;
>>> "Gluster Devel" <gluster-devel at gluster.org
>>> <mailto:gluster-devel at gluster.org>>
>>> Sent: 1/24/2016 9:27:02 PM
>>> Subject: Re: [Gluster-users] [Gluster-devel] heal hanging
>>>> It seems like there is a lot of finodelk/inodelk traffic. I wonder
>>>> why that is. I think the next steps is to collect statedump of the
>>>> brick which is taking lot of CPU, using "gluster volume statedump
>>>> <volname>"
>>>>
>>>> Pranith
>>>> On 01/22/2016 08:36 AM, Glomski, Patrick wrote:
>>>>> Pranith, attached are stack traces collected every second for 20
>>>>> seconds from the high-%cpu glusterfsd process.
>>>>>
>>>>> Patrick
>>>>>
>>>>> On Thu, Jan 21, 2016 at 9:46 PM, Glomski, Patrick
>>>>> <patrick.glomski at corvidtec.com
>>>>> <mailto:patrick.glomski at corvidtec.com>> wrote:
>>>>>
>>>>> Last entry for get_real_filename on any of the bricks was when
>>>>> we turned off the samba gfapi vfs plugin earlier today:
>>>>>
>>>>> /var/log/glusterfs/bricks/data-brick01a-homegfs.log:[2016-01-21 15:13:00.008239]
>>>>> E [server-rpc-fops.c:768:server_getxattr_cbk]
>>>>> 0-homegfs-server: 105: GETXATTR /wks_backup
>>>>> (40e582d6-b0c7-4099-ba88-9168a3c32ca6)
>>>>> (glusterfs.get_real_filename:desktop.ini) ==> (Permission denied)
>>>>>
>>>>> We'll get back to you with those traces when %cpu spikes
>>>>> again. As with most sporadic problems, as soon as you want
>>>>> something out of it, the issue becomes harder to reproduce.
>>>>>
>>>>>
>>>>> On Thu, Jan 21, 2016 at 9:21 PM, Pranith Kumar Karampuri
>>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 01/22/2016 07:25 AM, Glomski, Patrick wrote:
>>>>>> Unfortunately, all samba mounts to the gluster volume
>>>>>> through the gfapi vfs plugin have been disabled for the
>>>>>> last 6 hours or so and frequency of %cpu spikes is
>>>>>> increased. We had switched to sharing a fuse mount
>>>>>> through samba, but I just disabled that as well. There
>>>>>> are no samba shares of this volume now. The spikes now
>>>>>> happen every thirty minutes or so. We've resorted to just
>>>>>> rebooting the machine with high load for the present.
>>>>>
>>>>> Could you see if the logs of following type are not at all
>>>>> coming?
>>>>> [2016-01-21 15:13:00.005736] E
>>>>> [server-rpc-fops.c:768:server_getxattr_cbk]
>>>>> 0-homegfs-server: 110: GETXATTR /wks_backup
>>>>> (40e582d6-b0c7-4099-ba88-9168a3c
>>>>> 32ca6) (glusterfs.get_real_filename:desktop.ini) ==>
>>>>> (Permission denied)
>>>>>
>>>>> These are operations that failed. Operations that succeed
>>>>> are the ones that will scan the directory. But I don't
>>>>> have a way to find them other than using tcpdumps.
>>>>>
>>>>> At the moment I have 2 theories:
>>>>> 1) these get_real_filename calls
>>>>> 2) [2016-01-21 16:10:38.017828] E
>>>>> [server-helpers.c:46:gid_resolve] 0-gid-cache:
>>>>> getpwuid_r(494) failed
>>>>> "
>>>>>
>>>>> Yessir they are. Normally, sssd would look to the local
>>>>> cache file in /var/lib/sss/db/ first, to get any group or
>>>>> userid information, then go out to the domain controller.
>>>>> I put the options that we are using on our GFS volumes
>>>>> below… Thanks for your help.
>>>>>
>>>>> We had been running sssd with sssd_nss and sssd_be
>>>>> sub-processes on these systems for a long time, under the
>>>>> GFS 3.5.2 code, and not run into the problem that David
>>>>> described with the high cpu usage on sssd_nss.
>>>>>
>>>>> *"
>>>>> *That was Tom Young's email 1.5 years back when we
>>>>> debugged it. But the process which was consuming lot of
>>>>> cpu is sssd_nss. So I am not sure if it is same issue. Let
>>>>> us debug to see '1)' doesn't happen. The gstack traces I
>>>>> asked for should also help.
>>>>>
>>>>>
>>>>> Pranith
>>>>>>
>>>>>> On Thu, Jan 21, 2016 at 8:49 PM, Pranith Kumar Karampuri
>>>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 01/22/2016 07:13 AM, Glomski, Patrick wrote:
>>>>>>> We use the samba glusterfs virtual filesystem (the
>>>>>>> current version provided on download.gluster.org
>>>>>>> <http://download.gluster.org/>), but no windows
>>>>>>> clients connecting directly.
>>>>>>
>>>>>> Hmm.. Is there a way to disable using this and check
>>>>>> if the CPU% still increases? What getxattr of
>>>>>> "glusterfs.get_real_filename <filanme>" does is to
>>>>>> scan the entire directory looking for
>>>>>> strcasecmp(<filname>, <scanned-filename>). If
>>>>>> anything matches then it will return the
>>>>>> <scanned-filename>. But the problem is the scan is
>>>>>> costly. So I wonder if this is the reason for the CPU
>>>>>> spikes.
>>>>>>
>>>>>> Pranith
>>>>>>
>>>>>>>
>>>>>>> On Thu, Jan 21, 2016 at 8:37 PM, Pranith Kumar
>>>>>>> Karampuri <pkarampu at redhat.com
>>>>>>> <mailto:pkarampu at redhat.com>> wrote:
>>>>>>>
>>>>>>> Do you have any windows clients? I see a lot of
>>>>>>> getxattr calls for "glusterfs.get_real_filename"
>>>>>>> which lead to full readdirs of the directories
>>>>>>> on the brick.
>>>>>>>
>>>>>>> Pranith
>>>>>>>
>>>>>>> On 01/22/2016 12:51 AM, Glomski, Patrick wrote:
>>>>>>>> Pranith, could this kind of behavior be
>>>>>>>> self-inflicted by us deleting files directly
>>>>>>>> from the bricks? We have done that in the past
>>>>>>>> to clean up an issues where gluster wouldn't
>>>>>>>> allow us to delete from the mount.
>>>>>>>>
>>>>>>>> If so, is it feasible to clean them up by
>>>>>>>> running a search on the .glusterfs directories
>>>>>>>> directly and removing files with a reference
>>>>>>>> count of 1 that are non-zero size (or directly
>>>>>>>> checking the xattrs to be sure that it's not a
>>>>>>>> DHT link).
>>>>>>>>
>>>>>>>> find /data/brick01a/homegfs/.glusterfs -type f
>>>>>>>> -not -empty -links -2 -exec rm -f "{}" \;
>>>>>>>>
>>>>>>>> Is there anything I'm inherently missing with
>>>>>>>> that approach that will further corrupt the system?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jan 21, 2016 at 1:02 PM, Glomski,
>>>>>>>> Patrick <patrick.glomski at corvidtec.com
>>>>>>>> <mailto:patrick.glomski at corvidtec.com>> wrote:
>>>>>>>>
>>>>>>>> Load spiked again: ~1200%cpu on gfs02a for
>>>>>>>> glusterfsd. Crawl has been running on one
>>>>>>>> of the bricks on gfs02b for 25 min or so
>>>>>>>> and users cannot access the volume.
>>>>>>>>
>>>>>>>> I re-listed the xattrop directories as well
>>>>>>>> as a 'top' entry and heal statistics. Then
>>>>>>>> I restarted the gluster services on gfs02a.
>>>>>>>>
>>>>>>>> =================== top ===================
>>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM
>>>>>>>> TIME+ COMMAND
>>>>>>>> 8969 root 20 0 2815m 204m 3588 S
>>>>>>>> 1181.0 0.6 591:06.93 glusterfsd
>>>>>>>>
>>>>>>>> =================== xattrop ===================
>>>>>>>> /data/brick01a/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> xattrop-41f19453-91e4-437c-afa9-3b25614de210 xattrop-9b815879-2f4d-402b-867c-a6d65087788c
>>>>>>>>
>>>>>>>> /data/brick02a/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> xattrop-70131855-3cfb-49af-abce-9d23f57fb393 xattrop-dfb77848-a39d-4417-a725-9beca75d78c6
>>>>>>>>
>>>>>>>> /data/brick01b/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> e6e47ed9-309b-42a7-8c44-28c29b9a20f8
>>>>>>>> xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125
>>>>>>>> xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934 xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0
>>>>>>>>
>>>>>>>> /data/brick02b/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413
>>>>>>>>
>>>>>>>> /data/brick01a/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531
>>>>>>>>
>>>>>>>> /data/brick02a/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> xattrop-7e20fdb1-5224-4b9a-be06-568708526d70
>>>>>>>>
>>>>>>>> /data/brick01b/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> 8034bc06-92cd-4fa5-8aaf-09039e79d2c8
>>>>>>>> c9ce22ed-6d8b-471b-a111-b39e57f0b512
>>>>>>>> 94fa1d60-45ad-4341-b69c-315936b51e8d
>>>>>>>> xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7
>>>>>>>>
>>>>>>>> /data/brick02b/homegfs/.glusterfs/indices/xattrop:
>>>>>>>> xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d
>>>>>>>>
>>>>>>>>
>>>>>>>> =================== heal stats
>>>>>>>> ===================
>>>>>>>>
>>>>>>>> homegfs [b0-gfsib01a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:45 2016
>>>>>>>> homegfs [b0-gfsib01a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:45 2016
>>>>>>>> homegfs [b0-gfsib01a] : Type of crawl: INDEX
>>>>>>>> homegfs [b0-gfsib01a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b0-gfsib01a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b0-gfsib01a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b1-gfsib01b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:19 2016
>>>>>>>> homegfs [b1-gfsib01b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:19 2016
>>>>>>>> homegfs [b1-gfsib01b] : Type of crawl: INDEX
>>>>>>>> homegfs [b1-gfsib01b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b1-gfsib01b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b1-gfsib01b] : No. of heal failed
>>>>>>>> entries : 1
>>>>>>>>
>>>>>>>> homegfs [b2-gfsib01a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:48 2016
>>>>>>>> homegfs [b2-gfsib01a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:48 2016
>>>>>>>> homegfs [b2-gfsib01a] : Type of crawl: INDEX
>>>>>>>> homegfs [b2-gfsib01a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b2-gfsib01a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b2-gfsib01a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b3-gfsib01b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:47 2016
>>>>>>>> homegfs [b3-gfsib01b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:47 2016
>>>>>>>> homegfs [b3-gfsib01b] : Type of crawl: INDEX
>>>>>>>> homegfs [b3-gfsib01b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b3-gfsib01b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b3-gfsib01b] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b4-gfsib02a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:06 2016
>>>>>>>> homegfs [b4-gfsib02a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:06 2016
>>>>>>>> homegfs [b4-gfsib02a] : Type of crawl: INDEX
>>>>>>>> homegfs [b4-gfsib02a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b4-gfsib02a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b4-gfsib02a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b5-gfsib02b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:13:40 2016
>>>>>>>> homegfs [b5-gfsib02b] : *** Crawl is in
>>>>>>>> progress ***
>>>>>>>> homegfs [b5-gfsib02b] : Type of crawl: INDEX
>>>>>>>> homegfs [b5-gfsib02b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b5-gfsib02b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b5-gfsib02b] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b6-gfsib02a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:58 2016
>>>>>>>> homegfs [b6-gfsib02a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:58 2016
>>>>>>>> homegfs [b6-gfsib02a] : Type of crawl: INDEX
>>>>>>>> homegfs [b6-gfsib02a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b6-gfsib02a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b6-gfsib02a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b7-gfsib02b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:36:50 2016
>>>>>>>> homegfs [b7-gfsib02b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:36:50 2016
>>>>>>>> homegfs [b7-gfsib02b] : Type of crawl: INDEX
>>>>>>>> homegfs [b7-gfsib02b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b7-gfsib02b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b7-gfsib02b] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>>
>>>>>>>> ========================================================================================
>>>>>>>> I waited a few minutes for the heals to
>>>>>>>> finish and ran the heal statistics and info
>>>>>>>> again. one file is in split-brain. Aside
>>>>>>>> from the split-brain, the load on all
>>>>>>>> systems is down now and they are behaving
>>>>>>>> normally. glustershd.log is attached. What
>>>>>>>> is going on???
>>>>>>>>
>>>>>>>> Thu Jan 21 12:53:50 EST 2016
>>>>>>>>
>>>>>>>> =================== homegfs ===================
>>>>>>>>
>>>>>>>> homegfs [b0-gfsib01a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:02 2016
>>>>>>>> homegfs [b0-gfsib01a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:02 2016
>>>>>>>> homegfs [b0-gfsib01a] : Type of crawl: INDEX
>>>>>>>> homegfs [b0-gfsib01a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b0-gfsib01a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b0-gfsib01a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b1-gfsib01b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:38 2016
>>>>>>>> homegfs [b1-gfsib01b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:38 2016
>>>>>>>> homegfs [b1-gfsib01b] : Type of crawl: INDEX
>>>>>>>> homegfs [b1-gfsib01b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b1-gfsib01b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b1-gfsib01b] : No. of heal failed
>>>>>>>> entries : 1
>>>>>>>>
>>>>>>>> homegfs [b2-gfsib01a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:04 2016
>>>>>>>> homegfs [b2-gfsib01a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:04 2016
>>>>>>>> homegfs [b2-gfsib01a] : Type of crawl: INDEX
>>>>>>>> homegfs [b2-gfsib01a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b2-gfsib01a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b2-gfsib01a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b3-gfsib01b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:04 2016
>>>>>>>> homegfs [b3-gfsib01b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:04 2016
>>>>>>>> homegfs [b3-gfsib01b] : Type of crawl: INDEX
>>>>>>>> homegfs [b3-gfsib01b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b3-gfsib01b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b3-gfsib01b] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b4-gfsib02a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:33 2016
>>>>>>>> homegfs [b4-gfsib02a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:33 2016
>>>>>>>> homegfs [b4-gfsib02a] : Type of crawl: INDEX
>>>>>>>> homegfs [b4-gfsib02a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b4-gfsib02a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b4-gfsib02a] : No. of heal failed
>>>>>>>> entries : 1
>>>>>>>>
>>>>>>>> homegfs [b5-gfsib02b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:14 2016
>>>>>>>> homegfs [b5-gfsib02b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:15 2016
>>>>>>>> homegfs [b5-gfsib02b] : Type of crawl: INDEX
>>>>>>>> homegfs [b5-gfsib02b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b5-gfsib02b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b5-gfsib02b] : No. of heal failed
>>>>>>>> entries : 3
>>>>>>>>
>>>>>>>> homegfs [b6-gfsib02a] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:04 2016
>>>>>>>> homegfs [b6-gfsib02a] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:04 2016
>>>>>>>> homegfs [b6-gfsib02a] : Type of crawl: INDEX
>>>>>>>> homegfs [b6-gfsib02a] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b6-gfsib02a] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b6-gfsib02a] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> homegfs [b7-gfsib02b] : Starting time of
>>>>>>>> crawl : Thu Jan 21 12:53:09 2016
>>>>>>>> homegfs [b7-gfsib02b] : Ending time of
>>>>>>>> crawl : Thu Jan 21 12:53:09 2016
>>>>>>>> homegfs [b7-gfsib02b] : Type of crawl: INDEX
>>>>>>>> homegfs [b7-gfsib02b] : No. of entries
>>>>>>>> healed : 0
>>>>>>>> homegfs [b7-gfsib02b] : No. of entries in
>>>>>>>> split-brain: 0
>>>>>>>> homegfs [b7-gfsib02b] : No. of heal failed
>>>>>>>> entries : 0
>>>>>>>>
>>>>>>>> *** gluster bug in 'gluster volume heal
>>>>>>>> homegfs statistics' ***
>>>>>>>> *** Use 'gluster volume heal homegfs info'
>>>>>>>> until bug is fixed ***
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs01a.corvidtec.com:/data/brick01a/homegfs/
>>>>>>>> Number of entries: 0
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs01b.corvidtec.com:/data/brick01b/homegfs/
>>>>>>>> Number of entries: 0
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs01a.corvidtec.com:/data/brick02a/homegfs/
>>>>>>>> Number of entries: 0
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs01b.corvidtec.com:/data/brick02b/homegfs/
>>>>>>>> Number of entries: 0
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs02a.corvidtec.com:/data/brick01a/homegfs/
>>>>>>>> /users/bangell/.gconfd - Is in split-brain
>>>>>>>>
>>>>>>>> Number of entries: 1
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs02b.corvidtec.com:/data/brick01b/homegfs/
>>>>>>>> /users/bangell/.gconfd - Is in split-brain
>>>>>>>>
>>>>>>>> /users/bangell/.gconfd/saved_state
>>>>>>>> Number of entries: 2
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs02a.corvidtec.com:/data/brick02a/homegfs/
>>>>>>>> Number of entries: 0
>>>>>>>>
>>>>>>>> Brick
>>>>>>>> gfs02b.corvidtec.com:/data/brick02b/homegfs/
>>>>>>>> Number of entries: 0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jan 21, 2016 at 11:10 AM, Pranith
>>>>>>>> Kumar Karampuri <pkarampu at redhat.com
>>>>>>>> <mailto:pkarampu at redhat.com>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 01/21/2016 09:26 PM, Glomski,
>>>>>>>> Patrick wrote:
>>>>>>>>> I should mention that the problem is
>>>>>>>>> not currently occurring and there are
>>>>>>>>> no heals (output appended). By
>>>>>>>>> restarting the gluster services, we
>>>>>>>>> can stop the crawl, which lowers the
>>>>>>>>> load for a while. Subsequent crawls
>>>>>>>>> seem to finish properly. For what it's
>>>>>>>>> worth, files/folders that show up in
>>>>>>>>> the 'volume info' output during a hung
>>>>>>>>> crawl don't seem to be anything out of
>>>>>>>>> the ordinary.
>>>>>>>>>
>>>>>>>>> Over the past four days, the typical
>>>>>>>>> time before the problem recurs after
>>>>>>>>> suppressing it in this manner is an
>>>>>>>>> hour. Last night when we reached out
>>>>>>>>> to you was the last time it happened
>>>>>>>>> and the load has been low since (a
>>>>>>>>> relief). David believes that
>>>>>>>>> recursively listing the files (ls -alR
>>>>>>>>> or similar) from a client mount can
>>>>>>>>> force the issue to happen, but
>>>>>>>>> obviously I'd rather not unless we
>>>>>>>>> have some precise thing we're looking
>>>>>>>>> for. Let me know if you'd like me to
>>>>>>>>> attempt to drive the system unstable
>>>>>>>>> like that and what I should look for.
>>>>>>>>> As it's a production system, I'd
>>>>>>>>> rather not leave it in this state for
>>>>>>>>> long.
>>>>>>>>
>>>>>>>> Will it be possible to send glustershd,
>>>>>>>> mount logs of the past 4 days? I would
>>>>>>>> like to see if this is because of
>>>>>>>> directory self-heal going wild (Ravi is
>>>>>>>> working on throttling feature for 3.8,
>>>>>>>> which will allow to put breaks on
>>>>>>>> self-heal traffic)
>>>>>>>>
>>>>>>>> Pranith
>>>>>>>>
>>>>>>>>>
>>>>>>>>> [root at gfs01a xattrop]# gluster volume
>>>>>>>>> heal homegfs info
>>>>>>>>> Brick
>>>>>>>>> gfs01a.corvidtec.com:/data/brick01a/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs01b.corvidtec.com:/data/brick01b/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs01a.corvidtec.com:/data/brick02a/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs01b.corvidtec.com:/data/brick02b/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs02a.corvidtec.com:/data/brick01a/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs02b.corvidtec.com:/data/brick01b/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs02a.corvidtec.com:/data/brick02a/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>> Brick
>>>>>>>>> gfs02b.corvidtec.com:/data/brick02b/homegfs/
>>>>>>>>> Number of entries: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Jan 21, 2016 at 10:40 AM,
>>>>>>>>> Pranith Kumar Karampuri
>>>>>>>>> <pkarampu at redhat.com
>>>>>>>>> <mailto:pkarampu at redhat.com>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 01/21/2016 08:25 PM, Glomski,
>>>>>>>>> Patrick wrote:
>>>>>>>>>> Hello, Pranith. The typical
>>>>>>>>>> behavior is that the %cpu on a
>>>>>>>>>> glusterfsd process jumps to
>>>>>>>>>> number of processor cores
>>>>>>>>>> available (800% or 1200%,
>>>>>>>>>> depending on the pair of nodes
>>>>>>>>>> involved) and the load average on
>>>>>>>>>> the machine goes very high (~20).
>>>>>>>>>> The volume's heal statistics
>>>>>>>>>> output shows that it is crawling
>>>>>>>>>> one of the bricks and trying to
>>>>>>>>>> heal, but this crawl hangs and
>>>>>>>>>> never seems to finish.
>>>>>>>>>>
>>>>>>>>>> The number of files in the
>>>>>>>>>> xattrop directory varies over
>>>>>>>>>> time, so I ran a wc -l as you
>>>>>>>>>> requested periodically for some
>>>>>>>>>> time and then started including a
>>>>>>>>>> datestamped list of the files
>>>>>>>>>> that were in the xattrops
>>>>>>>>>> directory on each brick to see
>>>>>>>>>> which were persistent. All bricks
>>>>>>>>>> had files in the xattrop folder,
>>>>>>>>>> so all results are attached.
>>>>>>>>> Thanks this info is helpful. I
>>>>>>>>> don't see a lot of files. Could
>>>>>>>>> you give output of "gluster volume
>>>>>>>>> heal <volname> info"? Is there any
>>>>>>>>> directory in there which is LARGE?
>>>>>>>>>
>>>>>>>>> Pranith
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please let me know if there is
>>>>>>>>>> anything else I can provide.
>>>>>>>>>>
>>>>>>>>>> Patrick
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 21, 2016 at 12:01 AM,
>>>>>>>>>> Pranith Kumar Karampuri
>>>>>>>>>> <pkarampu at redhat.com
>>>>>>>>>> <mailto:pkarampu at redhat.com>> wrote:
>>>>>>>>>>
>>>>>>>>>> hey,
>>>>>>>>>> Which process is
>>>>>>>>>> consuming so much cpu? I went
>>>>>>>>>> through the logs you gave me.
>>>>>>>>>> I see that the following
>>>>>>>>>> files are in gfid mismatch state:
>>>>>>>>>>
>>>>>>>>>> <066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup>,
>>>>>>>>>> <1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak>,
>>>>>>>>>> <ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg>,
>>>>>>>>>>
>>>>>>>>>> Could you give me the output
>>>>>>>>>> of "ls
>>>>>>>>>> <brick-path>/indices/xattrop
>>>>>>>>>> | wc -l" output on all the
>>>>>>>>>> bricks which are acting this
>>>>>>>>>> way? This will tell us the
>>>>>>>>>> number of pending self-heals
>>>>>>>>>> on the system.
>>>>>>>>>>
>>>>>>>>>> Pranith
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 01/20/2016 09:26 PM, David
>>>>>>>>>> Robinson wrote:
>>>>>>>>>>> resending with parsed logs...
>>>>>>>>>>>>> I am having issues with
>>>>>>>>>>>>> 3.6.6 where the load will
>>>>>>>>>>>>> spike up to 800% for one
>>>>>>>>>>>>> of the glusterfsd
>>>>>>>>>>>>> processes and the users
>>>>>>>>>>>>> can no longer access the
>>>>>>>>>>>>> system. If I reboot the
>>>>>>>>>>>>> node, the heal will finish
>>>>>>>>>>>>> normally after a few
>>>>>>>>>>>>> minutes and the system
>>>>>>>>>>>>> will be responsive, but a
>>>>>>>>>>>>> few hours later the issue
>>>>>>>>>>>>> will start again. It look
>>>>>>>>>>>>> like it is hanging in a
>>>>>>>>>>>>> heal and spinning up the
>>>>>>>>>>>>> load on one of the
>>>>>>>>>>>>> bricks. The heal gets
>>>>>>>>>>>>> stuck and says it is
>>>>>>>>>>>>> crawling and never
>>>>>>>>>>>>> returns. After a few
>>>>>>>>>>>>> minutes of the heal saying
>>>>>>>>>>>>> it is crawling, the load
>>>>>>>>>>>>> spikes up and the mounts
>>>>>>>>>>>>> become unresponsive.
>>>>>>>>>>>>> Any suggestions on how to
>>>>>>>>>>>>> fix this? It has us
>>>>>>>>>>>>> stopped cold as the user
>>>>>>>>>>>>> can no longer access the
>>>>>>>>>>>>> systems when the load
>>>>>>>>>>>>> spikes... Logs attached.
>>>>>>>>>>>>> System setup info is:
>>>>>>>>>>>>> [root at gfs01a ~]# gluster
>>>>>>>>>>>>> volume info homegfs
>>>>>>>>>>>>>
>>>>>>>>>>>>> Volume Name: homegfs
>>>>>>>>>>>>> Type: Distributed-Replicate
>>>>>>>>>>>>> Volume ID:
>>>>>>>>>>>>> 1e32672a-f1b7-4b58-ba94-58c085e59071
>>>>>>>>>>>>> Status: Started
>>>>>>>>>>>>> Number of Bricks: 4 x 2 = 8
>>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>>> Bricks:
>>>>>>>>>>>>> Brick1:
>>>>>>>>>>>>> gfsib01a.corvidtec.com:/data/brick01a/homegfs
>>>>>>>>>>>>> Brick2:
>>>>>>>>>>>>> gfsib01b.corvidtec.com:/data/brick01b/homegfs
>>>>>>>>>>>>> Brick3:
>>>>>>>>>>>>> gfsib01a.corvidtec.com:/data/brick02a/homegfs
>>>>>>>>>>>>> Brick4:
>>>>>>>>>>>>> gfsib01b.corvidtec.com:/data/brick02b/homegfs
>>>>>>>>>>>>> Brick5:
>>>>>>>>>>>>> gfsib02a.corvidtec.com:/data/brick01a/homegfs
>>>>>>>>>>>>> Brick6:
>>>>>>>>>>>>> gfsib02b.corvidtec.com:/data/brick01b/homegfs
>>>>>>>>>>>>> Brick7:
>>>>>>>>>>>>> gfsib02a.corvidtec.com:/data/brick02a/homegfs
>>>>>>>>>>>>> Brick8:
>>>>>>>>>>>>> gfsib02b.corvidtec.com:/data/brick02b/homegfs
>>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>>> performance.io-thread-count:
>>>>>>>>>>>>> 32
>>>>>>>>>>>>> performance.cache-size: 128MB
>>>>>>>>>>>>> performance.write-behind-window-size:
>>>>>>>>>>>>> 128MB
>>>>>>>>>>>>> server.allow-insecure: on
>>>>>>>>>>>>> network.ping-timeout: 42
>>>>>>>>>>>>> storage.owner-gid: 100
>>>>>>>>>>>>> geo-replication.indexing: off
>>>>>>>>>>>>> geo-replication.ignore-pid-check:
>>>>>>>>>>>>> on
>>>>>>>>>>>>> changelog.changelog: off
>>>>>>>>>>>>> changelog.fsync-interval: 3
>>>>>>>>>>>>> changelog.rollover-time: 15
>>>>>>>>>>>>> server.manage-gids: on
>>>>>>>>>>>>> diagnostics.client-log-level:
>>>>>>>>>>>>> WARNING
>>>>>>>>>>>>> [root at gfs01a ~]# rpm -qa |
>>>>>>>>>>>>> grep gluster
>>>>>>>>>>>>> gluster-nagios-common-0.1.1-0.el6.noarch
>>>>>>>>>>>>> glusterfs-fuse-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-debuginfo-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-libs-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-geo-replication-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-api-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-devel-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-api-devel-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-cli-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-rdma-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> samba-vfs-glusterfs-4.1.11-2.el6.x86_64
>>>>>>>>>>>>> glusterfs-server-3.6.6-1.el6.x86_64
>>>>>>>>>>>>> glusterfs-extra-xlators-3.6.6-1.el6.x86_64
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-devel mailing list
>>>>>>>>>>> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160128/8e421e41/attachment.html>
More information about the Gluster-users
mailing list