[Gluster-users] [Gluster-devel] heal hanging

Thu Jan 21 18:02:10 UTC 2016

Load spiked again: ~1200%cpu on gfs02a for glusterfsd. Crawl has been
running on one of the bricks on gfs02b for 25 min or so and users cannot
access the volume.

I re-listed the xattrop directories as well as a 'top' entry and heal
statistics. Then I restarted the gluster services on gfs02a.

=================== top ===================
PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND
 8969 root      20   0 2815m 204m 3588 S 1181.0  0.6 591:06.93
glusterfsd

=================== xattrop ===================
/data/brick01a/homegfs/.glusterfs/indices/xattrop:
xattrop-41f19453-91e4-437c-afa9-3b25614de210
xattrop-9b815879-2f4d-402b-867c-a6d65087788c

/data/brick02a/homegfs/.glusterfs/indices/xattrop:
xattrop-70131855-3cfb-49af-abce-9d23f57fb393
xattrop-dfb77848-a39d-4417-a725-9beca75d78c6

/data/brick01b/homegfs/.glusterfs/indices/xattrop:
e6e47ed9-309b-42a7-8c44-28c29b9a20f8
xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125
xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934
xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0

/data/brick02b/homegfs/.glusterfs/indices/xattrop:
xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc
xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413

/data/brick01a/homegfs/.glusterfs/indices/xattrop:
xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531

/data/brick02a/homegfs/.glusterfs/indices/xattrop:
xattrop-7e20fdb1-5224-4b9a-be06-568708526d70

/data/brick01b/homegfs/.glusterfs/indices/xattrop:
8034bc06-92cd-4fa5-8aaf-09039e79d2c8  c9ce22ed-6d8b-471b-a111-b39e57f0b512
94fa1d60-45ad-4341-b69c-315936b51e8d
xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7

/data/brick02b/homegfs/.glusterfs/indices/xattrop:
xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d

=================== heal stats ===================

homegfs [b0-gfsib01a] : Starting time of crawl       : Thu Jan 21 12:36:45
2016
homegfs [b0-gfsib01a] : Ending time of crawl         : Thu Jan 21 12:36:45
2016
homegfs [b0-gfsib01a] : Type of crawl: INDEX
homegfs [b0-gfsib01a] : No. of entries healed        : 0
homegfs [b0-gfsib01a] : No. of entries in split-brain: 0
homegfs [b0-gfsib01a] : No. of heal failed entries   : 0

homegfs [b1-gfsib01b] : Starting time of crawl       : Thu Jan 21 12:36:19
2016
homegfs [b1-gfsib01b] : Ending time of crawl         : Thu Jan 21 12:36:19
2016
homegfs [b1-gfsib01b] : Type of crawl: INDEX
homegfs [b1-gfsib01b] : No. of entries healed        : 0
homegfs [b1-gfsib01b] : No. of entries in split-brain: 0
homegfs [b1-gfsib01b] : No. of heal failed entries   : 1

homegfs [b2-gfsib01a] : Starting time of crawl       : Thu Jan 21 12:36:48
2016
homegfs [b2-gfsib01a] : Ending time of crawl         : Thu Jan 21 12:36:48
2016
homegfs [b2-gfsib01a] : Type of crawl: INDEX
homegfs [b2-gfsib01a] : No. of entries healed        : 0
homegfs [b2-gfsib01a] : No. of entries in split-brain: 0
homegfs [b2-gfsib01a] : No. of heal failed entries   : 0

homegfs [b3-gfsib01b] : Starting time of crawl       : Thu Jan 21 12:36:47
2016
homegfs [b3-gfsib01b] : Ending time of crawl         : Thu Jan 21 12:36:47
2016
homegfs [b3-gfsib01b] : Type of crawl: INDEX
homegfs [b3-gfsib01b] : No. of entries healed        : 0
homegfs [b3-gfsib01b] : No. of entries in split-brain: 0
homegfs [b3-gfsib01b] : No. of heal failed entries   : 0

homegfs [b4-gfsib02a] : Starting time of crawl       : Thu Jan 21 12:36:06
2016
homegfs [b4-gfsib02a] : Ending time of crawl         : Thu Jan 21 12:36:06
2016
homegfs [b4-gfsib02a] : Type of crawl: INDEX
homegfs [b4-gfsib02a] : No. of entries healed        : 0
homegfs [b4-gfsib02a] : No. of entries in split-brain: 0
homegfs [b4-gfsib02a] : No. of heal failed entries   : 0

homegfs [b5-gfsib02b] : Starting time of crawl       : Thu Jan 21 12:13:40
2016
homegfs [b5-gfsib02b] :                                *** Crawl is in
progress ***
homegfs [b5-gfsib02b] : Type of crawl: INDEX
homegfs [b5-gfsib02b] : No. of entries healed        : 0
homegfs [b5-gfsib02b] : No. of entries in split-brain: 0
homegfs [b5-gfsib02b] : No. of heal failed entries   : 0

homegfs [b6-gfsib02a] : Starting time of crawl       : Thu Jan 21 12:36:58
2016
homegfs [b6-gfsib02a] : Ending time of crawl         : Thu Jan 21 12:36:58
2016
homegfs [b6-gfsib02a] : Type of crawl: INDEX
homegfs [b6-gfsib02a] : No. of entries healed        : 0
homegfs [b6-gfsib02a] : No. of entries in split-brain: 0
homegfs [b6-gfsib02a] : No. of heal failed entries   : 0

homegfs [b7-gfsib02b] : Starting time of crawl       : Thu Jan 21 12:36:50
2016
homegfs [b7-gfsib02b] : Ending time of crawl         : Thu Jan 21 12:36:50
2016
homegfs [b7-gfsib02b] : Type of crawl: INDEX
homegfs [b7-gfsib02b] : No. of entries healed        : 0
homegfs [b7-gfsib02b] : No. of entries in split-brain: 0
homegfs [b7-gfsib02b] : No. of heal failed entries   : 0

========================================================================================
I waited a few minutes for the heals to finish and ran the heal statistics
and info again. one file is in split-brain. Aside from the split-brain, the
load on all systems is down now and they are behaving normally.
glustershd.log is attached. What is going on???

Thu Jan 21 12:53:50 EST 2016

=================== homegfs ===================

homegfs [b0-gfsib01a] : Starting time of crawl       : Thu Jan 21 12:53:02
2016
homegfs [b0-gfsib01a] : Ending time of crawl         : Thu Jan 21 12:53:02
2016
homegfs [b0-gfsib01a] : Type of crawl: INDEX
homegfs [b0-gfsib01a] : No. of entries healed        : 0
homegfs [b0-gfsib01a] : No. of entries in split-brain: 0
homegfs [b0-gfsib01a] : No. of heal failed entries   : 0

homegfs [b1-gfsib01b] : Starting time of crawl       : Thu Jan 21 12:53:38
2016
homegfs [b1-gfsib01b] : Ending time of crawl         : Thu Jan 21 12:53:38
2016
homegfs [b1-gfsib01b] : Type of crawl: INDEX
homegfs [b1-gfsib01b] : No. of entries healed        : 0
homegfs [b1-gfsib01b] : No. of entries in split-brain: 0
homegfs [b1-gfsib01b] : No. of heal failed entries   : 1

homegfs [b2-gfsib01a] : Starting time of crawl       : Thu Jan 21 12:53:04
2016
homegfs [b2-gfsib01a] : Ending time of crawl         : Thu Jan 21 12:53:04
2016
homegfs [b2-gfsib01a] : Type of crawl: INDEX
homegfs [b2-gfsib01a] : No. of entries healed        : 0
homegfs [b2-gfsib01a] : No. of entries in split-brain: 0
homegfs [b2-gfsib01a] : No. of heal failed entries   : 0

homegfs [b3-gfsib01b] : Starting time of crawl       : Thu Jan 21 12:53:04
2016
homegfs [b3-gfsib01b] : Ending time of crawl         : Thu Jan 21 12:53:04
2016
homegfs [b3-gfsib01b] : Type of crawl: INDEX
homegfs [b3-gfsib01b] : No. of entries healed        : 0
homegfs [b3-gfsib01b] : No. of entries in split-brain: 0
homegfs [b3-gfsib01b] : No. of heal failed entries   : 0

homegfs [b4-gfsib02a] : Starting time of crawl       : Thu Jan 21 12:53:33
2016
homegfs [b4-gfsib02a] : Ending time of crawl         : Thu Jan 21 12:53:33
2016
homegfs [b4-gfsib02a] : Type of crawl: INDEX
homegfs [b4-gfsib02a] : No. of entries healed        : 0
homegfs [b4-gfsib02a] : No. of entries in split-brain: 0
homegfs [b4-gfsib02a] : No. of heal failed entries   : 1

homegfs [b5-gfsib02b] : Starting time of crawl       : Thu Jan 21 12:53:14
2016
homegfs [b5-gfsib02b] : Ending time of crawl         : Thu Jan 21 12:53:15
2016
homegfs [b5-gfsib02b] : Type of crawl: INDEX
homegfs [b5-gfsib02b] : No. of entries healed        : 0
homegfs [b5-gfsib02b] : No. of entries in split-brain: 0
homegfs [b5-gfsib02b] : No. of heal failed entries   : 3

homegfs [b6-gfsib02a] : Starting time of crawl       : Thu Jan 21 12:53:04
2016
homegfs [b6-gfsib02a] : Ending time of crawl         : Thu Jan 21 12:53:04
2016
homegfs [b6-gfsib02a] : Type of crawl: INDEX
homegfs [b6-gfsib02a] : No. of entries healed        : 0
homegfs [b6-gfsib02a] : No. of entries in split-brain: 0
homegfs [b6-gfsib02a] : No. of heal failed entries   : 0

homegfs [b7-gfsib02b] : Starting time of crawl       : Thu Jan 21 12:53:09
2016
homegfs [b7-gfsib02b] : Ending time of crawl         : Thu Jan 21 12:53:09
2016
homegfs [b7-gfsib02b] : Type of crawl: INDEX
homegfs [b7-gfsib02b] : No. of entries healed        : 0
homegfs [b7-gfsib02b] : No. of entries in split-brain: 0
homegfs [b7-gfsib02b] : No. of heal failed entries   : 0

*** gluster bug in 'gluster volume heal homegfs statistics'   ***
*** Use 'gluster volume heal homegfs info' until bug is fixed ***

Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/
Number of entries: 0

Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/
Number of entries: 0

Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/
Number of entries: 0

Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/
Number of entries: 0

Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/
/users/bangell/.gconfd - Is in split-brain

Number of entries: 1

Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/
/users/bangell/.gconfd - Is in split-brain

/users/bangell/.gconfd/saved_state
Number of entries: 2

Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/
Number of entries: 0

Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/
Number of entries: 0

On Thu, Jan 21, 2016 at 11:10 AM, Pranith Kumar Karampuri <
pkarampu at redhat.com> wrote:

>
>
> On 01/21/2016 09:26 PM, Glomski, Patrick wrote:
>
> I should mention that the problem is not currently occurring and there are
> no heals (output appended). By restarting the gluster services, we can stop
> the crawl, which lowers the load for a while. Subsequent crawls seem to
> finish properly. For what it's worth, files/folders that show up in the
> 'volume info' output during a hung crawl don't seem to be anything out of
> the ordinary.
>
> Over the past four days, the typical time before the problem recurs after
> suppressing it in this manner is an hour. Last night when we reached out to
> you was the last time it happened and the load has been low since (a
> relief).  David believes that recursively listing the files (ls -alR or
> similar) from a client mount can force the issue to happen, but obviously
> I'd rather not unless we have some precise thing we're looking for. Let me
> know if you'd like me to attempt to drive the system unstable like that and
> what I should look for. As it's a production system, I'd rather not leave
> it in this state for long.
>
>
> Will it be possible to send glustershd, mount logs of the past 4 days? I
> would like to see if this is because of directory self-heal going wild
> (Ravi is working on throttling feature for 3.8, which will allow to put
> breaks on self-heal traffic)
>
> Pranith
>
>
> [root at gfs01a xattrop]# gluster volume heal homegfs info
> Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/
> Number of entries: 0
>
> Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/
> Number of entries: 0
>
> Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/
> Number of entries: 0
>
> Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/
> Number of entries: 0
>
> Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/
> Number of entries: 0
>
> Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/
> Number of entries: 0
>
> Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/
> Number of entries: 0
>
> Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/
> Number of entries: 0
>
>
>
>
> On Thu, Jan 21, 2016 at 10:40 AM, Pranith Kumar Karampuri <
> pkarampu at redhat.com> wrote:
>
>>
>>
>> On 01/21/2016 08:25 PM, Glomski, Patrick wrote:
>>
>> Hello, Pranith. The typical behavior is that the %cpu on a glusterfsd
>> process jumps to number of processor cores available (800% or 1200%,
>> depending on the pair of nodes involved) and the load average on the
>> machine goes very high (~20). The volume's heal statistics output shows
>> that it is crawling one of the bricks and trying to heal, but this crawl
>> hangs and never seems to finish.
>>
>>
>> The number of files in the xattrop directory varies over time, so I ran a
>> wc -l as you requested periodically for some time and then started
>> including a datestamped list of the files that were in the xattrops
>> directory on each brick to see which were persistent. All bricks had files
>> in the xattrop folder, so all results are attached.
>>
>> Thanks this info is helpful. I don't see a lot of files. Could you give
>> output of "gluster volume heal <volname> info"? Is there any directory in
>> there which is LARGE?
>>
>> Pranith
>>
>>
>> Please let me know if there is anything else I can provide.
>>
>> Patrick
>>
>>
>> On Thu, Jan 21, 2016 at 12:01 AM, Pranith Kumar Karampuri <
>> pkarampu at redhat.com> wrote:
>>
>>> hey,
>>>        Which process is consuming so much cpu? I went through the logs
>>> you gave me. I see that the following files are in gfid mismatch state:
>>>
>>> <066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup>,
>>> <1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak>,
>>> <ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg>,
>>>
>>> Could you give me the output of "ls <brick-path>/indices/xattrop | wc
>>> -l" output on all the bricks which are acting this way? This will tell us
>>> the number of pending self-heals on the system.
>>>
>>> Pranith
>>>
>>>
>>> On 01/20/2016 09:26 PM, David Robinson wrote:
>>>
>>> resending with parsed logs...
>>>
>>>
>>>
>>>
>>>
>>> I am having issues with 3.6.6 where the load will spike up to 800% for
>>> one of the glusterfsd processes and the users can no longer access the
>>> system.  If I reboot the node, the heal will finish normally after a few
>>> minutes and the system will be responsive, but a few hours later the issue
>>> will start again.  It look like it is hanging in a heal and spinning up the
>>> load on one of the bricks.  The heal gets stuck and says it is crawling and
>>> never returns.  After a few minutes of the heal saying it is crawling, the
>>> load spikes up and the mounts become unresponsive.
>>>
>>> Any suggestions on how to fix this?  It has us stopped cold as the user
>>> can no longer access the systems when the load spikes... Logs attached.
>>>
>>> System setup info is:
>>>
>>> [root at gfs01a ~]# gluster volume info homegfs
>>>
>>> Volume Name: homegfs
>>> Type: Distributed-Replicate
>>> Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
>>> Status: Started
>>> Number of Bricks: 4 x 2 = 8
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
>>> Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
>>> Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
>>> Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
>>> Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
>>> Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
>>> Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
>>> Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
>>> Options Reconfigured:
>>> performance.io-thread-count: 32
>>> performance.cache-size: 128MB
>>> performance.write-behind-window-size: 128MB
>>> server.allow-insecure: on
>>> network.ping-timeout: 42
>>> storage.owner-gid: 100
>>> geo-replication.indexing: off
>>> geo-replication.ignore-pid-check: on
>>> changelog.changelog: off
>>> changelog.fsync-interval: 3
>>> changelog.rollover-time: 15
>>> server.manage-gids: on
>>> diagnostics.client-log-level: WARNING
>>>
>>> [root at gfs01a ~]# rpm -qa | grep gluster
>>> gluster-nagios-common-0.1.1-0.el6.noarch
>>> glusterfs-fuse-3.6.6-1.el6.x86_64
>>> glusterfs-debuginfo-3.6.6-1.el6.x86_64
>>> glusterfs-libs-3.6.6-1.el6.x86_64
>>> glusterfs-geo-replication-3.6.6-1.el6.x86_64
>>> glusterfs-api-3.6.6-1.el6.x86_64
>>> glusterfs-devel-3.6.6-1.el6.x86_64
>>> glusterfs-api-devel-3.6.6-1.el6.x86_64
>>> glusterfs-3.6.6-1.el6.x86_64
>>> glusterfs-cli-3.6.6-1.el6.x86_64
>>> glusterfs-rdma-3.6.6-1.el6.x86_64
>>> samba-vfs-glusterfs-4.1.11-2.el6.x86_64
>>> glusterfs-server-3.6.6-1.el6.x86_64
>>> glusterfs-extra-xlators-3.6.6-1.el6.x86_64
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160121/1a85aba2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd.log.xz
Type: application/x-xz
Size: 556848 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160121/1a85aba2/attachment-0001.xz>