[Gluster-devel] [Gluster-users] heal hanging
Pranith Kumar Karampuri
pkarampu at redhat.com
Fri Jan 22 01:37:14 UTC 2016
Do you have any windows clients? I see a lot of getxattr calls for
"glusterfs.get_real_filename" which lead to full readdirs of the
directories on the brick.
Pranith
On 01/22/2016 12:51 AM, Glomski, Patrick wrote:
> Pranith, could this kind of behavior be self-inflicted by us deleting
> files directly from the bricks? We have done that in the past to clean
> up an issues where gluster wouldn't allow us to delete from the mount.
>
> If so, is it feasible to clean them up by running a search on the
> .glusterfs directories directly and removing files with a reference
> count of 1 that are non-zero size (or directly checking the xattrs to
> be sure that it's not a DHT link).
>
> find /data/brick01a/homegfs/.glusterfs -type f -not -empty -links -2
> -exec rm -f "{}" \;
>
> Is there anything I'm inherently missing with that approach that will
> further corrupt the system?
>
>
> On Thu, Jan 21, 2016 at 1:02 PM, Glomski, Patrick
> <patrick.glomski at corvidtec.com <mailto:patrick.glomski at corvidtec.com>>
> wrote:
>
> Load spiked again: ~1200%cpu on gfs02a for glusterfsd. Crawl has
> been running on one of the bricks on gfs02b for 25 min or so and
> users cannot access the volume.
>
> I re-listed the xattrop directories as well as a 'top' entry and
> heal statistics. Then I restarted the gluster services on gfs02a.
>
> =================== top ===================
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 8969 root 20 0 2815m 204m 3588 S 1181.0 0.6 591:06.93
> glusterfsd
>
> =================== xattrop ===================
> /data/brick01a/homegfs/.glusterfs/indices/xattrop:
> xattrop-41f19453-91e4-437c-afa9-3b25614de210
> xattrop-9b815879-2f4d-402b-867c-a6d65087788c
>
> /data/brick02a/homegfs/.glusterfs/indices/xattrop:
> xattrop-70131855-3cfb-49af-abce-9d23f57fb393
> xattrop-dfb77848-a39d-4417-a725-9beca75d78c6
>
> /data/brick01b/homegfs/.glusterfs/indices/xattrop:
> e6e47ed9-309b-42a7-8c44-28c29b9a20f8
> xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125
> xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934
> xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0
>
> /data/brick02b/homegfs/.glusterfs/indices/xattrop:
> xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc
> xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413
>
> /data/brick01a/homegfs/.glusterfs/indices/xattrop:
> xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531
>
> /data/brick02a/homegfs/.glusterfs/indices/xattrop:
> xattrop-7e20fdb1-5224-4b9a-be06-568708526d70
>
> /data/brick01b/homegfs/.glusterfs/indices/xattrop:
> 8034bc06-92cd-4fa5-8aaf-09039e79d2c8
> c9ce22ed-6d8b-471b-a111-b39e57f0b512
> 94fa1d60-45ad-4341-b69c-315936b51e8d
> xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7
>
> /data/brick02b/homegfs/.glusterfs/indices/xattrop:
> xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d
>
>
> =================== heal stats ===================
>
> homegfs [b0-gfsib01a] : Starting time of crawl : Thu Jan 21
> 12:36:45 2016
> homegfs [b0-gfsib01a] : Ending time of crawl : Thu Jan 21
> 12:36:45 2016
> homegfs [b0-gfsib01a] : Type of crawl: INDEX
> homegfs [b0-gfsib01a] : No. of entries healed : 0
> homegfs [b0-gfsib01a] : No. of entries in split-brain: 0
> homegfs [b0-gfsib01a] : No. of heal failed entries : 0
>
> homegfs [b1-gfsib01b] : Starting time of crawl : Thu Jan 21
> 12:36:19 2016
> homegfs [b1-gfsib01b] : Ending time of crawl : Thu Jan 21
> 12:36:19 2016
> homegfs [b1-gfsib01b] : Type of crawl: INDEX
> homegfs [b1-gfsib01b] : No. of entries healed : 0
> homegfs [b1-gfsib01b] : No. of entries in split-brain: 0
> homegfs [b1-gfsib01b] : No. of heal failed entries : 1
>
> homegfs [b2-gfsib01a] : Starting time of crawl : Thu Jan 21
> 12:36:48 2016
> homegfs [b2-gfsib01a] : Ending time of crawl : Thu Jan 21
> 12:36:48 2016
> homegfs [b2-gfsib01a] : Type of crawl: INDEX
> homegfs [b2-gfsib01a] : No. of entries healed : 0
> homegfs [b2-gfsib01a] : No. of entries in split-brain: 0
> homegfs [b2-gfsib01a] : No. of heal failed entries : 0
>
> homegfs [b3-gfsib01b] : Starting time of crawl : Thu Jan 21
> 12:36:47 2016
> homegfs [b3-gfsib01b] : Ending time of crawl : Thu Jan 21
> 12:36:47 2016
> homegfs [b3-gfsib01b] : Type of crawl: INDEX
> homegfs [b3-gfsib01b] : No. of entries healed : 0
> homegfs [b3-gfsib01b] : No. of entries in split-brain: 0
> homegfs [b3-gfsib01b] : No. of heal failed entries : 0
>
> homegfs [b4-gfsib02a] : Starting time of crawl : Thu Jan 21
> 12:36:06 2016
> homegfs [b4-gfsib02a] : Ending time of crawl : Thu Jan 21
> 12:36:06 2016
> homegfs [b4-gfsib02a] : Type of crawl: INDEX
> homegfs [b4-gfsib02a] : No. of entries healed : 0
> homegfs [b4-gfsib02a] : No. of entries in split-brain: 0
> homegfs [b4-gfsib02a] : No. of heal failed entries : 0
>
> homegfs [b5-gfsib02b] : Starting time of crawl : Thu Jan 21
> 12:13:40 2016
> homegfs [b5-gfsib02b] : *** Crawl is in progress ***
> homegfs [b5-gfsib02b] : Type of crawl: INDEX
> homegfs [b5-gfsib02b] : No. of entries healed : 0
> homegfs [b5-gfsib02b] : No. of entries in split-brain: 0
> homegfs [b5-gfsib02b] : No. of heal failed entries : 0
>
> homegfs [b6-gfsib02a] : Starting time of crawl : Thu Jan 21
> 12:36:58 2016
> homegfs [b6-gfsib02a] : Ending time of crawl : Thu Jan 21
> 12:36:58 2016
> homegfs [b6-gfsib02a] : Type of crawl: INDEX
> homegfs [b6-gfsib02a] : No. of entries healed : 0
> homegfs [b6-gfsib02a] : No. of entries in split-brain: 0
> homegfs [b6-gfsib02a] : No. of heal failed entries : 0
>
> homegfs [b7-gfsib02b] : Starting time of crawl : Thu Jan 21
> 12:36:50 2016
> homegfs [b7-gfsib02b] : Ending time of crawl : Thu Jan 21
> 12:36:50 2016
> homegfs [b7-gfsib02b] : Type of crawl: INDEX
> homegfs [b7-gfsib02b] : No. of entries healed : 0
> homegfs [b7-gfsib02b] : No. of entries in split-brain: 0
> homegfs [b7-gfsib02b] : No. of heal failed entries : 0
>
>
> ========================================================================================
> I waited a few minutes for the heals to finish and ran the heal
> statistics and info again. one file is in split-brain. Aside from
> the split-brain, the load on all systems is down now and they are
> behaving normally. glustershd.log is attached. What is going on???
>
> Thu Jan 21 12:53:50 EST 2016
>
> =================== homegfs ===================
>
> homegfs [b0-gfsib01a] : Starting time of crawl : Thu Jan 21
> 12:53:02 2016
> homegfs [b0-gfsib01a] : Ending time of crawl : Thu Jan 21
> 12:53:02 2016
> homegfs [b0-gfsib01a] : Type of crawl: INDEX
> homegfs [b0-gfsib01a] : No. of entries healed : 0
> homegfs [b0-gfsib01a] : No. of entries in split-brain: 0
> homegfs [b0-gfsib01a] : No. of heal failed entries : 0
>
> homegfs [b1-gfsib01b] : Starting time of crawl : Thu Jan 21
> 12:53:38 2016
> homegfs [b1-gfsib01b] : Ending time of crawl : Thu Jan 21
> 12:53:38 2016
> homegfs [b1-gfsib01b] : Type of crawl: INDEX
> homegfs [b1-gfsib01b] : No. of entries healed : 0
> homegfs [b1-gfsib01b] : No. of entries in split-brain: 0
> homegfs [b1-gfsib01b] : No. of heal failed entries : 1
>
> homegfs [b2-gfsib01a] : Starting time of crawl : Thu Jan 21
> 12:53:04 2016
> homegfs [b2-gfsib01a] : Ending time of crawl : Thu Jan 21
> 12:53:04 2016
> homegfs [b2-gfsib01a] : Type of crawl: INDEX
> homegfs [b2-gfsib01a] : No. of entries healed : 0
> homegfs [b2-gfsib01a] : No. of entries in split-brain: 0
> homegfs [b2-gfsib01a] : No. of heal failed entries : 0
>
> homegfs [b3-gfsib01b] : Starting time of crawl : Thu Jan 21
> 12:53:04 2016
> homegfs [b3-gfsib01b] : Ending time of crawl : Thu Jan 21
> 12:53:04 2016
> homegfs [b3-gfsib01b] : Type of crawl: INDEX
> homegfs [b3-gfsib01b] : No. of entries healed : 0
> homegfs [b3-gfsib01b] : No. of entries in split-brain: 0
> homegfs [b3-gfsib01b] : No. of heal failed entries : 0
>
> homegfs [b4-gfsib02a] : Starting time of crawl : Thu Jan 21
> 12:53:33 2016
> homegfs [b4-gfsib02a] : Ending time of crawl : Thu Jan 21
> 12:53:33 2016
> homegfs [b4-gfsib02a] : Type of crawl: INDEX
> homegfs [b4-gfsib02a] : No. of entries healed : 0
> homegfs [b4-gfsib02a] : No. of entries in split-brain: 0
> homegfs [b4-gfsib02a] : No. of heal failed entries : 1
>
> homegfs [b5-gfsib02b] : Starting time of crawl : Thu Jan 21
> 12:53:14 2016
> homegfs [b5-gfsib02b] : Ending time of crawl : Thu Jan 21
> 12:53:15 2016
> homegfs [b5-gfsib02b] : Type of crawl: INDEX
> homegfs [b5-gfsib02b] : No. of entries healed : 0
> homegfs [b5-gfsib02b] : No. of entries in split-brain: 0
> homegfs [b5-gfsib02b] : No. of heal failed entries : 3
>
> homegfs [b6-gfsib02a] : Starting time of crawl : Thu Jan 21
> 12:53:04 2016
> homegfs [b6-gfsib02a] : Ending time of crawl : Thu Jan 21
> 12:53:04 2016
> homegfs [b6-gfsib02a] : Type of crawl: INDEX
> homegfs [b6-gfsib02a] : No. of entries healed : 0
> homegfs [b6-gfsib02a] : No. of entries in split-brain: 0
> homegfs [b6-gfsib02a] : No. of heal failed entries : 0
>
> homegfs [b7-gfsib02b] : Starting time of crawl : Thu Jan 21
> 12:53:09 2016
> homegfs [b7-gfsib02b] : Ending time of crawl : Thu Jan 21
> 12:53:09 2016
> homegfs [b7-gfsib02b] : Type of crawl: INDEX
> homegfs [b7-gfsib02b] : No. of entries healed : 0
> homegfs [b7-gfsib02b] : No. of entries in split-brain: 0
> homegfs [b7-gfsib02b] : No. of heal failed entries : 0
>
> *** gluster bug in 'gluster volume heal homegfs statistics' ***
> *** Use 'gluster volume heal homegfs info' until bug is fixed ***
>
> Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/
> Number of entries: 0
>
> Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/
> Number of entries: 0
>
> Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/
> Number of entries: 0
>
> Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/
> Number of entries: 0
>
> Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/
> /users/bangell/.gconfd - Is in split-brain
>
> Number of entries: 1
>
> Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/
> /users/bangell/.gconfd - Is in split-brain
>
> /users/bangell/.gconfd/saved_state
> Number of entries: 2
>
> Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/
> Number of entries: 0
>
> Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/
> Number of entries: 0
>
>
>
>
> On Thu, Jan 21, 2016 at 11:10 AM, Pranith Kumar Karampuri
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>
>
>
> On 01/21/2016 09:26 PM, Glomski, Patrick wrote:
>> I should mention that the problem is not currently occurring
>> and there are no heals (output appended). By restarting the
>> gluster services, we can stop the crawl, which lowers the
>> load for a while. Subsequent crawls seem to finish properly.
>> For what it's worth, files/folders that show up in the
>> 'volume info' output during a hung crawl don't seem to be
>> anything out of the ordinary.
>>
>> Over the past four days, the typical time before the problem
>> recurs after suppressing it in this manner is an hour. Last
>> night when we reached out to you was the last time it
>> happened and the load has been low since (a relief). David
>> believes that recursively listing the files (ls -alR or
>> similar) from a client mount can force the issue to happen,
>> but obviously I'd rather not unless we have some precise
>> thing we're looking for. Let me know if you'd like me to
>> attempt to drive the system unstable like that and what I
>> should look for. As it's a production system, I'd rather not
>> leave it in this state for long.
>
> Will it be possible to send glustershd, mount logs of the past
> 4 days? I would like to see if this is because of directory
> self-heal going wild (Ravi is working on throttling feature
> for 3.8, which will allow to put breaks on self-heal traffic)
>
> Pranith
>
>>
>> [root at gfs01a xattrop]# gluster volume heal homegfs info
>> Brick gfs01a.corvidtec.com:/data/brick01a/homegfs/
>> Number of entries: 0
>>
>> Brick gfs01b.corvidtec.com:/data/brick01b/homegfs/
>> Number of entries: 0
>>
>> Brick gfs01a.corvidtec.com:/data/brick02a/homegfs/
>> Number of entries: 0
>>
>> Brick gfs01b.corvidtec.com:/data/brick02b/homegfs/
>> Number of entries: 0
>>
>> Brick gfs02a.corvidtec.com:/data/brick01a/homegfs/
>> Number of entries: 0
>>
>> Brick gfs02b.corvidtec.com:/data/brick01b/homegfs/
>> Number of entries: 0
>>
>> Brick gfs02a.corvidtec.com:/data/brick02a/homegfs/
>> Number of entries: 0
>>
>> Brick gfs02b.corvidtec.com:/data/brick02b/homegfs/
>> Number of entries: 0
>>
>>
>>
>>
>> On Thu, Jan 21, 2016 at 10:40 AM, Pranith Kumar Karampuri
>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>
>>
>>
>> On 01/21/2016 08:25 PM, Glomski, Patrick wrote:
>>> Hello, Pranith. The typical behavior is that the %cpu on
>>> a glusterfsd process jumps to number of processor cores
>>> available (800% or 1200%, depending on the pair of nodes
>>> involved) and the load average on the machine goes very
>>> high (~20). The volume's heal statistics output shows
>>> that it is crawling one of the bricks and trying to
>>> heal, but this crawl hangs and never seems to finish.
>>>
>>> The number of files in the xattrop directory varies over
>>> time, so I ran a wc -l as you requested periodically for
>>> some time and then started including a datestamped list
>>> of the files that were in the xattrops directory on each
>>> brick to see which were persistent. All bricks had files
>>> in the xattrop folder, so all results are attached.
>> Thanks this info is helpful. I don't see a lot of files.
>> Could you give output of "gluster volume heal <volname>
>> info"? Is there any directory in there which is LARGE?
>>
>> Pranith
>>
>>>
>>> Please let me know if there is anything else I can provide.
>>>
>>> Patrick
>>>
>>>
>>> On Thu, Jan 21, 2016 at 12:01 AM, Pranith Kumar
>>> Karampuri <pkarampu at redhat.com
>>> <mailto:pkarampu at redhat.com>> wrote:
>>>
>>> hey,
>>> Which process is consuming so much cpu? I
>>> went through the logs you gave me. I see that the
>>> following files are in gfid mismatch state:
>>>
>>> <066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup>,
>>> <1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak>,
>>> <ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg>,
>>>
>>> Could you give me the output of "ls
>>> <brick-path>/indices/xattrop | wc -l" output on all
>>> the bricks which are acting this way? This will tell
>>> us the number of pending self-heals on the system.
>>>
>>> Pranith
>>>
>>>
>>> On 01/20/2016 09:26 PM, David Robinson wrote:
>>>> resending with parsed logs...
>>>>>> I am having issues with 3.6.6 where the load will
>>>>>> spike up to 800% for one of the glusterfsd
>>>>>> processes and the users can no longer access the
>>>>>> system. If I reboot the node, the heal will
>>>>>> finish normally after a few minutes and the
>>>>>> system will be responsive, but a few hours later
>>>>>> the issue will start again. It look like it is
>>>>>> hanging in a heal and spinning up the load on one
>>>>>> of the bricks. The heal gets stuck and says it
>>>>>> is crawling and never returns. After a few
>>>>>> minutes of the heal saying it is crawling, the
>>>>>> load spikes up and the mounts become unresponsive.
>>>>>> Any suggestions on how to fix this? It has us
>>>>>> stopped cold as the user can no longer access the
>>>>>> systems when the load spikes... Logs attached.
>>>>>> System setup info is:
>>>>>> [root at gfs01a ~]# gluster volume info homegfs
>>>>>>
>>>>>> Volume Name: homegfs
>>>>>> Type: Distributed-Replicate
>>>>>> Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
>>>>>> Status: Started
>>>>>> Number of Bricks: 4 x 2 = 8
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
>>>>>> Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
>>>>>> Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
>>>>>> Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
>>>>>> Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
>>>>>> Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
>>>>>> Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
>>>>>> Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
>>>>>> Options Reconfigured:
>>>>>> performance.io-thread-count: 32
>>>>>> performance.cache-size: 128MB
>>>>>> performance.write-behind-window-size: 128MB
>>>>>> server.allow-insecure: on
>>>>>> network.ping-timeout: 42
>>>>>> storage.owner-gid: 100
>>>>>> geo-replication.indexing: off
>>>>>> geo-replication.ignore-pid-check: on
>>>>>> changelog.changelog: off
>>>>>> changelog.fsync-interval: 3
>>>>>> changelog.rollover-time: 15
>>>>>> server.manage-gids: on
>>>>>> diagnostics.client-log-level: WARNING
>>>>>> [root at gfs01a ~]# rpm -qa | grep gluster
>>>>>> gluster-nagios-common-0.1.1-0.el6.noarch
>>>>>> glusterfs-fuse-3.6.6-1.el6.x86_64
>>>>>> glusterfs-debuginfo-3.6.6-1.el6.x86_64
>>>>>> glusterfs-libs-3.6.6-1.el6.x86_64
>>>>>> glusterfs-geo-replication-3.6.6-1.el6.x86_64
>>>>>> glusterfs-api-3.6.6-1.el6.x86_64
>>>>>> glusterfs-devel-3.6.6-1.el6.x86_64
>>>>>> glusterfs-api-devel-3.6.6-1.el6.x86_64
>>>>>> glusterfs-3.6.6-1.el6.x86_64
>>>>>> glusterfs-cli-3.6.6-1.el6.x86_64
>>>>>> glusterfs-rdma-3.6.6-1.el6.x86_64
>>>>>> samba-vfs-glusterfs-4.1.11-2.el6.x86_64
>>>>>> glusterfs-server-3.6.6-1.el6.x86_64
>>>>>> glusterfs-extra-xlators-3.6.6-1.el6.x86_64
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-devel mailing list
>>>> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> <mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160122/89855239/attachment-0001.html>
More information about the Gluster-devel
mailing list