[Gluster-users] Self Heal Confusion

Thu Dec 27 22:19:50 UTC 2018

Resend as I did not reply to the list earlier.  TBird responded to the 
poster and not the list.

On 12/27/18 11:46 AM, Brett Holcomb wrote:
>
> Thank you. I appreciate the help  Here is the information.  Let me 
> know if you need anything else.  I'm fairly new to gluster.
>
> Gluster version is 5.2
>
> 1. gluster v info
>
> Volume Name: projects
> Type: Distributed-Replicate
> Volume ID: 5aac71aa-feaa-44e9-a4f9-cb4dd6e0fdc3
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x 3 = 6
> Transport-type: tcp
> Bricks:
> Brick1: gfssrv1:/srv/gfs01/Projects
> Brick2: gfssrv2:/srv/gfs01/Projects
> Brick3: gfssrv3:/srv/gfs01/Projects
> Brick4: gfssrv4:/srv/gfs01/Projects
> Brick5: gfssrv5:/srv/gfs01/Projects
> Brick6: gfssrv6:/srv/gfs01/Projects
> Options Reconfigured:
> cluster.self-heal-daemon: enable
> performance.quick-read: off
> performance.parallel-readdir: off
> performance.readdir-ahead: off
> performance.write-behind: off
> performance.read-ahead: off
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> server.allow-insecure: on
> storage.build-pgfid: on
> changelog.changelog: on
> changelog.capture-del-path: on
>
> 2.  gluster v status
>
> Status of volume: projects
> Gluster process                             TCP Port  RDMA Port 
> Online  Pid
> ------------------------------------------------------------------------------
> Brick gfssrv1:/srv/gfs01/Projects           49154     0 Y       7213
> Brick gfssrv2:/srv/gfs01/Projects           49154     0 Y       6932
> Brick gfssrv3:/srv/gfs01/Projects           49154     0 Y       6920
> Brick gfssrv4:/srv/gfs01/Projects           49154     0 Y       6732
> Brick gfssrv5:/srv/gfs01/Projects           49154     0 Y       6950
> Brick gfssrv6:/srv/gfs01/Projects           49154     0 Y       6879
> Self-heal Daemon on localhost               N/A       N/A Y       11484
> Self-heal Daemon on gfssrv2                 N/A       N/A Y       10366
> Self-heal Daemon on gfssrv4                 N/A       N/A Y       9872
> Self-heal Daemon on srv-1-gfs3.corp.l1049h.
> net                                         N/A       N/A Y       9892
> Self-heal Daemon on gfssrv6                 N/A       N/A Y       10372
> Self-heal Daemon on gfssrv5                 N/A       N/A Y       10761
>
> Task Status of Volume projects
> ------------------------------------------------------------------------------
> There are no active volume tasks
>
> 3. I've given the summary since the actual list for two volumes is 
> around 5900 entries.
>
> Brick gfssrv1:/srv/gfs01/Projects
> Status: Connected
> Total Number of entries: 85
> Number of entries in heal pending: 85
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick gfssrv2:/srv/gfs01/Projects
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick gfssrv3:/srv/gfs01/Projects
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick gfssrv4:/srv/gfs01/Projects
> Status: Connected
> Total Number of entries: 0
> Number of entries in heal pending: 0
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick gfssrv5:/srv/gfs01/Projects
> Status: Connected
> Total Number of entries: 58854
> Number of entries in heal pending: 58854
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick gfssrv6:/srv/gfs01/Projects
> Status: Connected
> Total Number of entries: 58854
> Number of entries in heal pending: 58854
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> On 12/27/18 3:09 AM, Ashish Pandey wrote:
>> Hi Brett,
>>
>> Could you please tell us more about the setup?
>>
>> 1 - Gluster v info
>> 2 - gluster v status
>> 3 - gluster v heal <volname> info
>>
>> These are the very basic information to start with debugging or 
>> suggesting any workaround.
>> It should always be included when asking such questions on mailing 
>> list so that people can reply sooner.
>>
>>
>> Note: Please hide IP address/hostname or any other information you 
>> don't want world to see.
>>
>> ---
>> Ashish
>>
>> ------------------------------------------------------------------------
>> *From: *"Brett Holcomb" <biholcomb at l1049h.com>
>> *To: *gluster-users at gluster.org
>> *Sent: *Thursday, December 27, 2018 12:19:15 AM
>> *Subject: *Re: [Gluster-users] Self Heal Confusion
>>
>> Still no change in the heals pending.  I found this reference, 
>> https://archive.fosdem.org/2017/schedule/event/glusterselinux/attachments/slides/1876/export/events/attachments/glusterselinux/slides/1876/fosdem.pdf, 
>> which mentions the default SELinux context for a brick and that 
>> internal operations such as self-heal, rebalance should be ignored. 
>> but they do not elaborate on what ignore means - is it just not doing 
>> self-heal or something else.
>>
>> I did set SELinux to permissive and nothing changed. I'll try setting 
>> the bricks to the context mentioned in this pdf and see what happens.
>>
>>
>> On 12/20/18 8:26 PM, John Strunk wrote:
>>
>>     Assuming your bricks are up... yes, the heal count should be
>>     decreasing.
>>
>>     There is/was a bug wherein self-heal would stop healing but would
>>     still be running. I don't know whether your version is affected,
>>     but the remedy is to just restart the self-heal daemon.
>>     Force start one of the volumes that has heals pending. The bricks
>>     are already running, but it will cause shd to restart and,
>>     assuming this is the problem, healing should begin...
>>
>>     $ gluster vol start my-pending-heal-vol force
>>
>>     Others could better comment on the status of the bug.
>>
>>     -John
>>
>>
>>     On Thu, Dec 20, 2018 at 5:45 PM Brett Holcomb
>>     <biholcomb at l1049h.com <mailto:biholcomb at l1049h.com>> wrote:
>>
>>         I have one volume that has 85 pending entries in healing and
>>         two more
>>         volumes with 58,854 entries in healing pending.  These
>>         numbers are from
>>         the volume heal info summary command.  They have stayed
>>         constant for two
>>         days now.  I've read the gluster docs and many more. The
>>         Gluster docs
>>         just give some commands and non gluster docs basically repeat
>>         that.
>>         Given that it appears no self-healing is going on for my
>>         volume I am
>>         confused as to why.
>>
>>         1.  If a self-heal deamon is listed on a host (all of mine
>>         show one with
>>         a volume status command) can I assume it's enabled and running?
>>
>>         2.  I assume the volume that has all the self-heals pending
>>         has some
>>         serious issues even though I can access the files and
>>         directories on
>>         it.  If self-heal is running shouldn't the numbers be decreasing?
>>
>>         It appears to me self-heal is not working properly so how to
>>         I get it to
>>         start working or should I delete the volume and start over?
>>
>>         I'm running gluster 5.2 on Centos 7 latest and updated.
>>
>>         Thank you.
>>
>>
>>         _______________________________________________
>>         Gluster-users mailing list
>>         Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>         https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181227/0077ef71/attachment.html>