[Gluster-users] 3.8.3 Shards Healing Glacier Slow

David Gossage dgossage at carouselchecks.com
Wed Aug 31 15:51:51 UTC 2016


test server
root at ccengine2 ~]# ls -l /gluster2/brick?/1/.glusterfs/indices/xattrop/
/gluster2/brick1/1/.glusterfs/indices/xattrop/:
total 1
----------. 1 root root 0 Aug 31 08:48
xattrop-542c3fdb-add6-4efa-8cde-288991935eee

/gluster2/brick2/1/.glusterfs/indices/xattrop/:
total 1
----------. 2 root root 0 Aug 30 09:33 00000000-0000-0000-0000-000000000001
----------. 2 root root 0 Aug 30 09:33
xattrop-58f5b39b-0935-4153-b85b-4f4b2724906f

/gluster2/brick3/1/.glusterfs/indices/xattrop/:
total 1
----------. 2 root root 0 Aug 30 09:40 00000000-0000-0000-0000-000000000001
----------. 2 root root 0 Aug 30 09:40
xattrop-6a58e1ac-dfdb-4f6e-93d3-5f02d49bf94b


do i need to do something about these entries?  2 are from yesterday
probably during some of testing one from this morning.
gluster volume heal glustershard statistics heal-count
Gathering count of entries to be healed on volume glustershard has been
successful

Brick 192.168.71.10:/gluster2/brick1/1
Number of entries: 0

Brick 192.168.71.11:/gluster2/brick2/1
Number of entries: 1

Brick 192.168.71.12:/gluster2/brick3/1
Number of entries: 1

getfattr -d -m . -e hex /gluster2/brick?/1/
getfattr: Removing leading '/' from absolute path names
# file: gluster2/brick1/1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
user.some-name=0x736f6d652d76616c7565

# file: gluster2/brick2/1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.glustershard-client-0=0x000000010000000000000000
trusted.afr.glustershard-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
user.some-name=0x736f6d652d76616c7565

# file: gluster2/brick3/1/
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.glustershard-client-0=0x000000010000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
user.some-name=0x736f6d652d76616c7565




*David Gossage*
*Carousel Checks Inc. | System Administrator*
*Office* 708.613.2284

On Wed, Aug 31, 2016 at 9:43 AM, David Gossage <dgossage at carouselchecks.com>
wrote:

> Just as a test I did not shut down the one VM on the cluster as finding a
> window before weekend where I can shut down all VM's and fit in a full heal
> is unlikely so wanted to see what occurs.
>
>
> kill -15 brick pid
> rm -Rf /gluster2/brick1/1
> mkdir /gluster2/brick1/1
> mkdir /rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/fake3
> setfattr -n "user.some-name" -v "some-value" /rhev/data-center/mnt/glusterS
> D/192.168.71.10\:_glustershard
>
> getfattr -d -m . -e hex /gluster2/brick2/1
> # file: gluster2/brick2/1
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000001
> trusted.afr.glustershard-client-0=0x000000000000000200000000
> trusted.afr.glustershard-client-2=0x000000000000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> getfattr -d -m . -e hex /gluster2/brick3/1
> # file: gluster2/brick3/1
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000001
> trusted.afr.glustershard-client-0=0x000000000000000200000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> setfattr -n trusted.afr.glustershard-client-0 -v
> 0x000000010000000200000000 /gluster2/brick2/1
> setfattr -n trusted.afr.glustershard-client-0 -v
> 0x000000010000000200000000 /gluster2/brick3/1
>
> getfattr -d -m . -e hex /gluster2/brick3/1/
> getfattr: Removing leading '/' from absolute path names
> # file: gluster2/brick3/1/
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.glustershard-client-0=0x000000010000000200000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> getfattr -d -m . -e hex /gluster2/brick2/1/
> getfattr: Removing leading '/' from absolute path names
> # file: gluster2/brick2/1/
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.glustershard-client-0=0x000000010000000200000000
> trusted.afr.glustershard-client-2=0x000000000000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> gluster v start glustershard force
>
> gluster heal counts climbed up and down a little as it healed everything
> in visible gluster mount and .glusterfs for visible mount files then
> stalled with around 15 shards and the fake3 directory still in list
>
> getfattr -d -m . -e hex /gluster2/brick2/1/
> getfattr: Removing leading '/' from absolute path names
> # file: gluster2/brick2/1/
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.glustershard-client-0=0x000000010000000000000000
> trusted.afr.glustershard-client-2=0x000000000000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> getfattr -d -m . -e hex /gluster2/brick3/1/
> getfattr: Removing leading '/' from absolute path names
> # file: gluster2/brick3/1/
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.glustershard-client-0=0x000000010000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> getfattr -d -m . -e hex /gluster2/brick1/1/
> getfattr: Removing leading '/' from absolute path names
> # file: gluster2/brick1/1/
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a756e6c6162656c65645f743a733000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> heal count stayed same for awhile then ran
>
> gluster v heal glustershard full
>
> heals jump up to 700 as shards actually get read in as needing heals.
>  glustershd shows 3 sweeps started one per brick
>
> It heals shards things look ok heal <> info shows 0 files but statistics
> heal-info shows 1 left for brick 2 and 3. perhaps cause I didnt stop vm
> running?
>
> # file: gluster2/brick1/1/
> security.selinux=0x756e636f6e66696e65645f753a6f
> 626a6563745f723a756e6c6162656c65645f743a733000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> # file: gluster2/brick2/1/
> security.selinux=0x756e636f6e66696e65645f753a6f
> 626a6563745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.glustershard-client-0=0x000000010000000000000000
> trusted.afr.glustershard-client-2=0x000000000000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> # file: gluster2/brick3/1/
> security.selinux=0x756e636f6e66696e65645f753a6f
> 626a6563745f723a756e6c6162656c65645f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.glustershard-client-0=0x000000010000000000000000
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0x5889332e50ba441e8fa5cce3ae6f3a15
> user.some-name=0x736f6d652d76616c7565
>
> meta-data split-brain?  heal <> info split-brain shows no files or
> entries.  If I had thought ahead I would have checked the values returned
> by getfattr before, although I do know heal-count was returning 0 at the
> time
>
>
> Assuming I need to shut down vm's and put volume in maintenance from ovirt
> to prevent any io.  Does it need to occur for whole heal or can I
> re-activate at some point to bring VM's back up?
>
>
>
>
> *David Gossage*
> *Carousel Checks Inc. | System Administrator*
> *Office* 708.613.2284
>
> On Wed, Aug 31, 2016 at 3:50 AM, Krutika Dhananjay <kdhananj at redhat.com>
> wrote:
>
>> No, sorry, it's working fine. I may have missed some step because of
>> which i saw that problem. /.shard is also healing fine now.
>>
>> Let me know if it works for you.
>>
>> -Krutika
>>
>> On Wed, Aug 31, 2016 at 12:49 PM, Krutika Dhananjay <kdhananj at redhat.com>
>> wrote:
>>
>>> OK I just hit the other issue too, where .shard doesn't get healed. :)
>>>
>>> Investigating as to why that is the case. Give me some time.
>>>
>>> -Krutika
>>>
>>> On Wed, Aug 31, 2016 at 12:39 PM, Krutika Dhananjay <kdhananj at redhat.com
>>> > wrote:
>>>
>>>> Just figured the steps Anuradha has provided won't work if granular
>>>> entry heal is on.
>>>> So when you bring down a brick and create fake2 under / of the volume,
>>>> granular entry heal feature causes
>>>> sh to remember only the fact that 'fake2' needs to be recreated on the
>>>> offline brick (because changelogs are granular).
>>>>
>>>> In this case, we would be required to indicate to self-heal-daemon that
>>>> the entire directory tree from '/' needs to be repaired on the brick that
>>>> contains no data.
>>>>
>>>> To fix this, I did the following (for users who use granular entry
>>>> self-healing):
>>>>
>>>> 1. Kill the last brick process in the replica (/bricks/3)
>>>>
>>>> 2. [root at server-3 ~]# rm -rf /bricks/3
>>>>
>>>> 3. [root at server-3 ~]# mkdir /bricks/3
>>>>
>>>> 4. Create a new dir on the mount point:
>>>>     [root at client-1 ~]# mkdir /mnt/fake
>>>>
>>>> 5. Set some fake xattr on the root of the volume, and not the 'fake'
>>>> directory itself.
>>>>     [root at client-1 ~]# setfattr -n "user.some-name" -v "some-value"
>>>> /mnt
>>>>
>>>> 6. Make sure there's no io happening on your volume.
>>>>
>>>> 7. Check the pending xattrs on the brick directories of the two good
>>>> copies (on bricks 1 and 2), you should be seeing same values as the one
>>>> marked in red in both bricks.
>>>> (note that the client-<num> xattr key will have the same last digit as
>>>> the index of the brick that is down, when counting from 0. So if the first
>>>> brick is the one that is down, it would read trusted.afr.*-client-0; if the
>>>> second brick is the one that is empty and down, it would read
>>>> trusted.afr.*-client-1 and so on).
>>>>
>>>> [root at server-1 ~]# getfattr -d -m . -e hex /bricks/1
>>>> # file: 1
>>>> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
>>>> 23a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> *trusted.afr.rep-client-2=0x000000000000000100000001*
>>>> trusted.gfid=0x00000000000000000000000000000001
>>>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>>> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>>>>
>>>> [root at server-2 ~]# getfattr -d -m . -e hex /bricks/2
>>>> # file: 2
>>>> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
>>>> 23a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> *trusted.afr.rep-client-2=0x000**000000000000100000001*
>>>> trusted.gfid=0x00000000000000000000000000000001
>>>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>>> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>>>>
>>>> 8. Flip the 8th digit in the trusted.afr.<VOLNAME>-client-2 to a 1.
>>>>
>>>> [root at server-1 ~]# setfattr -n trusted.afr.rep-client-2 -v
>>>> *0x000000010000000100000001* /bricks/1
>>>> [root at server-2 ~]# setfattr -n trusted.afr.rep-client-2 -v
>>>> *0x000000010000000100000001* /bricks/2
>>>>
>>>> 9. Get the xattrs again and check the xattrs are set properly now
>>>>
>>>> [root at server-1 ~]# getfattr -d -m . -e hex /bricks/1
>>>> # file: 1
>>>> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
>>>> 23a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> *trusted.afr.rep-client-2=0x000**000010000000100000001*
>>>> trusted.gfid=0x00000000000000000000000000000001
>>>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>>> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>>>>
>>>> [root at server-2 ~]# getfattr -d -m . -e hex /bricks/2
>>>> # file: 2
>>>> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
>>>> 23a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> *trusted.afr.rep-client-2=0x000**000010000000100000001*
>>>> trusted.gfid=0x00000000000000000000000000000001
>>>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>>> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>>>>
>>>> 10. Force-start the volume.
>>>>
>>>> [root at server-1 ~]# gluster volume start rep force
>>>> volume start: rep: success
>>>>
>>>> 11. Monitor heal-info command to ensure the number of entries keeps
>>>> growing.
>>>>
>>>> 12. Keep monitoring with step 10 and eventually the number of entries
>>>> needing heal must come down to 0.
>>>> Also the checksums of the files on the previously empty brick should
>>>> now match with the copies on the other two bricks.
>>>>
>>>> Could you check if the above steps work for you, in your test
>>>> environment?
>>>>
>>>> You caught a nice bug in the manual steps to follow when granular
>>>> entry-heal is enabled and an empty brick needs heal. Thanks for reporting
>>>> it. :) We will fix the documentation appropriately.
>>>>
>>>> -Krutika
>>>>
>>>>
>>>> On Wed, Aug 31, 2016 at 11:29 AM, Krutika Dhananjay <
>>>> kdhananj at redhat.com> wrote:
>>>>
>>>>> Tried this.
>>>>>
>>>>> With me, only 'fake2' gets healed after i bring the 'empty' brick back
>>>>> up and it stops there unless I do a 'heal-full'.
>>>>>
>>>>> Is that what you're seeing as well?
>>>>>
>>>>> -Krutika
>>>>>
>>>>> On Wed, Aug 31, 2016 at 4:43 AM, David Gossage <
>>>>> dgossage at carouselchecks.com> wrote:
>>>>>
>>>>>> Same issue brought up glusterd on problem node heal count still stuck
>>>>>> at 6330.
>>>>>>
>>>>>> Ran gluster v heal GUSTER1 full
>>>>>>
>>>>>> glustershd on problem node shows a sweep starting and finishing in
>>>>>> seconds.  Other 2 nodes show no activity in log.  They should start a sweep
>>>>>> too shouldn't they?
>>>>>>
>>>>>> Tried starting from scratch
>>>>>>
>>>>>> kill -15 brickpid
>>>>>> rm -Rf /brick
>>>>>> mkdir -p /brick
>>>>>> mkdir mkdir /gsmount/fake2
>>>>>> setfattr -n "user.some-name" -v "some-value" /gsmount/fake2
>>>>>>
>>>>>> Heals visible dirs instantly then stops.
>>>>>>
>>>>>> gluster v heal GLUSTER1 full
>>>>>>
>>>>>> see sweep star on problem node and end almost instantly.  no files
>>>>>> added t heal list no files healed no more logging
>>>>>>
>>>>>> [2016-08-30 23:11:31.544331] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-30 23:11:33.776235] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>>>
>>>>>> same results no matter which node you run command on.  Still stuck
>>>>>> with 6330 files showing needing healed out of 19k.  still showing in logs
>>>>>> no heals are occuring.
>>>>>>
>>>>>> Is their a way to forcibly reset any prior heal data?  Could it be
>>>>>> stuck on some past failed heal start?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *David Gossage*
>>>>>> *Carousel Checks Inc. | System Administrator*
>>>>>> *Office* 708.613.2284
>>>>>>
>>>>>> On Tue, Aug 30, 2016 at 10:03 AM, David Gossage <
>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>
>>>>>>> On Tue, Aug 30, 2016 at 10:02 AM, David Gossage <
>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>
>>>>>>>> updated test server to 3.8.3
>>>>>>>>
>>>>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>>>>>> Options Reconfigured:
>>>>>>>> cluster.granular-entry-heal: on
>>>>>>>> performance.readdir-ahead: on
>>>>>>>> performance.read-ahead: off
>>>>>>>> nfs.disable: on
>>>>>>>> nfs.addr-namelookup: off
>>>>>>>> nfs.enable-ino32: off
>>>>>>>> cluster.background-self-heal-count: 16
>>>>>>>> cluster.self-heal-window-size: 1024
>>>>>>>> performance.quick-read: off
>>>>>>>> performance.io-cache: off
>>>>>>>> performance.stat-prefetch: off
>>>>>>>> cluster.eager-lock: enable
>>>>>>>> network.remote-dio: on
>>>>>>>> cluster.quorum-type: auto
>>>>>>>> cluster.server-quorum-type: server
>>>>>>>> storage.owner-gid: 36
>>>>>>>> storage.owner-uid: 36
>>>>>>>> server.allow-insecure: on
>>>>>>>> features.shard: on
>>>>>>>> features.shard-block-size: 64MB
>>>>>>>> performance.strict-o-direct: off
>>>>>>>> cluster.locking-scheme: granular
>>>>>>>>
>>>>>>>> kill -15 brickpid
>>>>>>>> rm -Rf /gluster2/brick3
>>>>>>>> mkdir -p /gluster2/brick3/1
>>>>>>>> mkdir mkdir /rhev/data-center/mnt/glusterSD/192.168.71.10
>>>>>>>> \:_glustershard/fake2
>>>>>>>> setfattr -n "user.some-name" -v "some-value"
>>>>>>>> /rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/fake2
>>>>>>>> gluster v start glustershard force
>>>>>>>>
>>>>>>>> at this point brick process starts and all visible files including
>>>>>>>> new dir are made on brick
>>>>>>>> handful of shards are in heal statistics still but no .shard
>>>>>>>> directory created and no increase in shard count
>>>>>>>>
>>>>>>>> gluster v heal glustershard
>>>>>>>>
>>>>>>>> At this point still no increase in count or dir made no additional
>>>>>>>> activity in logs for healing generated.  waited few minutes tailing logs to
>>>>>>>> check if anything kicked in.
>>>>>>>>
>>>>>>>> gluster v heal glustershard full
>>>>>>>>
>>>>>>>> gluster shards added to list and heal commences.  logs show full
>>>>>>>> sweep starting on all 3 nodes.  though this time it only shows as finishing
>>>>>>>> on one which looks to be the one that had brick deleted.
>>>>>>>>
>>>>>>>> [2016-08-30 14:45:33.098589] I [MSGID: 108026]
>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>>>>> glustershard-client-0
>>>>>>>> [2016-08-30 14:45:33.099492] I [MSGID: 108026]
>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>>>>> glustershard-client-1
>>>>>>>> [2016-08-30 14:45:33.100093] I [MSGID: 108026]
>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>>>>> glustershard-client-2
>>>>>>>> [2016-08-30 14:52:29.760213] I [MSGID: 108026]
>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>>>>> glustershard-client-2
>>>>>>>>
>>>>>>>
>>>>>>> Just realized its still healing so that may be why sweep on 2 other
>>>>>>> bricks haven't replied as finished.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> my hope is that later tonight a full heal will work on production.
>>>>>>>> Is it possible self-heal daemon can get stale or stop listening but still
>>>>>>>> show as active?  Would stopping and starting self-heal daemon from gluster
>>>>>>>> cli before doing these heals be helpful?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2016 at 9:29 AM, David Gossage <
>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2016 at 8:52 AM, David Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <
>>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <
>>>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage <
>>>>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika Dhananjay <
>>>>>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you also share the glustershd logs?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'll get them when I get to work sure
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I tried the same steps that you mentioned multiple times, but
>>>>>>>>>>>>>> heal is running to completion without any issues.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It must be said that 'heal full' traverses the files and
>>>>>>>>>>>>>> directories in a depth-first order and does heals also in the same order.
>>>>>>>>>>>>>> But if it gets interrupted in the middle (say because self-heal-daemon was
>>>>>>>>>>>>>> either intentionally or unintentionally brought offline and then brought
>>>>>>>>>>>>>> back up), self-heal will only pick up the entries that are so far marked as
>>>>>>>>>>>>>> new-entries that need heal which it will find in indices/xattrop directory.
>>>>>>>>>>>>>> What this means is that those files and directories that were not visited
>>>>>>>>>>>>>> during the crawl, will remain untouched and unhealed in this second
>>>>>>>>>>>>>> iteration of heal, unless you execute a 'heal-full' again.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So should it start healing shards as it crawls or not until
>>>>>>>>>>>>> after it crawls the entire .shard directory?  At the pace it was going that
>>>>>>>>>>>>> could be a week with one node appearing in the cluster but with no shard
>>>>>>>>>>>>> files if anything tries to access a file on that node.  From my experience
>>>>>>>>>>>>> other day telling it to heal full again did nothing regardless of node used.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Crawl is started from '/' of the volume. Whenever self-heal
>>>>>>>>>>> detects during the crawl that a file or directory is present in some
>>>>>>>>>>> brick(s) and absent in others, it creates the file on the bricks where it
>>>>>>>>>>> is absent and marks the fact that the file or directory might need
>>>>>>>>>>> data/entry and metadata heal too (this also means that an index is created
>>>>>>>>>>> under .glusterfs/indices/xattrop of the src bricks). And the data/entry and
>>>>>>>>>>> metadata heal are picked up and done in
>>>>>>>>>>>
>>>>>>>>>> the background with the help of these indices.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looking at my 3rd node as example i find nearly an exact same
>>>>>>>>>> number of files in xattrop dir as reported by heal count at time I brought
>>>>>>>>>> down node2 to try and alleviate read io errors that seemed to occur from
>>>>>>>>>> what I was guessing as attempts to use the node with no shards for reads.
>>>>>>>>>>
>>>>>>>>>> Also attached are the glustershd logs from the 3 nodes, along
>>>>>>>>>> with the test node i tried yesterday with same results.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Looking at my own logs I notice that a full sweep was only ever
>>>>>>>>> recorded in glustershd.log on 2nd node with missing directory.  I believe I
>>>>>>>>> should have found a sweep begun on every node correct?
>>>>>>>>>
>>>>>>>>> On my test dev when it did work I do see that
>>>>>>>>>
>>>>>>>>> [2016-08-30 13:56:25.223333] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>>>>>> glustershard-client-0
>>>>>>>>> [2016-08-30 13:56:25.223522] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>>>>>> glustershard-client-1
>>>>>>>>> [2016-08-30 13:56:25.224616] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>>>>>> glustershard-client-2
>>>>>>>>> [2016-08-30 14:18:48.333740] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>>>>>> glustershard-client-2
>>>>>>>>> [2016-08-30 14:18:48.356008] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>>>>>> glustershard-client-1
>>>>>>>>> [2016-08-30 14:18:49.637811] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>>>>>> glustershard-client-0
>>>>>>>>>
>>>>>>>>> While when looking at past few days of the 3 prod nodes i only
>>>>>>>>> found that on my 2nd node
>>>>>>>>> [2016-08-27 01:26:42.638772] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: starting full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 11:37:01.732366] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: finished full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 12:58:34.597228] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: starting full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 12:59:28.041173] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: finished full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 20:03:42.560188] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: starting full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 20:03:44.278274] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: finished full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 21:00:42.603315] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: starting full sweep on subvol GLUSTER1-client-1
>>>>>>>>> [2016-08-27 21:00:46.148674] I [MSGID: 108026]
>>>>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>>>>> 0-GLUSTER1-replicate-0: finished full sweep on subvol GLUSTER1-client-1
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> My suspicion is that this is what happened on your setup.
>>>>>>>>>>>>>> Could you confirm if that was the case?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Brick was brought online with force start then a full heal
>>>>>>>>>>>>> launched.  Hours later after it became evident that it was not adding new
>>>>>>>>>>>>> files to heal I did try restarting self-heal daemon and relaunching full
>>>>>>>>>>>>> heal again. But this was after the heal had basically already failed to
>>>>>>>>>>>>> work as intended.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> OK. How did you figure it was not adding any new files? I need
>>>>>>>>>>>> to know what places you were monitoring to come to this conclusion.
>>>>>>>>>>>>
>>>>>>>>>>>> -Krutika
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> As for those logs, I did manager to do something that caused
>>>>>>>>>>>>>> these warning messages you shared earlier to appear in my client and server
>>>>>>>>>>>>>> logs.
>>>>>>>>>>>>>> Although these logs are annoying and a bit scary too, they
>>>>>>>>>>>>>> didn't do any harm to the data in my volume. Why they appear just after a
>>>>>>>>>>>>>> brick is replaced and under no other circumstances is something I'm still
>>>>>>>>>>>>>> investigating.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But for future, it would be good to follow the steps Anuradha
>>>>>>>>>>>>>> gave as that would allow self-heal to at least detect that it has some
>>>>>>>>>>>>>> repairing to do whenever it is restarted whether intentionally or otherwise.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I followed those steps as described on my test box and ended
>>>>>>>>>>>>> up with exact same outcome of adding shards at an agonizing slow pace and
>>>>>>>>>>>>> no creation of .shard directory or heals on shard directory.  Directories
>>>>>>>>>>>>> visible from mount healed quickly.  This was with one VM so it has only 800
>>>>>>>>>>>>> shards as well.  After hours at work it had added a total of 33 shards to
>>>>>>>>>>>>> be healed.  I sent those logs yesterday as well though not the glustershd.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does replace-brick command copy files in same manner?  For
>>>>>>>>>>>>> these purposes I am contemplating just skipping the heal route.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Krutika
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 30, 2016 at 2:22 AM, David Gossage <
>>>>>>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> attached brick and client logs from test machine where same
>>>>>>>>>>>>>>> behavior occurred not sure if anything new is there.  its still on 3.8.2
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Number of Bricks: 1 x 3 = 3
>>>>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>>>>> Bricks:
>>>>>>>>>>>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>>>>>>>>>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>>>>>>>>>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>>>>> cluster.locking-scheme: granular
>>>>>>>>>>>>>>> performance.strict-o-direct: off
>>>>>>>>>>>>>>> features.shard-block-size: 64MB
>>>>>>>>>>>>>>> features.shard: on
>>>>>>>>>>>>>>> server.allow-insecure: on
>>>>>>>>>>>>>>> storage.owner-uid: 36
>>>>>>>>>>>>>>> storage.owner-gid: 36
>>>>>>>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>>>>>>>> network.remote-dio: on
>>>>>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>>>>>> cluster.self-heal-window-size: 1024
>>>>>>>>>>>>>>> cluster.background-self-heal-count: 16
>>>>>>>>>>>>>>> nfs.enable-ino32: off
>>>>>>>>>>>>>>> nfs.addr-namelookup: off
>>>>>>>>>>>>>>> nfs.disable: on
>>>>>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>>>>>>> cluster.granular-entry-heal: on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Aug 29, 2016 at 2:20 PM, David Gossage <
>>>>>>>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur <
>>>>>>>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>>>>>> > From: "David Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>>>>>>> > To: "Anuradha Talur" <atalur at redhat.com>
>>>>>>>>>>>>>>>>> > Cc: "gluster-users at gluster.org List" <
>>>>>>>>>>>>>>>>> Gluster-users at gluster.org>, "Krutika Dhananjay" <
>>>>>>>>>>>>>>>>> kdhananj at redhat.com>
>>>>>>>>>>>>>>>>> > Sent: Monday, August 29, 2016 5:12:42 PM
>>>>>>>>>>>>>>>>> > Subject: Re: [Gluster-users] 3.8.3 Shards Healing
>>>>>>>>>>>>>>>>> Glacier Slow
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur <
>>>>>>>>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > > Response inline.
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > ----- Original Message -----
>>>>>>>>>>>>>>>>> > > > From: "Krutika Dhananjay" <kdhananj at redhat.com>
>>>>>>>>>>>>>>>>> > > > To: "David Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>>>>>>> > > > Cc: "gluster-users at gluster.org List" <
>>>>>>>>>>>>>>>>> Gluster-users at gluster.org>
>>>>>>>>>>>>>>>>> > > > Sent: Monday, August 29, 2016 3:55:04 PM
>>>>>>>>>>>>>>>>> > > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing
>>>>>>>>>>>>>>>>> Glacier Slow
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > Could you attach both client and brick logs?
>>>>>>>>>>>>>>>>> Meanwhile I will try these
>>>>>>>>>>>>>>>>> > > steps
>>>>>>>>>>>>>>>>> > > > out on my machines and see if it is easily
>>>>>>>>>>>>>>>>> recreatable.
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > -Krutika
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > On Mon, Aug 29, 2016 at 2:31 PM, David Gossage <
>>>>>>>>>>>>>>>>> > > dgossage at carouselchecks.com
>>>>>>>>>>>>>>>>> > > > > wrote:
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > Centos 7 Gluster 3.8.3
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>>>>>> > > > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>>>>>> > > > Brick3: ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>>>>>> > > > Options Reconfigured:
>>>>>>>>>>>>>>>>> > > > cluster.data-self-heal-algorithm: full
>>>>>>>>>>>>>>>>> > > > cluster.self-heal-daemon: on
>>>>>>>>>>>>>>>>> > > > cluster.locking-scheme: granular
>>>>>>>>>>>>>>>>> > > > features.shard-block-size: 64MB
>>>>>>>>>>>>>>>>> > > > features.shard: on
>>>>>>>>>>>>>>>>> > > > performance.readdir-ahead: on
>>>>>>>>>>>>>>>>> > > > storage.owner-uid: 36
>>>>>>>>>>>>>>>>> > > > storage.owner-gid: 36
>>>>>>>>>>>>>>>>> > > > performance.quick-read: off
>>>>>>>>>>>>>>>>> > > > performance.read-ahead: off
>>>>>>>>>>>>>>>>> > > > performance.io-cache: off
>>>>>>>>>>>>>>>>> > > > performance.stat-prefetch: on
>>>>>>>>>>>>>>>>> > > > cluster.eager-lock: enable
>>>>>>>>>>>>>>>>> > > > network.remote-dio: enable
>>>>>>>>>>>>>>>>> > > > cluster.quorum-type: auto
>>>>>>>>>>>>>>>>> > > > cluster.server-quorum-type: server
>>>>>>>>>>>>>>>>> > > > server.allow-insecure: on
>>>>>>>>>>>>>>>>> > > > cluster.self-heal-window-size: 1024
>>>>>>>>>>>>>>>>> > > > cluster.background-self-heal-count: 16
>>>>>>>>>>>>>>>>> > > > performance.strict-write-ordering: off
>>>>>>>>>>>>>>>>> > > > nfs.disable: on
>>>>>>>>>>>>>>>>> > > > nfs.addr-namelookup: off
>>>>>>>>>>>>>>>>> > > > nfs.enable-ino32: off
>>>>>>>>>>>>>>>>> > > > cluster.granular-entry-heal: on
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > Friday did rolling upgrade from 3.8.3->3.8.3 no
>>>>>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>>>>> > > > Following steps detailed in previous recommendations
>>>>>>>>>>>>>>>>> began proces of
>>>>>>>>>>>>>>>>> > > > replacing and healngbricks one node at a time.
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > 1) kill pid of brick
>>>>>>>>>>>>>>>>> > > > 2) reconfigure brick from raid6 to raid10
>>>>>>>>>>>>>>>>> > > > 3) recreate directory of brick
>>>>>>>>>>>>>>>>> > > > 4) gluster volume start <> force
>>>>>>>>>>>>>>>>> > > > 5) gluster volume heal <> full
>>>>>>>>>>>>>>>>> > > Hi,
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > I'd suggest that full heal is not used. There are a
>>>>>>>>>>>>>>>>> few bugs in full heal.
>>>>>>>>>>>>>>>>> > > Better safe than sorry ;)
>>>>>>>>>>>>>>>>> > > Instead I'd suggest the following steps:
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > Currently I brought the node down by systemctl stop
>>>>>>>>>>>>>>>>> glusterd as I was
>>>>>>>>>>>>>>>>> > getting sporadic io issues and a few VM's paused so
>>>>>>>>>>>>>>>>> hoping that will help.
>>>>>>>>>>>>>>>>> > I may wait to do this till around 4PM when most work is
>>>>>>>>>>>>>>>>> done in case it
>>>>>>>>>>>>>>>>> > shoots load up.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > > 1) kill pid of brick
>>>>>>>>>>>>>>>>> > > 2) to configuring of brick that you need
>>>>>>>>>>>>>>>>> > > 3) recreate brick dir
>>>>>>>>>>>>>>>>> > > 4) while the brick is still down, from the mount point:
>>>>>>>>>>>>>>>>> > >    a) create a dummy non existent dir under / of mount.
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > so if noee 2 is down brick, pick node for example 3 and
>>>>>>>>>>>>>>>>> make a test dir
>>>>>>>>>>>>>>>>> > under its brick directory that doesnt exist on 2 or
>>>>>>>>>>>>>>>>> should I be dong this
>>>>>>>>>>>>>>>>> > over a gluster mount?
>>>>>>>>>>>>>>>>> You should be doing this over gluster mount.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > >    b) set a non existent extended attribute on / of
>>>>>>>>>>>>>>>>> mount.
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Could you give me an example of an attribute to set?
>>>>>>>>>>>>>>>>>  I've read a tad on
>>>>>>>>>>>>>>>>> > this, and looked up attributes but haven't set any yet
>>>>>>>>>>>>>>>>> myself.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> Sure. setfattr -n "user.some-name" -v "some-value"
>>>>>>>>>>>>>>>>> <path-to-mount>
>>>>>>>>>>>>>>>>> > Doing these steps will ensure that heal happens only
>>>>>>>>>>>>>>>>> from updated brick to
>>>>>>>>>>>>>>>>> > > down brick.
>>>>>>>>>>>>>>>>> > > 5) gluster v start <> force
>>>>>>>>>>>>>>>>> > > 6) gluster v heal <>
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Will it matter if somewhere in gluster the full heal
>>>>>>>>>>>>>>>>> command was run other
>>>>>>>>>>>>>>>>> > day?  Not sure if it eventually stops or times out.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> full heal will stop once the crawl is done. So if you want
>>>>>>>>>>>>>>>>> to trigger heal again,
>>>>>>>>>>>>>>>>> run gluster v heal <>. Actually even brick up or volume
>>>>>>>>>>>>>>>>> start force should
>>>>>>>>>>>>>>>>> trigger the heal.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Did this on test bed today.  its one server with 3 bricks
>>>>>>>>>>>>>>>> on same machine so take that for what its worth.  also it still runs
>>>>>>>>>>>>>>>> 3.8.2.  Maybe ill update and re-run test.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> killed brick
>>>>>>>>>>>>>>>> deleted brick dir
>>>>>>>>>>>>>>>> recreated brick dir
>>>>>>>>>>>>>>>> created fake dir on gluster mount
>>>>>>>>>>>>>>>> set suggested fake attribute on it
>>>>>>>>>>>>>>>> ran volume start <> force
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> looked at files it said needed healing and it was just 8
>>>>>>>>>>>>>>>> shards that were modified for few minutes I ran through steps
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> gave it few minutes and it stayed same
>>>>>>>>>>>>>>>> ran gluster volume <> heal
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> it healed all the directories and files you can see over
>>>>>>>>>>>>>>>> mount including fakedir.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> same issue for shards though.  it adds more shards to heal
>>>>>>>>>>>>>>>> at glacier pace.  slight jump in speed if I stat every file and dir in VM
>>>>>>>>>>>>>>>> running but not all shards.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It started with 8 shards to heal and is now only at 33 out
>>>>>>>>>>>>>>>> of 800 and probably wont finish adding for few days at rate it goes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > > 1st node worked as expected took 12 hours to heal
>>>>>>>>>>>>>>>>> 1TB data. Load was
>>>>>>>>>>>>>>>>> > > little
>>>>>>>>>>>>>>>>> > > > heavy but nothing shocking.
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > About an hour after node 1 finished I began same
>>>>>>>>>>>>>>>>> process on node2. Heal
>>>>>>>>>>>>>>>>> > > > proces kicked in as before and the files in
>>>>>>>>>>>>>>>>> directories visible from
>>>>>>>>>>>>>>>>> > > mount
>>>>>>>>>>>>>>>>> > > > and .glusterfs healed in short time. Then it began
>>>>>>>>>>>>>>>>> crawl of .shard adding
>>>>>>>>>>>>>>>>> > > > those files to heal count at which point the entire
>>>>>>>>>>>>>>>>> proces ground to a
>>>>>>>>>>>>>>>>> > > halt
>>>>>>>>>>>>>>>>> > > > basically. After 48 hours out of 19k shards it has
>>>>>>>>>>>>>>>>> added 5900 to heal
>>>>>>>>>>>>>>>>> > > list.
>>>>>>>>>>>>>>>>> > > > Load on all 3 machnes is negligible. It was
>>>>>>>>>>>>>>>>> suggested to change this
>>>>>>>>>>>>>>>>> > > value
>>>>>>>>>>>>>>>>> > > > to full cluster.data-self-heal-algorithm and
>>>>>>>>>>>>>>>>> restart volume which I
>>>>>>>>>>>>>>>>> > > did. No
>>>>>>>>>>>>>>>>> > > > efffect. Tried relaunching heal no effect, despite
>>>>>>>>>>>>>>>>> any node picked. I
>>>>>>>>>>>>>>>>> > > > started each VM and performed a stat of all files
>>>>>>>>>>>>>>>>> from within it, or a
>>>>>>>>>>>>>>>>> > > full
>>>>>>>>>>>>>>>>> > > > virus scan and that seemed to cause short small
>>>>>>>>>>>>>>>>> spikes in shards added,
>>>>>>>>>>>>>>>>> > > but
>>>>>>>>>>>>>>>>> > > > not by much. Logs are showing no real messages
>>>>>>>>>>>>>>>>> indicating anything is
>>>>>>>>>>>>>>>>> > > going
>>>>>>>>>>>>>>>>> > > > on. I get hits to brick log on occasion of null
>>>>>>>>>>>>>>>>> lookups making me think
>>>>>>>>>>>>>>>>> > > its
>>>>>>>>>>>>>>>>> > > > not really crawling shards directory but waiting for
>>>>>>>>>>>>>>>>> a shard lookup to
>>>>>>>>>>>>>>>>> > > add
>>>>>>>>>>>>>>>>> > > > it. I'll get following in brick log but not constant
>>>>>>>>>>>>>>>>> and sometime
>>>>>>>>>>>>>>>>> > > multiple
>>>>>>>>>>>>>>>>> > > > for same shard.
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > [2016-08-29 08:31:57.478125] W [MSGID: 115009]
>>>>>>>>>>>>>>>>> > > > [server-resolve.c:569:server_resolve]
>>>>>>>>>>>>>>>>> 0-GLUSTER1-server: no resolution
>>>>>>>>>>>>>>>>> > > type
>>>>>>>>>>>>>>>>> > > > for (null) (LOOKUP)
>>>>>>>>>>>>>>>>> > > > [2016-08-29 08:31:57.478170] E [MSGID: 115050]
>>>>>>>>>>>>>>>>> > > > [server-rpc-fops.c:156:server_lookup_cbk]
>>>>>>>>>>>>>>>>> 0-GLUSTER1-server: 12591783:
>>>>>>>>>>>>>>>>> > > > LOOKUP (null) (00000000-0000-0000-00
>>>>>>>>>>>>>>>>> > > > 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221)
>>>>>>>>>>>>>>>>> ==> (Invalid
>>>>>>>>>>>>>>>>> > > > argument) [Invalid argument]
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > This one repeated about 30 times in row then nothing
>>>>>>>>>>>>>>>>> for 10 minutes then
>>>>>>>>>>>>>>>>> > > one
>>>>>>>>>>>>>>>>> > > > hit for one different shard by itself.
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > How can I determine if Heal is actually running? How
>>>>>>>>>>>>>>>>> can I kill it or
>>>>>>>>>>>>>>>>> > > force
>>>>>>>>>>>>>>>>> > > > restart? Does node I start it from determine which
>>>>>>>>>>>>>>>>> directory gets
>>>>>>>>>>>>>>>>> > > crawled to
>>>>>>>>>>>>>>>>> > > > determine heals?
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > David Gossage
>>>>>>>>>>>>>>>>> > > > Carousel Checks Inc. | System Administrator
>>>>>>>>>>>>>>>>> > > > Office 708.613.2284
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > _______________________________________________
>>>>>>>>>>>>>>>>> > > > Gluster-users mailing list
>>>>>>>>>>>>>>>>> > > > Gluster-users at gluster.org
>>>>>>>>>>>>>>>>> > > > http://www.gluster.org/mailman
>>>>>>>>>>>>>>>>> /listinfo/gluster-users
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>>>>> > > > _______________________________________________
>>>>>>>>>>>>>>>>> > > > Gluster-users mailing list
>>>>>>>>>>>>>>>>> > > > Gluster-users at gluster.org
>>>>>>>>>>>>>>>>> > > > http://www.gluster.org/mailman
>>>>>>>>>>>>>>>>> /listinfo/gluster-users
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> > > --
>>>>>>>>>>>>>>>>> > > Thanks,
>>>>>>>>>>>>>>>>> > > Anuradha.
>>>>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Anuradha.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160831/3a108d7e/attachment.html>


More information about the Gluster-users mailing list