[Gluster-users] folder not being healed

Mon Jan 4 16:20:41 UTC 2016


On 01/04/2016 09:14 PM, Andreas Tsaridas wrote:
> Hello,
>
> Unfortunately I get :
>
> -bash: /usr/bin/getfattr: Argument list too long
>
> There are a lot of file in these directories and even ls takes a long 
> time to show results.
Kritika pointed out something important to me on IRC, Why does the 
volume have two sets of trusted.afr.* xattrs? i.e. trusted.afr.remote1/2 
and trusted.afr.share-client-0/1.

Pranith
>
> How would I be able to keep the copy from web01 and discard the other ?
>
> Thanks
>
> On Mon, Jan 4, 2016 at 3:59 PM, Pranith Kumar Karampuri 
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>
>     hi Andreas,
>             The directory is in split-brain. Do you have any
>     files/directories, that are in split-brain in the directory
>     'media/ga/live/a' ??
>
>     Could you give output of
>     "getfattr -d -m. -e hex media/ga/live/a/*" on both the bricks?
>
>     Pranith
>
>
>     On 01/04/2016 05:21 PM, Andreas Tsaridas wrote:
>>     Hello,
>>
>>     Please see below :
>>     -----
>>
>>     web01 # getfattr -d -m . -e hex media/ga/live/a
>>     # file: media/ga/live/a
>>     trusted.afr.dirty=0x000000000000000000000000
>>     trusted.afr.remote1=0x000000000000000000000000
>>     trusted.afr.remote2=0x000000000000000000000005
>>     trusted.afr.share-client-0=0x000000000000000000000000
>>     trusted.afr.share-client-1=0x0000000000000000000000ee
>>     trusted.gfid=0xb13199a1464c44918464444b3f7eeee3
>>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>
>>
>>     ------
>>
>>     web02 # getfattr -d -m . -e hex media/ga/live/a
>>     # file: media/ga/live/a
>>     trusted.afr.dirty=0x000000000000000000000000
>>     trusted.afr.remote1=0x000000000000000000000008
>>     trusted.afr.remote2=0x000000000000000000000000
>>     trusted.afr.share-client-0=0x000000000000000000000000
>>     trusted.afr.share-client-1=0x000000000000000000000000
>>     trusted.gfid=0xb13199a1464c44918464444b3f7eeee3
>>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>>
>>     ------
>>
>>     Regards,
>>     AT
>>
>>     On Mon, Jan 4, 2016 at 12:44 PM, Krutika Dhananjay
>>     <kdhananj at redhat.com <mailto:kdhananj at redhat.com>> wrote:
>>
>>         Hi,
>>
>>         Could you share the output of
>>         # getfattr -d -m . -e hex <abs-path-to-media/ga/live/a>
>>
>>         from both the bricks?
>>
>>         -Krutika
>>         ------------------------------------------------------------------------
>>
>>             *From: *"Andreas Tsaridas" <andreas.tsaridas at gmail.com
>>             <mailto:andreas.tsaridas at gmail.com>>
>>             *To: *gluster-users at gluster.org
>>             <mailto:gluster-users at gluster.org>
>>             *Sent: *Monday, January 4, 2016 5:10:58 PM
>>             *Subject: *[Gluster-users] folder not being healed
>>
>>
>>             Hello,
>>
>>             I have a cluster of two replicated nodes in glusterfs
>>             3.6.3 in RedHat 6.6. Problem is that a specific folder is
>>             always trying to be healed but never gets healed. This
>>             has been going on for 2 weeks now.
>>
>>             -----
>>
>>             # gluster volume status
>>             Status of volume: share
>>             Gluster processPortOnlinePid
>>             ------------------------------------------------------------------------------
>>             Brick 172.16.4.1:/srv/share/glusterfs49152Y10416
>>             Brick 172.16.4.2:/srv/share/glusterfs49152Y19907
>>             NFS Server on localhost2049Y22664
>>             Self-heal Daemon on localhostN/AY22676
>>             NFS Server on 172.16.4.22049Y19923
>>             Self-heal Daemon on 172.16.4.2N/AY19937
>>
>>             Task Status of Volume share
>>             ------------------------------------------------------------------------------
>>             There are no active volume tasks
>>
>>             ------
>>
>>             # gluster volume info
>>
>>             Volume Name: share
>>             Type: Replicate
>>             Volume ID: 17224664-645c-48b7-bc3a-b8fc84c6ab30
>>             Status: Started
>>             Number of Bricks: 1 x 2 = 2
>>             Transport-type: tcp
>>             Bricks:
>>             Brick1: 172.16.4.1:/srv/share/glusterfs
>>             Brick2: 172.16.4.2:/srv/share/glusterfs
>>             Options Reconfigured:
>>             cluster.background-self-heal-count: 20
>>             cluster.heal-timeout: 2
>>             performance.normal-prio-threads: 64
>>             performance.high-prio-threads: 64
>>             performance.least-prio-threads: 64
>>             performance.low-prio-threads: 64
>>             performance.flush-behind: off
>>             performance.io-thread-count: 64
>>
>>             ------
>>
>>             # gluster volume heal share info
>>             Brick web01.rsdc:/srv/share/glusterfs/
>>             /media/ga/live/a - Possibly undergoing heal
>>
>>             Number of entries: 1
>>
>>             Brick web02.rsdc:/srv/share/glusterfs/
>>             Number of entries: 0
>>
>>             -------
>>
>>             # gluster volume heal share info split-brain
>>             Gathering list of split brain entries on volume share has
>>             been successful
>>
>>             Brick 172.16.4.1:/srv/share/glusterfs
>>             Number of entries: 0
>>
>>             Brick 172.16.4.2:/srv/share/glusterfs
>>             Number of entries: 0
>>
>>             -------
>>
>>             ==> /var/log/glusterfs/glustershd.log <==
>>             [2016-01-04 11:35:33.004831] I
>>             [afr-self-heal-entry.c:554:afr_selfheal_entry_do]
>>             0-share-replicate-0: performing entry selfheal on
>>             b13199a1-464c-4491-8464-444b3f7eeee3
>>             [2016-01-04 11:36:07.449192] W
>>             [client-rpc-fops.c:2772:client3_3_lookup_cbk]
>>             0-share-client-1: remote operation failed: No data
>>             available. Path: (null)
>>             (00000000-0000-0000-0000-000000000000)
>>             [2016-01-04 11:36:07.449706] W
>>             [client-rpc-fops.c:240:client3_3_mknod_cbk]
>>             0-share-client-1: remote operation failed: File exists.
>>             Path: (null)
>>
>>             Could you please advise ?
>>
>>             Kind regards,
>>
>>             AT
>>
>>             _______________________________________________
>>             Gluster-users mailing list
>>             Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>             http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org  <mailto:Gluster-users at gluster.org>
>>     http://www.gluster.org/mailman/listinfo/gluster-users
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160104/a05508ec/attachment.html>