[Gluster-users] [Gluster-devel] Query on healing process

ABHISHEK PALIWAL abhishpaliwal at gmail.com
Fri Mar 4 12:53:56 UTC 2016


On Fri, Mar 4, 2016 at 5:31 PM, Ravishankar N <ravishankar at redhat.com>
wrote:

> On 03/04/2016 12:10 PM, ABHISHEK PALIWAL wrote:
>
> Hi Ravi,
>
> 3. On the rebooted node, do you have ssl enabled by any chance? There is a
> bug for "Not able to fetch volfile' when ssl is enabled:
> <https://bugzilla.redhat.com/show_bug.cgi?id=1258931>
> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>
> ->>>>> I have checked but ssl is disabled but still getting these errors
>
> # gluster volume heal c_glusterfs info
> c_glusterfs: Not able to fetch volfile from glusterd
> Volume heal failed.
>
>
> Ok, just to confirm, glusterd  and other brick processes are running after
> this node rebooted?
> When you run the above command, you need to check
> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
> client-log-level to DEBUG would give you a more verbose message
>
> Yes, glusterd and other brick processes running fine. I have check the
/var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
Here is the logs from that file

[2016-03-02 13:51:39.059440] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
with index 1
[2016-03-02 13:51:39.072172] W [MSGID: 101012]
[common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
info [No such file or directory]
[2016-03-02 13:51:39.072228] W [MSGID: 101081]
[common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
get reserved ports, hence there is a possibility that glusterfs may consume
reserved port
[2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
[2016-03-02 13:51:39.072663] E [MSGID: 104024]
[glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
remote-host: localhost (Transport endpoint is not connected) [Transport
endpoint is not connected]
[2016-03-02 13:51:39.072700] I [MSGID: 104025]
[glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
servers [Transport endpoint is not connected]

> # gluster volume heal c_glusterfs info split-brain
> c_glusterfs: Not able to fetch volfile from glusterd
> Volume heal failed.
>
>
>
>
> And based on the your observation I understood that this is not the
> problem of split-brain but *is there any way through which can find out
> the file which is not in split-brain as well as not in sync?*
>
>
> `gluster volume heal c_glusterfs info split-brain`  should give you files
> that need heal.
>

I have run "gluster volume heal c_glusterfs info split-brain" command but
it is not showing that file which is out of sync that is the issue file is
not in sync on both of the brick and split-brain is not showing that
command in output for heal required.

Thats is why I am asking that is there any command other than this split
brain command so that I can find out the files those are required the heal
operation but not displayed in the output of "gluster volume heal
c_glusterfs info split-brain" command.

>
>
> # getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-2=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-4=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-6=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000** //because
> client8 is the latest client in our case and starting 8 digits *
>
> *00000006....are saying like there is something in changelog data. *
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000001356d86c0c000217fd
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # lhsh 002500 getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000** // and
> here we can say that there is no split brain but the file is out of sync*
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000001156d86c290005735c
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # gluster volume info
>
> Volume Name: c_glusterfs
> Type: Replicate
> Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
> Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
> Options Reconfigured:
> performance.readdir-ahead: on
> network.ping-timeout: 4
> nfs.disable: on
>
>
> # gluster volume info
>
> Volume Name: c_glusterfs
> Type: Replicate
> Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
> Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
> Options Reconfigured:
> performance.readdir-ahead: on
> network.ping-timeout: 4
> nfs.disable: on
>
> # gluster --version
> glusterfs 3.7.8 built on Feb 17 2016 07:49:49
> Repository revision: git://git.gluster.com/glusterfs.git
> Copyright (c) 2006-2011 Gluster Inc. <
> <https://prod-webmail.windriver.com/owa/redir.aspx?SURL=1n3NinBc2tJluL9mRvtdRtuM7FXSFmZ7aHgTkNSgQ7vm1RuX9kPTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAGwAdQBzAHQAZQByAC4AYwBvAG0ALwA.&URL=http%3a%2f%2fwww.gluster.com%2f>
> http://www.gluster.com>
> GlusterFS comes with ABSOLUTELY NO WARRANTY.
> You may redistribute copies of GlusterFS under the terms of the GNU
> General Public License.
> # gluster volume heal info heal-failed
> Usage: volume heal <VOLNAME> [enable | disable | full |statistics
> [heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed |
> split-brain] |split-brain {bigger-file <FILE> |source-brick
> <HOSTNAME:BRICKNAME> [<FILE>]}]
> # gluster volume heal c_glusterfs info heal-failed
> Command not supported. Please use "gluster volume heal c_glusterfs info"
> and logs to find the heal information.
> # lhsh 002500
>  _______  _____   _____              _____ __   _ _     _ _     _
>  |       |_____] |_____]      |        |   | \  | |     |  \___/
>  |_____  |       |            |_____ __|__ |  \_| |_____| _/   \_
>
> 002500> gluster --version
> glusterfs 3.7.8 built on Feb 17 2016 07:49:49
> Repository revision: git://git.gluster.com/glusterfs.git
> Copyright (c) 2006-2011 Gluster Inc. <
> <https://prod-webmail.windriver.com/owa/redir.aspx?SURL=1n3NinBc2tJluL9mRvtdRtuM7FXSFmZ7aHgTkNSgQ7vm1RuX9kPTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAGwAdQBzAHQAZQByAC4AYwBvAG0ALwA.&URL=http%3a%2f%2fwww.gluster.com%2f>
> http://www.gluster.com>
> GlusterFS comes with ABSOLUTELY NO WARRANTY.
> You may redistribute copies of GlusterFS under the terms of the GNU
> General Public License.
> 002500>
>
> Regards,
> Abhishek
>
> On Thu, Mar 3, 2016 at 4:54 PM, ABHISHEK PALIWAL <
> <abhishpaliwal at gmail.com>abhishpaliwal at gmail.com> wrote:
>
>>
>> On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N < <ravishankar at redhat.com>
>> ravishankar at redhat.com> wrote:
>>
>>> Hi,
>>>
>>> On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:
>>>
>>> Hi Ravi,
>>>
>>> As I discussed earlier this issue, I investigated this issue and find
>>> that healing is not triggered because the "gluster volume heal c_glusterfs
>>> info split-brain" command not showing any entries as a outcome of this
>>> command even though the file in split brain case.
>>>
>>>
>>> Couple of observations from the 'commands_output' file.
>>>
>>> getfattr -d -m . -e hex
>>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>>> The afr xattrs do not indicate that the file is in split brain:
>>> # file:
>>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>>> trusted.afr.c_glusterfs-client-1=0x000000000000000000000000
>>> trusted.afr.dirty=0x000000000000000000000000
>>> trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9
>>> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>>>
>>>
>>>
>>> getfattr -d -m . -e hex
>>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>>> trusted.afr.c_glusterfs-client-0=0x000000080000000000000000
>>> trusted.afr.c_glusterfs-client-2=0x000000020000000000000000
>>> trusted.afr.c_glusterfs-client-4=0x000000020000000000000000
>>> trusted.afr.c_glusterfs-client-6=0x000000020000000000000000
>>> trusted.afr.dirty=0x000000000000000000000000
>>> trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7
>>> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>>>
>>> 1. There doesn't seem to be a split-brain going by the trusted.afr*
>>> xattrs.
>>>
>>
>> if it is not the split brain problem then how can I resolve this.
>>
>>
>>> 2. You seem to have re-used the bricks from another volume/setup. For
>>> replica 2, only trusted.afr.c_glusterfs-client-0 and
>>> trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs -
>>> client-0,2,4 and 6
>>>
>>
>> could you please suggest why these entries are there because I am not
>> able to find out scenario. I am rebooting the one board multiple times to
>> reproduce the issue and after every reboot doing the remove-brick and
>> add-brick on the same volume for the second board.
>>
>>
>>> 3. On the rebooted node, do you have ssl enabled by any chance? There is
>>> a bug for "Not able to fetch volfile' when ssl is enabled:
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1258931>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>>>
>>> Btw, you for data and metadata split-brains you can use the gluster CLI
>>> <https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md>
>>> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>>> instead of modifying the file from the back end.
>>>
>>
>> But you are saying it is not split brain problem and even the split-brain
>> command  is not showing any file so how can I find the bigger file in size.
>> Also in my case the file size is fix 2MB it is overwritten every time.
>>
>>>
>>> -Ravi
>>>
>>>
>>> So, what I have done I manually deleted the gfid entry of that file from
>>> .glusterfs directory and follow the instruction mentioned in the following
>>> link to do heal
>>>
>>>
>>> https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md
>>>
>>> and this works fine for me.
>>>
>>> But my question is why the split-brain command not showing any file in
>>> output.
>>>
>>> Here I am attaching all the log which I get from the node for you and
>>> also the output of commands from both of the boards
>>>
>>> In this tar file two directories are present
>>>
>>> 000300 - log for the board which is running continuously
>>> 002500-  log for the board which is rebooted
>>>
>>> I am waiting for your reply please help me out on this issue.
>>>
>>> Thanks in advanced.
>>>
>>> Regards,
>>> Abhishek
>>>
>>> On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL <
>>> <abhishpaliwal at gmail.com>abhishpaliwal at gmail.com> wrote:
>>>
>>>> On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N <
>>>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote:
>>>>
>>>>> On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:
>>>>>
>>>>> Yes correct
>>>>>
>>>>>
>>>>> Okay, so when you say the files are not in sync until some time, are
>>>>> you getting stale data when accessing from the mount?
>>>>> I'm not able to figure out why heal info shows zero when the files are
>>>>> not in sync, despite all IO happening from the mounts. Could you provide
>>>>> the output of getfattr -d -m . -e hex /brick/file-name from both bricks
>>>>> when you hit this issue?
>>>>>
>>>>> I'll provide the logs once I get. here delay means we are powering on
>>>>> the second board after the 10 minutes.
>>>>>
>>>>>
>>>>> On Feb 26, 2016 9:57 AM, "Ravishankar N" < <ravishankar at redhat.com>
>>>>> ravishankar at redhat.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:
>>>>>>
>>>>>> Hi Ravi,
>>>>>>
>>>>>> Thanks for the response.
>>>>>>
>>>>>> We are using Glugsterfs-3.7.8
>>>>>>
>>>>>> Here is the use case:
>>>>>>
>>>>>> We have a logging file which saves logs of the events for every board
>>>>>> of a node and these files are in sync using glusterfs. System in replica 2
>>>>>> mode it means When one brick in a replicated volume goes offline,
>>>>>> the glusterd daemons on the other nodes keep track of all the files that
>>>>>> are not replicated to the offline brick. When the offline brick becomes
>>>>>> available again, the cluster initiates a healing process, replicating the
>>>>>> updated files to that brick. But in our casse, we see that log file
>>>>>> of one board is not in the sync and its format is corrupted means files are
>>>>>> not in sync.
>>>>>>
>>>>>>
>>>>>> Just to understand you correctly, you have mounted the 2 node
>>>>>> replica-2 volume on both these nodes and writing to a logging file from the
>>>>>> mounts right?
>>>>>>
>>>>>>
>>>>>> Even the outcome of #gluster volume heal c_glusterfs info shows that
>>>>>> there is no pending heals.
>>>>>>
>>>>>> Also , The logging file which is updated is of fixed size and the new
>>>>>> entries will be wrapped ,overwriting the old entries.
>>>>>>
>>>>>> This way we have seen that after few restarts , the contents of the
>>>>>> same file on two bricks are different , but the volume heal info shows zero
>>>>>> entries
>>>>>>
>>>>>> Solution:
>>>>>>
>>>>>> But when we tried to put delay  > 5 min before the healing
>>>>>> everything is working fine.
>>>>>>
>>>>>> Regards,
>>>>>> Abhishek
>>>>>>
>>>>>> On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N <
>>>>>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote:
>>>>>>
>>>>>>> On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Here, I have one query regarding the time taken by the healing
>>>>>>> process.
>>>>>>> In current two node setup when we rebooted one node then the
>>>>>>> self-healing process starts less than 5min interval on the board which
>>>>>>> resulting the corruption of the some files data.
>>>>>>>
>>>>>>>
>>>>>>> Heal should start immediately after the brick process comes up. What
>>>>>>> version of gluster are you using? What do you mean by corruption of data?
>>>>>>> Also, how did you observe that the heal started after 5 minutes?
>>>>>>> -Ravi
>>>>>>>
>>>>>>>
>>>>>>> And to resolve it I have search on google and found the following
>>>>>>> link:
>>>>>>> <https://support.rackspace.com/how-to/glusterfs-troubleshooting/>
>>>>>>> https://support.rackspace.com/how-to/glusterfs-troubleshooting/
>>>>>>>
>>>>>>> Mentioning that the healing process can takes upto 10min of time to
>>>>>>> start this process.
>>>>>>>
>>>>>>> Here is the statement from the link:
>>>>>>>
>>>>>>> "Healing replicated volumes
>>>>>>>
>>>>>>> When any brick in a replicated volume goes offline, the glusterd
>>>>>>> daemons on the remaining nodes keep track of all the files that are not
>>>>>>> replicated to the offline brick. When the offline brick becomes available
>>>>>>> again, the cluster initiates a healing process, replicating the updated
>>>>>>> files to that brick. *The start of this process can take up to 10
>>>>>>> minutes, based on observation.*"
>>>>>>>
>>>>>>> After giving the time of more than 5 min file corruption problem has
>>>>>>> been resolved.
>>>>>>>
>>>>>>> So, Here my question is there any way through which we can reduce
>>>>>>> the time taken by the healing process to start?
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Abhishek Paliwal
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Abhishek Paliwal
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>> Abhishek Paliwal
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>> Regards
>>> Abhishek Paliwal
>>>
>>>
>>>
>>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
>
>
>


-- 




Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160304/0a3115d2/attachment.html>


More information about the Gluster-users mailing list