[Gluster-devel] Query on healing process

ABHISHEK PALIWAL abhishpaliwal at gmail.com
Fri Mar 4 13:30:03 UTC 2016


On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N <ravishankar at redhat.com>
wrote:

> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>
>
>> Ok, just to confirm, glusterd  and other brick processes are running
>> after this node rebooted?
>> When you run the above command, you need to check
>> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
>> client-log-level to DEBUG would give you a more verbose message
>>
>> Yes, glusterd and other brick processes running fine. I have check the
> /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
> Here is the logs from that file
>
> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
> file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
> info [No such file or directory]
> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
> get reserved ports, hence there is a possibility that glusterfs may consume
> reserved port
> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
> 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
>
>
> Not sure why ^^ occurs. You could try flushing iptables (iptables -F),
> restart glusterd and run the heal info command again .
>

No hint from the logs? I'll try your suggestion.

>
> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
> remote-host: localhost (Transport endpoint is not connected) [Transport
> endpoint is not connected]
> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
> servers [Transport endpoint is not connected]
>
>> # gluster volume heal c_glusterfs info split-brain
>> c_glusterfs: Not able to fetch volfile from glusterd
>> Volume heal failed.
>>
>>
>>
>>
>> And based on the your observation I understood that this is not the
>> problem of split-brain but *is there any way through which can find out
>> the file which is not in split-brain as well as not in sync?*
>>
>>
>> `gluster volume heal c_glusterfs info split-brain`  should give you files
>> that need heal.
>>
>
> Sorry  I meant 'gluster volume heal c_glusterfs info' should give you the
> files that need heal and 'gluster volume heal c_glusterfs info
> split-brain' the list of files in split-brain.
> The commands are detailed in
> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>

Yes, I have tried this as well It is also giving Number of entries : 0
means no healing is required but the file
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is
not in sync both of brick showing the different version of this file.

You can see it in the getfattr command outcome as well.


# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x000000000000000000000000
trusted.afr.c_glusterfs-client-2=0x000000000000000000000000
trusted.afr.c_glusterfs-client-4=0x000000000000000000000000
trusted.afr.c_glusterfs-client-6=0x000000000000000000000000
trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000** //because
client8 is the latest client in our case and starting 8 digits *

*00000006....are saying like there is something in changelog data.*
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000** // and here
we can say that there is no split brain but the file is out of sync*
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae


> Regards,
>
   Abhishek

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160304/442f7fac/attachment.html>


More information about the Gluster-devel mailing list