[Gluster-devel] Query on healing process
Ravishankar N
ravishankar at redhat.com
Mon Mar 14 08:07:43 UTC 2016
On 03/14/2016 10:36 AM, ABHISHEK PALIWAL wrote:
> Hi Ravishankar,
>
> I just want to inform that this file have some different properties
> from other files like this is the file which having the fixed size and
> when there is no space in file the next data will start wrapping from
> the top of the file.
>
> Means in this file we are doing the wrapping of the data as well.
>
> So, I just want to know is this feature of file will effect gluster to
> identify the split-brain or xattr attributes?
Hi,
No it shouldn't matter at what offset the writes happen. The xattrs only
track that the write was missed (and therefore a pending heal),
irrespective of (offset, length).
Ravi
>
> Regards,
> Abhishek
>
> On Fri, Mar 4, 2016 at 7:00 PM, ABHISHEK PALIWAL
> <abhishpaliwal at gmail.com <mailto:abhishpaliwal at gmail.com>> wrote:
>
>
>
> On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N
> <ravishankar at redhat.com <mailto:ravishankar at redhat.com>> wrote:
>
> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>>
>>
>> Ok, just to confirm, glusterd and other brick processes
>> are running after this node rebooted?
>> When you run the above command, you need to check
>> /var/log/glusterfs/glfsheal-volname.log logs errros.
>> Setting client-log-level to DEBUG would give you a more
>> verbose message
>>
>> Yes, glusterd and other brick processes running fine. I have
>> check the /var/log/glusterfs/glfsheal-volname.log file
>> without the log-level= DEBUG. Here is the logs from that file
>>
>> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
>> Started thread with index 1
>> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
>> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs:
>> could not open the file
>> /proc/sys/net/ipv4/ip_local_reserved_ports for getting
>> reserved ports info [No such file or directory]
>> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
>> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs:
>> Not able to get reserved ports, hence there is a possibility
>> that glusterfs may consume reserved port
>> [2016-03-02 13:51:39.072583] E
>> [socket.c:2278:socket_connect_finish] 0-gfapi: connection to
>> 127.0.0.1:24007 <http://127.0.0.1:24007> failed (Connection
>> refused)
>
> Not sure why ^^ occurs. You could try flushing iptables
> (iptables -F), restart glusterd and run the heal info command
> again .
>
>
> No hint from the logs? I'll try your suggestion.
>
>
>> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
>> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to
>> connect with remote-host: localhost (Transport endpoint is
>> not connected) [Transport endpoint is not connected]
>> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
>> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all
>> volfile servers [Transport endpoint is not connected]
>>
>>> # gluster volume heal c_glusterfs info split-brain
>>> c_glusterfs: Not able to fetch volfile from glusterd
>>> Volume heal failed.
>>
>>>
>>>
>>> And based on the your observation I understood that this
>>> is not the problem of split-brain but *is there any way
>>> through which can find out the file which is not in
>>> split-brain as well as not in sync?*
>>
>> `gluster volume heal c_glusterfs info split-brain`
>> should give you files that need heal.
>>
>
> Sorry I meant 'gluster volume heal c_glusterfs info' should
> give you the files that need heal and 'gluster volume heal
> c_glusterfs info split-brain' the list of files in split-brain.
> The commands are detailed in
> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>
>
> Yes, I have tried this as well It is also giving Number of entries
> : 0 means no healing is required but the file
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> is not in sync both of brick showing the different version of this
> file.
>
> You can see it in the getfattr command outcome as well.
>
>
> # getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-2=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-4=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-6=0x000000000000000000000000
> trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000**//because
> client8 is the latest client in our case and starting 8 digits **
> *
> *00000006....are saying like there is something in changelog data.
> *
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000001356d86c0c000217fd
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> # lhsh 002500 getfattr -m . -d -e hex
> /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>
> getfattr: Removing leading '/' from absolute path names
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000**//
> and here we can say that there is no split brain but the file is
> out of sync*
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000001156d86c290005735c
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> Regards,
>
> Abhishek
>
>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160314/2fac7f2c/attachment-0001.html>
More information about the Gluster-devel
mailing list