[Gluster-users] Problem with self-heal

Milos Kozak milos.kozak at lejmr.com
Tue Jul 15 13:09:32 UTC 2014


I read your answer, but I dont know how how to create my RPM files, 
because I dont want to install it right to the system.. Is there any manual?

On 7/15/2014 8:34 AM, Ravishankar N wrote:
> On 07/15/2014 05:47 PM, Milos Kozak wrote:
>> Hi,
>> Yesterday I was gonna to replicate the error, but I didnt managed to
>> do it, so I started to wonder whether it wasnt bad call..
>>
>> I read the following links, so I would like to ask :D Does it mean,
>> that this bug is caused by very fast recovery of connection? Or are
>> there other things that come to the game? I am running 3.5.1 on
>> production servers for less important stuff, and there one server came
>> down this weekend. After all the heal process was totally fine. As
>> long as the real server boots nearly 5minuts. Does it mean that this
>> was the reason why I didnt experienced this bug?
>>
>
> Yes,  it happened when the client quickly reconnected before the server
> had a chance to discard the stale inode and fd tables. Hope you got a
> chance to look at my comment in the BZ [1]
> Thanks,
> Ravi
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1115748#c16
>
>
>>
>> When we can expect Gluster 3.5.2 to be released?
>>
>> Thanks Milos
>>
>>
>>
>>
>> On 7/13/2014 10:23 PM, Ravishankar N wrote:
>>> On 07/13/2014 09:05 PM, Miloš Kozák wrote:
>>>> Hi, I would like to ask about the progress. On the ticket there is
>>>> nothing new added..
>>>>
>>>
>>>
>>> I haven't had a chance to look at the logs/ reproduce the bug. Will get
>>> to it in a couple of days.
>>> Thanks,
>>> Ravi
>>>
>>>
>>>> Thanks, Milos
>>>>
>>>>
>>>>
>>>> Dne 14-07-02 11:37 PM, Miloš Kozák napsal(a):
>>>>> Submitted: 1115748
>>>>>
>>>>> Milos
>>>>>
>>>>> Dne 14-07-02 11:40 AM, Vijay Bellur napsal(a):
>>>>>> On 07/02/2014 06:15 PM, Milos Kozak wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am going to replicate the problem on clean gluster configuration
>>>>>>> latter today. So far my answers are below.
>>>>>>>
>>>>>>> On 7/2/2014 1:38 AM, Ravishankar N wrote:
>>>>>>>> On 07/02/2014 02:28 AM, Miloš Kozák wrote:
>>>>>>>>> Hi,
>>>>>>>>> I am running some test on top of v3.5.1 in my 2 nodes
>>>>>>>>> configuration
>>>>>>>>> with one disk each and replica 2 mode.
>>>>>>>>>
>>>>>>>>> I have two servers connected by a cable. Through this cable I let
>>>>>>>>> glusterd communicate. I start dd to create a relatively large
>>>>>>>>> file. In
>>>>>>>>> the middle of writing process I disconnect the cable, so on one
>>>>>>>>> server
>>>>>>>>> (node1) I can see all data and on the other one (node2) I can see
>>>>>>>>> just
>>>>>>>>> a split of the file when writing is finished
>>>>>>>>
>>>>>>>> Does this mean your client (mount point) is also on node 1?
>>>>>>>
>>>>>>> Yes I mounted volume on both servers as follows:
>>>>>>> localhost:vg0    /mnt
>>>>>>>
>>>>>>>>> .. no surprise so far.
>>>>>>>>>
>>>>>>>>> Then I put the cable back. After a while peers are discovered,
>>>>>>>>> self-healing daemons start to communicate, so I can see:
>>>>>>>>>
>>>>>>>>> gluster volume heal vg0 info
>>>>>>>>> Brick node1:/dist1/brick/fs/
>>>>>>>>> /node-middle - Possibly undergoing heal
>>>>>>>>> Number of entries: 1
>>>>>>>>>
>>>>>>>>> Brick node2:/dist1/brick/fs/
>>>>>>>>> /node-middle - Possibly undergoing heal
>>>>>>>>> Number of entries: 1
>>>>>>>>>
>>>>>>>>> But on the network there are no data moving, which I verify by
>>>>>>>>> df..
>>>>>>>>>
>>>>>>>> When  you get "Possibly undergoing heal" and no I/O is going on
>>>>>>>> from the
>>>>>>>> client, it means the self-heal daemon is healing the file. Can you
>>>>>>>> check
>>>>>>>> if there are  messages in glustershd.log of node1 about self-heal
>>>>>>>> completion ?
>>>>>>>
>>>>>>> There are no lines in log, that is the reason why I wrote this email
>>>>>>> eventually.
>>>>>>>
>>>>>>>>> Any help? In my opinion after a while I should get my nodes
>>>>>>>>> synchronized, but after 20minuts of waiting still nothing (the
>>>>>>>>> file
>>>>>>>>> was 2G big)
>>>>>>>> Does gluster volume status show all processes being online?
>>>>>>>
>>>>>>> All processes are running.
>>>>>>>
>>>>>>
>>>>>> Output of strace -f -p <self-heal-daemon pid> from both nodes might
>>>>>> also help.
>>>>>>
>>>>>> Thanks,
>>>>>> Vijay
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>



More information about the Gluster-users mailing list