[Gluster-users] Copy operation freezes. Lots of locks in state BLOCKED (3-node setup with 1 arbiter)
Pranith Kumar Karampuri
pkarampu at redhat.com
Wed Nov 4 23:18:46 UTC 2015
On 11/04/2015 09:10 PM, Adrian Gruntkowski wrote:
> Hello,
>
> I have applied Pranith's patch myself on current 3.7.5 release and
> rebuilt packages. Unfortunately, the issue is still there :( It
> behaves exactly the same.
Could you get the statedumps of the bricks again? I will take a look?
May be the hang I observed is different from what you are observing and
I only fixed the one I observed.
Pranith
>
> Regards,
> Adrian
>
> 2015-10-28 12:02 GMT+01:00 Pranith Kumar Karampuri
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>
>
>
> On 10/28/2015 04:27 PM, Adrian Gruntkowski wrote:
>> Hello Pranith,
>>
>> Thank you for prompt reaction. I didn't get back to this until
>> now, because I had other problems to deal with.
>>
>> Are there chances that it will get released this or next month?
>> If not, I will probably have to resort to compiling on my own.
> I am planning to get this in for 3.7.6 which is to be released by
> end of this month. I guess in 4-5 days :-). I will update you
>
> Pranith
>
>>
>> Regards,
>> Adrian
>>
>>
>> 2015-10-26 12:37 GMT+01:00 Pranith Kumar Karampuri
>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>
>>
>>
>> On 10/23/2015 10:10 AM, Ravishankar N wrote:
>>>
>>>
>>> On 10/21/2015 05:55 PM, Adrian Gruntkowski wrote:
>>>> Hello,
>>>>
>>>> I'm trying to track down a problem with my setup (version
>>>> 3.7.3 on Debian stable).
>>>>
>>>> I have a couple of volumes setup in 3-node configuration
>>>> with 1 brick as an arbiter for each.
>>>>
>>>> There are 4 volumes set up in cross-over across 3 physical
>>>> servers, like this:
>>>>
>>>>
>>>>
>>>> ------------------------------------->[ GigabitEthernet
>>>> switch ]<--------------------------
>>>> | ^
>>>> |
>>>> | |
>>>> |
>>>> V V
>>>> V
>>>> /-------------------------- \ /-------------------------- \
>>>> /-------------------------- \
>>>> | web-rep | | cluster-rep
>>>> | | mail-rep |
>>>> | | | |
>>>> | |
>>>> | vols: | | vols:
>>>> | | vols: |
>>>> | system_www1 | | system_www1
>>>> | | system_www1(arbiter) |
>>>> | data_www1 | | data_www1
>>>> | | data_www1(arbiter) |
>>>> | system_mail1(arbiter) | | system_mail1
>>>> | | system_mail1 |
>>>> | data_mail1(arbiter) | | data_mail1
>>>> | | data_mail1 |
>>>> \---------------------------/ \---------------------------/
>>>> \---------------------------/
>>>>
>>>>
>>>> Now, after a fresh boot-up, everything seems to be running
>>>> fine.
>>>> Then I start copying big files (KVM disk images) from local
>>>> disk to gluster mounts.
>>>> In the beginning it seems to be running fine (although
>>>> iowait seems go so high that it clogs up io operations
>>>> at some moments, but that's an issue for later). After some
>>>> time the transfer freezes, then
>>>> after some (long) time, it advances in a short burst to
>>>> freeze again. Another interesting thing is that
>>>> I see constant flow of the network traffic on interfaces
>>>> dedicated to gluster, even when there's a "freeze".
>>>>
>>>> I have done "gluster volume statedump" at that time of
>>>> transfer (file is copied from local disk on cluster-rep
>>>> onto local mount of "system_www1" volume). I've observer a
>>>> following section in the dump for cluster-rep node:
>>>>
>>>> [xlator.features.locks.system_www1-locks.inode]
>>>> path=/images/101/vm-101-disk-1.qcow2
>>>> mandatory=0
>>>> inodelk-count=12
>>>> lock-dump.domain.domain=system_www1-replicate-0:self-heal
>>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 18446744073709551610, owner=c811600cd67f0000,
>>>> client=0x7fbe100df280,
>>>> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0,
>>>> granted at 2015-10-21 11:36:22
>>>> lock-dump.domain.domain=system_www1-replicate-0
>>>> inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
>>>> start=2195849216, len=131072, pid = 18446744073709551610,
>>>> owner=c811600cd67f0000, client=0x7fbe100df280,
>>>> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0,
>>>> granted at 2015-10-21 11:37:45
>>>> inodelk.inodelk[1](ACTIVE)=type=WRITE, whence=0,
>>>> start=9223372036854775805, len=1, pid =
>>>> 18446744073709551610, owner=c811600cd67f0000,
>>>> client=0x7fbe100df280,
>>>> connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0,
>>>> granted at 2015-10-21 11:36:22
>>>
>>> From the statedump, It looks like self-heal daemon had taken
>>> locks to heal the file due to which the locks attempted by
>>> the client (mount) are in blocked state.
>>> In Arbiter volumes the client (mount) takes full locks
>>> (start=0, len=0) for every write() as opposed to normal
>>> replica volumes which take range locks (i.e. appropriate
>>> start,len values) for that write(). This is done to avoid
>>> network split-brains.
>>> So in normal replica volumes, clients can still write to a
>>> file while heal is going on, as long as the offsets don't
>>> overlap. This is not the case with arbiter volumes.
>>> You can look at the client or glustershd logs to see if
>>> there are messages that indicate healing of a file,
>>> something along the lines of "Completed data selfheal on xxx"
>> hi Adrian,
>> Thanks for taking the time to send this mail. I raised
>> this as bug
>> @https://bugzilla.redhat.com/show_bug.cgi?id=1275247, fix is
>> posted for review @ http://review.gluster.com/#/c/12426/
>>
>> Pranith
>>
>>>
>>>> inodelk.inodelk[2](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=c4fd2d78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[3](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=dc752e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[4](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=34832e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[5](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=d44d2e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[6](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=306f2e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[7](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=8c902e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[8](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=782c2e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[9](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=1c0b2e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>> inodelk.inodelk[10](BLOCKED)=type=WRITE, whence=0, start=0,
>>>> len=0, pid = 0, owner=24332e78487f0000,
>>>> client=0x7fbe100e1380,
>>>> connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,
>>>> blocked at 2015-10-21 11:37:45
>>>>
>>>> There seem to be multiple locks in BLOCKED state - which
>>>> doesn't look normal to me. The other 2 nodes have
>>>> only 2 ACTIVE locks at the same time.
>>>>
>>>> Below is "gluster volume info" output.
>>>>
>>>> # gluster volume info
>>>>
>>>> Volume Name: data_mail1
>>>> Type: Replicate
>>>> Volume ID: fc3259a1-ddcf-46e9-ae77-299aaad93b7c
>>>> Status: Started
>>>> Number of Bricks: 1 x 3 = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: cluster-rep:/GFS/data/mail1
>>>> Brick2: mail-rep:/GFS/data/mail1
>>>> Brick3: web-rep:/GFS/data/mail1
>>>> Options Reconfigured:
>>>> performance.readdir-ahead: on
>>>> cluster.quorum-count: 2
>>>> cluster.quorum-type: fixed
>>>> cluster.server-quorum-ratio: 51%
>>>>
>>>> Volume Name: data_www1
>>>> Type: Replicate
>>>> Volume ID: 0c37a337-dbe5-4e75-8010-94e068c02026
>>>> Status: Started
>>>> Number of Bricks: 1 x 3 = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: cluster-rep:/GFS/data/www1
>>>> Brick2: web-rep:/GFS/data/www1
>>>> Brick3: mail-rep:/GFS/data/www1
>>>> Options Reconfigured:
>>>> performance.readdir-ahead: on
>>>> cluster.quorum-type: fixed
>>>> cluster.quorum-count: 2
>>>> cluster.server-quorum-ratio: 51%
>>>>
>>>> Volume Name: system_mail1
>>>> Type: Replicate
>>>> Volume ID: 0568d985-9fa7-40a7-bead-298310622cb5
>>>> Status: Started
>>>> Number of Bricks: 1 x 3 = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: cluster-rep:/GFS/system/mail1
>>>> Brick2: mail-rep:/GFS/system/mail1
>>>> Brick3: web-rep:/GFS/system/mail1
>>>> Options Reconfigured:
>>>> performance.readdir-ahead: on
>>>> cluster.quorum-type: none
>>>> cluster.quorum-count: 2
>>>> cluster.server-quorum-ratio: 51%
>>>>
>>>> Volume Name: system_www1
>>>> Type: Replicate
>>>> Volume ID: 147636a2-5c15-4d9a-93c8-44d51252b124
>>>> Status: Started
>>>> Number of Bricks: 1 x 3 = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: cluster-rep:/GFS/system/www1
>>>> Brick2: web-rep:/GFS/system/www1
>>>> Brick3: mail-rep:/GFS/system/www1
>>>> Options Reconfigured:
>>>> performance.readdir-ahead: on
>>>> cluster.quorum-type: none
>>>> cluster.quorum-count: 2
>>>> cluster.server-quorum-ratio: 51%
>>>>
>>>> The issue does not occur when I get rid of 3rd arbiter brick.
>>>
>>> What do you mean by 'getting rid of'? Killing the 3rd brick
>>> process of the volume?
>>>
>>> Regards,
>>> Ravi
>>>>
>>>> If there's any additional information that is missing and I
>>>> could provide, please let me know.
>>>>
>>>> Greetings,
>>>> Adrian
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151105/1d47490e/attachment.html>
More information about the Gluster-users
mailing list