<div dir="auto">Hi,<div dir="auto">     Did you find the command from strace?</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 25 Jan 2018 1:52 pm, &quot;Pranith Kumar Karampuri&quot; &lt;<a href="mailto:pkarampu@redhat.com">pkarampu@redhat.com</a>&gt; wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jan 25, 2018 at 1:49 PM, Samuli Heinonen <span dir="ltr">&lt;<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="m_-746125927662200449gmail-">Pranith Kumar Karampuri kirjoitti 25.01.2018 07:09:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
On Thu, Jan 25, 2018 at 2:27 AM, Samuli Heinonen<br>
&lt;<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a>&gt; wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Hi!<br>
<br>
Thank you very much for your help so far. Could you please tell an<br>
example command how to use aux-gid-mount to remove locks? &quot;gluster<br>
vol clear-locks&quot; seems to mount volume by itself.<br>
</blockquote>
<br>
You are correct, sorry, this was implemented around 7 years back and I<br>
forgot that bit about it :-(. Essentially it becomes a getxattr<br>
syscall on the file.<br>
Could you give me the clear-locks command you were trying to execute<br>
and I can probably convert it to the getfattr command?<br>
</blockquote>
<br></span>
I have been testing this in test environment and with command:<br>
gluster vol clear-locks g1 /.gfid/14341ccb-df7b-4f92-90d5<wbr>-7814431c5a1c kind all inode<br></blockquote><div><br></div><div>Could you do strace of glusterd when this happens? It will have a getxattr with &quot;glusterfs.clrlk&quot; in the key. You need to execute that on the gfid-aux-mount<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_-746125927662200449gmail-h5">
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Best regards,<br>
Samuli Heinonen<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Pranith Kumar Karampuri &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;<br>
23 January 2018 at 10.30<br>
<br>
On Tue, Jan 23, 2018 at 1:38 PM, Samuli Heinonen<br>
&lt;<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a> &lt;mailto:<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a>&gt;<wbr>&gt; wrote:<br>
<br>
Pranith Kumar Karampuri kirjoitti 23.01.2018 09:34:<br>
<br>
On Mon, Jan 22, 2018 at 12:33 AM, Samuli Heinonen<br>
<br>
&lt;<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a> &lt;mailto:<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a>&gt;<wbr>&gt;<br>
wrote:<br>
<br>
Hi again,<br>
<br>
here is more information regarding issue described<br>
earlier<br>
<br>
It looks like self healing is stuck. According to<br>
&quot;heal<br>
statistics&quot;<br>
crawl began at Sat Jan 20 12:56:19 2018 and it&#39;s still<br>
going on<br>
(It&#39;s around Sun Jan 21 20:30 when writing this).<br>
However<br>
glustershd.log says that last heal was completed at<br>
&quot;2018-01-20<br>
11:00:13.090697&quot; (which is 13:00 UTC+2). Also &quot;heal<br>
info&quot;<br>
has been<br>
running now for over 16 hours without any information.<br>
In<br>
statedump<br>
I can see that storage nodes have locks on files and<br>
some<br>
of those<br>
are blocked. Ie. Here again it says that ovirt8z2 is<br>
having active<br>
lock even ovirt8z2 crashed after the lock was<br>
granted.:<br>
<br>
[xlator.features.locks.zone2-s<wbr>sd1-vmstor1-locks.inode]<br>
path=/.shard/3d55f8cc-cda9-489<wbr>a-b0a3-fd0f43d67876.27<br>
mandatory=0<br>
inodelk-count=3<br>
<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:self-<wbr>heal<br>
inodelk.inodelk[0](ACTIVE)=typ<wbr>e=WRITE, whence=0,<br>
start=0,<br>
len=0, pid<br>
= 18446744073709551610, owner=d0c6d857a87f0000,<br>
client=0x7f885845efa0,<br>
<br>
<br>
<br>
</blockquote>
<br>
</blockquote>
connection-id=sto2z2.xxx-10975<wbr>-2018/01/20-10:56:14:649541-zo<wbr>ne2-ssd1-vmstor1-client-0-0-0,<br>
</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_-746125927662200449gmail-h5">
<br>
granted at 2018-01-20 10:59:52<br>
<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:metad<wbr>ata<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0<br>
inodelk.inodelk[0](ACTIVE)=typ<wbr>e=WRITE, whence=0,<br>
start=0,<br>
len=0, pid<br>
= 3420, owner=d8b9372c397f0000, client=0x7f8858410be0,<br>
<br></div></div>
connection-id=<a href="http://ovirt8z2.xxx.com" rel="noreferrer" target="_blank">ovirt8z2.xxx.com</a> [1]<br>
<br>
<br>
</blockquote>
<br>
</blockquote><div><div class="m_-746125927662200449gmail-h5">
&lt;<a href="http://ovirt8z2.xxx.com" rel="noreferrer" target="_blank">http://ovirt8z2.xxx.com</a>&gt;-5652<wbr>-2017/12/27-09:49:02:946825-zo<wbr>ne2-ssd1-vmstor1-client-0-7-0,<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
granted at 2018-01-20 08:57:23<br>
inodelk.inodelk[1](BLOCKED)=ty<wbr>pe=WRITE, whence=0,<br>
start=0,<br>
len=0,<br>
pid = 18446744073709551610, owner=d0c6d857a87f0000,<br>
client=0x7f885845efa0,<br>
<br>
<br>
<br>
</blockquote>
<br>
</blockquote>
connection-id=sto2z2.xxx-10975<wbr>-2018/01/20-10:56:14:649541-zo<wbr>ne2-ssd1-vmstor1-client-0-0-0,<br>
</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_-746125927662200449gmail-h5">
<br>
blocked at 2018-01-20 10:59:52<br>
<br>
I&#39;d also like to add that volume had arbiter brick<br>
before<br>
crash<br>
happened. We decided to remove it because we thought<br>
that<br>
it was<br>
causing issues. However now I think that this was<br>
unnecessary. After<br>
the crash arbiter logs had lots of messages like this:<br>
[2018-01-20 10:19:36.515717] I [MSGID: 115072]<br>
[server-rpc-fops.c:1640:server<wbr>_setattr_cbk]<br>
0-zone2-ssd1-vmstor1-server: 37374187: SETATTR<br>
&lt;gfid:a52055bd-e2e9-42dd-92a3-<wbr>e96b693bcafe&gt;<br>
(a52055bd-e2e9-42dd-92a3-e96b6<wbr>93bcafe) ==&gt; (Operation<br>
not<br>
permitted)<br>
[Operation not permitted]<br>
<br>
Is there anyways to force self heal to stop? Any help<br>
would be very<br>
much appreciated :)<br>
<br>
Exposing .shard to a normal mount is opening a can of<br>
worms. You<br>
should probably look at mounting the volume with gfid<br>
aux-mount where<br>
you can access a file with<br>
&lt;path-to-mount&gt;/.gfid/&lt;gfid-st<wbr>ring&gt;to clear<br>
locks on it.<br>
<br>
Mount command:  mount -t glusterfs -o aux-gfid-mount<br>
vm1:test<br>
/mnt/testvol<br>
<br>
A gfid string will have some hyphens like:<br>
11118443-1894-4273-9340-4b212f<wbr>a1c0e4<br>
<br>
That said. Next disconnect on the brick where you<br>
successfully<br>
did the<br>
clear-locks will crash the brick. There was a bug in 3.8.x<br>
series with<br>
clear-locks which was fixed in 3.9.0 with a feature. The<br>
self-heal<br>
deadlocks that you witnessed also is fixed in 3.10 version<br>
of the<br>
release.<br>
<br>
Thank you the answer. Could you please tell more about crash?<br>
What<br>
will actually happen or is there a bug report about it? Just<br>
want<br>
to make sure that we can do everything to secure data on<br>
bricks.<br>
We will look into upgrade but we have to make sure that new<br>
version works for us and of course get self healing working<br>
before<br>
doing anything :)<br>
<br>
Locks xlator/module maintains a list of locks that are granted to<br>
a client. Clear locks had an issue where it forgets to remove the<br>
lock from this list. So the connection list ends up pointing to<br>
data that is freed in that list after a clear lock. When a<br>
disconnect happens, all the locks that are granted to a client<br>
need to be unlocked. So the process starts traversing through this<br>
list and when it starts trying to access this freed data it leads<br>
to a crash. I found it while reviewing a feature patch sent by<br>
facebook folks to locks xlator (<a href="http://review.gluster.org/14816" rel="noreferrer" target="_blank">http://review.gluster.org/148<wbr>16</a><br></div></div>
[2]) for 3.9.0 and they also fixed this bug as well as part of<div><div class="m_-746125927662200449gmail-h5"><br>
that feature patch.<br>
<br>
Br,<br>
Samuli<br>
<br>
3.8.x is EOLed, so I recommend you to upgrade to a<br>
supported<br>
version<br>
soon.<br>
<br>
Best regards,<br>
Samuli Heinonen<br>
<br>
Samuli Heinonen<br>
20 January 2018 at 21.57<br>
<br>
Hi all!<br>
<br>
One hypervisor on our virtualization environment<br>
crashed and now<br>
some of the VM images cannot be accessed. After<br>
investigation we<br>
found out that there was lots of images that still<br>
had<br>
active lock<br>
on crashed hypervisor. We were able to remove<br>
locks<br>
from &quot;regular<br>
files&quot;, but it doesn&#39;t seem possible to remove<br>
locks<br>
from shards.<br>
<br>
We are running GlusterFS 3.8.15 on all nodes.<br>
<br>
Here is part of statedump that shows shard having<br>
active lock on<br>
crashed node:<br>
<br>
[xlator.features.locks.zone2-s<wbr>sd1-vmstor1-locks.inode]<br>
<br>
path=/.shard/75353c17-d6b8-485<wbr>d-9baf-fd6c700e39a1.21<br>
mandatory=0<br>
inodelk-count=1<br>
<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:metad<wbr>ata<br>
<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:self-<wbr>heal<br>
<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0<br>
inodelk.inodelk[0](ACTIVE)=typ<wbr>e=WRITE, whence=0,<br>
start=0, len=0,<br>
pid = 3568, owner=14ce372c397f0000,<br>
client=0x7f3198388770,<br>
connection-id<br>
<br>
<br>
<br>
</div></div></blockquote>
<br>
</blockquote>
ovirt8z2.xxx-5652-2017/12/27-0<wbr>9:49:02:946825-zone2-ssd1-vmst<wbr>or1-client-1-7-0,<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_-746125927662200449gmail-h5">
<br>
granted at 2018-01-20 08:57:24<br>
<br>
If we try to run clear-locks we get following<br>
error<br>
message:<br>
# gluster volume clear-locks zone2-ssd1-vmstor1<br>
/.shard/75353c17-d6b8-485d-9ba<wbr>f-fd6c700e39a1.21<br>
kind<br>
all inode<br>
Volume clear-locks unsuccessful<br>
clear-locks getxattr command failed. Reason:<br>
Operation not<br>
permitted<br>
<br>
Gluster vol info if needed:<br>
Volume Name: zone2-ssd1-vmstor1<br>
Type: Replicate<br>
Volume ID: b6319968-690b-4060-8fff-b212d2<wbr>295208<br>
Status: Started<br>
Snapshot Count: 0<br>
Number of Bricks: 1 x 2 = 2<br>
Transport-type: rdma<br>
Bricks:<br>
Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1<wbr>/export<br>
Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1<wbr>/export<br>
Options Reconfigured:<br>
cluster.shd-wait-qlength: 10000<br>
cluster.shd-max-threads: 8<br>
cluster.locking-scheme: granular<br>
performance.low-prio-threads: 32<br>
cluster.data-self-heal-algorit<wbr>hm: full<br>
performance.client-io-threads: off<br>
storage.linux-aio: off<br>
performance.readdir-ahead: on<br>
client.event-threads: 16<br>
server.event-threads: 16<br>
performance.strict-write-order<wbr>ing: off<br>
performance.quick-read: off<br>
performance.read-ahead: on<br>
performance.io-cache: off<br>
performance.stat-prefetch: off<br>
cluster.eager-lock: enable<br>
network.remote-dio: on<br>
cluster.quorum-type: none<br>
network.ping-timeout: 22<br>
performance.write-behind: off<br>
nfs.disable: on<br>
features.shard: on<br>
features.shard-block-size: 512MB<br>
storage.owner-uid: 36<br>
storage.owner-gid: 36<br>
performance.io-thread-count: 64<br>
performance.cache-size: 2048MB<br>
performance.write-behind-windo<wbr>w-size: 256MB<br>
server.allow-insecure: on<br>
cluster.ensure-durability: off<br>
config.transport: rdma<br>
server.outstanding-rpc-limit: 512<br>
diagnostics.brick-log-level: INFO<br>
<br>
Any recommendations how to advance from here?<br>
<br>
Best regards,<br>
Samuli Heinonen<br>
<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
&lt;mailto:<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.<wbr>org</a>&gt;<br>
<br>
</div></div><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a> [3]<br>
&lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mail<wbr>man/listinfo/gluster-users</a> [3]&gt;<span class="m_-746125927662200449gmail-"><br>
[1]<br>
<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
&lt;mailto:<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.<wbr>org</a>&gt;<br>
<br>
</span><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a> [3]<br>
<br>
&lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mail<wbr>man/listinfo/gluster-users</a> [3]&gt; [1]<span class="m_-746125927662200449gmail-"><br>
<br>
--<br>
<br>
Pranith<br>
<br>
Links:<br>
------<br>
[1]<br>
</span><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a> [3]<br>
&lt;<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mail<wbr>man/listinfo/gluster-users</a><br>
[3]&gt;<div><div class="m_-746125927662200449gmail-h5"><br>
<br>
--<br>
Pranith<br>
Samuli Heinonen &lt;mailto:<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a>&gt;<br>
21 January 2018 at 21.03<br>
Hi again,<br>
<br>
here is more information regarding issue described earlier<br>
<br>
It looks like self healing is stuck. According to &quot;heal<br>
statistics&quot; crawl began at Sat Jan 20 12:56:19 2018 and it&#39;s still<br>
going on (It&#39;s around Sun Jan 21 20:30 when writing this). However<br>
glustershd.log says that last heal was completed at &quot;2018-01-20<br>
11:00:13.090697&quot; (which is 13:00 UTC+2). Also &quot;heal info&quot; has been<br>
running now for over 16 hours without any information. In<br>
statedump I can see that storage nodes have locks on files and<br>
some of those are blocked. Ie. Here again it says that ovirt8z2 is<br>
having active lock even ovirt8z2 crashed after the lock was<br>
granted.:<br>
<br>
[xlator.features.locks.zone2-s<wbr>sd1-vmstor1-locks.inode]<br>
path=/.shard/3d55f8cc-cda9-489<wbr>a-b0a3-fd0f43d67876.27<br>
mandatory=0<br>
inodelk-count=3<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:self-<wbr>heal<br>
inodelk.inodelk[0](ACTIVE)=typ<wbr>e=WRITE, whence=0, start=0, len=0,<br>
pid = 18446744073709551610, owner=d0c6d857a87f0000,<br>
client=0x7f885845efa0,<br>
<br>
</div></div></blockquote>
<br>
</blockquote>
connection-id=sto2z2.xxx-10975<wbr>-2018/01/20-10:56:14:649541-zo<wbr>ne2-ssd1-vmstor1-client-0-0-0,<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_-746125927662200449gmail-h5"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
granted at 2018-01-20 10:59:52<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:metad<wbr>ata<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0<br>
inodelk.inodelk[0](ACTIVE)=typ<wbr>e=WRITE, whence=0, start=0, len=0,<br>
pid = 3420, owner=d8b9372c397f0000, client=0x7f8858410be0,<br>
connection-id=<a href="http://ovirt8z2.xxx.com" rel="noreferrer" target="_blank">ovirt8z2.xxx.com</a><br>
<br>
</blockquote></div></div>
[1]-5652-2017/12/27-09:49:02:9<wbr>46825-zone2-ssd1-vmstor1-clien<wbr>t-0-7-0,<span class="m_-746125927662200449gmail-"><br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
granted at 2018-01-20 08:57:23<br>
inodelk.inodelk[1](BLOCKED)=ty<wbr>pe=WRITE, whence=0, start=0, len=0,<br>
pid = 18446744073709551610, owner=d0c6d857a87f0000,<br>
client=0x7f885845efa0,<br>
<br>
</blockquote>
<br>
</span></blockquote>
connection-id=sto2z2.xxx-10975<wbr>-2018/01/20-10:56:14:649541-zo<wbr>ne2-ssd1-vmstor1-client-0-0-0,<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="m_-746125927662200449gmail-">
blocked at 2018-01-20 10:59:52<br>
<br>
I&#39;d also like to add that volume had arbiter brick before crash<br>
happened. We decided to remove it because we thought that it was<br>
causing issues. However now I think that this was unnecessary.<br>
After the crash arbiter logs had lots of messages like this:<br>
[2018-01-20 10:19:36.515717] I [MSGID: 115072]<br>
[server-rpc-fops.c:1640:server<wbr>_setattr_cbk]<br>
0-zone2-ssd1-vmstor1-server: 37374187: SETATTR<br>
&lt;gfid:a52055bd-e2e9-42dd-92a3-<wbr>e96b693bcafe&gt;<br>
(a52055bd-e2e9-42dd-92a3-e96b6<wbr>93bcafe) ==&gt; (Operation not<br>
permitted) [Operation not permitted]<br>
<br>
Is there anyways to force self heal to stop? Any help would be<br>
very much appreciated :)<br>
<br></span><span class="m_-746125927662200449gmail-">
Best regards,<br>
Samuli Heinonen<br>
<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
</span><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a> [3]<div><div class="m_-746125927662200449gmail-h5"><br>
Samuli Heinonen &lt;mailto:<a href="mailto:samppah@neutraali.net" target="_blank">samppah@neutraali.net</a>&gt;<br>
<br>
20 January 2018 at 21.57<br>
Hi all!<br>
<br>
One hypervisor on our virtualization environment crashed and now<br>
some of the VM images cannot be accessed. After investigation we<br>
found out that there was lots of images that still had active lock<br>
on crashed hypervisor. We were able to remove locks from &quot;regular<br>
files&quot;, but it doesn&#39;t seem possible to remove locks from shards.<br>
<br>
We are running GlusterFS 3.8.15 on all nodes.<br>
<br>
Here is part of statedump that shows shard having active lock on<br>
crashed node:<br>
[xlator.features.locks.zone2-s<wbr>sd1-vmstor1-locks.inode]<br>
path=/.shard/75353c17-d6b8-485<wbr>d-9baf-fd6c700e39a1.21<br>
mandatory=0<br>
inodelk-count=1<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:metad<wbr>ata<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0:self-<wbr>heal<br>
lock-dump.domain.domain=zone2-<wbr>ssd1-vmstor1-replicate-0<br>
inodelk.inodelk[0](ACTIVE)=typ<wbr>e=WRITE, whence=0, start=0, len=0,<br>
pid = 3568, owner=14ce372c397f0000, client=0x7f3198388770,<br>
connection-id<br>
<br>
</div></div></blockquote>
<br>
</blockquote>
ovirt8z2.xxx-5652-2017/12/27-0<wbr>9:49:02:946825-zone2-ssd1-vmst<wbr>or1-client-1-7-0,<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="m_-746125927662200449gmail-h5">
granted at 2018-01-20 08:57:24<br>
<br>
If we try to run clear-locks we get following error message:<br>
# gluster volume clear-locks zone2-ssd1-vmstor1<br>
/.shard/75353c17-d6b8-485d-9ba<wbr>f-fd6c700e39a1.21 kind all inode<br>
Volume clear-locks unsuccessful<br>
clear-locks getxattr command failed. Reason: Operation not<br>
permitted<br>
<br>
Gluster vol info if needed:<br>
Volume Name: zone2-ssd1-vmstor1<br>
Type: Replicate<br>
Volume ID: b6319968-690b-4060-8fff-b212d2<wbr>295208<br>
Status: Started<br>
Snapshot Count: 0<br>
Number of Bricks: 1 x 2 = 2<br>
Transport-type: rdma<br>
Bricks:<br>
Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1<wbr>/export<br>
Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1<wbr>/export<br>
Options Reconfigured:<br>
cluster.shd-wait-qlength: 10000<br>
cluster.shd-max-threads: 8<br>
cluster.locking-scheme: granular<br>
performance.low-prio-threads: 32<br>
cluster.data-self-heal-algorit<wbr>hm: full<br>
performance.client-io-threads: off<br>
storage.linux-aio: off<br>
performance.readdir-ahead: on<br>
client.event-threads: 16<br>
server.event-threads: 16<br>
performance.strict-write-order<wbr>ing: off<br>
performance.quick-read: off<br>
performance.read-ahead: on<br>
performance.io-cache: off<br>
performance.stat-prefetch: off<br>
cluster.eager-lock: enable<br>
network.remote-dio: on<br>
cluster.quorum-type: none<br>
network.ping-timeout: 22<br>
performance.write-behind: off<br>
nfs.disable: on<br>
features.shard: on<br>
features.shard-block-size: 512MB<br>
storage.owner-uid: 36<br>
storage.owner-gid: 36<br>
performance.io-thread-count: 64<br>
performance.cache-size: 2048MB<br>
performance.write-behind-windo<wbr>w-size: 256MB<br>
server.allow-insecure: on<br>
cluster.ensure-durability: off<br>
config.transport: rdma<br>
server.outstanding-rpc-limit: 512<br>
diagnostics.brick-log-level: INFO<br>
<br>
Any recommendations how to advance from here?<br>
<br>
Best regards,<br>
Samuli Heinonen<br>
<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
</div></div><a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a> [3]<br>
</blockquote></blockquote>
<br><span class="m_-746125927662200449gmail-HOEnZb"><font color="#888888">
--<br>
<br>
Pranith<br>
<br>
<br>
Links:<br>
------<br>
[1] <a href="http://ovirt8z2.xxx.com" rel="noreferrer" target="_blank">http://ovirt8z2.xxx.com</a><br>
[2] <a href="http://review.gluster.org/14816" rel="noreferrer" target="_blank">http://review.gluster.org/1481<wbr>6</a><br>
[3] <a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br>
</font></span></blockquote>
</blockquote></div><br><br clear="all"><br>-- <br><div class="m_-746125927662200449gmail_signature"><div dir="ltr">Pranith<br></div></div>
</div></div>
</blockquote></div></div>