<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 24, 2019 at 12:47 PM Hu Bert <<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Good morning,<br>
<br>
we currently transfer some data to a new glusterfs volume; to check<br>
the throughput of the new volume/setup while the transfer is running i<br>
decided to create some files on one of the gluster servers with dd in<br>
loop:<br>
<br>
while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M<br>
count=1024; rm /shared/private/1G.file; done<br>
<br>
/shared/private is the mount point of the glusterfs volume. The dd<br>
should run for about an hour. But now it happened twice that during<br>
this loop the transport endpoint gets disconnected:<br>
<br>
dd: failed to open '/shared/private/1G.file': Transport endpoint is<br>
not connected<br>
rm: cannot remove '/shared/private/1G.file': Transport endpoint is not connected<br>
<br>
In the /var/log/glusterfs/shared-private.log i see:<br>
<br>
[2019-01-24 07:03:28.938745] W [MSGID: 108001]<br>
[afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0:<br>
7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i<br>
s not met [Transport endpoint is not connected]<br>
[2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free]<br>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c)<br>
[0x7eff84248e8c] -->/usr/lib/x86_64-lin<br>
ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18)<br>
[0x7eff84248e18]<br>
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6)<br>
[0x7eff8a9485a6] ) 0-: Assertion failed:<br>
GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)<br>
[----snip----]<br>
<br>
The whole output can be found here: <a href="https://pastebin.com/qTMmFxx0" rel="noreferrer" target="_blank">https://pastebin.com/qTMmFxx0</a><br>
gluster volume info here: <a href="https://pastebin.com/ENTWZ7j3" rel="noreferrer" target="_blank">https://pastebin.com/ENTWZ7j3</a><br>
<br>
After umount + mount the transport endpoint is connected again - until<br>
the next disconnect. A /core file gets generated. Maybe someone wants<br>
to have a look at this file?<br>
_________________</blockquote><div><br></div><div>Hi Hu Bert,</div><div><br></div><div>Thanks for these logs, and report. 'Transport end point not connected' on a mount comes for 2 reasons.</div><div><br></div><div>1. When the brick (in case of replica all the bricks) having the file is not reachable, or are down. This gets to normal state when the bricks are restarted.</div><div>2. When the client process crashes/asserts. In this case, /dev/fuse wouldn't be connected to a process, but mount will still have a reference. This needs 'umount' and mount again to work.</div><div><br></div><div>We will see what is this issue and get back.</div><div><br></div><div>Regards,</div><div>Amar</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">______________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Amar Tumballi (amarts)<br></div></div></div></div></div></div>