<div><div dir="auto">Can you please pass all the gluster log files from the server where the transport end point not connected error is reported? As restarting glusterd didn’t solve this issue, I believe this isn’t a stale port problem but something else. Also please provide the output of ‘gluster v info <volname>’</div></div><div dir="auto"><br></div><div dir="auto">(@cc Ravi, Karthik)</div><div><br><div class="gmail_quote"><div dir="ltr">On Fri, 31 Aug 2018 at 23:24, Johnson, Tim <<a href="mailto:tjj@uic.edu">tjj@uic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="m_-7027967429663119338WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt">Hello all,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> We have a gluster replicate (with arbiter) volumes that we are getting “Transport endpoint is not connected” with on a rotating basis from each of the two file servers, and a third host that has the
arbiter bricks on.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">This is happening when trying to run a heal on all the volumes on the gluster hosts When I get the status of all the volumes all looks good.
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> This behavior seems to be a forshadowing of the gluster volumes becoming unresponsive to our vm cluster. As well as one of the file servers have two processes for each of the volumes instead of one
per volume. Eventually the affected file server<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">will drop off the listed peers. Restarting glusterd/glusterfsd on the affected file server does not take care of the issue, we have to bring down both file<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Servers due to the volumes not being seen by the vm cluster after the errors start occurring. I had seen that there were bug reports about the “Transport endpoint is not connected” on earlier versions of Gluster
however had thought that<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">It had been addressed. <u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> Dmesg did have some entries for “a possible syn flood on port *” which we changed the sysctl to “net.ipv4.tcp_max_syn_backlog = 2048” which seemed to help the syn flood messages but not the underlying
volume issues.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> I have put the versions of all the Gluster packages installed below as well as the “Heal” and “Status” commands showing the volumes are
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> This has just started happening but cannot definitively say if this started occurring after an update or not.
<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> <u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Thanks for any assistance.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Running Heal :<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">gluster volume heal ovirt_engine info<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Brick ****1.rrc.local:/bricks/brick0/ovirt_engine<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Status: Connected<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Number of entries: 0<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Brick ****3.rrc.local:/bricks/brick0/ovirt_engine<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Status: Transport endpoint is not connected<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Number of entries: -<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Brick *****3.rrc.local:/bricks/arb-brick/ovirt_engine<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Status: Transport endpoint is not connected<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Number of entries: - <u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Running status :<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">gluster volume status ovirt_engine<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Status of volume: ovirt_engine<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Gluster process TCP Port RDMA Port Online Pid<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">------------------------------------------------------------------------------<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Brick*****.rrc.local:/bricks/brick0/ov<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">irt_engine 49152 0 Y 5521<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Brick fs2-tier3.rrc.local:/bricks/brick0/ov<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">irt_engine 49152 0 Y 6245<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Brick ****.rrc.local:/bricks/arb-b<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">rick/ovirt_engine 49152 0 Y 3526<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Self-heal Daemon on localhost N/A N/A Y 5509<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Self-heal Daemon on ***.rrc.local N/A N/A Y 6218<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Self-heal Daemon on ***.rrc.local N/A N/A Y 3501<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Self-heal Daemon on ****.rrc.local N/A N/A Y 3657<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Self-heal Daemon on *****.rrc.local N/A N/A Y 3753<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Self-heal Daemon on ****.rrc.local N/A N/A Y 17284<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Task Status of Volume ovirt_engine<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">------------------------------------------------------------------------------<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">There are no active volume tasks<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">/etc/glusterd.vol. :<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">volume management<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> type mgmt/glusterd<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option working-directory /var/lib/glusterd<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option transport-type socket,rdma<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option transport.socket.keepalive-time 10<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option transport.socket.keepalive-interval 2<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option transport.socket.read-fail-log off<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option ping-timeout 0<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option event-threads 1<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> option rpc-auth-allow-insecure on<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"># option transport.address-family inet6<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"># option base-port 49152<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">end-volume<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">rpm -qa |grep gluster<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-gnfs-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-api-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-cli-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-client-xlators-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-fuse-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">centos-release-gluster312-1.0-2.el7.centos.noarch<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-rdma-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-libs-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">glusterfs-server-3.12.13-1.el7.x86_64<u></u><u></u></span></p>
</div>
</div>
_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature">- Atin (atinm)</div>