[Gluster-users] gluster connection interrupted during transfer
Richard Neuboeck
hawk at tbi.univie.ac.at
Thu Aug 30 11:48:39 UTC 2018
Hi Nithya,
On 08/30/2018 09:45 AM, Nithya Balachandran wrote:
> Hi Richard,
>
>
>
> On 29 August 2018 at 18:11, Richard Neuboeck <hawk at tbi.univie.ac.at
> <mailto:hawk at tbi.univie.ac.at>> wrote:
>
> Hi Gluster Community,
>
> I have problems with a glusterfs 'Transport endpoint not connected'
> connection abort during file transfers that I can replicate (all the
> time now) but not pinpoint as to why this is happening.
>
> The volume is set up in replica 3 mode and accessed with the fuse
> gluster client. Both client and server are running CentOS and the
> supplied 3.12.11 version of gluster.
>
> The connection abort happens at different times during rsync but
> occurs every time I try to sync all our files (1.1TB) to the empty
> volume.
>
> Client and server side I don't find errors in the gluster log files.
> rsync logs the obvious transfer problem. The only log that shows
> anything related is the server brick log which states that the
> connection is shutting down:
>
> [2018-08-18 22:40:35.502510] I [MSGID: 115036]
> [server.c:527:server_rpc_notify] 0-home-server: disconnecting
> connection from
> brax-110405-2018/08/16-08:36:28:575972-home-client-0-0-0
> [2018-08-18 22:40:35.502620] W
> [inodelk.c:499:pl_inodelk_log_cleanup] 0-home-server: releasing lock
> on eaeb0398-fefd-486d-84a7-f13744d1cf10 held by
> {client=0x7f83ec0b3ce0, pid=110423 lk-owner=d0fd5ffb427f0000}
> [2018-08-18 22:40:35.502692] W
> [entrylk.c:864:pl_entrylk_log_cleanup] 0-home-server: releasing lock
> on faa93f7b-6c46-4251-b2b2-abcd2f2613e1 held by
> {client=0x7f83ec0b3ce0, pid=110423 lk-owner=703dd4cc407f0000}
> [2018-08-18 22:40:35.502719] W
> [entrylk.c:864:pl_entrylk_log_cleanup] 0-home-server: releasing lock
> on faa93f7b-6c46-4251-b2b2-abcd2f2613e1 held by
> {client=0x7f83ec0b3ce0, pid=110423 lk-owner=703dd4cc407f0000}
> [2018-08-18 22:40:35.505950] I [MSGID: 101055]
> [client_t.c:443:gf_client_unref] 0-home-server: Shutting down
> connection brax-110405-2018/08/16-08:36:28:575972-home-client-0-0-0
>
>
> Since I'm running another replica 3 setup for oVirt for a long time
>
>
> Is this setup running with the same gluster version and on the same
> nodes or is it a different cluster?
It's a different cluster (sphere-one, sphere-two and sphere-three)
but the same gluster version and basically the same hardware.
Cheers
Richard
>
>
>
> now which is completely stable I thought I made a mistake setting
> different options at first. However even when I reset those options
> I'm able to reproduce the connection problem.
>
> The unoptimized volume setup looks like this:
>
>
> Volume Name: home
> Type: Replicate
> Volume ID: c92fa4cc-4a26-41ff-8c70-1dd07f733ac8
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: sphere-four:/srv/gluster_home/brick
> Brick2: sphere-five:/srv/gluster_home/brick
> Brick3: sphere-six:/srv/gluster_home/brick
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.server-quorum-ratio: 50%
>
>
> The following additional options were used before:
>
> performance.cache-size: 5GB
> client.event-threads: 4
> server.event-threads: 4
> cluster.lookup-optimize: on
> features.cache-invalidation: on
> performance.stat-prefetch: on
> performance.cache-invalidation: on
> network.inode-lru-limit: 50000
> features.cache-invalidation-timeout: 600
> performance.md-cache-timeout: 600
> performance.parallel-readdir: on
>
>
> In this case the gluster servers and also the client is using a
> bonded network device running in adaptive load balancing mode.
>
> I've tried using the debug option for the client mount. But except
> for a ~0.5TB log file I didn't get information that seems
> helpful to me.
>
> Transferring just a couple of GB works without problems.
>
> It may very well be that I'm already blind to the obvious but after
> many long running tests I can't find the crux in the setup.
>
> Does anyone have an idea as how to approach this problem in a way
> that sheds some useful information?
>
> Any help is highly appreciated!
> Cheers
> Richard
>
> --
> /dev/null
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users>
>
>
--
/dev/null
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180830/e1ee97b6/attachment.sig>
More information about the Gluster-users
mailing list