[Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

Micha Ober micha2k at gmail.com
Tue Dec 20 12:31:02 UTC 2016


Hi Rafi,

here are the log files:

NFS: http://paste.ubuntu.com/23658653/
Brick: http://paste.ubuntu.com/23658656/

The brick log is of the brick which has caused the last disconnect
at 2016-12-20 06:46:36 (0-gv0-client-7).

For completeness, here is also dmesg output:
http://paste.ubuntu.com/23658691/

Regards,
Micha

2016-12-19 7:28 GMT+01:00 Mohammed Rafi K C <rkavunga at redhat.com>:

> Hi Micha,
>
> Sorry for the late reply. I was busy with some other things.
>
> If you have still the setup available Can you enable TRACE log level
> [1],[2] and see if you could find any log entries when the network start
> disconnecting. Basically I'm trying to find out any disconnection had
> occurred other than ping timer expire issue.
>
>
>
> [1] : gluster volume <volname> diagnostics.brick-log-level TRACE
>
> [2] : gluster volume <volname> diagnostics.client-log-level TRACE
>
>
> Regards
>
> Rafi KC
>
> On 12/08/2016 07:59 PM, Atin Mukherjee wrote:
>
>
>
> On Thu, Dec 8, 2016 at 4:37 PM, Micha Ober <micha2k at gmail.com> wrote:
>
>> Hi Rafi,
>>
>> thank you for your support. It is greatly appreciated.
>>
>> Just some more thoughts from my side:
>>
>> There have been no reports from other  users in *this* thread until now,
>> but I have found at least one user with a very simiar problem in an older
>> thread:
>>
>> https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html
>>
>> He is also reporting disconnects  with no apparent reasons, althogh his
>> setup is a bit more complicated, also involving a firewall. In our setup,
>> all servers/clients are connected via 1 GbE with no firewall or anything
>> that might block/throttle traffic. Also, we are using exactly the same
>> software versions on all nodes.
>>
>>
>> I can also find some reports in the bugtracker when searching for
>> "rpc_client_ping_timer_expired" and "rpc_clnt_ping_timer_expired" (looks
>> like spelling changed during versions).
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1096729
>>
>
> Just FYI, this is a different issue, here GlusterD fails to handle the
> volume of incoming requests on time since MT-epoll is not enabled here.
>
>
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1370683
>>
>> But both reports involve large traffic/load on the bricks/disks, which is
>> not the case for out setup.
>> To give a ballpark figure: Over three days, 30 GiB were written. And the
>> data was not written at once, but continuously over the whole time.
>>
>>
>> Just to be sure, I have checked the logfiles of one of the other clusters
>> right now, which are sitting in the same building, in the same rack, even
>> on the same switch, running the same jobs, but with glusterfs 3.4.2 and I
>> can see no disconnects in the logfiles. So I can definitely rule out our
>> infrastructure as problem.
>>
>> Regards,
>> Micha
>>
>>
>>
>> Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K C:
>>
>> Hi Micha,
>>
>> This is great. I will provide you one debug build which has two fixes
>> which I possible suspect for a frequent disconnect issue, though I don't
>> have much data to validate my theory. So I will take one more day to dig in
>> to that.
>>
>> Thanks for your support, and opensource++
>>
>> Regards
>>
>> Rafi KC
>> On 12/07/2016 05:02 AM, Micha Ober wrote:
>>
>> Hi,
>>
>> thank you for your answer and even more for the question!
>> Until now, I was using FUSE. Today I changed all mounts to NFS using the
>> same 3.7.17 version.
>>
>> But: The problem is still the same. Now, the NFS logfile contains lines
>> like these:
>>
>> [2016-12-06 15:12:29.006325] C [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
>> 0-gv0-client-7: server X.X.18.62:49153 has not responded in the last 42
>> seconds, disconnecting.
>>
>> Interestingly enough,  the IP address X.X.18.62 is the same machine! As I
>> wrote earlier, each node serves both as a server and a client, as each node
>> contributes bricks to the volume. Every server is connecting to itself via
>> its hostname. For example, the fstab on the node "giant2" looks like:
>>
>> #giant2:/gv0    /shared_data    glusterfs       defaults,noauto 0       0
>> #giant2:/gv2    /shared_slurm   glusterfs       defaults,noauto 0       0
>>
>> giant2:/gv0     /shared_data    nfs             defaults,_netdev,vers=3
>> 0       0
>> giant2:/gv2     /shared_slurm   nfs             defaults,_netdev,vers=3
>> 0       0
>>
>> So I understand the disconnects even less.
>>
>> I don't know if it's possible to create a dummy cluster which exposes the
>> same behaviour, because the disconnects only happen when there are compute
>> jobs running on those nodes - and they are GPU compute jobs, so that's
>> something which cannot be easily emulated in a VM.
>>
>> As we have more clusters (which are running fine with an ancient 3.4
>> version :-)) and we are currently not dependent on this particular cluster
>> (which may stay like this for this month, I think) I should be able to
>> deploy the debug build on the "real" cluster, if you can provide a debug
>> build.
>>
>> Regards and thanks,
>> Micha
>>
>>
>>
>> Am 06.12.2016 um 08:15 schrieb Mohammed Rafi K C:
>>
>>
>>
>> On 12/03/2016 12:56 AM, Micha Ober wrote:
>>
>> ** Update: ** I have downgraded from 3.8.6 to 3.7.17 now, but the problem
>> still exists.
>>
>>
>> Client log: <http://paste.ubuntu.com/23569065/>http://paste.ubuntu.com/
>> 23569065/
>> Brick log: <http://paste.ubuntu.com/23569067/>http://paste.ubuntu.com/
>> 23569067/
>>
>> Please note that each server has two bricks.
>> Whereas, according to the logs, one brick loses the connection to all
>> other hosts:
>>
>> [2016-12-02 18:38:53.703301] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.219:49121 failed (Broken pipe)
>> [2016-12-02 18:38:53.703381] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.62:49118 failed (Broken pipe)
>> [2016-12-02 18:38:53.703380] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.107:49121 failed (Broken pipe)
>> [2016-12-02 18:38:53.703424] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.206:49120 failed (Broken pipe)
>> [2016-12-02 18:38:53.703359] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.58:49121 failed (Broken pipe)
>>
>> The SECOND brick on the SAME host is NOT affected, i.e. no disconnects!
>> As I said, the network connection is fine and the disks are idle.
>> The CPU always has 2 free cores.
>>
>> It looks like I have to downgrade to 3.4 now in order for the disconnects to stop.
>>
>>
>> Hi Micha,
>>
>> Thanks for the update and sorry for what happened with gluster higher
>> versions. I can understand the need for downgrade as it is a production
>> setup.
>>
>> Can you tell me the clients used here ? whether it is a
>> fuse,nfs,nfs-ganesha, smb or libgfapi ?
>>
>> Since I'm not able to reproduce the issue (I have been trying from last
>> 3days) and the logs are not much helpful here (we don't have much logs in
>> socket layer), Could you please create a dummy cluster and try to reproduce
>> the issue? If then we can play with that volume and I could provide some
>> debug build which we can use for further debugging?
>>
>> If you don't have bandwidth for this, please leave it ;).
>>
>> Regards
>> Rafi KC
>>
>> - Micha
>>
>>
>> Am 30.11.2016 um 06:57 schrieb Mohammed Rafi K C:
>>
>> Hi Micha,
>>
>> I have changed the thread and subject so that your original thread remain
>> same for your query. Let's try to fix the problem what you observed with
>> 3.8.4, So I have started a new thread to discuss the frequent disconnect
>> problem.
>>
>> *If any one else has experienced the same problem, please respond to the
>> mail.*
>>
>> It would be very helpful if you could give us some more logs from clients
>> and bricks.  Also any reproducible steps will surely help to chase the
>> problem further.
>>
>> Regards
>>
>> Rafi KC
>> On 11/30/2016 04:44 AM, Micha Ober wrote:
>>
>> I had opened another thread on this mailing list (Subject: "After upgrade
>> from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and
>> split-brain").
>>
>> The title may be a bit misleading now, as I am no longer observing high
>> CPU usage after upgrading to 3.8.6, but the disconnects are still happening
>> and the number of files in split-brain is growing.
>>
>> Setup: 6 compute nodes, each serving as a glusterfs server and client,
>> Ubuntu 14.04, two bricks per node, distribute-replicate
>>
>> I have two gluster volumes set up (one for scratch data, one for the
>> slurm scheduler). Only the scratch data volume shows critical errors "[...]
>> has not responded in the last 42 seconds, disconnecting.". So I can rule
>> out network problems, the gigabit link between the nodes is not saturated
>> at all. The disks are almost idle (<10%).
>>
>> I have glusterfs 3.4.2 on Ubuntu 12.04 on a another compute cluster,
>> running fine since it was deployed.
>> I had glusterfs 3.4.2 on Ubuntu 14.04 on this cluster, running fine for
>> almost a year.
>>
>> After upgrading to 3.8.5, the problems (as described) started. I would
>> like to use some of the new features of the newer versions (like bitrot),
>> but the users can't run their compute jobs right now because the result
>> files are garbled.
>>
>> There also seems to be a bug report with a smiliar problem: (but no
>> progress)
>> <https://bugzilla.redhat.com/show_bug.cgi?id=1370683>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1370683
>>
>> For me, ALL servers are affected (not isolated to one or two servers)
>>
>> I also see messages like "INFO: task gpu_graphene_bv:4476 blocked for
>> more than 120 seconds." in the syslog.
>>
>> For completeness (gv0 is the scratch volume, gv2 the slurm volume):
>>
>> [root at giant2: ~]# gluster v info
>>
>> Volume Name: gv0
>> Type: Distributed-Replicate
>> Volume ID: 993ec7c9-e4bc-44d0-b7c4-2d977e622e86
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 6 x 2 = 12
>> Transport-type: tcp
>> Bricks:
>> Brick1: giant1:/gluster/sdc/gv0
>> Brick2: giant2:/gluster/sdc/gv0
>> Brick3: giant3:/gluster/sdc/gv0
>> Brick4: giant4:/gluster/sdc/gv0
>> Brick5: giant5:/gluster/sdc/gv0
>> Brick6: giant6:/gluster/sdc/gv0
>> Brick7: giant1:/gluster/sdd/gv0
>> Brick8: giant2:/gluster/sdd/gv0
>> Brick9: giant3:/gluster/sdd/gv0
>> Brick10: giant4:/gluster/sdd/gv0
>> Brick11: giant5:/gluster/sdd/gv0
>> Brick12: giant6:/gluster/sdd/gv0
>> Options Reconfigured:
>> auth.allow: X.X.X.*,127.0.0.1
>> nfs.disable: on
>>
>> Volume Name: gv2
>> Type: Replicate
>> Volume ID: 30c78928-5f2c-4671-becc-8deaee1a7a8d
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: giant1:/gluster/sdd/gv2
>> Brick2: giant2:/gluster/sdd/gv2
>> Options Reconfigured:
>> auth.allow: X.X.X.*,127.0.0.1
>> cluster.granular-entry-heal: on
>> cluster.locking-scheme: granular
>> nfs.disable: on
>>
>>
>> 2016-11-30 0:10 GMT+01:00 Micha Ober < <micha2k at gmail.com>
>> micha2k at gmail.com>:
>>
>>> There also seems to be a bug report with a smiliar problem: (but no
>>> progress)
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1370683>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1370683
>>>
>>> For me, ALL servers are affected (not isolated to one or two servers)
>>>
>>> I also see messages like "INFO: task gpu_graphene_bv:4476 blocked for
>>> more than 120 seconds." in the syslog.
>>>
>>> For completeness (gv0 is the scratch volume, gv2 the slurm volume):
>>>
>>> [root at giant2: ~]# gluster v info
>>>
>>> Volume Name: gv0
>>> Type: Distributed-Replicate
>>> Volume ID: 993ec7c9-e4bc-44d0-b7c4-2d977e622e86
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 6 x 2 = 12
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: giant1:/gluster/sdc/gv0
>>> Brick2: giant2:/gluster/sdc/gv0
>>> Brick3: giant3:/gluster/sdc/gv0
>>> Brick4: giant4:/gluster/sdc/gv0
>>> Brick5: giant5:/gluster/sdc/gv0
>>> Brick6: giant6:/gluster/sdc/gv0
>>> Brick7: giant1:/gluster/sdd/gv0
>>> Brick8: giant2:/gluster/sdd/gv0
>>> Brick9: giant3:/gluster/sdd/gv0
>>> Brick10: giant4:/gluster/sdd/gv0
>>> Brick11: giant5:/gluster/sdd/gv0
>>> Brick12: giant6:/gluster/sdd/gv0
>>> Options Reconfigured:
>>> auth.allow: X.X.X.*,127.0.0.1
>>> nfs.disable: on
>>>
>>> Volume Name: gv2
>>> Type: Replicate
>>> Volume ID: 30c78928-5f2c-4671-becc-8deaee1a7a8d
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: giant1:/gluster/sdd/gv2
>>> Brick2: giant2:/gluster/sdd/gv2
>>> Options Reconfigured:
>>> auth.allow: X.X.X.*,127.0.0.1
>>> cluster.granular-entry-heal: on
>>> cluster.locking-scheme: granular
>>> nfs.disable: on
>>>
>>>
>>> 2016-11-29 19:21 GMT+01:00 Micha Ober < <micha2k at gmail.com>
>>> micha2k at gmail.com>:
>>>
>>>> I had opened another thread on this mailing list (Subject: "After
>>>> upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and
>>>> split-brain").
>>>>
>>>> The title may be a bit misleading now, as I am no longer observing high
>>>> CPU usage after upgrading to 3.8.6, but the disconnects are still happening
>>>> and the number of files in split-brain is growing.
>>>>
>>>> Setup: 6 compute nodes, each serving as a glusterfs server and client,
>>>> Ubuntu 14.04, two bricks per node, distribute-replicate
>>>>
>>>> I have two gluster volumes set up (one for scratch data, one for the
>>>> slurm scheduler). Only the scratch data volume shows critical errors "[...]
>>>> has not responded in the last 42 seconds, disconnecting.". So I can rule
>>>> out network problems, the gigabit link between the nodes is not saturated
>>>> at all. The disks are almost idle (<10%).
>>>>
>>>> I have glusterfs 3.4.2 on Ubuntu 12.04 on a another compute cluster,
>>>> running fine since it was deployed.
>>>> I had glusterfs 3.4.2 on Ubuntu 14.04 on this cluster, running fine for
>>>> almost a year.
>>>>
>>>> After upgrading to 3.8.5, the problems (as described) started. I would
>>>> like to use some of the new features of the newer versions (like bitrot),
>>>> but the users can't run their compute jobs right now because the result
>>>> files are garbled.
>>>>
>>>> 2016-11-29 18:53 GMT+01:00 Atin Mukherjee < <amukherj at redhat.com>
>>>> amukherj at redhat.com>:
>>>>
>>>>> Would you be able to share what is not working for you in 3.8.x
>>>>> (mention the exact version). 3.4 is quite old and falling back to an
>>>>> unsupported version doesn't look a feasible option.
>>>>>
>>>>> On Tue, 29 Nov 2016 at 17:01, Micha Ober < <micha2k at gmail.com>
>>>>> micha2k at gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I was using gluster 3.4 and upgraded to 3.8, but that version showed
>>>>>> to be unusable for me. I now need to downgrade.
>>>>>>
>>>>>> I'm running Ubuntu 14.04. As upgrades of the op version
>>>>>> are irreversible, I guess I have to delete all gluster volumes and
>>>>>> re-create them with the downgraded version.
>>>>>>
>>>>>> 0. Backup data
>>>>>> 1. Unmount all gluster volumes
>>>>>> 2. apt-get purge glusterfs-server glusterfs-client
>>>>>> 3. Remove PPA for 3.8
>>>>>> 4. Add PPA for older version
>>>>>> 5. apt-get install glusterfs-server glusterfs-client
>>>>>> 6. Create volumes
>>>>>>
>>>>>> Is "purge" enough to delete all configuration files of the currently
>>>>>> installed version or do I need to  manually clear some residues before
>>>>>> installing an older version?
>>>>>>
>>>>>> Thanks.
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> <Gluster-users at gluster.org>Gluster-users at gluster.org
>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>> --
>>>>> - Atin (atinm)
>>>>>
>>>>
>>>>
>>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>
>> --
> ~ Atin (atinm)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161220/74e76c2b/attachment.html>


More information about the Gluster-users mailing list