[Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

Milind Changire mchangir at redhat.com
Wed Jun 13 10:34:15 UTC 2018


Kashif,
FYI: http://debuginfo.centos.org/centos/6/storage/x86_64/


On Wed, Jun 13, 2018 at 3:21 PM, mohammad kashif <kashif.alig at gmail.com>
wrote:

> Hi Milind
>
> There is no glusterfs-debuginfo available for gluster-3.12 from
> http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/ repo. Do
> you know from where I can get it?
> Also when I run gdb, it says
>
> Missing separate debuginfos, use: debuginfo-install
> glusterfs-fuse-3.12.9-1.el6.x86_64
>
> I can't find debug package for glusterfs-fuse either
>
> Thanks from the pit of despair ;)
>
> Kashif
>
>
> On Tue, Jun 12, 2018 at 5:01 PM, mohammad kashif <kashif.alig at gmail.com>
> wrote:
>
>> Hi Milind
>>
>> I will send you links for logs.
>>
>> I collected these core dumps at client and there is no glusterd process
>> running on client.
>>
>> Kashif
>>
>>
>>
>> On Tue, Jun 12, 2018 at 4:14 PM, Milind Changire <mchangir at redhat.com>
>> wrote:
>>
>>> Kashif,
>>> Could you also send over the client/mount log file as Vijay suggested ?
>>> Or maybe the lines with the crash backtrace lines
>>>
>>> Also, you've mentioned that you straced glusterd, but when you ran gdb,
>>> you ran it over /usr/sbin/glusterfs
>>>
>>>
>>> On Tue, Jun 12, 2018 at 8:19 PM, Vijay Bellur <vbellur at redhat.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Jun 12, 2018 at 7:40 AM, mohammad kashif <kashif.alig at gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Milind
>>>>>
>>>>> The operating system is Scientific Linux 6 which is based on RHEL6.
>>>>> The cpu arch is Intel x86_64.
>>>>>
>>>>> I will send you a separate email with link to core dump.
>>>>>
>>>>
>>>>
>>>> You could also grep for crash in the client log file and the lines
>>>> following crash would have a backtrace in most cases.
>>>>
>>>> HTH,
>>>> Vijay
>>>>
>>>>
>>>>>
>>>>> Thanks for your help.
>>>>>
>>>>> Kashif
>>>>>
>>>>>
>>>>> On Tue, Jun 12, 2018 at 3:16 PM, Milind Changire <mchangir at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> Kashif,
>>>>>> Could you share the core dump via Google Drive or something similar
>>>>>>
>>>>>> Also, let me know the CPU arch and OS Distribution on which you are
>>>>>> running gluster.
>>>>>>
>>>>>> If you've installed the glusterfs-debuginfo package, you'll also get
>>>>>> the source lines in the backtrace via gdb
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 12, 2018 at 5:59 PM, mohammad kashif <
>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Milind, Vijay
>>>>>>>
>>>>>>> Thanks, I have some more information now as I straced glusterd on
>>>>>>> client
>>>>>>>
>>>>>>> 138544      0.000131 mprotect(0x7f2f70785000, 4096,
>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000026>
>>>>>>> 138544      0.000128 mprotect(0x7f2f70786000, 4096,
>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027>
>>>>>>> 138544      0.000126 mprotect(0x7f2f70787000, 4096,
>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027>
>>>>>>> 138544      0.000124 --- SIGSEGV {si_signo=SIGSEGV,
>>>>>>> si_code=SEGV_ACCERR, si_addr=0x7f2f7c60ef88} ---
>>>>>>> 138544      0.000051 --- SIGSEGV {si_signo=SIGSEGV,
>>>>>>> si_code=SI_KERNEL, si_addr=0} ---
>>>>>>> 138551      0.105048 +++ killed by SIGSEGV (core dumped) +++
>>>>>>> 138550      0.000041 +++ killed by SIGSEGV (core dumped) +++
>>>>>>> 138547      0.000008 +++ killed by SIGSEGV (core dumped) +++
>>>>>>> 138546      0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>> 138545      0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>> 138544      0.000008 +++ killed by SIGSEGV (core dumped) +++
>>>>>>> 138543      0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>
>>>>>>> As for I understand that somehow gluster is trying to access memory
>>>>>>> in appropriate manner and kernel sends SIGSEGV
>>>>>>>
>>>>>>> I also got the core dump. I am trying gdb first time so I am not
>>>>>>> sure whether I am using it correctly
>>>>>>>
>>>>>>> gdb /usr/sbin/glusterfs core.138536
>>>>>>>
>>>>>>> It just tell me that program terminated with signal 11, segmentation
>>>>>>> fault .
>>>>>>>
>>>>>>> The problem is not limited to one client but happening to many
>>>>>>> clients.
>>>>>>>
>>>>>>> I will really appreciate any help as whole file system has become
>>>>>>> unusable
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Kashif
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 12, 2018 at 12:26 PM, Milind Changire <
>>>>>>> mchangir at redhat.com> wrote:
>>>>>>>
>>>>>>>> Kashif,
>>>>>>>> You can change the log level by:
>>>>>>>> $ gluster volume set <vol> diagnostics.brick-log-level TRACE
>>>>>>>> $ gluster volume set <vol> diagnostics.client-log-level TRACE
>>>>>>>>
>>>>>>>> and see how things fare
>>>>>>>>
>>>>>>>> If you want fewer logs you can change the log-level to DEBUG
>>>>>>>> instead of TRACE.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 12, 2018 at 3:37 PM, mohammad kashif <
>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Vijay
>>>>>>>>>
>>>>>>>>> Now it is unmounting every 30 mins !
>>>>>>>>>
>>>>>>>>> The server log at /var/log/glusterfs/bricks/glusteratlas-brics001-gv0.log
>>>>>>>>> have this line only
>>>>>>>>>
>>>>>>>>> 2018-06-12 09:53:19.303102] I [MSGID: 115013]
>>>>>>>>> [server-helpers.c:289:do_fd_cleanup] 0-atlasglust-server: fd
>>>>>>>>> cleanup on /atlas/atlasdata/zgubic/hmumu/histograms/v14.3/Signal
>>>>>>>>> [2018-06-12 09:53:19.306190] I [MSGID: 101055]
>>>>>>>>> [client_t.c:443:gf_client_unref] 0-atlasglust-server: Shutting
>>>>>>>>> down connection <server-name> -2224879-2018/06/12-09:51:01:4
>>>>>>>>> 60889-atlasglust-client-0-0-0
>>>>>>>>>
>>>>>>>>> There is no other information. Is there any way to increase log
>>>>>>>>> verbosity?
>>>>>>>>>
>>>>>>>>> on the client
>>>>>>>>>
>>>>>>>>> 2018-06-12 09:51:01.744980] I [MSGID: 114057]
>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs]
>>>>>>>>> 0-atlasglust-client-5: Using Program GlusterFS 3.3, Num (1298437), Version
>>>>>>>>> (330)
>>>>>>>>> [2018-06-12 09:51:01.746508] I [MSGID: 114046]
>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk]
>>>>>>>>> 0-atlasglust-client-5: Connected to atlasglust-client-5, attached to remote
>>>>>>>>> volume '/glusteratlas/brick006/gv0'.
>>>>>>>>> [2018-06-12 09:51:01.746543] I [MSGID: 114047]
>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk]
>>>>>>>>> 0-atlasglust-client-5: Server and Client lk-version numbers are not same,
>>>>>>>>> reopening the fds
>>>>>>>>> [2018-06-12 09:51:01.746814] I [MSGID: 114035]
>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>>>>>>> 0-atlasglust-client-5: Server lk version = 1
>>>>>>>>> [2018-06-12 09:51:01.748449] I [MSGID: 114057]
>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs]
>>>>>>>>> 0-atlasglust-client-6: Using Program GlusterFS 3.3, Num (1298437), Version
>>>>>>>>> (330)
>>>>>>>>> [2018-06-12 09:51:01.750219] I [MSGID: 114046]
>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk]
>>>>>>>>> 0-atlasglust-client-6: Connected to atlasglust-client-6, attached to remote
>>>>>>>>> volume '/glusteratlas/brick007/gv0'.
>>>>>>>>> [2018-06-12 09:51:01.750261] I [MSGID: 114047]
>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk]
>>>>>>>>> 0-atlasglust-client-6: Server and Client lk-version numbers are not same,
>>>>>>>>> reopening the fds
>>>>>>>>> [2018-06-12 09:51:01.750503] I [MSGID: 114035]
>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>>>>>>> 0-atlasglust-client-6: Server lk version = 1
>>>>>>>>> [2018-06-12 09:51:01.752207] I [fuse-bridge.c:4205:fuse_init]
>>>>>>>>> 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel
>>>>>>>>> 7.14
>>>>>>>>> [2018-06-12 09:51:01.752261] I [fuse-bridge.c:4835:fuse_graph_sync]
>>>>>>>>> 0-fuse: switched to graph 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> is there a problem with server and client 1k version?
>>>>>>>>>
>>>>>>>>> Thanks for your help.
>>>>>>>>>
>>>>>>>>> Kashif
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 11, 2018 at 11:52 PM, Vijay Bellur <vbellur at redhat.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jun 11, 2018 at 8:50 AM, mohammad kashif <
>>>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi
>>>>>>>>>>>
>>>>>>>>>>> Since I have updated our gluster server and client to latest
>>>>>>>>>>> version 3.12.9-1, I am having this issue of gluster getting unmounted from
>>>>>>>>>>> client very regularly. It was not a problem before update.
>>>>>>>>>>>
>>>>>>>>>>> Its a distributed file system with no replication. We have seven
>>>>>>>>>>> servers totaling around 480TB data. Its 97% full.
>>>>>>>>>>>
>>>>>>>>>>> I am using following config on server
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> gluster volume set atlasglust features.cache-invalidation on
>>>>>>>>>>> gluster volume set atlasglust features.cache-invalidation-timeout
>>>>>>>>>>> 600
>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch on
>>>>>>>>>>> gluster volume set atlasglust performance.cache-invalidation on
>>>>>>>>>>> gluster volume set atlasglust performance.md-cache-timeout 600
>>>>>>>>>>> gluster volume set atlasglust performance.parallel-readdir on
>>>>>>>>>>> gluster volume set atlasglust performance.cache-size 1GB
>>>>>>>>>>> gluster volume set atlasglust performance.client-io-threads on
>>>>>>>>>>> gluster volume set atlasglust cluster.lookup-optimize on
>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch on
>>>>>>>>>>> gluster volume set atlasglust client.event-threads 4
>>>>>>>>>>> gluster volume set atlasglust server.event-threads 4
>>>>>>>>>>>
>>>>>>>>>>> clients are mounted with this option
>>>>>>>>>>>
>>>>>>>>>>> defaults,direct-io-mode=disable,attribute-timeout=600,entry-
>>>>>>>>>>> timeout=600,negative-timeout=600,fopen-keep-cache,rw,_netdev
>>>>>>>>>>>
>>>>>>>>>>> I can't see anything in the log file. Can someone suggest that
>>>>>>>>>>> how to troubleshoot this issue?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Can you please share the log file? Checking for messages related
>>>>>>>>>> to disconnections/crashes in the log file would be a good way to start
>>>>>>>>>> troubleshooting the problem.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Vijay
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Milind
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Milind
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Milind
>>>
>>>
>>
>


-- 
Milind
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180613/98365887/attachment.html>


More information about the Gluster-users mailing list