[Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version
mohammad kashif
kashif.alig at gmail.com
Wed Jun 20 16:12:34 UTC 2018
Hi Nithya
Thanks for the bug report. This new crash happened only once and only at
one client in the last 6 days. I will let you know if it happened again or
more frequently.
Cheers
Kashif
On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran <nbalacha at redhat.com>
wrote:
> Hi Mohammad,
>
> This is a different crash. How often does it happen?
>
>
> We have managed to reproduce the first crash you reported and a bug has
> been filed at [1].
> We will work on a fix for this.
>
>
> Regards,
> Nithya
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199
>
>
> On 18 June 2018 at 14:09, mohammad kashif <kashif.alig at gmail.com> wrote:
>
>> Hi
>>
>> Problem appeared again after few days. This time, the client
>> is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is
>> off. The log level was set to ERROR and I got this log at the time of crash
>>
>> [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind] (-->
>> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x153)[0x7fac2e66ce03]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fac2e434867]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fac2e43497e]
>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xa5)[0x7fac2e434a45]
>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x278)[0x7fac2e434d68]
>> ))))) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3)
>> op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7
>>
>> Core dump was enabled on client so it created a dump. It is here
>>
>> http://www-pnp.physics.ox.ac.uk/~mohammad
>> <http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log>/core.1002074
>>
>> I used a gdb trace using this command
>>
>> gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee
>> backtrace.log_18_16_1
>>
>>
>> http://www-pnp.physics.ox.ac.uk/~mohammad
>> <http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log>
>> /backtrace.log_18_16_1
>>
>> I haven't used gdb much so let me know if you want me to run gdb in
>> different manner.
>>
>> Thanks
>>
>> Kashif
>>
>>
>> On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa <
>> rgowdapp at redhat.com> wrote:
>>
>>>
>>>
>>> On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa <
>>> rgowdapp at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa <
>>>> rgowdapp at redhat.com> wrote:
>>>>
>>>>> From the bt:
>>>>>
>>>>> #8 0x00007f6ef977e6de in rda_readdirp (frame=0x7f6eec862320,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #9 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #10 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #11 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #12 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #13 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #14 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #15 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #16 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #17 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #18 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #19 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #20 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861cc0,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #21 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>> #22 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861bb0,
>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>> #23 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>> dht-common.c:5388
>>>>>
>>>>> It looks like an infinite recursion. Note that readdirp is wound to
>>>>> the same subvol (value of "this" is same in all calls to rda_readdirp) at
>>>>> the same offset (of value 2). This may be a bug in DHT (winding down
>>>>> readdirp with wrong offset) or in readdir-ahead (populating incorrect
>>>>> offset values in dentries it returns as readdirp response).
>>>>>
>>>>
>>>> It looks to be a corruption. Value of size argument in rda_readdirp is
>>>> too big (around 127 TB) to be sane. If you've a reproducer, please run it
>>>> in valgrind or ASAN.
>>>>
>>>
>>>
>>> I spoke too early. It could be a negative value and hence it may not be
>>> a corruption. Is it possible to upload the core somewhere? Or better still
>>> access to gdb session with this core would be more helpful.
>>>
>>>
>>>> To make it explicit, ATM its not clear that there is bug in
>>>> readdir-ahead or DHT as it looks to be a memory corruption. Till I get a
>>>> reproducer or valgrind/ASAN output of client process when the issue
>>>> occcurs, I won't be working on this problem.
>>>>
>>>>
>>>>>
>>>>> On Wed, Jun 13, 2018 at 4:29 PM, mohammad kashif <
>>>>> kashif.alig at gmail.com> wrote:
>>>>>
>>>>>> Hi Milind
>>>>>>
>>>>>> Thanks a lot, I manage to run gdb and produced traceback as well. Its
>>>>>> here
>>>>>>
>>>>>> http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log
>>>>>>
>>>>>>
>>>>>> I am trying to understand but still not able to make sense out of it.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Kashif
>>>>>>
>>>>>> On Wed, Jun 13, 2018 at 11:34 AM, Milind Changire <
>>>>>> mchangir at redhat.com> wrote:
>>>>>>
>>>>>>> Kashif,
>>>>>>> FYI: http://debuginfo.centos.org/centos/6/storage/x86_64/
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 13, 2018 at 3:21 PM, mohammad kashif <
>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Milind
>>>>>>>>
>>>>>>>> There is no glusterfs-debuginfo available for gluster-3.12 from
>>>>>>>> http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/
>>>>>>>> repo. Do you know from where I can get it?
>>>>>>>> Also when I run gdb, it says
>>>>>>>>
>>>>>>>> Missing separate debuginfos, use: debuginfo-install
>>>>>>>> glusterfs-fuse-3.12.9-1.el6.x86_64
>>>>>>>>
>>>>>>>> I can't find debug package for glusterfs-fuse either
>>>>>>>>
>>>>>>>> Thanks from the pit of despair ;)
>>>>>>>>
>>>>>>>> Kashif
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 12, 2018 at 5:01 PM, mohammad kashif <
>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Milind
>>>>>>>>>
>>>>>>>>> I will send you links for logs.
>>>>>>>>>
>>>>>>>>> I collected these core dumps at client and there is no glusterd
>>>>>>>>> process running on client.
>>>>>>>>>
>>>>>>>>> Kashif
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jun 12, 2018 at 4:14 PM, Milind Changire <
>>>>>>>>> mchangir at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>> Kashif,
>>>>>>>>>> Could you also send over the client/mount log file as Vijay
>>>>>>>>>> suggested ?
>>>>>>>>>> Or maybe the lines with the crash backtrace lines
>>>>>>>>>>
>>>>>>>>>> Also, you've mentioned that you straced glusterd, but when you
>>>>>>>>>> ran gdb, you ran it over /usr/sbin/glusterfs
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 12, 2018 at 8:19 PM, Vijay Bellur <vbellur at redhat.com
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jun 12, 2018 at 7:40 AM, mohammad kashif <
>>>>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Milind
>>>>>>>>>>>>
>>>>>>>>>>>> The operating system is Scientific Linux 6 which is based on
>>>>>>>>>>>> RHEL6. The cpu arch is Intel x86_64.
>>>>>>>>>>>>
>>>>>>>>>>>> I will send you a separate email with link to core dump.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You could also grep for crash in the client log file and the
>>>>>>>>>>> lines following crash would have a backtrace in most cases.
>>>>>>>>>>>
>>>>>>>>>>> HTH,
>>>>>>>>>>> Vijay
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>>>
>>>>>>>>>>>> Kashif
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jun 12, 2018 at 3:16 PM, Milind Changire <
>>>>>>>>>>>> mchangir at redhat.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Kashif,
>>>>>>>>>>>>> Could you share the core dump via Google Drive or something
>>>>>>>>>>>>> similar
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, let me know the CPU arch and OS Distribution on which
>>>>>>>>>>>>> you are running gluster.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you've installed the glusterfs-debuginfo package, you'll
>>>>>>>>>>>>> also get the source lines in the backtrace via gdb
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 5:59 PM, mohammad kashif <
>>>>>>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Milind, Vijay
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks, I have some more information now as I straced
>>>>>>>>>>>>>> glusterd on client
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 138544 0.000131 mprotect(0x7f2f70785000, 4096,
>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000026>
>>>>>>>>>>>>>> 138544 0.000128 mprotect(0x7f2f70786000, 4096,
>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027>
>>>>>>>>>>>>>> 138544 0.000126 mprotect(0x7f2f70787000, 4096,
>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027>
>>>>>>>>>>>>>> 138544 0.000124 --- SIGSEGV {si_signo=SIGSEGV,
>>>>>>>>>>>>>> si_code=SEGV_ACCERR, si_addr=0x7f2f7c60ef88} ---
>>>>>>>>>>>>>> 138544 0.000051 --- SIGSEGV {si_signo=SIGSEGV,
>>>>>>>>>>>>>> si_code=SI_KERNEL, si_addr=0} ---
>>>>>>>>>>>>>> 138551 0.105048 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>> 138550 0.000041 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>> 138547 0.000008 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>> 138546 0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>> 138545 0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>> 138544 0.000008 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>> 138543 0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As for I understand that somehow gluster is trying to access
>>>>>>>>>>>>>> memory in appropriate manner and kernel sends SIGSEGV
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also got the core dump. I am trying gdb first time so I am
>>>>>>>>>>>>>> not sure whether I am using it correctly
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> gdb /usr/sbin/glusterfs core.138536
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It just tell me that program terminated with signal 11,
>>>>>>>>>>>>>> segmentation fault .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem is not limited to one client but happening to
>>>>>>>>>>>>>> many clients.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will really appreciate any help as whole file system has
>>>>>>>>>>>>>> become unusable
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kashif
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 12:26 PM, Milind Changire <
>>>>>>>>>>>>>> mchangir at redhat.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kashif,
>>>>>>>>>>>>>>> You can change the log level by:
>>>>>>>>>>>>>>> $ gluster volume set <vol> diagnostics.brick-log-level TRACE
>>>>>>>>>>>>>>> $ gluster volume set <vol> diagnostics.client-log-level TRACE
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and see how things fare
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you want fewer logs you can change the log-level to DEBUG
>>>>>>>>>>>>>>> instead of TRACE.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 3:37 PM, mohammad kashif <
>>>>>>>>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Vijay
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Now it is unmounting every 30 mins !
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The server log at /var/log/glusterfs/bricks/glusteratlas-brics001-gv0.log
>>>>>>>>>>>>>>>> have this line only
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-06-12 09:53:19.303102] I [MSGID: 115013]
>>>>>>>>>>>>>>>> [server-helpers.c:289:do_fd_cleanup] 0-atlasglust-server:
>>>>>>>>>>>>>>>> fd cleanup on /atlas/atlasdata/zgubic/hmumu/
>>>>>>>>>>>>>>>> histograms/v14.3/Signal
>>>>>>>>>>>>>>>> [2018-06-12 09:53:19.306190] I [MSGID: 101055]
>>>>>>>>>>>>>>>> [client_t.c:443:gf_client_unref] 0-atlasglust-server:
>>>>>>>>>>>>>>>> Shutting down connection <server-name> -2224879-2018/06/12-09:51:01:4
>>>>>>>>>>>>>>>> 60889-atlasglust-client-0-0-0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There is no other information. Is there any way to increase
>>>>>>>>>>>>>>>> log verbosity?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> on the client
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-06-12 09:51:01.744980] I [MSGID: 114057]
>>>>>>>>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs]
>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Using Program GlusterFS 3.3, Num (1298437), Version
>>>>>>>>>>>>>>>> (330)
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746508] I [MSGID: 114046]
>>>>>>>>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk]
>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Connected to atlasglust-client-5, attached to remote
>>>>>>>>>>>>>>>> volume '/glusteratlas/brick006/gv0'.
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746543] I [MSGID: 114047]
>>>>>>>>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk]
>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Server and Client lk-version numbers are not same,
>>>>>>>>>>>>>>>> reopening the fds
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746814] I [MSGID: 114035]
>>>>>>>>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Server lk version = 1
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.748449] I [MSGID: 114057]
>>>>>>>>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs]
>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Using Program GlusterFS 3.3, Num (1298437), Version
>>>>>>>>>>>>>>>> (330)
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750219] I [MSGID: 114046]
>>>>>>>>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk]
>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Connected to atlasglust-client-6, attached to remote
>>>>>>>>>>>>>>>> volume '/glusteratlas/brick007/gv0'.
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750261] I [MSGID: 114047]
>>>>>>>>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk]
>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Server and Client lk-version numbers are not same,
>>>>>>>>>>>>>>>> reopening the fds
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750503] I [MSGID: 114035]
>>>>>>>>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Server lk version = 1
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.752207] I
>>>>>>>>>>>>>>>> [fuse-bridge.c:4205:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol
>>>>>>>>>>>>>>>> versions: glusterfs 7.24 kernel 7.14
>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.752261] I
>>>>>>>>>>>>>>>> [fuse-bridge.c:4835:fuse_graph_sync] 0-fuse: switched to
>>>>>>>>>>>>>>>> graph 0
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> is there a problem with server and client 1k version?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kashif
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Jun 11, 2018 at 11:52 PM, Vijay Bellur <
>>>>>>>>>>>>>>>> vbellur at redhat.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jun 11, 2018 at 8:50 AM, mohammad kashif <
>>>>>>>>>>>>>>>>> kashif.alig at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Since I have updated our gluster server and client to
>>>>>>>>>>>>>>>>>> latest version 3.12.9-1, I am having this issue of gluster getting
>>>>>>>>>>>>>>>>>> unmounted from client very regularly. It was not a problem before update.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Its a distributed file system with no replication. We
>>>>>>>>>>>>>>>>>> have seven servers totaling around 480TB data. Its 97% full.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am using following config on server
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust features.cache-invalidation
>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>> features.cache-invalidation-timeout 600
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>> performance.cache-invalidation on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>> performance.md-cache-timeout 600
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>> performance.parallel-readdir on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.cache-size 1GB
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>> performance.client-io-threads on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust cluster.lookup-optimize on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch on
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust client.event-threads 4
>>>>>>>>>>>>>>>>>> gluster volume set atlasglust server.event-threads 4
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> clients are mounted with this option
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> defaults,direct-io-mode=disabl
>>>>>>>>>>>>>>>>>> e,attribute-timeout=600,entry-
>>>>>>>>>>>>>>>>>> timeout=600,negative-timeout=600,fopen-keep-cache,rw,_netdev
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I can't see anything in the log file. Can someone suggest
>>>>>>>>>>>>>>>>>> that how to troubleshoot this issue?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can you please share the log file? Checking for messages
>>>>>>>>>>>>>>>>> related to disconnections/crashes in the log file would be a good way to
>>>>>>>>>>>>>>>>> start troubleshooting the problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Milind
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Milind
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Milind
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Milind
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180620/9f3e3ece/attachment.html>
More information about the Gluster-users
mailing list