<div dir="ltr">Thank you. In the meantime, turning off parallel readdir should prevent the first crash.<div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 20 June 2018 at 21:42, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Hi Nithya<br><br></div>Thanks for the bug report. This new crash happened only once and only at one client in the last 6 days. I will let you know if it happened again or more frequently. <br><br></div>Cheers<span class="HOEnZb"><font color="#888888"><br><br></font></span></div><span class="HOEnZb"><font color="#888888">Kashif <br></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran <span dir="ltr"><<a href="mailto:nbalacha@redhat.com" target="_blank">nbalacha@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Mohammad,<div><br></div><div>This is a different crash. How often does it happen?</div><div><br></div><div><br></div><div>We have managed to reproduce the first crash you reported and a bug has been filed at [1].</div><div>We will work on a fix for this.</div><div><br></div><div><br></div><div>Regards,</div><div>Nithya</div><div><br></div><div>[1] <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1593199" target="_blank">https://bugzilla.redhat.co<wbr>m/show_bug.cgi?id=1593199</a></div><div><br></div></div><div class="m_5953770867640638257HOEnZb"><div class="m_5953770867640638257h5"><div class="gmail_extra"><br><div class="gmail_quote">On 18 June 2018 at 14:09, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi <div><br></div><div>Problem appeared again after few days. This time, the client is glusterfs-3.10.12-1.el6.x86<wbr>_64 and performance.parallel-readd<wbr>ir is off. The log level was set to ERROR and I got this log at the time of crash</div><div><br></div><div>[2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_u<wbr>nwind] (--> /usr/lib64/libglusterfs.so.0(_<wbr>gf_log_callingfn+0x153)[0x7fac<wbr>2e66ce03] (--> /usr/lib64/libgfrpc.so.0(saved<wbr>_frames_unwind+0x1e7)[0x7fac2e<wbr>434867] (--> /usr/lib64/libgfrpc.so.0(saved<wbr>_frames_destroy+0xe)[0x7fac2e4<wbr>3497e] (--> /usr/lib64/libgfrpc.so.0(rpc_c<wbr>lnt_connection_cleanup+0xa5)[0<wbr>x7fac2e434a45] (--> /usr/lib64/libgfrpc.so.0(rpc_c<wbr>lnt_notify+0x278)[0x7fac2e434d<wbr>68] ))))) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3) op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7<br></div><div><br></div><div>Core dump was enabled on client so it created a dump. It is here</div><div><br></div><div>
<a href="http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log" style="color:rgb(17,85,204);font-size:12.8px;background-color:rgb(255,255,255)" target="_blank">http://www-pnp.physics.ox.ac.u<wbr>k/~mohammad</a>/core.1002074 <br></div><div><br></div><div>I used a gdb trace using this command</div><div><br></div><div>gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee backtrace.log_18_16_1<br></div><div><br></div><div><br></div><div>
<a href="http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log" style="color:rgb(17,85,204);font-size:12.8px;background-color:rgb(255,255,255)" target="_blank">http://www-pnp.physics.ox.ac.u<wbr>k/~mohammad</a>/backtrace.log_18_1<wbr>6_1
<br></div><div><br></div><div>I haven't used gdb much so let me know if you want me to run gdb in different manner.</div><div><br></div><div>Thanks</div><span class="m_5953770867640638257m_-7941410713842890653HOEnZb"><font color="#888888"><div><br></div><div>Kashif</div><div><br></div></font></span></div><div class="m_5953770867640638257m_-7941410713842890653HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa <span dir="ltr"><<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa <span dir="ltr"><<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa <span dir="ltr"><<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>></span> wrote:<br></span><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>From the bt:<br><br>#8 0x00007f6ef977e6de in rda_readdirp (frame=0x7f6eec862320, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#9 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#10 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#11 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#12 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#13 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#14 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#15 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#16 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#17 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#18 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#19 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#20 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861cc0, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#21 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br>#22 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861bb0, this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, xdata=0x7f6eec0085a0) at readdir-ahead.c:266<br>#23 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at dht-common.c:5388<br><br></div>It looks like an infinite recursion. Note that readdirp is wound to the same subvol (value of "this" is same in all calls to rda_readdirp) at the same offset (of value 2). This may be a bug in DHT (winding down readdirp with wrong offset) or in readdir-ahead (populating incorrect offset values in dentries it returns as readdirp response).<br></div></blockquote><div><br></div></span><div>It looks to be a corruption. Value of size argument in rda_readdirp is too big (around 127 TB) to be sane. If you've a reproducer, please run it in valgrind or ASAN.</div></div></div></div></blockquote><div><br></div><div><br></div></span><div>I spoke too early. It could be a negative value and hence it may not be a corruption. Is it possible to upload the core somewhere? Or better still access to gdb session with this core would be more helpful.</div><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872h5"><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>To make it explicit, ATM its not clear that there is bug in readdir-ahead or DHT as it looks to be a memory corruption. Till I get a reproducer or valgrind/ASAN output of client process when the issue occcurs, I won't be working on this problem.<br></div><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810h5"><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 13, 2018 at 4:29 PM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Milind</div><div><br></div><div>Thanks a lot, I manage to run gdb and produced traceback as well. Its here</div><div><br></div><div><a href="http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log" target="_blank">http://www-pnp.physics.ox.ac.u<wbr>k/~mohammad/backtrace.log</a> <br></div><div><br></div><div><br></div><div>I am trying to understand but still not able to make sense out of it.<br></div><div><br></div><div>Thanks</div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648HOEnZb"><font color="#888888"><div><br></div><div>Kashif<br></div></font></span></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 13, 2018 at 11:34 AM, Milind Changire <span dir="ltr"><<a href="mailto:mchangir@redhat.com" target="_blank">mchangir@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Kashif,</div><div>FYI: <a href="http://debuginfo.centos.org/centos/6/storage/x86_64/" target="_blank">http://debuginfo.centos.org/ce<wbr>ntos/6/storage/x86_64/</a></div><div><br></div></div><div class="gmail_extra"><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356h5"><br><div class="gmail_quote">On Wed, Jun 13, 2018 at 3:21 PM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>Hi Milind <br></div><div><br></div><div>There is no
glusterfs-debuginfo available for gluster-3.12 from
<a href="http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/" target="_blank">http://mirror.centos.org/cento<wbr>s/6/storage/x86_64/gluster-3.1<wbr>2/</a> repo. Do
you know from where I can get it? <br></div><div>Also when I run gdb, it says <br></div><div><br></div><div>Missing separate debuginfos, use: debuginfo-install glusterfs-fuse-3.12.9-1.el6.x8<wbr>6_64 <br></div><div><br></div><div>I can't find debug package for glusterfs-fuse either</div></div><div><br></div><div>Thanks from the pit of despair ;)</div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878HOEnZb"><font color="#888888"><div><br></div><div>Kashif<br></div><div><br></div></font></span></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 12, 2018 at 5:01 PM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Milind</div><div><br></div><div>I will send you links for logs.</div><div><br></div><div>I collected these core dumps at client and there is no glusterd process running on client.</div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411HOEnZb"><font color="#888888"><div><br></div><div>Kashif<br></div><div><br></div><div><br></div></font></span></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 12, 2018 at 4:14 PM, Milind Changire <span dir="ltr"><<a href="mailto:mchangir@redhat.com" target="_blank">mchangir@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Kashif,</div><div>Could you also send over the client/mount log file as Vijay suggested ?</div><div>Or maybe the lines with the crash backtrace lines<br></div><div><br></div><div>Also, you've mentioned that you straced glusterd, but when you ran gdb, you ran it over /usr/sbin/glusterfs</div><div><br></div></div><div class="gmail_extra"><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572h5"><br><div class="gmail_quote">On Tue, Jun 12, 2018 at 8:19 PM, Vijay Bellur <span dir="ltr"><<a href="mailto:vbellur@redhat.com" target="_blank">vbellur@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Tue, Jun 12, 2018 at 7:40 AM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Milind<br></div><div><br></div><div>The operating system is Scientific Linux 6 which is based on RHEL6. The cpu arch is Intel x86_64.</div><div><br></div><div>I will send you a separate email with link to core dump.</div></div></blockquote><div><br></div><div><br></div></span><div>You could also grep for crash in the client log file and the lines following crash would have a backtrace in most cases.</div><div><br></div><div>HTH,</div><div>Vijay</div><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568h5"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><span><div><br></div><div>Thanks for your help.</div><div><br></div><div>Kashif<br></div><div><br></div></span></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 12, 2018 at 3:16 PM, Milind Changire <span dir="ltr"><<a href="mailto:mchangir@redhat.com" target="_blank">mchangir@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Kashif,</div><div>Could you share the core dump via Google Drive or something similar</div><div><br></div><div>Also, let me know the CPU arch and OS Distribution on which you are running gluster.</div><div><br></div><div>If you've installed the glusterfs-debuginfo package, you'll also get the source lines in the backtrace via gdb</div><div><br></div><div><br></div></div><div class="gmail_extra"><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073h5"><br><div class="gmail_quote">On Tue, Jun 12, 2018 at 5:59 PM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Milind, Vijay <br></div><div><br></div><div>Thanks, I have some more information now as I straced glusterd on client</div><div><br></div><div>138544 0.000131 mprotect(0x7f2f70785000, 4096, PROT_READ|PROT_WRITE) = 0 <0.000026><br>138544 0.000128 mprotect(0x7f2f70786000, 4096, PROT_READ|PROT_WRITE) = 0 <0.000027><br>138544 0.000126 mprotect(0x7f2f70787000, 4096, PROT_READ|PROT_WRITE) = 0 <0.000027><br>138544 0.000124 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f2f7c60ef88} ---<br>138544 0.000051 --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---<br>138551 0.105048 +++ killed by SIGSEGV (core dumped) +++<br>138550 0.000041 +++ killed by SIGSEGV (core dumped) +++<br>138547 0.000008 +++ killed by SIGSEGV (core dumped) +++<br>138546 0.000007 +++ killed by SIGSEGV (core dumped) +++<br>138545 0.000007 +++ killed by SIGSEGV (core dumped) +++<br>138544 0.000008 +++ killed by SIGSEGV (core dumped) +++<br>138543 0.000007 +++ killed by SIGSEGV (core dumped) +++</div><div><br></div><div>As for I understand that somehow gluster is trying to access memory in appropriate manner and kernel sends SIGSEGV <br></div><div><br></div><div>I also got the core dump. I am trying gdb first time so I am not sure whether I am using it correctly <br></div><div><br></div><div>gdb /usr/sbin/glusterfs core.138536</div><div><br></div><div>It just tell me that program terminated with signal 11, segmentation fault .</div><div><br></div><div>The problem is not limited to one client but happening to many clients. <br></div><div><br></div><div>I will really appreciate any help as whole file system has become unusable <br></div><div><br></div><div>Thanks</div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629HOEnZb"><font color="#888888"><div><br></div><div>Kashif<br></div><div><br></div><div><br></div><div><br></div></font></span></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 12, 2018 at 12:26 PM, Milind Changire <span dir="ltr"><<a href="mailto:mchangir@redhat.com" target="_blank">mchangir@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Kashif,</div><div>You can change the log level by:</div><div>$ gluster volume set <vol> diagnostics.brick-log-level TRACE</div><div>$ gluster volume set <vol> diagnostics.client-log-level TRACE</div><div><br></div><div>and see how things fare</div><div><br></div><div>If you want fewer logs you can change the log-level to DEBUG instead of TRACE.</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676h5">On Tue, Jun 12, 2018 at 3:37 PM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676h5"><div dir="ltr"><div>Hi Vijay</div><div><br></div><div>Now it is unmounting every 30 mins ! <br></div><div><br></div><div>The server log at /var/log/glusterfs/bricks/glus<wbr>teratlas-brics001-gv0.log have this line only</div><div><br></div><div>2018-06-12 09:53:19.303102] I [MSGID: 115013] [server-helpers.c:289:do_fd_cl<wbr>eanup] 0-atlasglust-server: fd cleanup on /atlas/atlasdata/zgubic/hmumu/<wbr>histograms/v14.3/Signal<br>[2018-06-12 09:53:19.306190] I [MSGID: 101055] [client_t.c:443:gf_client_unre<wbr>f] 0-atlasglust-server: Shutting down connection <server-name> -2224879-2018/06/12-09:51:01:4<wbr>60889-atlasglust-client-0-0-0</div><div><br></div><div>There is no other information. Is there any way to increase log verbosity?</div><div><br></div><div>on the client <br></div><div><br></div><div>2018-06-12 09:51:01.744980] I [MSGID: 114057] [client-handshake.c:1478:selec<wbr>t_server_supported_programs] 0-atlasglust-client-5: Using Program GlusterFS 3.3, Num (1298437), Version (330)<br>[2018-06-12 09:51:01.746508] I [MSGID: 114046] [client-handshake.c:1231:clien<wbr>t_setvolume_cbk] 0-atlasglust-client-5: Connected to atlasglust-client-5, attached to remote volume '/glusteratlas/brick006/gv0'.<br>[2018-06-12 09:51:01.746543] I [MSGID: 114047] [client-handshake.c:1242:clien<wbr>t_setvolume_cbk] 0-atlasglust-client-5: Server and Client lk-version numbers are not same, reopening the fds<br>[2018-06-12 09:51:01.746814] I [MSGID: 114035] [client-handshake.c:202:client<wbr>_set_lk_version_cbk] 0-atlasglust-client-5: Server lk version = 1<br>[2018-06-12 09:51:01.748449] I [MSGID: 114057] [client-handshake.c:1478:selec<wbr>t_server_supported_programs] 0-atlasglust-client-6: Using Program GlusterFS 3.3, Num (1298437), Version (330)<br>[2018-06-12 09:51:01.750219] I [MSGID: 114046] [client-handshake.c:1231:clien<wbr>t_setvolume_cbk] 0-atlasglust-client-6: Connected to atlasglust-client-6, attached to remote volume '/glusteratlas/brick007/gv0'.<br>[2018-06-12 09:51:01.750261] I [MSGID: 114047] [client-handshake.c:1242:clien<wbr>t_setvolume_cbk] 0-atlasglust-client-6: Server and Client lk-version numbers are not same, reopening the fds<br>[2018-06-12 09:51:01.750503] I [MSGID: 114035] [client-handshake.c:202:client<wbr>_set_lk_version_cbk] 0-atlasglust-client-6: Server lk version = 1<br>[2018-06-12 09:51:01.752207] I [fuse-bridge.c:4205:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.14<br>[2018-06-12 09:51:01.752261] I [fuse-bridge.c:4835:fuse_graph<wbr>_sync] 0-fuse: switched to graph 0<br></div><div><br></div><div><br></div><div>is there a problem with server and client 1k version?</div><div><br></div><div>Thanks for your help.</div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676m_605157100837625209HOEnZb"><font color="#888888"><div><br></div><div>Kashif<br></div><div><br></div><div><br></div><div><br></div><div> </div></font></span></div><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676m_605157100837625209HOEnZb"><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676m_605157100837625209h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jun 11, 2018 at 11:52 PM, Vijay Bellur <span dir="ltr"><<a href="mailto:vbellur@redhat.com" target="_blank">vbellur@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Mon, Jun 11, 2018 at 8:50 AM, mohammad kashif <span dir="ltr"><<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi</div><div><br></div><div>Since I have updated our gluster server and client to latest version 3.12.9-1, I am having this issue of gluster getting unmounted from client very regularly. It was not a problem before update.<br></div><div><br></div><div>Its a distributed file system with no replication. We have seven servers totaling around 480TB data. Its 97% full. <br></div><div><br></div><div>I am using following config on server</div><div><br></div><div><br></div>gluster volume set atlasglust features.cache-invalidation on<br>gluster volume set atlasglust features.cache-invalidation-ti<wbr>meout 600<br>gluster volume set atlasglust performance.stat-prefetch on<br>gluster volume set atlasglust performance.cache-invalidation on<br>gluster volume set atlasglust performance.md-cache-timeout 600<br>gluster volume set atlasglust performance.parallel-readdir on<br>gluster volume set atlasglust performance.cache-size 1GB<br>gluster volume set atlasglust performance.client-io-threads on<br>gluster volume set atlasglust cluster.lookup-optimize on<br>gluster volume set atlasglust performance.stat-prefetch on<br>gluster volume set atlasglust client.event-threads 4<br>gluster volume set atlasglust server.event-threads 4<br><div><br></div><div>clients are mounted with this option</div><div><br></div><div>defaults,direct-io-mode=disabl<wbr>e,attribute-timeout=600,entry-<wbr>timeout=600,negative-timeout=6<wbr>00,fopen-keep-cache,rw,_netdev <br></div><div><br></div><div>I can't see anything in the log file. Can someone suggest that how to troubleshoot this issue?</div><div><br></div><div><br></div></div></blockquote><div><br></div><div><br></div></span><div>Can you please share the log file? Checking for messages related to disconnections/crashes in the log file would be a good way to start troubleshooting the problem.</div><div><br></div><div>Thanks,</div><div>Vijay </div></div></div></div>
</blockquote></div><br></div>
</div></div><br></div></div>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676HOEnZb"><font color="#888888"><br></font></span></blockquote></div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676HOEnZb"><font color="#888888"><br><br clear="all"><br>-- <br><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629m_7147468144477528676m_605157100837625209gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">Milind<br><br></div></div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear="all"><br></div></div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073HOEnZb"><font color="#888888">-- <br><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568m_4152801743731486923m_-4451941051880752073m_3059762644582098629gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">Milind<br><br></div></div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br><br clear="all"><br></div></div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572HOEnZb"><font color="#888888">-- <br><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878m_2285939528783175411m_4375213844299906572m_4931964523437139568gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">Milind<br><br></div></div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear="all"><br></div></div><span class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356HOEnZb"><font color="#888888">-- <br><div class="m_5953770867640638257m_-7941410713842890653m_467285906852344872m_7598081866559255810m_196425271823366050m_3873033849228088648m_978616729599433356m_5014525246591411878gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">Milind<br><br></div></div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div><br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br></blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</div></div><br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/mailm<wbr>an/listinfo/gluster-users</a><br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>