<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <p>Hi Strahil</p>
    <p>I have tried a couple of tests of trying to gunzip the file with
      top running on the client (mseas) and on the brick server
      (mseas-data3) and with iotop running on the client (mseas).  I was
      not able to install iotop on the brick server yet (the external
      line is down).  I'll repeat when I fix that problem</p>
    <p>I now can get one of two error messages when gunzip fails:</p>
    <ul>
      <li>gzip:
        /projects/dri_calypso/PE/2019/Apr09/Ens3R200deg001/pe_out.nc.gz:
        File descriptor in bad state</li>
      <ul>
        <li>a new error message<br>
        </li>
      </ul>
      <li>gzip:
        /projects/dri_calypso/PE/2019/Apr09/Ens3R200deg001/pe_out.nc.gz:
        Transport endpoint is not connected</li>
      <ul>
        <li>the original error message<br>
        </li>
      </ul>
    </ul>
    <p>What I observed while waiting for gunzip to fail</p>
    <ul>
      <li>top</li>
      <ul>
        <li>no significant load (usually less than 0.1) on both
          machines.</li>
        <li>zero IO-wait on both machines</li>
      </ul>
      <li>iotop (only running on the client)</li>
      <ul>
        <li>nothing related to gluster showing up in the display at all</li>
      </ul>
    </ul>
    <p>I include below what I found in the log files again corresponding
      to these tests (and what I see in dmesg on the brick-server
      related to gluster, nothing showed up on the client)</p>
    <p>Please let me know what I should try next.</p>
    <p>Thanks</p>
    <p>Pat<br>
    </p>
    <p><font face="monospace"><br>
        ------------------------------------------<br>
        mseas-data3: dmesg | grep glust<br>
        ------------------------------------------<br>
        many repeats of the following pairs of lines:<br>
        <br>
        glusterfsd: page allocation failure. order:1, mode:0x20<br>
        Pid: 14245, comm: glusterfsd Not tainted
        2.6.32-754.2.1.el6.x86_64 #1<br>
        <br>
        ------------------------------------------<br>
        mseas:messages<br>
        ------------------------------------------<br>
        Jun 21 17:04:35 mseas gdata[155485]: [2022-06-21
        21:04:35.638810] C
        [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
        0-data-volume-client-2: server 172.16.1.113:49153 has not
        responded in the last 42 seconds, disconnecting.<br>
        <br>
        Jun 21 17:21:04 mseas gdata[155485]: [2022-06-21
        21:21:04.786083] C
        [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
        0-data-volume-client-2: server 172.16.1.113:49153 has not
        responded in the last 42 seconds, disconnecting.<br>
        <br>
        ------------------------------------------<br>
        mseas:gdata.log<br>
        ------------------------------------------<br>
        [2022-06-21 21:04:35.638810] C
        [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
        0-data-volume-client-2: server 172.16.1.113:49153 has not
        responded in the last 42 seconds, disconnecting.<br>
        [2022-06-21 21:04:35.639261] E
        [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
        (-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
        (-->
        /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
        ))))) 0-data-volume-client-2: forced unwinding frame
        type(GlusterFS 3.3) op(READ(12)) called at 2022-06-21
        21:03:29.735807 (xid=0xc05d54)<br>
        [2022-06-21 21:04:35.639494] E
        [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
        (-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
        (-->
        /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
        ))))) 0-data-volume-client-2: forced unwinding frame
        type(GF-DUMP) op(NULL(2)) called at 2022-06-21 21:03:53.633472
        (xid=0xc05d55)<br>
        <br>
        <br>
        [2022-06-21 21:21:04.786083] C
        [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
        0-data-volume-client-2: server 172.16.1.113:49153 has not
        responded in the last 42 seconds, disconnecting.<br>
        [2022-06-21 21:21:04.786732] E
        [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
        (-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
        (-->
        /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
        ))))) 0-data-volume-client-2: forced unwinding frame
        type(GlusterFS 3.3) op(READ(12)) called at 2022-06-21
        21:19:52.634383 (xid=0xc05e31)<br>
        [2022-06-21 21:21:04.787172] E
        [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
        (-->
        /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
        (-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
        (-->
        /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
        ))))) 0-data-volume-client-2: forced unwinding frame
        type(GF-DUMP) op(NULL(2)) called at 2022-06-21 21:20:22.780023
        (xid=0xc05e32)<br>
        <br>
        ------------------------------------------<br>
        mseas-data3: bricks/export-sda-brick3.log<br>
        ------------------------------------------<br>
        [2022-06-21 21:03:54.489638] I [MSGID: 115036]
        [server.c:552:server_rpc_notify] 0-data-volume-server:
        disconnecting connection from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-31<br>
        [2022-06-21 21:03:54.489752] I [MSGID: 115013]
        [server-helpers.c:294:do_fd_cleanup] 0-data-volume-server: fd
        cleanup on
        /projects/dri_calypso/PE/2019/Apr09/Ens3R200deg001/pe_out.nc.gz<br>
        [2022-06-21 21:03:54.489817] I [MSGID: 101055]
        [client_t.c:420:gf_client_unref] 0-data-volume-server: Shutting
        down connection
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-31<br>
        [2022-06-21 21:04:04.506544] I [MSGID: 115029]
        [server-handshake.c:690:server_setvolume] 0-data-volume-server:
        accepted client from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-32
        (version: 3.7.11)<br>
        <br>
        <br>
        [2022-06-21 21:20:23.625096] I [MSGID: 115036]
        [server.c:552:server_rpc_notify] 0-data-volume-server:
        disconnecting connection from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-32<br>
        [2022-06-21 21:20:23.625189] I [MSGID: 115013]
        [server-helpers.c:294:do_fd_cleanup] 0-data-volume-server: fd
        cleanup on
        /projects/dri_calypso/PE/2019/Apr09/Ens3R200deg001/pe_out.nc.gz<br>
        [2022-06-21 21:20:23.625255] I [MSGID: 101055]
        [client_t.c:420:gf_client_unref] 0-data-volume-server: Shutting
        down connection
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-32<br>
        [2022-06-21 21:20:23.641462] I [MSGID: 115029]
        [server-handshake.c:690:server_setvolume] 0-data-volume-server:
        accepted client from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-33
        (version: 3.7.11)<br>
      </font><br>
    </p>
    <div class="moz-cite-prefix">On 6/17/22 2:18 AM, Strahil Nikolov
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:302258344.2824399.1655446681644@mail.yahoo.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      Check with top & iotop the load.
      <div>Especially check the wait for I/O in top.</div>
      <div><br>
      </div>
      <div>Did you check dmesg for any clues ?</div>
      <div><br>
      </div>
      <div>Best Regards,</div>
      <div>Strahil Nikolov<br>
        <br>
        <blockquote style="margin: 0 0 20px 0;">
          <div style="font-family:Roboto, sans-serif; color:#6D00F6;">
            <div>On Thu, Jun 16, 2022 at 22:59, Pat Haley</div>
            <div><a class="moz-txt-link-rfc2396E" href="mailto:phaley@mit.edu"><phaley@mit.edu></a> wrote:</div>
          </div>
          <div style="padding: 10px 0 0 20px; margin: 10px 0 0 0;
            border-left: 1px solid #6D00F6;">
            <div id="yiv5506566618">
              <div>
                <p><br clear="none">
                </p>
                <p>Hi Strahil,</p>
                <p>I poked around our logs, and found this on the
                  front-end (from the day & time of the last time we
                  had the issue)</p>
                <p><br clear="none">
                </p>
                <p><font face="monospace">Jun 15 10:51:17 mseas
                    gdata[155485]: [2022-06-15 14:51:17.263858] C
                    [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
                    0-data-volume-client-2: server 172.16.1.113:49153
                    has not responded in the last 42 seconds,
                    disconnecting.<br clear="none">
                  </font></p>
                <p><br clear="none">
                </p>
                <p>This would indicate that the problem is related.  For
                  us, however, I believe we can reproduce this issue at
                  will (i.e. simply try to gunzip the same file).
                  Unfortunately I have to go to a meeting now, but if
                  you have some specific tests you'd like me to try, I
                  can try them when I get back.</p>
                <p>Thanks</p>
                <p>Pat</p>
                <p><br clear="none">
                </p>
                <p><br clear="none">
                </p>
                <div class="yiv5506566618moz-cite-prefix">On 6/16/22
                  3:07 PM, Strahil Nikolov wrote:<br clear="none">
                </div>
                <blockquote type="cite"> </blockquote>
              </div>
              <div> Pat, 
                <div><br clear="none">
                </div>
                <div>
                  <div>Can you check the cpu and disk  performance when
                    the volume reports the issue?</div>
                  <div><br clear="none">
                  </div>
                </div>
                <div><br clear="none">
                </div>
                <div>It seems that similar issue was reported
                  in <a rel="nofollow noopener noreferrer" shape="rect"
                    target="_blank"
href="https://lists.gluster.org/pipermail/gluster-users/2019-March/035944.html"
                    class="yiv5506566618moz-txt-link-freetext
                    moz-txt-link-freetext" moz-do-not-send="true">https://lists.gluster.org/pipermail/gluster-users/2019-March/035944.html</a>
                  but I don't see a clear solution.</div>
                <div>Take a look in the thread and check if it matches
                  your symptoms.</div>
                <div><br clear="none">
                </div>
                <div><br clear="none">
                </div>
                <div>Best Regards,</div>
                <div>Strahil Nikolov<br clear="none">
                  <br clear="none">
                  <blockquote style="margin:0 0 20px 0;">
                    <div style="font-family:Roboto,
                      sans-serif;color:#6D00F6;">
                      <div>On Thu, Jun 16, 2022 at 18:14, Pat Haley</div>
                      <div><a rel="nofollow noopener noreferrer"
                          shape="rect" ymailto="mailto:phaley@mit.edu"
                          target="_blank" href="mailto:phaley@mit.edu"
                          class="yiv5506566618moz-txt-link-rfc2396E"
                          moz-do-not-send="true"><phaley@mit.edu></a>
                        wrote:</div>
                    </div>
                    <div style="padding:10px 0 0 20px;margin:10px 0 0
                      0;border-left:1px solid #6D00F6;">
                      <div id="yiv5506566618">
                        <div>
                          <p><br clear="none">
                          </p>
                          <p>Hi Strahil,</p>
                          <p>I poked around again and for brick 3 (where
                            the file we were testing resides)  I only
                            found the same log file as was at the bottom
                            of my first Email:</p>
                          <p><br clear="none">
                            <font face="monospace">---------------------------------------------------<br
                                clear="none">
                              mseas-data3:  bricks/export-sda-brick3.log<br
                                clear="none">
                              -----------------------------------------<br
                                clear="none">
                              [2022-06-15 14:50:42.588143] I [MSGID:
                              115036] [server.c:552:server_rpc_notify]
                              0-data-volume-server: disconnecting
                              connection from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-28<br
                                clear="none">
                              [2022-06-15 14:50:42.588220] I [MSGID:
                              115013]
                              [server-helpers.c:294:do_fd_cleanup]
                              0-data-volume-server: fd cleanup on
/projects/posydon/Acoustics_ASA/MSEAS-ParEq-DO/Save/2D/Test_Cases/RI/DO_NAPE_JASA_Paper/Uncertain_Pekeris_Waveguide_DO_MC<br
                                clear="none">
                              [2022-06-15 14:50:42.588259] I [MSGID:
                              115013]
                              [server-helpers.c:294:do_fd_cleanup]
                              0-data-volume-server: fd cleanup on
                              /projects/dri_calypso/PE/2019/Apr09/Ens3R200deg001/pe_out.nc.gz<br
                                clear="none">
                              [2022-06-15 14:50:42.588288] I [MSGID:
                              101055] [client_t.c:420:gf_client_unref]
                              0-data-volume-server: Shutting down
                              connection
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-28<br
                                clear="none">
                              [2022-06-15 14:50:53.605215] I [MSGID:
                              115029]
                              [server-handshake.c:690:server_setvolume]
                              0-data-volume-server: accepted client from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-29
                              (version: 3.7.11)<br clear="none">
                              [2022-06-15 14:50:42.588247] I [MSGID:
                              115013]
                              [server-helpers.c:294:do_fd_cleanup]
                              0-data-volume-server: fd cleanup on
/projects/posydon/Acoustics_ASA/MSEAS-ParEq-DO/Save/2D/Test_Cases/RI/DO_NAPE_JASA_Paper/Uncertain_Pekeris_Waveguide_DO_MC</font><br
                              clear="none">
                          </p>
                          <p>Thanks</p>
                          <p>Pat</p>
                          <p><br clear="none">
                          </p>
                          <div id="yiv5506566618yqt80756"
                            class="yiv5506566618yqt6062330617">
                            <div class="yiv5506566618moz-cite-prefix">On
                              6/15/22 6:47 PM, Strahil Nikolov wrote:<br
                                clear="none">
                            </div>
                            <blockquote type="cite"> </blockquote>
                          </div>
                        </div>
                        <div id="yiv5506566618yqt04765"
                          class="yiv5506566618yqt6062330617">
                          <div>
                            <div id="yiv5506566618">
                              <div>I agree. It will be very hard to
                                debug.
                                <div><br clear="none">
                                </div>
                                <div>Anything in the brick logs ?</div>
                                <div><br clear="none">
                                </div>
                                <div>I think it's pointless to mention
                                  that EL6 is dead and Gluster v3 is so
                                  old that it's worth considering a
                                  migration to a newer setup.</div>
                                <div><br clear="none">
                                </div>
                                <div>Best Regards,</div>
                                <div>Strahil Nikolov<br clear="none">
                                  <br clear="none">
                                  <blockquote style="margin:0 0 20px 0;">
                                    <div style="font-family:Roboto,
                                      sans-serif;color:#6D00F6;">
                                      <div id="yiv5506566618yqtfd58221"
class="yiv5506566618yqt6679717770">
                                        <div>On Wed, Jun 15, 2022 at
                                          22:51, Yaniv Kaul</div>
                                        <div><a rel="nofollow noopener
                                            noreferrer" shape="rect"
                                            ymailto="mailto:ykaul@redhat.com"
                                            target="_blank"
                                            href="mailto:ykaul@redhat.com"
class="yiv5506566618moz-txt-link-rfc2396E" moz-do-not-send="true"><ykaul@redhat.com></a>
                                          wrote:</div>
                                      </div>
                                    </div>
                                    <div id="yiv5506566618yqtfd56990"
                                      class="yiv5506566618yqt6679717770">
                                      <div style="padding:10px 0 0
                                        20px;margin:10px 0 0
                                        0;border-left:1px solid
                                        #6D00F6;"> ________<br
                                          clear="none">
                                        <br clear="none">
                                        <br clear="none">
                                        <br clear="none">
                                        Community Meeting Calendar:<br
                                          clear="none">
                                        <br clear="none">
                                        Schedule -<br clear="none">
                                        Every 2nd and 4th Tuesday at
                                        14:30 IST / 09:00 UTC<br
                                          clear="none">
                                        Bridge: <a rel="nofollow
                                          noopener noreferrer"
                                          shape="rect" target="_blank"
                                          href="https://meet.google.com/cpu-eiue-hvk"
class="yiv5506566618moz-txt-link-freetext moz-txt-link-freetext"
                                          moz-do-not-send="true">https://meet.google.com/cpu-eiue-hvk</a><br
                                          clear="none">
                                        Gluster-users mailing list<br
                                          clear="none">
                                        <a rel="nofollow noopener
                                          noreferrer" shape="rect"
                                          ymailto="mailto:Gluster-users@gluster.org"
                                          target="_blank"
                                          href="mailto:Gluster-users@gluster.org"
class="yiv5506566618moz-txt-link-freetext moz-txt-link-freetext"
                                          moz-do-not-send="true">Gluster-users@gluster.org</a><br
                                          clear="none">
                                        <a rel="nofollow noopener
                                          noreferrer" shape="rect"
                                          target="_blank"
                                          href="https://lists.gluster.org/mailman/listinfo/gluster-users"
class="yiv5506566618moz-txt-link-freetext moz-txt-link-freetext"
                                          moz-do-not-send="true">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br
                                          clear="none">
                                      </div>
                                    </div>
                                  </blockquote>
                                </div>
                              </div>
                            </div>
                            <pre class="yiv5506566618moz-signature">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:phaley@mit.edu" target="_blank" href="mailto:phaley@mit.edu" class="yiv5506566618moz-txt-link-abbreviated yiv5506566618moz-txt-link-freetext moz-txt-link-freetext" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://web.mit.edu/phaley/www/" class="yiv5506566618moz-txt-link-freetext moz-txt-link-freetext" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre>
                          </div>
                          <div id="yiv5506566618yqtfd71224"
                            class="yiv5506566618yqt0138793630"> </div>
                        </div>
                        <div id="yiv5506566618yqtfd75825"
                          class="yiv5506566618yqt0138793630"> </div>
                      </div>
                      <div id="yiv5506566618yqtfd56693"
                        class="yiv5506566618yqt0138793630"> </div>
                    </div>
                    <div id="yiv5506566618yqtfd94279"
                      class="yiv5506566618yqt0138793630"> </div>
                  </blockquote>
                  <div id="yiv5506566618yqtfd14360"
                    class="yiv5506566618yqt0138793630"> </div>
                </div>
                <div id="yiv5506566618yqtfd34488"
                  class="yiv5506566618yqt0138793630">
                  <pre class="yiv5506566618moz-signature">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:phaley@mit.edu" target="_blank" href="mailto:phaley@mit.edu" class="yiv5506566618moz-txt-link-abbreviated moz-txt-link-freetext" moz-do-not-send="true">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a rel="nofollow noopener noreferrer" shape="rect" target="_blank" href="http://web.mit.edu/phaley/www/" class="yiv5506566618moz-txt-link-freetext moz-txt-link-freetext" moz-do-not-send="true">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre>
                </div>
              </div>
            </div>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  <a class="moz-txt-link-abbreviated" href="mailto:phaley@mit.edu">phaley@mit.edu</a>
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    <a class="moz-txt-link-freetext" href="http://web.mit.edu/phaley/www/">http://web.mit.edu/phaley/www/</a>
77 Massachusetts Avenue
Cambridge, MA  02139-4301
</pre>
  </body>
</html>