<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    @Jim Kusznir<br>

    <br>

    For the heal issue, can you provide the getfattr output of one of

    the 8 files in question from all 3 bricks?<br>

    Example: `getfattr -d -m . -e hex

/gluster/brick3/data-hdd/cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8-7ac5-4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9`<br>

    Also provide the stat output of the same file from all 3 bricks.<br>

    <br>

    Thanks,<br>

    Ravi<br>

    <p><br>

    </p>

    <br>

    <div class="moz-cite-prefix">On 05/30/2018 09:47 AM, Krutika

      Dhananjay wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAPhYV8M6AGAMZCmSPi9m2ktffha3UeP0dpnuN1mBUsUjzNNj-w@mail.gmail.com">

      <div dir="ltr">

        <div>Adding Ravi to look into the heal issue.</div>

        <div><br>

        </div>

        <div>As for the fsync hang and subsequent IO errors, it seems a

          lot like <a

            href="https://bugzilla.redhat.com/show_bug.cgi?id=1497156"

            moz-do-not-send="true">https://bugzilla.redhat.com/show_bug.cgi?id=1497156</a>

          and Paolo Bonzini from qemu had pointed out that this would be

          fixed by the following commit:</div>

        <div><br>

        </div>

        <div>

          <pre class="gmail-bz_comment_text gmail-bz_wrap_comment_text" id="gmail-comment_text_20">  commit e72c9a2a67a6400c8ef3d01d4c461dbbbfa0e1f0

    Author: Paolo Bonzini &lt;<a href="mailto:pbonzini@redhat.com" moz-do-not-send="true">pbonzini@redhat.com</a>&gt;

    Date:   Wed Jun 21 16:35:46 2017 +0200

    scsi: virtio_scsi: let host do exception handling

    virtio_scsi tries to do exception handling after the default 30 seconds

    timeout expires.  However, it's better to let the host control the

    timeout, otherwise with a heavy I/O load it is likely that an abort will

    also timeout.  This leads to fatal errors like filesystems going

    offline.

    Disable the 'sd' timeout and allow the host to do exception handling,

    following the precedent of the storvsc driver.

    Hannes has a proposal to introduce timeouts in virtio, but this provides

    an immediate solution for stable kernels too.

    [mkp: fixed typo]

    Reported-by: Douglas Miller &lt;<a href="mailto:dougmill@linux.vnet.ibm.com" moz-do-not-send="true">dougmill@linux.vnet.ibm.com</a>&gt;

    Cc: "James E.J. Bottomley" &lt;<a href="mailto:jejb@linux.vnet.ibm.com" moz-do-not-send="true">jejb@linux.vnet.ibm.com</a>&gt;

    Cc: "Martin K. Petersen" &lt;<a href="mailto:martin.petersen@oracle.com" moz-do-not-send="true">martin.petersen@oracle.com</a>&gt;

    Cc: Hannes Reinecke &lt;<a href="mailto:hare@suse.de" moz-do-not-send="true">hare@suse.de</a>&gt;

    Cc: <a href="mailto:linux-scsi@vger.kernel.org" moz-do-not-send="true">linux-scsi@vger.kernel.org</a>

    Cc: <a href="mailto:stable@vger.kernel.org" moz-do-not-send="true">stable@vger.kernel.org</a>

    Signed-off-by: Paolo Bonzini &lt;<a href="mailto:pbonzini@redhat.com" moz-do-not-send="true">pbonzini@redhat.com</a>&gt;

    Signed-off-by: Martin K. Petersen &lt;<a href="mailto:martin.petersen@oracle.com" moz-do-not-send="true">martin.petersen@oracle.com</a>&gt;</pre>

        </div>

        <div><br>

        </div>

        <div>Adding Paolo/Kevin to comment.</div>

        <div><br>

        </div>

        <div>As for the poor gluster performance, could you disable

          cluster.eager-lock and see if that makes any difference:</div>

        <div><br>

        </div>

        <div># gluster volume set &lt;VOL&gt; cluster.eager-lock off</div>

        <div><br>

        </div>

        <div>Do also capture the volume profile again if you still see

          performance issues after disabling eager-lock.<br>

        </div>

        <div><br>

        </div>

        <div>-Krutika</div>

        <div><br>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Wed, May 30, 2018 at 6:55 AM, Jim

          Kusznir <span dir="ltr">&lt;<a

              href="mailto:jim@palousetech.com" target="_blank"

              moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div dir="ltr">I also finally found the following in my

              system log on one server:

              <div><br>

              </div>

              <div>

                <div>[10679.524491] INFO: task glusterclogro:14933

                  blocked for more than 120 seconds.</div>

                <div>[10679.525826] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[10679.527144] glusterclogro   D ffff97209832bf40 

                     0 14933      1 0x00000080</div>

                <div>[10679.527150] Call Trace:</div>

                <div>[10679.527161]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[10679.527218]  [&lt;ffffffffc060e388&gt;]

                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>

                <div>[10679.527225]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[10679.527254]  [&lt;ffffffffc05eeb97&gt;]

                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>

                <div>[10679.527260]  [&lt;ffffffffb944f0e7&gt;]

                  do_fsync+0x67/0xb0</div>

                <div>[10679.527268]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[10679.527271]  [&lt;ffffffffb944f3d0&gt;]

                  SyS_fsync+0x10/0x20</div>

                <div>[10679.527275]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[10679.527279]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[10679.527283] INFO: task glusterposixfsy:14941

                  blocked for more than 120 seconds.</div>

                <div>[10679.528608] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[10679.529956] glusterposixfsy D ffff972495f84f10 

                     0 14941      1 0x00000080</div>

                <div>[10679.529961] Call Trace:</div>

                <div>[10679.529966]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[10679.530003]  [&lt;ffffffffc060e388&gt;]

                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>

                <div>[10679.530008]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[10679.530038]  [&lt;ffffffffc05eeb97&gt;]

                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>

                <div>[10679.530042]  [&lt;ffffffffb944f0e7&gt;]

                  do_fsync+0x67/0xb0</div>

                <div>[10679.530046]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[10679.530050]  [&lt;ffffffffb944f3f3&gt;]

                  SyS_fdatasync+0x13/0x20</div>

                <div>[10679.530054]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[10679.530058]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[10679.530062] INFO: task glusteriotwr13:15486

                  blocked for more than 120 seconds.</div>

                <div>[10679.531805] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[10679.533732] glusteriotwr13  D ffff9720a83f0000 

                     0 15486      1 0x00000080</div>

                <div>[10679.533738] Call Trace:</div>

                <div>[10679.533747]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[10679.533799]  [&lt;ffffffffc060e388&gt;]

                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>

                <div>[10679.533806]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[10679.533846]  [&lt;ffffffffc05eeb97&gt;]

                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>

                <div>[10679.533852]  [&lt;ffffffffb944f0e7&gt;]

                  do_fsync+0x67/0xb0</div>

                <div>[10679.533858]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[10679.533863]  [&lt;ffffffffb944f3f3&gt;]

                  SyS_fdatasync+0x13/0x20</div>

                <div>[10679.533868]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[10679.533873]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[10919.512757] INFO: task glusterclogro:14933

                  blocked for more than 120 seconds.</div>

                <div>[10919.514714] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[10919.516663] glusterclogro   D ffff97209832bf40 

                     0 14933      1 0x00000080</div>

                <div>[10919.516677] Call Trace:</div>

                <div>[10919.516690]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[10919.516696]  [&lt;ffffffffb99118e9&gt;]

                  schedule_timeout+0x239/0x2c0</div>

                <div>[10919.516703]  [&lt;ffffffffb951cc04&gt;] ?

                  blk_finish_plug+0x14/0x40</div>

                <div>[10919.516768]  [&lt;ffffffffc05e9224&gt;] ?

                  _xfs_buf_ioapply+0x334/0x460 [xfs]</div>

                <div>[10919.516774]  [&lt;ffffffffb991432d&gt;]

                  wait_for_completion+0xfd/0x140</div>

                <div>[10919.516782]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[10919.516821]  [&lt;ffffffffc05eb0a3&gt;] ?

                  _xfs_buf_read+0x23/0x40 [xfs]</div>

                <div>[10919.516859]  [&lt;ffffffffc05eafa9&gt;]

                  xfs_buf_submit_wait+0xf9/0x1d0 [xfs]</div>

                <div>[10919.516902]  [&lt;ffffffffc061b279&gt;] ?

                  xfs_trans_read_buf_map+0x199/<wbr>0x400 [xfs]</div>

                <div>[10919.516940]  [&lt;ffffffffc05eb0a3&gt;]

                  _xfs_buf_read+0x23/0x40 [xfs]</div>

                <div>[10919.516977]  [&lt;ffffffffc05eb1b9&gt;]

                  xfs_buf_read_map+0xf9/0x160 [xfs]</div>

                <div>[10919.517022]  [&lt;ffffffffc061b279&gt;]

                  xfs_trans_read_buf_map+0x199/<wbr>0x400 [xfs]</div>

                <div>[10919.517057]  [&lt;ffffffffc05c8d04&gt;]

                  xfs_da_read_buf+0xd4/0x100 [xfs]</div>

                <div>[10919.517091]  [&lt;ffffffffc05c8d53&gt;]

                  xfs_da3_node_read+0x23/0xd0 [xfs]</div>

                <div>[10919.517126]  [&lt;ffffffffc05c9fee&gt;]

                  xfs_da3_node_lookup_int+0x6e/<wbr>0x2f0 [xfs]</div>

                <div>[10919.517160]  [&lt;ffffffffc05d5a1d&gt;]

                  xfs_dir2_node_lookup+0x4d/<wbr>0x170 [xfs]</div>

                <div>[10919.517194]  [&lt;ffffffffc05ccf5d&gt;]

                  xfs_dir_lookup+0x1bd/0x1e0 [xfs]</div>

                <div>[10919.517233]  [&lt;ffffffffc05fd8d9&gt;]

                  xfs_lookup+0x69/0x140 [xfs]</div>

                <div>[10919.517271]  [&lt;ffffffffc05fa018&gt;]

                  xfs_vn_lookup+0x78/0xc0 [xfs]</div>

                <div>[10919.517278]  [&lt;ffffffffb9425cf3&gt;]

                  lookup_real+0x23/0x60</div>

                <div>[10919.517283]  [&lt;ffffffffb9426702&gt;]

                  __lookup_hash+0x42/0x60</div>

                <div>[10919.517288]  [&lt;ffffffffb942d519&gt;]

                  SYSC_renameat2+0x3a9/0x5a0</div>

                <div>[10919.517296]  [&lt;ffffffffb94d3753&gt;] ?

                  selinux_file_free_security+<wbr>0x23/0x30</div>

                <div>[10919.517304]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[10919.517309]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[10919.517313]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[10919.517318]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[10919.517323]  [&lt;ffffffffb942e58e&gt;]

                  SyS_renameat2+0xe/0x10</div>

                <div>[10919.517328]  [&lt;ffffffffb942e5ce&gt;]

                  SyS_rename+0x1e/0x20</div>

                <div>[10919.517333]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[10919.517339]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[11159.496095] INFO: task glusteriotwr9:15482

                  blocked for more than 120 seconds.</div>

                <div>[11159.497546] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[11159.498978] glusteriotwr9   D ffff971fa0fa1fa0 

                     0 15482      1 0x00000080</div>

                <div>[11159.498984] Call Trace:</div>

                <div>[11159.498995]  [&lt;ffffffffb9911f00&gt;] ?

                  bit_wait+0x50/0x50</div>

                <div>[11159.498999]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[11159.499003]  [&lt;ffffffffb99118e9&gt;]

                  schedule_timeout+0x239/0x2c0</div>

                <div>[11159.499056]  [&lt;ffffffffc05dd9b7&gt;] ?

                  xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]</div>

                <div>[11159.499082]  [&lt;ffffffffc05dd43e&gt;] ?

                  xfs_iext_bno_to_irec+0x8e/0xd0 [xfs]</div>

                <div>[11159.499090]  [&lt;ffffffffb92f7a12&gt;] ?

                  ktime_get_ts64+0x52/0xf0</div>

                <div>[11159.499093]  [&lt;ffffffffb9911f00&gt;] ?

                  bit_wait+0x50/0x50</div>

                <div>[11159.499097]  [&lt;ffffffffb991348d&gt;]

                  io_schedule_timeout+0xad/0x130</div>

                <div>[11159.499101]  [&lt;ffffffffb9913528&gt;]

                  io_schedule+0x18/0x20</div>

                <div>[11159.499104]  [&lt;ffffffffb9911f11&gt;]

                  bit_wait_io+0x11/0x50</div>

                <div>[11159.499107]  [&lt;ffffffffb9911ac1&gt;]

                  __wait_on_bit_lock+0x61/0xc0</div>

                <div>[11159.499113]  [&lt;ffffffffb9393634&gt;]

                  __lock_page+0x74/0x90</div>

                <div>[11159.499118]  [&lt;ffffffffb92bc210&gt;] ?

                  wake_bit_function+0x40/0x40</div>

                <div>[11159.499121]  [&lt;ffffffffb9394154&gt;]

                  __find_lock_page+0x54/0x70</div>

                <div>[11159.499125]  [&lt;ffffffffb9394e85&gt;]

                  grab_cache_page_write_begin+<wbr>0x55/0xc0</div>

                <div>[11159.499130]  [&lt;ffffffffb9484b76&gt;]

                  iomap_write_begin+0x66/0x100</div>

                <div>[11159.499135]  [&lt;ffffffffb9484edf&gt;]

                  iomap_write_actor+0xcf/0x1d0</div>

                <div>[11159.499140]  [&lt;ffffffffb9484e10&gt;] ?

                  iomap_write_end+0x80/0x80</div>

                <div>[11159.499144]  [&lt;ffffffffb94854e7&gt;]

                  iomap_apply+0xb7/0x150</div>

                <div>[11159.499149]  [&lt;ffffffffb9485621&gt;]

                  iomap_file_buffered_write+<wbr>0xa1/0xe0</div>

                <div>[11159.499153]  [&lt;ffffffffb9484e10&gt;] ?

                  iomap_write_end+0x80/0x80</div>

                <div>[11159.499182]  [&lt;ffffffffc05f025d&gt;]

                  xfs_file_buffered_aio_write+<wbr>0x12d/0x2c0 [xfs]</div>

                <div>[11159.499213]  [&lt;ffffffffc05f057d&gt;]

                  xfs_file_aio_write+0x18d/0x1b0 [xfs]</div>

                <div>[11159.499217]  [&lt;ffffffffb941a533&gt;]

                  do_sync_write+0x93/0xe0</div>

                <div>[11159.499222]  [&lt;ffffffffb941b010&gt;]

                  vfs_write+0xc0/0x1f0</div>

                <div>[11159.499225]  [&lt;ffffffffb941c002&gt;]

                  SyS_pwrite64+0x92/0xc0</div>

                <div>[11159.499230]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[11159.499234]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[11159.499238]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[11279.488720] INFO: task xfsaild/dm-10:1134

                  blocked for more than 120 seconds.</div>

                <div>[11279.490197] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[11279.491665] xfsaild/dm-10   D ffff9720a8660fd0 

                     0  1134      2 0x00000000</div>

                <div>[11279.491671] Call Trace:</div>

                <div>[11279.491682]  [&lt;ffffffffb92a3a2e&gt;] ?

                  try_to_del_timer_sync+0x5e/<wbr>0x90</div>

                <div>[11279.491688]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[11279.491744]  [&lt;ffffffffc060de36&gt;]

                  _xfs_log_force+0x1c6/0x2c0 [xfs]</div>

                <div>[11279.491750]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[11279.491783]  [&lt;ffffffffc0619fec&gt;] ?

                  xfsaild+0x16c/0x6f0 [xfs]</div>

                <div>[11279.491817]  [&lt;ffffffffc060df5c&gt;]

                  xfs_log_force+0x2c/0x70 [xfs]</div>

                <div>[11279.491849]  [&lt;ffffffffc0619e80&gt;] ?

                  xfs_trans_ail_cursor_first+<wbr>0x90/0x90 [xfs]</div>

                <div>[11279.491880]  [&lt;ffffffffc0619fec&gt;]

                  xfsaild+0x16c/0x6f0 [xfs]</div>

                <div>[11279.491913]  [&lt;ffffffffc0619e80&gt;] ?

                  xfs_trans_ail_cursor_first+<wbr>0x90/0x90 [xfs]</div>

                <div>[11279.491919]  [&lt;ffffffffb92bb161&gt;]

                  kthread+0xd1/0xe0</div>

                <div>[11279.491926]  [&lt;ffffffffb92bb090&gt;] ?

                  insert_kthread_work+0x40/0x40</div>

                <div>[11279.491932]  [&lt;ffffffffb9920677&gt;]

                  ret_from_fork_nospec_begin+<wbr>0x21/0x21</div>

                <div>[11279.491936]  [&lt;ffffffffb92bb090&gt;] ?

                  insert_kthread_work+0x40/0x40</div>

                <div>[11279.491976] INFO: task glusterclogfsyn:14934

                  blocked for more than 120 seconds.</div>

                <div>[11279.493466] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[11279.494952] glusterclogfsyn D ffff97209832af70 

                     0 14934      1 0x00000080</div>

                <div>[11279.494957] Call Trace:</div>

                <div>[11279.494979]  [&lt;ffffffffc0309839&gt;] ?

                  __split_and_process_bio+0x2e9/<wbr>0x520 [dm_mod]</div>

                <div>[11279.494983]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[11279.494987]  [&lt;ffffffffb99118e9&gt;]

                  schedule_timeout+0x239/0x2c0</div>

                <div>[11279.494997]  [&lt;ffffffffc0309d98&gt;] ?

                  dm_make_request+0x128/0x1a0 [dm_mod]</div>

                <div>[11279.495001]  [&lt;ffffffffb991348d&gt;]

                  io_schedule_timeout+0xad/0x130</div>

                <div>[11279.495005]  [&lt;ffffffffb99145ad&gt;]

                  wait_for_completion_io+0xfd/<wbr>0x140</div>

                <div>[11279.495010]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[11279.495016]  [&lt;ffffffffb951e574&gt;]

                  blkdev_issue_flush+0xb4/0x110</div>

                <div>[11279.495049]  [&lt;ffffffffc06064b9&gt;]

                  xfs_blkdev_issue_flush+0x19/<wbr>0x20 [xfs]</div>

                <div>[11279.495079]  [&lt;ffffffffc05eec40&gt;]

                  xfs_file_fsync+0x1b0/0x1e0 [xfs]</div>

                <div>[11279.495086]  [&lt;ffffffffb944f0e7&gt;]

                  do_fsync+0x67/0xb0</div>

                <div>[11279.495090]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[11279.495094]  [&lt;ffffffffb944f3d0&gt;]

                  SyS_fsync+0x10/0x20</div>

                <div>[11279.495098]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[11279.495102]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[11279.495105] INFO: task glusterposixfsy:14941

                  blocked for more than 120 seconds.</div>

                <div>[11279.496606] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[11279.498114] glusterposixfsy D ffff972495f84f10 

                     0 14941      1 0x00000080</div>

                <div>[11279.498118] Call Trace:</div>

                <div>[11279.498134]  [&lt;ffffffffc0309839&gt;] ?

                  __split_and_process_bio+0x2e9/<wbr>0x520 [dm_mod]</div>

                <div>[11279.498138]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[11279.498142]  [&lt;ffffffffb99118e9&gt;]

                  schedule_timeout+0x239/0x2c0</div>

                <div>[11279.498152]  [&lt;ffffffffc0309d98&gt;] ?

                  dm_make_request+0x128/0x1a0 [dm_mod]</div>

                <div>[11279.498156]  [&lt;ffffffffb991348d&gt;]

                  io_schedule_timeout+0xad/0x130</div>

                <div>[11279.498160]  [&lt;ffffffffb99145ad&gt;]

                  wait_for_completion_io+0xfd/<wbr>0x140</div>

                <div>[11279.498165]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[11279.498169]  [&lt;ffffffffb951e574&gt;]

                  blkdev_issue_flush+0xb4/0x110</div>

                <div>[11279.498202]  [&lt;ffffffffc06064b9&gt;]

                  xfs_blkdev_issue_flush+0x19/<wbr>0x20 [xfs]</div>

                <div>[11279.498231]  [&lt;ffffffffc05eec40&gt;]

                  xfs_file_fsync+0x1b0/0x1e0 [xfs]</div>

                <div>[11279.498238]  [&lt;ffffffffb944f0e7&gt;]

                  do_fsync+0x67/0xb0</div>

                <div>[11279.498242]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[11279.498246]  [&lt;ffffffffb944f3f3&gt;]

                  SyS_fdatasync+0x13/0x20</div>

                <div>[11279.498250]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[11279.498254]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[11279.498257] INFO: task glusteriotwr1:14950

                  blocked for more than 120 seconds.</div>

                <div>[11279.499789] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[11279.501343] glusteriotwr1   D ffff97208b6daf70 

                     0 14950      1 0x00000080</div>

                <div>[11279.501348] Call Trace:</div>

                <div>[11279.501353]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[11279.501390]  [&lt;ffffffffc060e388&gt;]

                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>

                <div>[11279.501396]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[11279.501428]  [&lt;ffffffffc05eeb97&gt;]

                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>

                <div>[11279.501432]  [&lt;ffffffffb944ef3f&gt;]

                  generic_write_sync+0x4f/0x70</div>

                <div>[11279.501461]  [&lt;ffffffffc05f0545&gt;]

                  xfs_file_aio_write+0x155/0x1b0 [xfs]</div>

                <div>[11279.501466]  [&lt;ffffffffb941a533&gt;]

                  do_sync_write+0x93/0xe0</div>

                <div>[11279.501471]  [&lt;ffffffffb941b010&gt;]

                  vfs_write+0xc0/0x1f0</div>

                <div>[11279.501475]  [&lt;ffffffffb941c002&gt;]

                  SyS_pwrite64+0x92/0xc0</div>

                <div>[11279.501479]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[11279.501483]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[11279.501489]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[11279.501493] INFO: task glusteriotwr4:14953

                  blocked for more than 120 seconds.</div>

                <div>[11279.503047] "echo 0 &gt;

                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables

                  this message.</div>

                <div>[11279.504630] glusteriotwr4   D ffff972499f2bf40 

                     0 14953      1 0x00000080</div>

                <div>[11279.504635] Call Trace:</div>

                <div>[11279.504640]  [&lt;ffffffffb9913f79&gt;]

                  schedule+0x29/0x70</div>

                <div>[11279.504676]  [&lt;ffffffffc060e388&gt;]

                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>

                <div>[11279.504681]  [&lt;ffffffffb92cf1b0&gt;] ?

                  wake_up_state+0x20/0x20</div>

                <div>[11279.504710]  [&lt;ffffffffc05eeb97&gt;]

                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>

                <div>[11279.504714]  [&lt;ffffffffb944f0e7&gt;]

                  do_fsync+0x67/0xb0</div>

                <div>[11279.504718]  [&lt;ffffffffb992076f&gt;] ?

                  system_call_after_swapgs+0xbc/<wbr>0x160</div>

                <div>[11279.504722]  [&lt;ffffffffb944f3d0&gt;]

                  SyS_fsync+0x10/0x20</div>

                <div>[11279.504725]  [&lt;ffffffffb992082f&gt;]

                  system_call_fastpath+0x1c/0x21</div>

                <div>[11279.504730]  [&lt;ffffffffb992077b&gt;] ?

                  system_call_after_swapgs+0xc8/<wbr>0x160</div>

                <div>[12127.466494] perf: interrupt took too long (8263

                  &gt; 8150), lowering kernel.perf_event_max_sample_<wbr>rate

                  to 24000</div>

              </div>

              <div><br>

              </div>

              <div>--------------------</div>

              <div>I think this is the cause of the massive ovirt

                performance issues irrespective of gluster volume.  At

                the time this happened, I was also ssh'ed into the host,

                and was doing some rpm querry commands.  I had just run

                rpm -qa |grep glusterfs (to verify what version was

                actually installed), and that command took almost 2

                minutes to return!  Normally it takes less than 2

                seconds.  That is all pure local SSD IO, too....</div>

              <div><br>

              </div>

              <div>I'm no expert, but its my understanding that anytime

                a software causes these kinds of issues, its a serious

                bug in the software, even if its mis-handled

                exceptions.  Is this correct?</div>

              <span class="HOEnZb"><font color="#888888">

                  <div><br>

                  </div>

                  <div>--Jim</div>

                </font></span></div>

            <div class="HOEnZb">

              <div class="h5">

                <div class="gmail_extra"><br>

                  <div class="gmail_quote">On Tue, May 29, 2018 at 3:01

                    PM, Jim Kusznir <span dir="ltr">&lt;<a

                        href="mailto:jim@palousetech.com"

                        target="_blank" moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>

                    wrote:<br>

                    <blockquote class="gmail_quote" style="margin:0 0 0

                      .8ex;border-left:1px #ccc solid;padding-left:1ex">

                      <div dir="ltr">I think this is the profile

                        information for one of the volumes that lives on

                        the SSDs and is fully operational with no

                        down/problem disks:

                        <div><br>

                        </div>

                        <div>

                          <div>[root@ovirt2 yum.repos.d]# gluster volume

                            profile data info</div>

                          <div>Brick: ovirt2.nwfiber.com:/gluster/br<wbr>ick2/data</div>

                          <div>------------------------------<wbr>----------------</div>

                          <div>Cumulative Stats:</div>

                          <div>   Block Size:                256b+     

                                       512b+                1024b+ </div>

                          <div> No. of Reads:                  983     

                                        2696                  1059 </div>

                          <div>No. of Writes:                    0     

                                        1113                   302 </div>

                          <div> </div>

                          <div>   Block Size:               2048b+     

                                      4096b+                8192b+ </div>

                          <div> No. of Reads:                  852     

                                       88608                 53526 </div>

                          <div>No. of Writes:                  522     

                                      812340                 76257 </div>

                          <div> </div>

                          <div>   Block Size:              16384b+     

                                     32768b+               65536b+ </div>

                          <div> No. of Reads:                54351     

                                      241901                 15024 </div>

                          <div>No. of Writes:                21636     

                                        8656                  8976 </div>

                          <div> </div>

                          <div>   Block Size:             131072b+ </div>

                          <div> No. of Reads:               524156 </div>

                          <div>No. of Writes:               296071 </div>

                          <div> %-latency   Avg-latency   Min-Latency 

                             Max-Latency   No. of calls         Fop</div>

                          <div> ---------   -----------   ----------- 

                             -----------   ------------        ----</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           4189     RELEASE</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           1257  RELEASEDIR</div>

                          <div>      0.00      46.19 us      12.00 us   

                             187.00 us             69       FLUSH</div>

                          <div>      0.00     147.00 us      78.00 us   

                             367.00 us             86 REMOVEXATTR</div>

                          <div>      0.00     223.46 us      24.00 us   

                            1166.00 us            149     READDIR</div>

                          <div>      0.00     565.34 us      76.00 us   

                            3639.00 us             88   FTRUNCATE</div>

                          <div>      0.00     263.28 us      20.00 us 

                             28385.00 us            228          LK</div>

                          <div>      0.00      98.84 us       2.00 us   

                             880.00 us           1198     OPENDIR</div>

                          <div>      0.00      91.59 us      26.00 us 

                             10371.00 us           3853      STATFS</div>

                          <div>      0.00     494.14 us      17.00 us 

                            193439.00 us           1171    GETXATTR</div>

                          <div>      0.00     299.42 us      35.00 us   

                            9799.00 us           2044    READDIRP</div>

                          <div>      0.00    1965.31 us     110.00 us 

                            382258.00 us            321     XATTROP</div>

                          <div>      0.01     113.40 us      24.00 us 

                             61061.00 us           8134        STAT</div>

                          <div>      0.01     755.38 us      57.00 us 

                            607603.00 us           3196     DISCARD</div>

                          <div>      0.05    2690.09 us      58.00 us

                            2704761.00 us           3206        OPEN</div>

                          <div>      0.10  119978.25 us      97.00 us

                            9406684.00 us            154     SETATTR</div>

                          <div>      0.18     101.73 us      28.00 us 

                            700477.00 us         313379       FSTAT</div>

                          <div>      0.23    1059.84 us      25.00 us

                            2716124.00 us          38255      LOOKUP</div>

                          <div>      0.47    1024.11 us      54.00 us

                            6197164.00 us          81455    FXATTROP</div>

                          <div>      1.72    2984.00 us      15.00 us

                            37098954.00 us         103020    FINODELK</div>

                          <div>      5.92   44315.32 us      51.00 us

                            24731536.00 us          23957       FSYNC</div>

                          <div>     13.27    2399.78 us      25.00 us

                            22089540.00 us         991005        READ</div>

                          <div>     37.00    5980.43 us      52.00 us

                            22099889.00 us        1108976       WRITE</div>

                          <div>     41.04    5452.75 us      13.00 us

                            22102452.00 us        1349053     INODELK</div>

                          <div> </div>

                          <div>    Duration: 10026 seconds</div>

                          <div>   Data Read: 80046027759 bytes</div>

                          <div>Data Written: 44496632320 bytes</div>

                          <div> </div>

                          <div>Interval 1 Stats:</div>

                          <div>   Block Size:                256b+     

                                       512b+                1024b+ </div>

                          <div> No. of Reads:                  983     

                                        2696                  1059 </div>

                          <div>No. of Writes:                    0     

                                         838                   185 </div>

                          <div> </div>

                          <div>   Block Size:               2048b+     

                                      4096b+                8192b+ </div>

                          <div> No. of Reads:                  852     

                                       85856                 51575 </div>

                          <div>No. of Writes:                  382     

                                      705802                 57812 </div>

                          <div> </div>

                          <div>   Block Size:              16384b+     

                                     32768b+               65536b+ </div>

                          <div> No. of Reads:                52673     

                                      232093                 14984 </div>

                          <div>No. of Writes:                13499     

                                        4908                  4242 </div>

                          <div> </div>

                          <div>   Block Size:             131072b+ </div>

                          <div> No. of Reads:               460040 </div>

                          <div>No. of Writes:                 6411 </div>

                          <div> %-latency   Avg-latency   Min-Latency 

                             Max-Latency   No. of calls         Fop</div>

                          <div> ---------   -----------   ----------- 

                             -----------   ------------        ----</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           2093     RELEASE</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           1093  RELEASEDIR</div>

                          <div>      0.00      53.38 us      26.00 us   

                             111.00 us             16       FLUSH</div>

                          <div>      0.00     145.14 us      78.00 us   

                             367.00 us             71 REMOVEXATTR</div>

                          <div>      0.00     190.96 us     114.00 us   

                             298.00 us             71     SETATTR</div>

                          <div>      0.00     213.38 us      24.00 us   

                            1145.00 us             90     READDIR</div>

                          <div>      0.00     263.28 us      20.00 us 

                             28385.00 us            228          LK</div>

                          <div>      0.00     101.76 us       2.00 us   

                             880.00 us           1093     OPENDIR</div>

                          <div>      0.01      93.60 us      27.00 us 

                             10371.00 us           3090      STATFS</div>

                          <div>      0.02     537.47 us      17.00 us 

                            193439.00 us           1038    GETXATTR</div>

                          <div>      0.03     297.44 us      35.00 us   

                            9799.00 us           1990    READDIRP</div>

                          <div>      0.03    2357.28 us     110.00 us 

                            382258.00 us            253     XATTROP</div>

                          <div>      0.04     385.93 us      58.00 us 

                             47593.00 us           2091        OPEN</div>

                          <div>      0.04     114.86 us      24.00 us 

                             61061.00 us           7715        STAT</div>

                          <div>      0.06     444.59 us      57.00 us 

                            333240.00 us           3053     DISCARD</div>

                          <div>      0.42     316.24 us      25.00 us 

                            290728.00 us          29823      LOOKUP</div>

                          <div>      0.73     257.92 us      54.00 us 

                            344812.00 us          63296    FXATTROP</div>

                          <div>      1.37      98.30 us      28.00 us 

                             67621.00 us         313172       FSTAT</div>

                          <div>      1.58    2124.69 us      51.00 us 

                            849200.00 us          16717       FSYNC</div>

                          <div>      5.73     162.46 us      52.00 us 

                            748492.00 us         794079       WRITE</div>

                          <div>      7.19    2065.17 us      16.00 us

                            37098954.00 us          78381    FINODELK</div>

                          <div>     36.44     886.32 us      25.00 us

                            2216436.00 us         925421        READ</div>

                          <div>     46.30    1178.04 us      13.00 us

                            1700704.00 us         884635     INODELK</div>

                          <div> </div>

                          <div>    Duration: 7485 seconds</div>

                          <div>   Data Read: 71250527215 bytes</div>

                          <div>Data Written: 5119903744 bytes</div>

                          <div> </div>

                          <div>Brick: ovirt3.nwfiber.com:/gluster/br<wbr>ick2/data</div>

                          <div>------------------------------<wbr>----------------</div>

                          <div>Cumulative Stats:</div>

                          <div>   Block Size:                  1b+ </div>

                          <div> No. of Reads:                    0 </div>

                          <div>No. of Writes:              3264419 </div>

                          <div> %-latency   Avg-latency   Min-Latency 

                             Max-Latency   No. of calls         Fop</div>

                          <div> ---------   -----------   ----------- 

                             -----------   ------------        ----</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us             90      FORGET</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           9462     RELEASE</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           4254  RELEASEDIR</div>

                          <div>      0.00      50.52 us      13.00 us   

                             190.00 us             71       FLUSH</div>

                          <div>      0.00     186.97 us      87.00 us   

                             713.00 us             86 REMOVEXATTR</div>

                          <div>      0.00      79.32 us      33.00 us   

                             189.00 us            228          LK</div>

                          <div>      0.00     220.98 us     129.00 us   

                             513.00 us             86     SETATTR</div>

                          <div>      0.01     259.30 us      26.00 us   

                            2632.00 us            137     READDIR</div>

                          <div>      0.02     322.76 us     145.00 us   

                            2125.00 us            321     XATTROP</div>

                          <div>      0.03     109.55 us       2.00 us   

                            1258.00 us           1193     OPENDIR</div>

                          <div>      0.05      70.21 us      21.00 us   

                             431.00 us           3196     DISCARD</div>

                          <div>      0.05     169.26 us      21.00 us   

                            2315.00 us           1545    GETXATTR</div>

                          <div>      0.12     176.85 us      63.00 us   

                            2844.00 us           3206        OPEN</div>

                          <div>      0.61     303.49 us      90.00 us   

                            3085.00 us           9633       FSTAT</div>

                          <div>      2.44     305.66 us      28.00 us   

                            3716.00 us          38230      LOOKUP</div>

                          <div>      4.52     266.22 us      55.00 us 

                             53424.00 us          81455    FXATTROP</div>

                          <div>      6.96    1397.99 us      51.00 us 

                             64822.00 us          23889       FSYNC</div>

                          <div>     16.48      84.74 us      25.00 us   

                            6917.00 us         932592       WRITE</div>

                          <div>     30.16     106.90 us      13.00 us

                            3920189.00 us        1353046     INODELK</div>

                          <div>     38.55    1794.52 us      14.00 us

                            16210553.00 us         103039    FINODELK</div>

                          <div> </div>

                          <div>    Duration: 66562 seconds</div>

                          <div>   Data Read: 0 bytes</div>

                          <div>Data Written: 3264419 bytes</div>

                          <div> </div>

                          <div>Interval 1 Stats:</div>

                          <div>   Block Size:                  1b+ </div>

                          <div> No. of Reads:                    0 </div>

                          <div>No. of Writes:               794080 </div>

                          <div> %-latency   Avg-latency   Min-Latency 

                             Max-Latency   No. of calls         Fop</div>

                          <div> ---------   -----------   ----------- 

                             -----------   ------------        ----</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           2093     RELEASE</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           1093  RELEASEDIR</div>

                          <div>      0.00      70.31 us      26.00 us   

                             125.00 us             16       FLUSH</div>

                          <div>      0.00     193.10 us     103.00 us   

                             713.00 us             71 REMOVEXATTR</div>

                          <div>      0.01     227.32 us     133.00 us   

                             513.00 us             71     SETATTR</div>

                          <div>      0.01      79.32 us      33.00 us   

                             189.00 us            228          LK</div>

                          <div>      0.01     259.83 us      35.00 us   

                            1138.00 us             89     READDIR</div>

                          <div>      0.03     318.26 us     145.00 us   

                            2047.00 us            253     XATTROP</div>

                          <div>      0.04     112.67 us       3.00 us   

                            1258.00 us           1093     OPENDIR</div>

                          <div>      0.06     167.98 us      23.00 us   

                            1951.00 us           1014    GETXATTR</div>

                          <div>      0.08      70.97 us      22.00 us   

                             431.00 us           3053     DISCARD</div>

                          <div>      0.13     183.78 us      66.00 us   

                            2844.00 us           2091        OPEN</div>

                          <div>      1.01     303.82 us      90.00 us   

                            3085.00 us           9610       FSTAT</div>

                          <div>      3.27     316.59 us      30.00 us   

                            3716.00 us          29820      LOOKUP</div>

                          <div>      5.83     265.79 us      59.00 us 

                             53424.00 us          63296    FXATTROP</div>

                          <div>      7.95    1373.89 us      51.00 us 

                             64822.00 us          16717       FSYNC</div>

                          <div>     23.17     851.99 us      14.00 us

                            16210553.00 us          78555    FINODELK</div>

                          <div>     24.04      87.44 us      27.00 us   

                            6917.00 us         794081       WRITE</div>

                          <div>     34.36     111.91 us      14.00 us 

                            984871.00 us         886790     INODELK</div>

                          <div> </div>

                          <div>    Duration: 7485 seconds</div>

                          <div>   Data Read: 0 bytes</div>

                          <div>Data Written: 794080 bytes</div>

                        </div>

                        <div><br>

                        </div>

                        <div><br>

                        </div>

                        <div>-----------------------</div>

                        <div>Here is the data from the volume that is

                          backed by the SHDDs and has one failed disk:</div>

                        <div>

                          <div>[root@ovirt2 yum.repos.d]# gluster volume

                            profile data-hdd info</div>

                          <div>Brick: 172.172.1.12:/gluster/brick3/d<wbr>ata-hdd</div>

                          <div>------------------------------<wbr>--------------</div>

                          <div>Cumulative Stats:</div>

                          <div>   Block Size:                256b+     

                                       512b+                1024b+ </div>

                          <div> No. of Reads:                 1702     

                                          86                    16 </div>

                          <div>No. of Writes:                    0     

                                         767                    71 </div>

                          <div> </div>

                          <div>   Block Size:               2048b+     

                                      4096b+                8192b+ </div>

                          <div> No. of Reads:                   19     

                                       51841                  2049 </div>

                          <div>No. of Writes:                   76     

                                       60668                 35727 </div>

                          <div> </div>

                          <div>   Block Size:              16384b+     

                                     32768b+               65536b+ </div>

                          <div> No. of Reads:                 1744     

                                         639                  1088 </div>

                          <div>No. of Writes:                 8524     

                                        2410                  1285 </div>

                          <div> </div>

                          <div>   Block Size:             131072b+ </div>

                          <div> No. of Reads:               771999 </div>

                          <div>No. of Writes:                29584 </div>

                          <div> %-latency   Avg-latency   Min-Latency 

                             Max-Latency   No. of calls         Fop</div>

                          <div> ---------   -----------   ----------- 

                             -----------   ------------        ----</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           2902     RELEASE</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           1517  RELEASEDIR</div>

                          <div>      0.00     197.00 us     197.00 us   

                             197.00 us              1   FTRUNCATE</div>

                          <div>      0.00      70.24 us      16.00 us   

                             758.00 us             51       FLUSH</div>

                          <div>      0.00     143.93 us      82.00 us   

                             305.00 us             57 REMOVEXATTR</div>

                          <div>      0.00     178.63 us     105.00 us   

                             712.00 us             60     SETATTR</div>

                          <div>      0.00      67.30 us      19.00 us   

                             572.00 us            555          LK</div>

                          <div>      0.00     322.80 us      23.00 us   

                            4673.00 us            138     READDIR</div>

                          <div>      0.00     336.56 us     106.00 us 

                             11994.00 us            237     XATTROP</div>

                          <div>      0.00      84.70 us      28.00 us   

                            1071.00 us           3469      STATFS</div>

                          <div>      0.01     387.75 us       2.00 us 

                            146017.00 us           1467     OPENDIR</div>

                          <div>      0.01     148.59 us      21.00 us 

                             64374.00 us           4454        STAT</div>

                          <div>      0.02     783.02 us      16.00 us 

                             93502.00 us           1902    GETXATTR</div>

                          <div>      0.03    1516.10 us      17.00 us 

                            210690.00 us           1364     ENTRYLK</div>

                          <div>      0.03    2555.47 us     300.00 us 

                            674454.00 us           1064    READDIRP</div>

                          <div>      0.07      85.74 us      19.00 us 

                             68340.00 us          62849       FSTAT</div>

                          <div>      0.07    1978.12 us      59.00 us 

                            202596.00 us           2729        OPEN</div>

                          <div>      0.22     708.57 us      15.00 us 

                            394799.00 us          25447      LOOKUP</div>

                          <div>      5.94    2331.74 us      15.00 us

                            1099530.00 us         207534    FINODELK</div>

                          <div>      7.31    8311.75 us      58.00 us

                            1800216.00 us          71668    FXATTROP</div>

                          <div>     12.49    7735.19 us      51.00 us

                            3595513.00 us         131642       WRITE</div>

                          <div>     17.70     957.08 us      16.00 us

                            13700466.00 us        1508160     INODELK</div>

                          <div>     24.55    2546.43 us      26.00 us

                            5077347.00 us         786060        READ</div>

                          <div>     31.56   49699.15 us      47.00 us

                            3746331.00 us          51777       FSYNC</div>

                          <div> </div>

                          <div>    Duration: 10101 seconds</div>

                          <div>   Data Read: 101562897361 bytes</div>

                          <div>Data Written: 4834450432 bytes</div>

                          <div> </div>

                          <div>Interval 0 Stats:</div>

                          <div>   Block Size:                256b+     

                                       512b+                1024b+ </div>

                          <div> No. of Reads:                 1702     

                                          86                    16 </div>

                          <div>No. of Writes:                    0     

                                         767                    71 </div>

                          <div> </div>

                          <div>   Block Size:               2048b+     

                                      4096b+                8192b+ </div>

                          <div> No. of Reads:                   19     

                                       51841                  2049 </div>

                          <div>No. of Writes:                   76     

                                       60668                 35727 </div>

                          <div> </div>

                          <div>   Block Size:              16384b+     

                                     32768b+               65536b+ </div>

                          <div> No. of Reads:                 1744     

                                         639                  1088 </div>

                          <div>No. of Writes:                 8524     

                                        2410                  1285 </div>

                          <div> </div>

                          <div>   Block Size:             131072b+ </div>

                          <div> No. of Reads:               771999 </div>

                          <div>No. of Writes:                29584 </div>

                          <div> %-latency   Avg-latency   Min-Latency 

                             Max-Latency   No. of calls         Fop</div>

                          <div> ---------   -----------   ----------- 

                             -----------   ------------        ----</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           2902     RELEASE</div>

                          <div>      0.00       0.00 us       0.00 us   

                               0.00 us           1517  RELEASEDIR</div>

                          <div>      0.00     197.00 us     197.00 us   

                             197.00 us              1   FTRUNCATE</div>

                          <div>      0.00      70.24 us      16.00 us   

                             758.00 us             51       FLUSH</div>

                          <div>      0.00     143.93 us      82.00 us   

                             305.00 us             57 REMOVEXATTR</div>

                          <div>      0.00     178.63 us     105.00 us   

                             712.00 us             60     SETATTR</div>

                          <div>      0.00      67.30 us      19.00 us   

                             572.00 us            555          LK</div>

                          <div>      0.00     322.80 us      23.00 us   

                            4673.00 us            138     READDIR</div>

                          <div>      0.00     336.56 us     106.00 us 

                             11994.00 us            237     XATTROP</div>

                          <div>      0.00      84.70 us      28.00 us   

                            1071.00 us           3469      STATFS</div>

                          <div>      0.01     387.75 us       2.00 us 

                            146017.00 us           1467     OPENDIR</div>

                          <div>      0.01     148.59 us      21.00 us 

                             64374.00 us           4454        STAT</div>

                          <div>      0.02     783.02 us      16.00 us 

                             93502.00 us           1902    GETXATTR</div>

                          <div>      0.03    1516.10 us      17.00 us 

                            210690.00 us           1364     ENTRYLK</div>

                          <div>      0.03    2555.47 us     300.00 us 

                            674454.00 us           1064    READDIRP</div>

                          <div>      0.07      85.73 us      19.00 us 

                             68340.00 us          62849       FSTAT</div>

                          <div>      0.07    1978.12 us      59.00 us 

                            202596.00 us           2729        OPEN</div>

                          <div>      0.22     708.57 us      15.00 us 

                            394799.00 us          25447      LOOKUP</div>

                          <div>      5.94    2334.57 us      15.00 us

                            1099530.00 us         207534    FINODELK</div>

                          <div>      7.31    8311.49 us      58.00 us

                            1800216.00 us          71668    FXATTROP</div>

                          <div>     12.49    7735.32 us      51.00 us

                            3595513.00 us         131642       WRITE</div>

                          <div>     17.71     957.08 us      16.00 us

                            13700466.00 us        1508160     INODELK</div>

                          <div>     24.56    2546.42 us      26.00 us

                            5077347.00 us         786060        READ</div>

                          <div>     31.54   49651.63 us      47.00 us

                            3746331.00 us          51777       FSYNC</div>

                          <div> </div>

                          <div>    Duration: 10101 seconds</div>

                          <div>   Data Read: 101562897361 bytes</div>

                          <div>Data Written: 4834450432 bytes</div>

                        </div>

                        <div><br>

                        </div>

                      </div>

                      <div class="m_-5992202424066002276HOEnZb">

                        <div class="m_-5992202424066002276h5">

                          <div class="gmail_extra"><br>

                            <div class="gmail_quote">On Tue, May 29,

                              2018 at 2:55 PM, Jim Kusznir <span

                                dir="ltr">&lt;<a

                                  href="mailto:jim@palousetech.com"

                                  target="_blank" moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>

                              wrote:<br>

                              <blockquote class="gmail_quote"

                                style="margin:0 0 0 .8ex;border-left:1px

                                #ccc solid;padding-left:1ex">

                                <div dir="ltr">Thank you for your

                                  response.

                                  <div><br>

                                  </div>

                                  <div>I have 4 gluster volumes.  3 are

                                    replica 2 + arbitrator.  replica

                                    bricks are on ovirt1 and ovirt2,

                                    arbitrator on ovirt3.  The 4th

                                    volume is replica 3, with a brick on

                                    all three ovirt machines.</div>

                                  <div><br>

                                  </div>

                                  <div>The first 3 volumes are on an SSD

                                    disk; the 4th is on a Seagate SSHD

                                    (same in all three machines).  On

                                    ovirt3, the SSHD has reported hard

                                    IO failures, and that brick is

                                    offline.  However, the other two

                                    replicas are fully operational

                                    (although they still show contents

                                    in the heal info command that won't

                                    go away, but that may be the case

                                    until I replace the failed disk).</div>

                                  <div><br>

                                  </div>

                                  <div>What is bothering me is that ALL

                                    4 gluster volumes are showing

                                    horrible performance issues.  At

                                    this point, as the bad disk has been

                                    completely offlined, I would expect

                                    gluster to perform at normal speed,

                                    but that is definitely not the case.</div>

                                  <div><br>

                                  </div>

                                  <div>I've also noticed that the

                                    performance hits seem to come in

                                    waves: things seem to work

                                    acceptably (but slow) for a while,

                                    then suddenly, its as if all disk IO

                                    on all volumes (including

                                    non-gluster local OS disk volumes

                                    for the hosts) pause for about 30

                                    seconds, then IO resumes again. 

                                    During those times, I start getting

                                    VM not responding and host not

                                    responding notices as well as the

                                    applications having major issues.</div>

                                  <div><br>

                                  </div>

                                  <div>I've shut down most of my VMs and

                                    am down to just my essential core

                                    VMs (shedded about 75% of my VMs). 

                                    I still am experiencing the same

                                    issues.</div>

                                  <div><br>

                                  </div>

                                  <div>Am I correct in believing that

                                    once the failed disk was brought

                                    offline that performance should

                                    return to normal?</div>

                                </div>

                                <div

                                  class="m_-5992202424066002276m_1037085839393797930HOEnZb">

                                  <div

                                    class="m_-5992202424066002276m_1037085839393797930h5">

                                    <div class="gmail_extra"><br>

                                      <div class="gmail_quote">On Tue,

                                        May 29, 2018 at 1:27 PM, Alex K

                                        <span dir="ltr">&lt;<a

                                            href="mailto:rightkicktech@gmail.com"

                                            target="_blank"

                                            moz-do-not-send="true">rightkicktech@gmail.com</a>&gt;</span>

                                        wrote:<br>

                                        <blockquote class="gmail_quote"

                                          style="margin:0 0 0

                                          .8ex;border-left:1px #ccc

                                          solid;padding-left:1ex">

                                          <div dir="auto">I would check

                                            disks status and

                                            accessibility of mount

                                            points where your gluster

                                            volumes reside.</div>

                                          <br>

                                          <div class="gmail_quote">

                                            <div>

                                              <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844h5">

                                                <div dir="ltr">On Tue,

                                                  May 29, 2018, 22:28

                                                  Jim Kusznir &lt;<a

                                                    href="mailto:jim@palousetech.com"

                                                    target="_blank"

                                                    moz-do-not-send="true">jim@palousetech.com</a>&gt;

                                                  wrote:<br>

                                                </div>

                                              </div>

                                            </div>

                                            <blockquote

                                              class="gmail_quote"

                                              style="margin:0 0 0

                                              .8ex;border-left:1px #ccc

                                              solid;padding-left:1ex">

                                              <div>

                                                <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844h5">

                                                  <div dir="ltr">On one

                                                    ovirt server, I'm

                                                    now seeing these

                                                    messages:

                                                    <div>

                                                      <div>[56474.239725]

blk_update_request: 63 callbacks suppressed</div>

                                                      <div>[56474.239732]

blk_update_request: I/O error, dev dm-2, sector 0</div>

                                                      <div>[56474.240602]

blk_update_request: I/O error, dev dm-2, sector 3905945472</div>

                                                      <div>[56474.241346]

blk_update_request: I/O error, dev dm-2, sector 3905945584</div>

                                                      <div>[56474.242236]

blk_update_request: I/O error, dev dm-2, sector 2048</div>

                                                      <div>[56474.243072]

blk_update_request: I/O error, dev dm-2, sector 3905943424</div>

                                                      <div>[56474.243997]

blk_update_request: I/O error, dev dm-2, sector 3905943536</div>

                                                      <div>[56474.247347]

blk_update_request: I/O error, dev dm-2, sector 0</div>

                                                      <div>[56474.248315]

blk_update_request: I/O error, dev dm-2, sector 3905945472</div>

                                                      <div>[56474.249231]

blk_update_request: I/O error, dev dm-2, sector 3905945584</div>

                                                      <div>[56474.250221]

blk_update_request: I/O error, dev dm-2, sector 2048</div>

                                                    </div>

                                                    <div><br>

                                                    </div>

                                                    <div><br>

                                                    </div>

                                                    <div><br>

                                                    </div>

                                                  </div>

                                                  <div

                                                    class="gmail_extra"><br>

                                                    <div

                                                      class="gmail_quote">On

                                                      Tue, May 29, 2018

                                                      at 11:59 AM, Jim

                                                      Kusznir <span

                                                        dir="ltr">&lt;<a

href="mailto:jim@palousetech.com" rel="noreferrer" target="_blank"

                                                          moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>

                                                      wrote:<br>

                                                      <blockquote

                                                        class="gmail_quote"

                                                        style="margin:0

                                                        0 0

                                                        .8ex;border-left:1px

                                                        #ccc

                                                        solid;padding-left:1ex">

                                                        <div dir="ltr">I

                                                          see in

                                                          messages on

                                                          ovirt3 (my 3rd

                                                          machine, the

                                                          one upgraded

                                                          to 4.2):

                                                          <div><br>

                                                          </div>

                                                          <div>

                                                          <div>May 29

                                                          11:54:41

                                                          ovirt3

                                                          ovs-vsctl:

                                                          ovs|00001|db_ctl_base|ERR|unix<wbr>:/var/run/openvswitch/db.sock:

                                                          database

                                                          connection

                                                          failed (No

                                                          such file or

                                                          directory)</div>

                                                          <div>May 29

                                                          11:54:51

                                                          ovirt3

                                                          ovs-vsctl:

                                                          ovs|00001|db_ctl_base|ERR|unix<wbr>:/var/run/openvswitch/db.sock:

                                                          database

                                                          connection

                                                          failed (No

                                                          such file or

                                                          directory)</div>

                                                          <div>May 29

                                                          11:55:01

                                                          ovirt3

                                                          ovs-vsctl:

                                                          ovs|00001|db_ctl_base|ERR|unix<wbr>:/var/run/openvswitch/db.sock:

                                                          database

                                                          connection

                                                          failed (No

                                                          such file or

                                                          directory)</div>

                                                          </div>

                                                          <div>(appears

                                                          a lot).</div>

                                                          <div><br>

                                                          </div>

                                                          <div>I also

                                                          found on the

                                                          ssh session of

                                                          that, some

                                                          sysv warnings

                                                          about the

                                                          backing disk

                                                          for one of the

                                                          gluster

                                                          volumes

                                                          (straight

                                                          replica 3). 

                                                          The glusterfs

                                                          process for

                                                          that disk on

                                                          that machine

                                                          went offline. 

                                                          Its my

                                                          understanding

                                                          that it should

                                                          continue to

                                                          work with the

                                                          other two

                                                          machines while

                                                          I attempt to

                                                          replace that

                                                          disk, right? 

                                                          Attempted

                                                          writes

                                                          (touching an

                                                          empty file)

                                                          can take 15

                                                          seconds,

                                                          repeating it

                                                          later will be

                                                          much faster.</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Gluster

                                                          generates a

                                                          bunch of

                                                          different log

                                                          files, I don't

                                                          know what ones

                                                          you want, or

                                                          from which

                                                          machine(s).</div>

                                                          <div><br>

                                                          </div>

                                                          <div>How do I

                                                          do "volume

                                                          profiling"?</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Thanks!</div>

                                                        </div>

                                                        <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928HOEnZb">

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928h5">

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote">On

                                                          Tue, May 29,

                                                          2018 at 11:53

                                                          AM, Sahina

                                                          Bose <span

                                                          dir="ltr">&lt;<a

href="mailto:sabose@redhat.com" rel="noreferrer" target="_blank"

                                                          moz-do-not-send="true">sabose@redhat.com</a>&gt;</span>

                                                          wrote:<br>

                                                          <blockquote

                                                          class="gmail_quote"

style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                                          <div dir="ltr">

                                                          <div>Do you

                                                          see errors

                                                          reported in

                                                          the mount logs

                                                          for the

                                                          volume? If so,

                                                          could you

                                                          attach the

                                                          logs?<br>

                                                          </div>

                                                          Any issues

                                                          with your

                                                          underlying

                                                          disks. Can you

                                                          also attach

                                                          output of

                                                          volume

                                                          profiling?<br>

                                                          </div>

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702HOEnZb">

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702h5">

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote">On

                                                          Wed, May 30,

                                                          2018 at 12:13

                                                          AM, Jim

                                                          Kusznir <span

                                                          dir="ltr">&lt;<a

href="mailto:jim@palousetech.com" rel="noreferrer" target="_blank"

                                                          moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>

                                                          wrote:<br>

                                                          <blockquote

                                                          class="gmail_quote"

style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                                          <div dir="ltr">Ok,

                                                          things have

                                                          gotten MUCH

                                                          worse this

                                                          morning.  I'm

                                                          getting random

                                                          errors from

                                                          VMs, right

                                                          now, about a

                                                          third of my

                                                          VMs have been

                                                          paused due to

                                                          storage

                                                          issues, and

                                                          most of the

                                                          remaining VMs

                                                          are not

                                                          performing

                                                          well.

                                                          <div><br>

                                                          </div>

                                                          <div>At this

                                                          point, I am in

                                                          full EMERGENCY

                                                          mode, as my

                                                          production

                                                          services are

                                                          now impacted,

                                                          and I'm

                                                          getting calls

                                                          coming in with

                                                          problems...</div>

                                                          <div><br>

                                                          </div>

                                                          <div>I'd

                                                          greatly

                                                          appreciate

                                                          help...VMs are

                                                          running VERY

                                                          slowly (when

                                                          they run), and

                                                          they are

                                                          steadily

                                                          getting

                                                          worse.  I

                                                          don't know

                                                          why.  I was

                                                          seeing CPU

                                                          peaks (to

                                                          100%) on

                                                          several VMs,

                                                          in perfect

                                                          sync, for a

                                                          few minutes at

                                                          a time (while

                                                          the VM became

                                                          unresponsive

                                                          and any VMs I

                                                          was logged

                                                          into that were

                                                          linux were

                                                          giving me the

                                                          CPU stuck

                                                          messages in my

                                                          origional

                                                          post).  Is all

                                                          this storage

                                                          related?</div>

                                                          <div><br>

                                                          </div>

                                                          <div>I also

                                                          have two

                                                          different

                                                          gluster

                                                          volumes for VM

                                                          storage, and

                                                          only one had

                                                          the issues,

                                                          but now VMs in

                                                          both are being

                                                          affected at

                                                          the same time

                                                          and same way.</div>

                                                          <span

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339HOEnZb"><font

color="#888888">

                                                          <div><br>

                                                          </div>

                                                          <div>--Jim</div>

                                                          </font></span></div>

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339HOEnZb">

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339h5">

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote">On

                                                          Mon, May 28,

                                                          2018 at 10:50

                                                          PM, Sahina

                                                          Bose <span

                                                          dir="ltr">&lt;<a

href="mailto:sabose@redhat.com" rel="noreferrer" target="_blank"

                                                          moz-do-not-send="true">sabose@redhat.com</a>&gt;</span>

                                                          wrote:<br>

                                                          <blockquote

                                                          class="gmail_quote"

style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                                          <div dir="ltr">[Adding

                                                          gluster-users

                                                          to look at the

                                                          heal issue]<br>

                                                          </div>

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote">

                                                          <div>

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339m_2506865858631215125h5">On

                                                          Tue, May 29,

                                                          2018 at 9:17

                                                          AM, Jim

                                                          Kusznir <span

                                                          dir="ltr">&lt;<a

href="mailto:jim@palousetech.com" rel="noreferrer" target="_blank"

                                                          moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>

                                                          wrote:<br>

                                                          </div>

                                                          </div>

                                                          <blockquote

                                                          class="gmail_quote"

style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                                                          <div>

                                                          <div

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339m_2506865858631215125h5">

                                                          <div dir="ltr">Hello:

                                                          <div><br>

                                                          </div>

                                                          <div>I've been

                                                          having some

                                                          cluster and

                                                          gluster

                                                          performance

                                                          issues

                                                          lately.  I

                                                          also found

                                                          that my

                                                          cluster was

                                                          out of date,

                                                          and was trying

                                                          to apply

                                                          updates

                                                          (hoping to fix

                                                          some of

                                                          these), and

                                                          discovered the

                                                          ovirt 4.1

                                                          repos were

                                                          taken

                                                          completely

                                                          offline.  So,

                                                          I was forced

                                                          to begin an

                                                          upgrade to

                                                          4.2. 

                                                          According to

                                                          docs I

                                                          found/read, I

                                                          needed only

                                                          add the new

                                                          repo, do a yum

                                                          update,

                                                          reboot, and be

                                                          good on my

                                                          hosts (did the

                                                          yum update,

                                                          the

                                                          engine-setup

                                                          on my hosted

                                                          engine). 

                                                          Things seemed

                                                          to work

                                                          relatively

                                                          well, except

                                                          for a gluster

                                                          sync issue

                                                          that showed

                                                          up.</div>

                                                          <div><br>

                                                          </div>

                                                          <div>My

                                                          cluster is a 3

                                                          node

                                                          hyperconverged

                                                          cluster.  I

                                                          upgraded the

                                                          hosted engine

                                                          first, then

                                                          engine 3. 

                                                          When engine 3

                                                          came back up,

                                                          for some

                                                          reason one of

                                                          my gluster

                                                          volumes would

                                                          not sync. 

                                                          Here's sample

                                                          output:</div>

                                                          <div><br>

                                                          </div>

                                                          <div>

                                                          <div>[root@ovirt3

                                                          ~]# gluster

                                                          volume heal

                                                          data-hdd info</div>

                                                          <div>Brick

                                                          172.172.1.11:/gluster/brick3/d<wbr>ata-hdd</div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/48d7ecb8-7ac5-4<wbr>725-bca5-b3519681cf2f/0d6080b0<wbr>-7018-4fa3-bb82-1dd9ef07d9b9 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/647be733-f153-4<wbr>cdc-85bd-ba72544c2631/b453a300<wbr>-0602-4be1-8310-8bd5abe00971 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/6da854d1-b6be-4<wbr>46b-9bf0-90a0dbbea830/3c93bd1f<wbr>-b7fa-4aa2-b445-6904e31839ba </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/7f647567-d18c-4<wbr>4f1-a58e-9b8865833acb/f9364470<wbr>-9770-4bb1-a6b9-a54861849625 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/f3c8e7aa-6ef2-4<wbr>2a7-93d4-e0a4df6dd2fa/2eb0b1ad<wbr>-2606-44ef-9cd3-ae59610a504b </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/b1ea3f62-0f05-4<wbr>ded-8c82-9c91c90e0b61/d5d6bf5a<wbr>-499f-431d-9013-5453db93ed32 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/8c8b5147-e9d6-4<wbr>810-b45b-185e3ed65727/16f08231<wbr>-93b0-489d-a2fd-687b6bf88eaa </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/12924435-b9c2-4<wbr>aab-ba19-1c1bc31310ef/07b3db69<wbr>-440e-491e-854c-bbfa18a7cff2 </div>

                                                          <div>Status:

                                                          Connected</div>

                                                          <div>Number of

                                                          entries: 8</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Brick

                                                          172.172.1.12:/gluster/brick3/d<wbr>ata-hdd</div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/48d7ecb8-7ac5-4<wbr>725-bca5-b3519681cf2f/0d6080b0<wbr>-7018-4fa3-bb82-1dd9ef07d9b9 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/647be733-f153-4<wbr>cdc-85bd-ba72544c2631/b453a300<wbr>-0602-4be1-8310-8bd5abe00971 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/b1ea3f62-0f05-4<wbr>ded-8c82-9c91c90e0b61/d5d6bf5a<wbr>-499f-431d-9013-5453db93ed32 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/6da854d1-b6be-4<wbr>46b-9bf0-90a0dbbea830/3c93bd1f<wbr>-b7fa-4aa2-b445-6904e31839ba </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/7f647567-d18c-4<wbr>4f1-a58e-9b8865833acb/f9364470<wbr>-9770-4bb1-a6b9-a54861849625 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/8c8b5147-e9d6-4<wbr>810-b45b-185e3ed65727/16f08231<wbr>-93b0-489d-a2fd-687b6bf88eaa </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/12924435-b9c2-4<wbr>aab-ba19-1c1bc31310ef/07b3db69<wbr>-440e-491e-854c-bbfa18a7cff2 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/f3c8e7aa-6ef2-4<wbr>2a7-93d4-e0a4df6dd2fa/2eb0b1ad<wbr>-2606-44ef-9cd3-ae59610a504b </div>

                                                          <div>Status:

                                                          Connected</div>

                                                          <div>Number of

                                                          entries: 8</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Brick

                                                          172.172.1.13:/gluster/brick3/d<wbr>ata-hdd</div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/b1ea3f62-0f05-4<wbr>ded-8c82-9c91c90e0b61/d5d6bf5a<wbr>-499f-431d-9013-5453db93ed32 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/8c8b5147-e9d6-4<wbr>810-b45b-185e3ed65727/16f08231<wbr>-93b0-489d-a2fd-687b6bf88eaa </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/12924435-b9c2-4<wbr>aab-ba19-1c1bc31310ef/07b3db69<wbr>-440e-491e-854c-bbfa18a7cff2 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/f3c8e7aa-6ef2-4<wbr>2a7-93d4-e0a4df6dd2fa/2eb0b1ad<wbr>-2606-44ef-9cd3-ae59610a504b </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/647be733-f153-4<wbr>cdc-85bd-ba72544c2631/b453a300<wbr>-0602-4be1-8310-8bd5abe00971 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/48d7ecb8-7ac5-4<wbr>725-bca5-b3519681cf2f/0d6080b0<wbr>-7018-4fa3-bb82-1dd9ef07d9b9 </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/6da854d1-b6be-4<wbr>46b-9bf0-90a0dbbea830/3c93bd1f<wbr>-b7fa-4aa2-b445-6904e31839ba </div>

                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/7f647567-d18c-4<wbr>4f1-a58e-9b8865833acb/f9364470<wbr>-9770-4bb1-a6b9-a54861849625 </div>

                                                          <div>Status:

                                                          Connected</div>

                                                          <div>Number of

                                                          entries: 8</div>

                                                          </div>

                                                          <div><br>

                                                          </div>

                                                          <div>---------</div>

                                                          <div>Its been

                                                          in this state

                                                          for a couple

                                                          days now, and

                                                          bandwidth

                                                          monitoring

                                                          shows no

                                                          appreciable

                                                          data moving. 

                                                          I've tried

                                                          repeatedly

                                                          commanding a

                                                          full heal from

                                                          all three

                                                          clusters in

                                                          the node.  Its

                                                          always the

                                                          same files

                                                          that need

                                                          healing.</div>

                                                          <div><br>

                                                          </div>

                                                          <div>When

                                                          running

                                                          gluster volume

                                                          heal data-hdd

                                                          statistics, I

                                                          see sometimes

                                                          different

                                                          information,

                                                          but always

                                                          some number of

                                                          "heal failed"

                                                          entries.  It

                                                          shows 0 for

                                                          split brain.</div>

                                                          <div><br>

                                                          </div>

                                                          <div>I'm not

                                                          quite sure

                                                          what to do.  I

                                                          suspect it may

                                                          be due to

                                                          nodes 1 and 2

                                                          still being on

                                                          the older

                                                          ovirt/gluster

                                                          release, but

                                                          I'm afraid to

                                                          upgrade and

                                                          reboot them

                                                          until I have a

                                                          good gluster

                                                          sync (don't

                                                          need to create

                                                          a split brain

                                                          issue).  How

                                                          do I proceed

                                                          with this?</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Second

                                                          issue: I've

                                                          been

                                                          experiencing

                                                          VERY POOR

                                                          performance on

                                                          most of my

                                                          VMs.  To the

                                                          tune that

                                                          logging into a

                                                          windows 10 vm

                                                          via remote

                                                          desktop can

                                                          take 5

                                                          minutes,

                                                          launching

                                                          quickbooks

                                                          inside said vm

                                                          can easily

                                                          take 10

                                                          minutes.  On

                                                          some linux

                                                          VMs, I get

                                                          random

                                                          messages like

                                                          this:</div>

                                                          <div>

                                                          <div>Message

                                                          from

                                                          syslogd@unifi

                                                          at May 28

                                                          20:39:23 ...</div>

                                                          <div> kernel:[6171996.308904]

                                                          NMI watchdog:

                                                          BUG: soft

                                                          lockup - CPU#0

                                                          stuck for 22s!

                                                          [mongod:14766]</div>

                                                          </div>

                                                          <div><br>

                                                          </div>

                                                          <div>(the

                                                          process and

                                                          PID are often

                                                          different)</div>

                                                          <div><br>

                                                          </div>

                                                          <div>I'm not

                                                          quite sure

                                                          what to do

                                                          about this

                                                          either.  My

                                                          initial

                                                          thought was

                                                          upgrad

                                                          everything to

                                                          current and

                                                          see if its

                                                          still there,

                                                          but I cannot

                                                          move forward

                                                          with that

                                                          until my

                                                          gluster is

                                                          healed...</div>

                                                          <div><br>

                                                          </div>

                                                          <div>Thanks!</div>

                                                          <span

class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339m_2506865858631215125m_-3484925472286407273HOEnZb"><font

color="#888888">

                                                          <div>--Jim</div>

                                                          </font></span></div>

                                                          <br>

                                                          </div>

                                                          </div>

______________________________<wbr>_________________<br>

                                                          Users mailing

                                                          list -- <a

                                                          href="mailto:users@ovirt.org"

rel="noreferrer" target="_blank" moz-do-not-send="true">users@ovirt.org</a><br>

                                                          To unsubscribe

                                                          send an email

                                                          to <a

                                                          href="mailto:users-leave@ovirt.org"

rel="noreferrer" target="_blank" moz-do-not-send="true">users-leave@ovirt.org</a><br>

                                                          Privacy

                                                          Statement: <a

href="https://www.ovirt.org/site/privacy-policy/" rel="noreferrer

                                                          noreferrer"

                                                          target="_blank"

moz-do-not-send="true">https://www.ovirt.org/site/pri<wbr>vacy-policy/</a><br>

                                                          oVirt Code of

                                                          Conduct: <a

                                                          href="https://www.ovirt.org/community/about/community-guidelines/"

rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://www.ovirt.org/communit<wbr>y/about/community-guidelines/</a><br>

                                                          List Archives:

                                                          <a

href="https://lists.ovirt.org/archives/list/users@ovirt.org/message/3LEV6ZQ3JV2XLAL7NYBTXOYMYUOTIRQF/"

rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://lists.ovirt.org/archiv<wbr>es/list/users@ovirt.org/messag<wbr>e/3LEV6ZQ3JV2XLAL7NYBTXOYMYUOT<wbr>IRQF/</a><br>

                                                          <br>

                                                          </blockquote>

                                                          </div>

                                                          <br>

                                                          </div>

                                                          </blockquote>

                                                          </div>

                                                          <br>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </blockquote>

                                                          </div>

                                                          <br>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          </blockquote>

                                                          </div>

                                                          <br>

                                                          </div>

                                                          </div>

                                                        </div>

                                                      </blockquote>

                                                    </div>

                                                    <br>

                                                  </div>

______________________________<wbr>_________________<br>

                                                  Users mailing list --

                                                  <a

                                                    href="mailto:users@ovirt.org"

                                                    rel="noreferrer"

                                                    target="_blank"

                                                    moz-do-not-send="true">users@ovirt.org</a><br>

                                                  To unsubscribe send an

                                                  email to <a

                                                    href="mailto:users-leave@ovirt.org"

                                                    rel="noreferrer"

                                                    target="_blank"

                                                    moz-do-not-send="true">users-leave@ovirt.org</a><br>

                                                  Privacy Statement: <a

href="https://www.ovirt.org/site/privacy-policy/" rel="noreferrer

                                                    noreferrer"

                                                    target="_blank"

                                                    moz-do-not-send="true">https://www.ovirt.org/site/pri<wbr>vacy-policy/</a><br>

                                                  oVirt Code of Conduct:

                                                  <a

                                                    href="https://www.ovirt.org/community/about/community-guidelines/"

                                                    rel="noreferrer

                                                    noreferrer"

                                                    target="_blank"

                                                    moz-do-not-send="true">https://www.ovirt.org/communit<wbr>y/about/community-guidelines/</a><br>

                                                </div>

                                              </div>

                                              List Archives: <a

href="https://lists.ovirt.org/archives/list/users@ovirt.org/message/ACO7RFSLBSRBAIONIC2HQ6Z24ZDES5MF/"

                                                rel="noreferrer

                                                noreferrer"

                                                target="_blank"

                                                moz-do-not-send="true">https://lists.ovirt.org/archiv<wbr>es/list/users@ovirt.org/messag<wbr>e/ACO7RFSLBSRBAIONIC2HQ6Z24ZDE<wbr>S5MF/</a><br>

                                            </blockquote>

                                          </div>

                                        </blockquote>

                                      </div>

                                      <br>

                                    </div>

                                  </div>

                                </div>

                              </blockquote>

                            </div>

                            <br>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </div>

                  <br>

                </div>

              </div>

            </div>

            <br>

            ______________________________<wbr>_________________<br>

            Users mailing list -- <a href="mailto:users@ovirt.org"

              moz-do-not-send="true">users@ovirt.org</a><br>

            To unsubscribe send an email to <a

              href="mailto:users-leave@ovirt.org" moz-do-not-send="true">users-leave@ovirt.org</a><br>

            Privacy Statement: <a

              href="https://www.ovirt.org/site/privacy-policy/"

              rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.ovirt.org/site/<wbr>privacy-policy/</a><br>

            oVirt Code of Conduct: <a

              href="https://www.ovirt.org/community/about/community-guidelines/"

              rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.ovirt.org/<wbr>community/about/community-<wbr>guidelines/</a><br>

            List Archives: <a

href="https://lists.ovirt.org/archives/list/users@ovirt.org/message/3DEQQLJM3WHQNZJ7KEMRZVFZ52MTIL74/"

              rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.ovirt.org/<wbr>archives/list/users@ovirt.org/<wbr>message/<wbr>3DEQQLJM3WHQNZJ7KEMRZVFZ52MTIL<wbr>74/</a><br>

            <br>

          </blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

  </body>

</html>