<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    @Jim Kusznir<br>
    <br>
    For the heal issue, can you provide the getfattr output of one of
    the 8 files in question from all 3 bricks?<br>
    Example: `getfattr -d -m . -e hex
/gluster/brick3/data-hdd/cc65f671-3377-494a-a7d4-1d9f7c3ae46c/images/48d7ecb8-7ac5-4725-bca5-b3519681cf2f/0d6080b0-7018-4fa3-bb82-1dd9ef07d9b9`<br>
    Also provide the stat output of the same file from all 3 bricks.<br>
    <br>
    Thanks,<br>
    Ravi<br>
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 05/30/2018 09:47 AM, Krutika
      Dhananjay wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAPhYV8M6AGAMZCmSPi9m2ktffha3UeP0dpnuN1mBUsUjzNNj-w@mail.gmail.com">
      <div dir="ltr">
        <div>Adding Ravi to look into the heal issue.</div>
        <div><br>
        </div>
        <div>As for the fsync hang and subsequent IO errors, it seems a
          lot like <a
            href="https://bugzilla.redhat.com/show_bug.cgi?id=1497156"
            moz-do-not-send="true">https://bugzilla.redhat.com/show_bug.cgi?id=1497156</a>
          and Paolo Bonzini from qemu had pointed out that this would be
          fixed by the following commit:</div>
        <div><br>
        </div>
        <div>
          <pre class="gmail-bz_comment_text gmail-bz_wrap_comment_text" id="gmail-comment_text_20">  commit e72c9a2a67a6400c8ef3d01d4c461dbbbfa0e1f0
    Author: Paolo Bonzini &lt;<a href="mailto:pbonzini@redhat.com" moz-do-not-send="true">pbonzini@redhat.com</a>&gt;
    Date:   Wed Jun 21 16:35:46 2017 +0200

    scsi: virtio_scsi: let host do exception handling
    
    virtio_scsi tries to do exception handling after the default 30 seconds
    timeout expires.  However, it's better to let the host control the
    timeout, otherwise with a heavy I/O load it is likely that an abort will
    also timeout.  This leads to fatal errors like filesystems going
    offline.
    
    Disable the 'sd' timeout and allow the host to do exception handling,
    following the precedent of the storvsc driver.
    
    Hannes has a proposal to introduce timeouts in virtio, but this provides
    an immediate solution for stable kernels too.
    
    [mkp: fixed typo]
    
    Reported-by: Douglas Miller &lt;<a href="mailto:dougmill@linux.vnet.ibm.com" moz-do-not-send="true">dougmill@linux.vnet.ibm.com</a>&gt;
    Cc: "James E.J. Bottomley" &lt;<a href="mailto:jejb@linux.vnet.ibm.com" moz-do-not-send="true">jejb@linux.vnet.ibm.com</a>&gt;
    Cc: "Martin K. Petersen" &lt;<a href="mailto:martin.petersen@oracle.com" moz-do-not-send="true">martin.petersen@oracle.com</a>&gt;
    Cc: Hannes Reinecke &lt;<a href="mailto:hare@suse.de" moz-do-not-send="true">hare@suse.de</a>&gt;
    Cc: <a href="mailto:linux-scsi@vger.kernel.org" moz-do-not-send="true">linux-scsi@vger.kernel.org</a>
    Cc: <a href="mailto:stable@vger.kernel.org" moz-do-not-send="true">stable@vger.kernel.org</a>
    Signed-off-by: Paolo Bonzini &lt;<a href="mailto:pbonzini@redhat.com" moz-do-not-send="true">pbonzini@redhat.com</a>&gt;
    Signed-off-by: Martin K. Petersen &lt;<a href="mailto:martin.petersen@oracle.com" moz-do-not-send="true">martin.petersen@oracle.com</a>&gt;</pre>
        </div>
        <div><br>
        </div>
        <div>Adding Paolo/Kevin to comment.</div>
        <div><br>
        </div>
        <div>As for the poor gluster performance, could you disable
          cluster.eager-lock and see if that makes any difference:</div>
        <div><br>
        </div>
        <div># gluster volume set &lt;VOL&gt; cluster.eager-lock off</div>
        <div><br>
        </div>
        <div>Do also capture the volume profile again if you still see
          performance issues after disabling eager-lock.<br>
        </div>
        <div><br>
        </div>
        <div>-Krutika</div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Wed, May 30, 2018 at 6:55 AM, Jim
          Kusznir <span dir="ltr">&lt;<a
              href="mailto:jim@palousetech.com" target="_blank"
              moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">I also finally found the following in my
              system log on one server:
              <div><br>
              </div>
              <div>
                <div>[10679.524491] INFO: task glusterclogro:14933
                  blocked for more than 120 seconds.</div>
                <div>[10679.525826] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[10679.527144] glusterclogro   D ffff97209832bf40 
                     0 14933      1 0x00000080</div>
                <div>[10679.527150] Call Trace:</div>
                <div>[10679.527161]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[10679.527218]  [&lt;ffffffffc060e388&gt;]
                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>
                <div>[10679.527225]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[10679.527254]  [&lt;ffffffffc05eeb97&gt;]
                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>
                <div>[10679.527260]  [&lt;ffffffffb944f0e7&gt;]
                  do_fsync+0x67/0xb0</div>
                <div>[10679.527268]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[10679.527271]  [&lt;ffffffffb944f3d0&gt;]
                  SyS_fsync+0x10/0x20</div>
                <div>[10679.527275]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[10679.527279]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[10679.527283] INFO: task glusterposixfsy:14941
                  blocked for more than 120 seconds.</div>
                <div>[10679.528608] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[10679.529956] glusterposixfsy D ffff972495f84f10 
                     0 14941      1 0x00000080</div>
                <div>[10679.529961] Call Trace:</div>
                <div>[10679.529966]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[10679.530003]  [&lt;ffffffffc060e388&gt;]
                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>
                <div>[10679.530008]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[10679.530038]  [&lt;ffffffffc05eeb97&gt;]
                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>
                <div>[10679.530042]  [&lt;ffffffffb944f0e7&gt;]
                  do_fsync+0x67/0xb0</div>
                <div>[10679.530046]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[10679.530050]  [&lt;ffffffffb944f3f3&gt;]
                  SyS_fdatasync+0x13/0x20</div>
                <div>[10679.530054]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[10679.530058]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[10679.530062] INFO: task glusteriotwr13:15486
                  blocked for more than 120 seconds.</div>
                <div>[10679.531805] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[10679.533732] glusteriotwr13  D ffff9720a83f0000 
                     0 15486      1 0x00000080</div>
                <div>[10679.533738] Call Trace:</div>
                <div>[10679.533747]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[10679.533799]  [&lt;ffffffffc060e388&gt;]
                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>
                <div>[10679.533806]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[10679.533846]  [&lt;ffffffffc05eeb97&gt;]
                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>
                <div>[10679.533852]  [&lt;ffffffffb944f0e7&gt;]
                  do_fsync+0x67/0xb0</div>
                <div>[10679.533858]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[10679.533863]  [&lt;ffffffffb944f3f3&gt;]
                  SyS_fdatasync+0x13/0x20</div>
                <div>[10679.533868]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[10679.533873]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[10919.512757] INFO: task glusterclogro:14933
                  blocked for more than 120 seconds.</div>
                <div>[10919.514714] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[10919.516663] glusterclogro   D ffff97209832bf40 
                     0 14933      1 0x00000080</div>
                <div>[10919.516677] Call Trace:</div>
                <div>[10919.516690]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[10919.516696]  [&lt;ffffffffb99118e9&gt;]
                  schedule_timeout+0x239/0x2c0</div>
                <div>[10919.516703]  [&lt;ffffffffb951cc04&gt;] ?
                  blk_finish_plug+0x14/0x40</div>
                <div>[10919.516768]  [&lt;ffffffffc05e9224&gt;] ?
                  _xfs_buf_ioapply+0x334/0x460 [xfs]</div>
                <div>[10919.516774]  [&lt;ffffffffb991432d&gt;]
                  wait_for_completion+0xfd/0x140</div>
                <div>[10919.516782]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[10919.516821]  [&lt;ffffffffc05eb0a3&gt;] ?
                  _xfs_buf_read+0x23/0x40 [xfs]</div>
                <div>[10919.516859]  [&lt;ffffffffc05eafa9&gt;]
                  xfs_buf_submit_wait+0xf9/0x1d0 [xfs]</div>
                <div>[10919.516902]  [&lt;ffffffffc061b279&gt;] ?
                  xfs_trans_read_buf_map+0x199/<wbr>0x400 [xfs]</div>
                <div>[10919.516940]  [&lt;ffffffffc05eb0a3&gt;]
                  _xfs_buf_read+0x23/0x40 [xfs]</div>
                <div>[10919.516977]  [&lt;ffffffffc05eb1b9&gt;]
                  xfs_buf_read_map+0xf9/0x160 [xfs]</div>
                <div>[10919.517022]  [&lt;ffffffffc061b279&gt;]
                  xfs_trans_read_buf_map+0x199/<wbr>0x400 [xfs]</div>
                <div>[10919.517057]  [&lt;ffffffffc05c8d04&gt;]
                  xfs_da_read_buf+0xd4/0x100 [xfs]</div>
                <div>[10919.517091]  [&lt;ffffffffc05c8d53&gt;]
                  xfs_da3_node_read+0x23/0xd0 [xfs]</div>
                <div>[10919.517126]  [&lt;ffffffffc05c9fee&gt;]
                  xfs_da3_node_lookup_int+0x6e/<wbr>0x2f0 [xfs]</div>
                <div>[10919.517160]  [&lt;ffffffffc05d5a1d&gt;]
                  xfs_dir2_node_lookup+0x4d/<wbr>0x170 [xfs]</div>
                <div>[10919.517194]  [&lt;ffffffffc05ccf5d&gt;]
                  xfs_dir_lookup+0x1bd/0x1e0 [xfs]</div>
                <div>[10919.517233]  [&lt;ffffffffc05fd8d9&gt;]
                  xfs_lookup+0x69/0x140 [xfs]</div>
                <div>[10919.517271]  [&lt;ffffffffc05fa018&gt;]
                  xfs_vn_lookup+0x78/0xc0 [xfs]</div>
                <div>[10919.517278]  [&lt;ffffffffb9425cf3&gt;]
                  lookup_real+0x23/0x60</div>
                <div>[10919.517283]  [&lt;ffffffffb9426702&gt;]
                  __lookup_hash+0x42/0x60</div>
                <div>[10919.517288]  [&lt;ffffffffb942d519&gt;]
                  SYSC_renameat2+0x3a9/0x5a0</div>
                <div>[10919.517296]  [&lt;ffffffffb94d3753&gt;] ?
                  selinux_file_free_security+<wbr>0x23/0x30</div>
                <div>[10919.517304]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[10919.517309]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[10919.517313]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[10919.517318]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[10919.517323]  [&lt;ffffffffb942e58e&gt;]
                  SyS_renameat2+0xe/0x10</div>
                <div>[10919.517328]  [&lt;ffffffffb942e5ce&gt;]
                  SyS_rename+0x1e/0x20</div>
                <div>[10919.517333]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[10919.517339]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[11159.496095] INFO: task glusteriotwr9:15482
                  blocked for more than 120 seconds.</div>
                <div>[11159.497546] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[11159.498978] glusteriotwr9   D ffff971fa0fa1fa0 
                     0 15482      1 0x00000080</div>
                <div>[11159.498984] Call Trace:</div>
                <div>[11159.498995]  [&lt;ffffffffb9911f00&gt;] ?
                  bit_wait+0x50/0x50</div>
                <div>[11159.498999]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[11159.499003]  [&lt;ffffffffb99118e9&gt;]
                  schedule_timeout+0x239/0x2c0</div>
                <div>[11159.499056]  [&lt;ffffffffc05dd9b7&gt;] ?
                  xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]</div>
                <div>[11159.499082]  [&lt;ffffffffc05dd43e&gt;] ?
                  xfs_iext_bno_to_irec+0x8e/0xd0 [xfs]</div>
                <div>[11159.499090]  [&lt;ffffffffb92f7a12&gt;] ?
                  ktime_get_ts64+0x52/0xf0</div>
                <div>[11159.499093]  [&lt;ffffffffb9911f00&gt;] ?
                  bit_wait+0x50/0x50</div>
                <div>[11159.499097]  [&lt;ffffffffb991348d&gt;]
                  io_schedule_timeout+0xad/0x130</div>
                <div>[11159.499101]  [&lt;ffffffffb9913528&gt;]
                  io_schedule+0x18/0x20</div>
                <div>[11159.499104]  [&lt;ffffffffb9911f11&gt;]
                  bit_wait_io+0x11/0x50</div>
                <div>[11159.499107]  [&lt;ffffffffb9911ac1&gt;]
                  __wait_on_bit_lock+0x61/0xc0</div>
                <div>[11159.499113]  [&lt;ffffffffb9393634&gt;]
                  __lock_page+0x74/0x90</div>
                <div>[11159.499118]  [&lt;ffffffffb92bc210&gt;] ?
                  wake_bit_function+0x40/0x40</div>
                <div>[11159.499121]  [&lt;ffffffffb9394154&gt;]
                  __find_lock_page+0x54/0x70</div>
                <div>[11159.499125]  [&lt;ffffffffb9394e85&gt;]
                  grab_cache_page_write_begin+<wbr>0x55/0xc0</div>
                <div>[11159.499130]  [&lt;ffffffffb9484b76&gt;]
                  iomap_write_begin+0x66/0x100</div>
                <div>[11159.499135]  [&lt;ffffffffb9484edf&gt;]
                  iomap_write_actor+0xcf/0x1d0</div>
                <div>[11159.499140]  [&lt;ffffffffb9484e10&gt;] ?
                  iomap_write_end+0x80/0x80</div>
                <div>[11159.499144]  [&lt;ffffffffb94854e7&gt;]
                  iomap_apply+0xb7/0x150</div>
                <div>[11159.499149]  [&lt;ffffffffb9485621&gt;]
                  iomap_file_buffered_write+<wbr>0xa1/0xe0</div>
                <div>[11159.499153]  [&lt;ffffffffb9484e10&gt;] ?
                  iomap_write_end+0x80/0x80</div>
                <div>[11159.499182]  [&lt;ffffffffc05f025d&gt;]
                  xfs_file_buffered_aio_write+<wbr>0x12d/0x2c0 [xfs]</div>
                <div>[11159.499213]  [&lt;ffffffffc05f057d&gt;]
                  xfs_file_aio_write+0x18d/0x1b0 [xfs]</div>
                <div>[11159.499217]  [&lt;ffffffffb941a533&gt;]
                  do_sync_write+0x93/0xe0</div>
                <div>[11159.499222]  [&lt;ffffffffb941b010&gt;]
                  vfs_write+0xc0/0x1f0</div>
                <div>[11159.499225]  [&lt;ffffffffb941c002&gt;]
                  SyS_pwrite64+0x92/0xc0</div>
                <div>[11159.499230]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[11159.499234]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[11159.499238]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[11279.488720] INFO: task xfsaild/dm-10:1134
                  blocked for more than 120 seconds.</div>
                <div>[11279.490197] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[11279.491665] xfsaild/dm-10   D ffff9720a8660fd0 
                     0  1134      2 0x00000000</div>
                <div>[11279.491671] Call Trace:</div>
                <div>[11279.491682]  [&lt;ffffffffb92a3a2e&gt;] ?
                  try_to_del_timer_sync+0x5e/<wbr>0x90</div>
                <div>[11279.491688]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[11279.491744]  [&lt;ffffffffc060de36&gt;]
                  _xfs_log_force+0x1c6/0x2c0 [xfs]</div>
                <div>[11279.491750]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[11279.491783]  [&lt;ffffffffc0619fec&gt;] ?
                  xfsaild+0x16c/0x6f0 [xfs]</div>
                <div>[11279.491817]  [&lt;ffffffffc060df5c&gt;]
                  xfs_log_force+0x2c/0x70 [xfs]</div>
                <div>[11279.491849]  [&lt;ffffffffc0619e80&gt;] ?
                  xfs_trans_ail_cursor_first+<wbr>0x90/0x90 [xfs]</div>
                <div>[11279.491880]  [&lt;ffffffffc0619fec&gt;]
                  xfsaild+0x16c/0x6f0 [xfs]</div>
                <div>[11279.491913]  [&lt;ffffffffc0619e80&gt;] ?
                  xfs_trans_ail_cursor_first+<wbr>0x90/0x90 [xfs]</div>
                <div>[11279.491919]  [&lt;ffffffffb92bb161&gt;]
                  kthread+0xd1/0xe0</div>
                <div>[11279.491926]  [&lt;ffffffffb92bb090&gt;] ?
                  insert_kthread_work+0x40/0x40</div>
                <div>[11279.491932]  [&lt;ffffffffb9920677&gt;]
                  ret_from_fork_nospec_begin+<wbr>0x21/0x21</div>
                <div>[11279.491936]  [&lt;ffffffffb92bb090&gt;] ?
                  insert_kthread_work+0x40/0x40</div>
                <div>[11279.491976] INFO: task glusterclogfsyn:14934
                  blocked for more than 120 seconds.</div>
                <div>[11279.493466] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[11279.494952] glusterclogfsyn D ffff97209832af70 
                     0 14934      1 0x00000080</div>
                <div>[11279.494957] Call Trace:</div>
                <div>[11279.494979]  [&lt;ffffffffc0309839&gt;] ?
                  __split_and_process_bio+0x2e9/<wbr>0x520 [dm_mod]</div>
                <div>[11279.494983]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[11279.494987]  [&lt;ffffffffb99118e9&gt;]
                  schedule_timeout+0x239/0x2c0</div>
                <div>[11279.494997]  [&lt;ffffffffc0309d98&gt;] ?
                  dm_make_request+0x128/0x1a0 [dm_mod]</div>
                <div>[11279.495001]  [&lt;ffffffffb991348d&gt;]
                  io_schedule_timeout+0xad/0x130</div>
                <div>[11279.495005]  [&lt;ffffffffb99145ad&gt;]
                  wait_for_completion_io+0xfd/<wbr>0x140</div>
                <div>[11279.495010]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[11279.495016]  [&lt;ffffffffb951e574&gt;]
                  blkdev_issue_flush+0xb4/0x110</div>
                <div>[11279.495049]  [&lt;ffffffffc06064b9&gt;]
                  xfs_blkdev_issue_flush+0x19/<wbr>0x20 [xfs]</div>
                <div>[11279.495079]  [&lt;ffffffffc05eec40&gt;]
                  xfs_file_fsync+0x1b0/0x1e0 [xfs]</div>
                <div>[11279.495086]  [&lt;ffffffffb944f0e7&gt;]
                  do_fsync+0x67/0xb0</div>
                <div>[11279.495090]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[11279.495094]  [&lt;ffffffffb944f3d0&gt;]
                  SyS_fsync+0x10/0x20</div>
                <div>[11279.495098]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[11279.495102]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[11279.495105] INFO: task glusterposixfsy:14941
                  blocked for more than 120 seconds.</div>
                <div>[11279.496606] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[11279.498114] glusterposixfsy D ffff972495f84f10 
                     0 14941      1 0x00000080</div>
                <div>[11279.498118] Call Trace:</div>
                <div>[11279.498134]  [&lt;ffffffffc0309839&gt;] ?
                  __split_and_process_bio+0x2e9/<wbr>0x520 [dm_mod]</div>
                <div>[11279.498138]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[11279.498142]  [&lt;ffffffffb99118e9&gt;]
                  schedule_timeout+0x239/0x2c0</div>
                <div>[11279.498152]  [&lt;ffffffffc0309d98&gt;] ?
                  dm_make_request+0x128/0x1a0 [dm_mod]</div>
                <div>[11279.498156]  [&lt;ffffffffb991348d&gt;]
                  io_schedule_timeout+0xad/0x130</div>
                <div>[11279.498160]  [&lt;ffffffffb99145ad&gt;]
                  wait_for_completion_io+0xfd/<wbr>0x140</div>
                <div>[11279.498165]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[11279.498169]  [&lt;ffffffffb951e574&gt;]
                  blkdev_issue_flush+0xb4/0x110</div>
                <div>[11279.498202]  [&lt;ffffffffc06064b9&gt;]
                  xfs_blkdev_issue_flush+0x19/<wbr>0x20 [xfs]</div>
                <div>[11279.498231]  [&lt;ffffffffc05eec40&gt;]
                  xfs_file_fsync+0x1b0/0x1e0 [xfs]</div>
                <div>[11279.498238]  [&lt;ffffffffb944f0e7&gt;]
                  do_fsync+0x67/0xb0</div>
                <div>[11279.498242]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[11279.498246]  [&lt;ffffffffb944f3f3&gt;]
                  SyS_fdatasync+0x13/0x20</div>
                <div>[11279.498250]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[11279.498254]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[11279.498257] INFO: task glusteriotwr1:14950
                  blocked for more than 120 seconds.</div>
                <div>[11279.499789] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[11279.501343] glusteriotwr1   D ffff97208b6daf70 
                     0 14950      1 0x00000080</div>
                <div>[11279.501348] Call Trace:</div>
                <div>[11279.501353]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[11279.501390]  [&lt;ffffffffc060e388&gt;]
                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>
                <div>[11279.501396]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[11279.501428]  [&lt;ffffffffc05eeb97&gt;]
                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>
                <div>[11279.501432]  [&lt;ffffffffb944ef3f&gt;]
                  generic_write_sync+0x4f/0x70</div>
                <div>[11279.501461]  [&lt;ffffffffc05f0545&gt;]
                  xfs_file_aio_write+0x155/0x1b0 [xfs]</div>
                <div>[11279.501466]  [&lt;ffffffffb941a533&gt;]
                  do_sync_write+0x93/0xe0</div>
                <div>[11279.501471]  [&lt;ffffffffb941b010&gt;]
                  vfs_write+0xc0/0x1f0</div>
                <div>[11279.501475]  [&lt;ffffffffb941c002&gt;]
                  SyS_pwrite64+0x92/0xc0</div>
                <div>[11279.501479]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[11279.501483]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[11279.501489]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[11279.501493] INFO: task glusteriotwr4:14953
                  blocked for more than 120 seconds.</div>
                <div>[11279.503047] "echo 0 &gt;
                  /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables
                  this message.</div>
                <div>[11279.504630] glusteriotwr4   D ffff972499f2bf40 
                     0 14953      1 0x00000080</div>
                <div>[11279.504635] Call Trace:</div>
                <div>[11279.504640]  [&lt;ffffffffb9913f79&gt;]
                  schedule+0x29/0x70</div>
                <div>[11279.504676]  [&lt;ffffffffc060e388&gt;]
                  _xfs_log_force_lsn+0x2e8/0x340 [xfs]</div>
                <div>[11279.504681]  [&lt;ffffffffb92cf1b0&gt;] ?
                  wake_up_state+0x20/0x20</div>
                <div>[11279.504710]  [&lt;ffffffffc05eeb97&gt;]
                  xfs_file_fsync+0x107/0x1e0 [xfs]</div>
                <div>[11279.504714]  [&lt;ffffffffb944f0e7&gt;]
                  do_fsync+0x67/0xb0</div>
                <div>[11279.504718]  [&lt;ffffffffb992076f&gt;] ?
                  system_call_after_swapgs+0xbc/<wbr>0x160</div>
                <div>[11279.504722]  [&lt;ffffffffb944f3d0&gt;]
                  SyS_fsync+0x10/0x20</div>
                <div>[11279.504725]  [&lt;ffffffffb992082f&gt;]
                  system_call_fastpath+0x1c/0x21</div>
                <div>[11279.504730]  [&lt;ffffffffb992077b&gt;] ?
                  system_call_after_swapgs+0xc8/<wbr>0x160</div>
                <div>[12127.466494] perf: interrupt took too long (8263
                  &gt; 8150), lowering kernel.perf_event_max_sample_<wbr>rate
                  to 24000</div>
              </div>
              <div><br>
              </div>
              <div>--------------------</div>
              <div>I think this is the cause of the massive ovirt
                performance issues irrespective of gluster volume.  At
                the time this happened, I was also ssh'ed into the host,
                and was doing some rpm querry commands.  I had just run
                rpm -qa |grep glusterfs (to verify what version was
                actually installed), and that command took almost 2
                minutes to return!  Normally it takes less than 2
                seconds.  That is all pure local SSD IO, too....</div>
              <div><br>
              </div>
              <div>I'm no expert, but its my understanding that anytime
                a software causes these kinds of issues, its a serious
                bug in the software, even if its mis-handled
                exceptions.  Is this correct?</div>
              <span class="HOEnZb"><font color="#888888">
                  <div><br>
                  </div>
                  <div>--Jim</div>
                </font></span></div>
            <div class="HOEnZb">
              <div class="h5">
                <div class="gmail_extra"><br>
                  <div class="gmail_quote">On Tue, May 29, 2018 at 3:01
                    PM, Jim Kusznir <span dir="ltr">&lt;<a
                        href="mailto:jim@palousetech.com"
                        target="_blank" moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div dir="ltr">I think this is the profile
                        information for one of the volumes that lives on
                        the SSDs and is fully operational with no
                        down/problem disks:
                        <div><br>
                        </div>
                        <div>
                          <div>[root@ovirt2 yum.repos.d]# gluster volume
                            profile data info</div>
                          <div>Brick: ovirt2.nwfiber.com:/gluster/br<wbr>ick2/data</div>
                          <div>------------------------------<wbr>----------------</div>
                          <div>Cumulative Stats:</div>
                          <div>   Block Size:                256b+     
                                       512b+                1024b+ </div>
                          <div> No. of Reads:                  983     
                                        2696                  1059 </div>
                          <div>No. of Writes:                    0     
                                        1113                   302 </div>
                          <div> </div>
                          <div>   Block Size:               2048b+     
                                      4096b+                8192b+ </div>
                          <div> No. of Reads:                  852     
                                       88608                 53526 </div>
                          <div>No. of Writes:                  522     
                                      812340                 76257 </div>
                          <div> </div>
                          <div>   Block Size:              16384b+     
                                     32768b+               65536b+ </div>
                          <div> No. of Reads:                54351     
                                      241901                 15024 </div>
                          <div>No. of Writes:                21636     
                                        8656                  8976 </div>
                          <div> </div>
                          <div>   Block Size:             131072b+ </div>
                          <div> No. of Reads:               524156 </div>
                          <div>No. of Writes:               296071 </div>
                          <div> %-latency   Avg-latency   Min-Latency 
                             Max-Latency   No. of calls         Fop</div>
                          <div> ---------   -----------   ----------- 
                             -----------   ------------        ----</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           4189     RELEASE</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           1257  RELEASEDIR</div>
                          <div>      0.00      46.19 us      12.00 us   
                             187.00 us             69       FLUSH</div>
                          <div>      0.00     147.00 us      78.00 us   
                             367.00 us             86 REMOVEXATTR</div>
                          <div>      0.00     223.46 us      24.00 us   
                            1166.00 us            149     READDIR</div>
                          <div>      0.00     565.34 us      76.00 us   
                            3639.00 us             88   FTRUNCATE</div>
                          <div>      0.00     263.28 us      20.00 us 
                             28385.00 us            228          LK</div>
                          <div>      0.00      98.84 us       2.00 us   
                             880.00 us           1198     OPENDIR</div>
                          <div>      0.00      91.59 us      26.00 us 
                             10371.00 us           3853      STATFS</div>
                          <div>      0.00     494.14 us      17.00 us 
                            193439.00 us           1171    GETXATTR</div>
                          <div>      0.00     299.42 us      35.00 us   
                            9799.00 us           2044    READDIRP</div>
                          <div>      0.00    1965.31 us     110.00 us 
                            382258.00 us            321     XATTROP</div>
                          <div>      0.01     113.40 us      24.00 us 
                             61061.00 us           8134        STAT</div>
                          <div>      0.01     755.38 us      57.00 us 
                            607603.00 us           3196     DISCARD</div>
                          <div>      0.05    2690.09 us      58.00 us
                            2704761.00 us           3206        OPEN</div>
                          <div>      0.10  119978.25 us      97.00 us
                            9406684.00 us            154     SETATTR</div>
                          <div>      0.18     101.73 us      28.00 us 
                            700477.00 us         313379       FSTAT</div>
                          <div>      0.23    1059.84 us      25.00 us
                            2716124.00 us          38255      LOOKUP</div>
                          <div>      0.47    1024.11 us      54.00 us
                            6197164.00 us          81455    FXATTROP</div>
                          <div>      1.72    2984.00 us      15.00 us
                            37098954.00 us         103020    FINODELK</div>
                          <div>      5.92   44315.32 us      51.00 us
                            24731536.00 us          23957       FSYNC</div>
                          <div>     13.27    2399.78 us      25.00 us
                            22089540.00 us         991005        READ</div>
                          <div>     37.00    5980.43 us      52.00 us
                            22099889.00 us        1108976       WRITE</div>
                          <div>     41.04    5452.75 us      13.00 us
                            22102452.00 us        1349053     INODELK</div>
                          <div> </div>
                          <div>    Duration: 10026 seconds</div>
                          <div>   Data Read: 80046027759 bytes</div>
                          <div>Data Written: 44496632320 bytes</div>
                          <div> </div>
                          <div>Interval 1 Stats:</div>
                          <div>   Block Size:                256b+     
                                       512b+                1024b+ </div>
                          <div> No. of Reads:                  983     
                                        2696                  1059 </div>
                          <div>No. of Writes:                    0     
                                         838                   185 </div>
                          <div> </div>
                          <div>   Block Size:               2048b+     
                                      4096b+                8192b+ </div>
                          <div> No. of Reads:                  852     
                                       85856                 51575 </div>
                          <div>No. of Writes:                  382     
                                      705802                 57812 </div>
                          <div> </div>
                          <div>   Block Size:              16384b+     
                                     32768b+               65536b+ </div>
                          <div> No. of Reads:                52673     
                                      232093                 14984 </div>
                          <div>No. of Writes:                13499     
                                        4908                  4242 </div>
                          <div> </div>
                          <div>   Block Size:             131072b+ </div>
                          <div> No. of Reads:               460040 </div>
                          <div>No. of Writes:                 6411 </div>
                          <div> %-latency   Avg-latency   Min-Latency 
                             Max-Latency   No. of calls         Fop</div>
                          <div> ---------   -----------   ----------- 
                             -----------   ------------        ----</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           2093     RELEASE</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           1093  RELEASEDIR</div>
                          <div>      0.00      53.38 us      26.00 us   
                             111.00 us             16       FLUSH</div>
                          <div>      0.00     145.14 us      78.00 us   
                             367.00 us             71 REMOVEXATTR</div>
                          <div>      0.00     190.96 us     114.00 us   
                             298.00 us             71     SETATTR</div>
                          <div>      0.00     213.38 us      24.00 us   
                            1145.00 us             90     READDIR</div>
                          <div>      0.00     263.28 us      20.00 us 
                             28385.00 us            228          LK</div>
                          <div>      0.00     101.76 us       2.00 us   
                             880.00 us           1093     OPENDIR</div>
                          <div>      0.01      93.60 us      27.00 us 
                             10371.00 us           3090      STATFS</div>
                          <div>      0.02     537.47 us      17.00 us 
                            193439.00 us           1038    GETXATTR</div>
                          <div>      0.03     297.44 us      35.00 us   
                            9799.00 us           1990    READDIRP</div>
                          <div>      0.03    2357.28 us     110.00 us 
                            382258.00 us            253     XATTROP</div>
                          <div>      0.04     385.93 us      58.00 us 
                             47593.00 us           2091        OPEN</div>
                          <div>      0.04     114.86 us      24.00 us 
                             61061.00 us           7715        STAT</div>
                          <div>      0.06     444.59 us      57.00 us 
                            333240.00 us           3053     DISCARD</div>
                          <div>      0.42     316.24 us      25.00 us 
                            290728.00 us          29823      LOOKUP</div>
                          <div>      0.73     257.92 us      54.00 us 
                            344812.00 us          63296    FXATTROP</div>
                          <div>      1.37      98.30 us      28.00 us 
                             67621.00 us         313172       FSTAT</div>
                          <div>      1.58    2124.69 us      51.00 us 
                            849200.00 us          16717       FSYNC</div>
                          <div>      5.73     162.46 us      52.00 us 
                            748492.00 us         794079       WRITE</div>
                          <div>      7.19    2065.17 us      16.00 us
                            37098954.00 us          78381    FINODELK</div>
                          <div>     36.44     886.32 us      25.00 us
                            2216436.00 us         925421        READ</div>
                          <div>     46.30    1178.04 us      13.00 us
                            1700704.00 us         884635     INODELK</div>
                          <div> </div>
                          <div>    Duration: 7485 seconds</div>
                          <div>   Data Read: 71250527215 bytes</div>
                          <div>Data Written: 5119903744 bytes</div>
                          <div> </div>
                          <div>Brick: ovirt3.nwfiber.com:/gluster/br<wbr>ick2/data</div>
                          <div>------------------------------<wbr>----------------</div>
                          <div>Cumulative Stats:</div>
                          <div>   Block Size:                  1b+ </div>
                          <div> No. of Reads:                    0 </div>
                          <div>No. of Writes:              3264419 </div>
                          <div> %-latency   Avg-latency   Min-Latency 
                             Max-Latency   No. of calls         Fop</div>
                          <div> ---------   -----------   ----------- 
                             -----------   ------------        ----</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us             90      FORGET</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           9462     RELEASE</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           4254  RELEASEDIR</div>
                          <div>      0.00      50.52 us      13.00 us   
                             190.00 us             71       FLUSH</div>
                          <div>      0.00     186.97 us      87.00 us   
                             713.00 us             86 REMOVEXATTR</div>
                          <div>      0.00      79.32 us      33.00 us   
                             189.00 us            228          LK</div>
                          <div>      0.00     220.98 us     129.00 us   
                             513.00 us             86     SETATTR</div>
                          <div>      0.01     259.30 us      26.00 us   
                            2632.00 us            137     READDIR</div>
                          <div>      0.02     322.76 us     145.00 us   
                            2125.00 us            321     XATTROP</div>
                          <div>      0.03     109.55 us       2.00 us   
                            1258.00 us           1193     OPENDIR</div>
                          <div>      0.05      70.21 us      21.00 us   
                             431.00 us           3196     DISCARD</div>
                          <div>      0.05     169.26 us      21.00 us   
                            2315.00 us           1545    GETXATTR</div>
                          <div>      0.12     176.85 us      63.00 us   
                            2844.00 us           3206        OPEN</div>
                          <div>      0.61     303.49 us      90.00 us   
                            3085.00 us           9633       FSTAT</div>
                          <div>      2.44     305.66 us      28.00 us   
                            3716.00 us          38230      LOOKUP</div>
                          <div>      4.52     266.22 us      55.00 us 
                             53424.00 us          81455    FXATTROP</div>
                          <div>      6.96    1397.99 us      51.00 us 
                             64822.00 us          23889       FSYNC</div>
                          <div>     16.48      84.74 us      25.00 us   
                            6917.00 us         932592       WRITE</div>
                          <div>     30.16     106.90 us      13.00 us
                            3920189.00 us        1353046     INODELK</div>
                          <div>     38.55    1794.52 us      14.00 us
                            16210553.00 us         103039    FINODELK</div>
                          <div> </div>
                          <div>    Duration: 66562 seconds</div>
                          <div>   Data Read: 0 bytes</div>
                          <div>Data Written: 3264419 bytes</div>
                          <div> </div>
                          <div>Interval 1 Stats:</div>
                          <div>   Block Size:                  1b+ </div>
                          <div> No. of Reads:                    0 </div>
                          <div>No. of Writes:               794080 </div>
                          <div> %-latency   Avg-latency   Min-Latency 
                             Max-Latency   No. of calls         Fop</div>
                          <div> ---------   -----------   ----------- 
                             -----------   ------------        ----</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           2093     RELEASE</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           1093  RELEASEDIR</div>
                          <div>      0.00      70.31 us      26.00 us   
                             125.00 us             16       FLUSH</div>
                          <div>      0.00     193.10 us     103.00 us   
                             713.00 us             71 REMOVEXATTR</div>
                          <div>      0.01     227.32 us     133.00 us   
                             513.00 us             71     SETATTR</div>
                          <div>      0.01      79.32 us      33.00 us   
                             189.00 us            228          LK</div>
                          <div>      0.01     259.83 us      35.00 us   
                            1138.00 us             89     READDIR</div>
                          <div>      0.03     318.26 us     145.00 us   
                            2047.00 us            253     XATTROP</div>
                          <div>      0.04     112.67 us       3.00 us   
                            1258.00 us           1093     OPENDIR</div>
                          <div>      0.06     167.98 us      23.00 us   
                            1951.00 us           1014    GETXATTR</div>
                          <div>      0.08      70.97 us      22.00 us   
                             431.00 us           3053     DISCARD</div>
                          <div>      0.13     183.78 us      66.00 us   
                            2844.00 us           2091        OPEN</div>
                          <div>      1.01     303.82 us      90.00 us   
                            3085.00 us           9610       FSTAT</div>
                          <div>      3.27     316.59 us      30.00 us   
                            3716.00 us          29820      LOOKUP</div>
                          <div>      5.83     265.79 us      59.00 us 
                             53424.00 us          63296    FXATTROP</div>
                          <div>      7.95    1373.89 us      51.00 us 
                             64822.00 us          16717       FSYNC</div>
                          <div>     23.17     851.99 us      14.00 us
                            16210553.00 us          78555    FINODELK</div>
                          <div>     24.04      87.44 us      27.00 us   
                            6917.00 us         794081       WRITE</div>
                          <div>     34.36     111.91 us      14.00 us 
                            984871.00 us         886790     INODELK</div>
                          <div> </div>
                          <div>    Duration: 7485 seconds</div>
                          <div>   Data Read: 0 bytes</div>
                          <div>Data Written: 794080 bytes</div>
                        </div>
                        <div><br>
                        </div>
                        <div><br>
                        </div>
                        <div>-----------------------</div>
                        <div>Here is the data from the volume that is
                          backed by the SHDDs and has one failed disk:</div>
                        <div>
                          <div>[root@ovirt2 yum.repos.d]# gluster volume
                            profile data-hdd info</div>
                          <div>Brick: 172.172.1.12:/gluster/brick3/d<wbr>ata-hdd</div>
                          <div>------------------------------<wbr>--------------</div>
                          <div>Cumulative Stats:</div>
                          <div>   Block Size:                256b+     
                                       512b+                1024b+ </div>
                          <div> No. of Reads:                 1702     
                                          86                    16 </div>
                          <div>No. of Writes:                    0     
                                         767                    71 </div>
                          <div> </div>
                          <div>   Block Size:               2048b+     
                                      4096b+                8192b+ </div>
                          <div> No. of Reads:                   19     
                                       51841                  2049 </div>
                          <div>No. of Writes:                   76     
                                       60668                 35727 </div>
                          <div> </div>
                          <div>   Block Size:              16384b+     
                                     32768b+               65536b+ </div>
                          <div> No. of Reads:                 1744     
                                         639                  1088 </div>
                          <div>No. of Writes:                 8524     
                                        2410                  1285 </div>
                          <div> </div>
                          <div>   Block Size:             131072b+ </div>
                          <div> No. of Reads:               771999 </div>
                          <div>No. of Writes:                29584 </div>
                          <div> %-latency   Avg-latency   Min-Latency 
                             Max-Latency   No. of calls         Fop</div>
                          <div> ---------   -----------   ----------- 
                             -----------   ------------        ----</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           2902     RELEASE</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           1517  RELEASEDIR</div>
                          <div>      0.00     197.00 us     197.00 us   
                             197.00 us              1   FTRUNCATE</div>
                          <div>      0.00      70.24 us      16.00 us   
                             758.00 us             51       FLUSH</div>
                          <div>      0.00     143.93 us      82.00 us   
                             305.00 us             57 REMOVEXATTR</div>
                          <div>      0.00     178.63 us     105.00 us   
                             712.00 us             60     SETATTR</div>
                          <div>      0.00      67.30 us      19.00 us   
                             572.00 us            555          LK</div>
                          <div>      0.00     322.80 us      23.00 us   
                            4673.00 us            138     READDIR</div>
                          <div>      0.00     336.56 us     106.00 us 
                             11994.00 us            237     XATTROP</div>
                          <div>      0.00      84.70 us      28.00 us   
                            1071.00 us           3469      STATFS</div>
                          <div>      0.01     387.75 us       2.00 us 
                            146017.00 us           1467     OPENDIR</div>
                          <div>      0.01     148.59 us      21.00 us 
                             64374.00 us           4454        STAT</div>
                          <div>      0.02     783.02 us      16.00 us 
                             93502.00 us           1902    GETXATTR</div>
                          <div>      0.03    1516.10 us      17.00 us 
                            210690.00 us           1364     ENTRYLK</div>
                          <div>      0.03    2555.47 us     300.00 us 
                            674454.00 us           1064    READDIRP</div>
                          <div>      0.07      85.74 us      19.00 us 
                             68340.00 us          62849       FSTAT</div>
                          <div>      0.07    1978.12 us      59.00 us 
                            202596.00 us           2729        OPEN</div>
                          <div>      0.22     708.57 us      15.00 us 
                            394799.00 us          25447      LOOKUP</div>
                          <div>      5.94    2331.74 us      15.00 us
                            1099530.00 us         207534    FINODELK</div>
                          <div>      7.31    8311.75 us      58.00 us
                            1800216.00 us          71668    FXATTROP</div>
                          <div>     12.49    7735.19 us      51.00 us
                            3595513.00 us         131642       WRITE</div>
                          <div>     17.70     957.08 us      16.00 us
                            13700466.00 us        1508160     INODELK</div>
                          <div>     24.55    2546.43 us      26.00 us
                            5077347.00 us         786060        READ</div>
                          <div>     31.56   49699.15 us      47.00 us
                            3746331.00 us          51777       FSYNC</div>
                          <div> </div>
                          <div>    Duration: 10101 seconds</div>
                          <div>   Data Read: 101562897361 bytes</div>
                          <div>Data Written: 4834450432 bytes</div>
                          <div> </div>
                          <div>Interval 0 Stats:</div>
                          <div>   Block Size:                256b+     
                                       512b+                1024b+ </div>
                          <div> No. of Reads:                 1702     
                                          86                    16 </div>
                          <div>No. of Writes:                    0     
                                         767                    71 </div>
                          <div> </div>
                          <div>   Block Size:               2048b+     
                                      4096b+                8192b+ </div>
                          <div> No. of Reads:                   19     
                                       51841                  2049 </div>
                          <div>No. of Writes:                   76     
                                       60668                 35727 </div>
                          <div> </div>
                          <div>   Block Size:              16384b+     
                                     32768b+               65536b+ </div>
                          <div> No. of Reads:                 1744     
                                         639                  1088 </div>
                          <div>No. of Writes:                 8524     
                                        2410                  1285 </div>
                          <div> </div>
                          <div>   Block Size:             131072b+ </div>
                          <div> No. of Reads:               771999 </div>
                          <div>No. of Writes:                29584 </div>
                          <div> %-latency   Avg-latency   Min-Latency 
                             Max-Latency   No. of calls         Fop</div>
                          <div> ---------   -----------   ----------- 
                             -----------   ------------        ----</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           2902     RELEASE</div>
                          <div>      0.00       0.00 us       0.00 us   
                               0.00 us           1517  RELEASEDIR</div>
                          <div>      0.00     197.00 us     197.00 us   
                             197.00 us              1   FTRUNCATE</div>
                          <div>      0.00      70.24 us      16.00 us   
                             758.00 us             51       FLUSH</div>
                          <div>      0.00     143.93 us      82.00 us   
                             305.00 us             57 REMOVEXATTR</div>
                          <div>      0.00     178.63 us     105.00 us   
                             712.00 us             60     SETATTR</div>
                          <div>      0.00      67.30 us      19.00 us   
                             572.00 us            555          LK</div>
                          <div>      0.00     322.80 us      23.00 us   
                            4673.00 us            138     READDIR</div>
                          <div>      0.00     336.56 us     106.00 us 
                             11994.00 us            237     XATTROP</div>
                          <div>      0.00      84.70 us      28.00 us   
                            1071.00 us           3469      STATFS</div>
                          <div>      0.01     387.75 us       2.00 us 
                            146017.00 us           1467     OPENDIR</div>
                          <div>      0.01     148.59 us      21.00 us 
                             64374.00 us           4454        STAT</div>
                          <div>      0.02     783.02 us      16.00 us 
                             93502.00 us           1902    GETXATTR</div>
                          <div>      0.03    1516.10 us      17.00 us 
                            210690.00 us           1364     ENTRYLK</div>
                          <div>      0.03    2555.47 us     300.00 us 
                            674454.00 us           1064    READDIRP</div>
                          <div>      0.07      85.73 us      19.00 us 
                             68340.00 us          62849       FSTAT</div>
                          <div>      0.07    1978.12 us      59.00 us 
                            202596.00 us           2729        OPEN</div>
                          <div>      0.22     708.57 us      15.00 us 
                            394799.00 us          25447      LOOKUP</div>
                          <div>      5.94    2334.57 us      15.00 us
                            1099530.00 us         207534    FINODELK</div>
                          <div>      7.31    8311.49 us      58.00 us
                            1800216.00 us          71668    FXATTROP</div>
                          <div>     12.49    7735.32 us      51.00 us
                            3595513.00 us         131642       WRITE</div>
                          <div>     17.71     957.08 us      16.00 us
                            13700466.00 us        1508160     INODELK</div>
                          <div>     24.56    2546.42 us      26.00 us
                            5077347.00 us         786060        READ</div>
                          <div>     31.54   49651.63 us      47.00 us
                            3746331.00 us          51777       FSYNC</div>
                          <div> </div>
                          <div>    Duration: 10101 seconds</div>
                          <div>   Data Read: 101562897361 bytes</div>
                          <div>Data Written: 4834450432 bytes</div>
                        </div>
                        <div><br>
                        </div>
                      </div>
                      <div class="m_-5992202424066002276HOEnZb">
                        <div class="m_-5992202424066002276h5">
                          <div class="gmail_extra"><br>
                            <div class="gmail_quote">On Tue, May 29,
                              2018 at 2:55 PM, Jim Kusznir <span
                                dir="ltr">&lt;<a
                                  href="mailto:jim@palousetech.com"
                                  target="_blank" moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>
                              wrote:<br>
                              <blockquote class="gmail_quote"
                                style="margin:0 0 0 .8ex;border-left:1px
                                #ccc solid;padding-left:1ex">
                                <div dir="ltr">Thank you for your
                                  response.
                                  <div><br>
                                  </div>
                                  <div>I have 4 gluster volumes.  3 are
                                    replica 2 + arbitrator.  replica
                                    bricks are on ovirt1 and ovirt2,
                                    arbitrator on ovirt3.  The 4th
                                    volume is replica 3, with a brick on
                                    all three ovirt machines.</div>
                                  <div><br>
                                  </div>
                                  <div>The first 3 volumes are on an SSD
                                    disk; the 4th is on a Seagate SSHD
                                    (same in all three machines).  On
                                    ovirt3, the SSHD has reported hard
                                    IO failures, and that brick is
                                    offline.  However, the other two
                                    replicas are fully operational
                                    (although they still show contents
                                    in the heal info command that won't
                                    go away, but that may be the case
                                    until I replace the failed disk).</div>
                                  <div><br>
                                  </div>
                                  <div>What is bothering me is that ALL
                                    4 gluster volumes are showing
                                    horrible performance issues.  At
                                    this point, as the bad disk has been
                                    completely offlined, I would expect
                                    gluster to perform at normal speed,
                                    but that is definitely not the case.</div>
                                  <div><br>
                                  </div>
                                  <div>I've also noticed that the
                                    performance hits seem to come in
                                    waves: things seem to work
                                    acceptably (but slow) for a while,
                                    then suddenly, its as if all disk IO
                                    on all volumes (including
                                    non-gluster local OS disk volumes
                                    for the hosts) pause for about 30
                                    seconds, then IO resumes again. 
                                    During those times, I start getting
                                    VM not responding and host not
                                    responding notices as well as the
                                    applications having major issues.</div>
                                  <div><br>
                                  </div>
                                  <div>I've shut down most of my VMs and
                                    am down to just my essential core
                                    VMs (shedded about 75% of my VMs). 
                                    I still am experiencing the same
                                    issues.</div>
                                  <div><br>
                                  </div>
                                  <div>Am I correct in believing that
                                    once the failed disk was brought
                                    offline that performance should
                                    return to normal?</div>
                                </div>
                                <div
                                  class="m_-5992202424066002276m_1037085839393797930HOEnZb">
                                  <div
                                    class="m_-5992202424066002276m_1037085839393797930h5">
                                    <div class="gmail_extra"><br>
                                      <div class="gmail_quote">On Tue,
                                        May 29, 2018 at 1:27 PM, Alex K
                                        <span dir="ltr">&lt;<a
                                            href="mailto:rightkicktech@gmail.com"
                                            target="_blank"
                                            moz-do-not-send="true">rightkicktech@gmail.com</a>&gt;</span>
                                        wrote:<br>
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex">
                                          <div dir="auto">I would check
                                            disks status and
                                            accessibility of mount
                                            points where your gluster
                                            volumes reside.</div>
                                          <br>
                                          <div class="gmail_quote">
                                            <div>
                                              <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844h5">
                                                <div dir="ltr">On Tue,
                                                  May 29, 2018, 22:28
                                                  Jim Kusznir &lt;<a
                                                    href="mailto:jim@palousetech.com"
                                                    target="_blank"
                                                    moz-do-not-send="true">jim@palousetech.com</a>&gt;
                                                  wrote:<br>
                                                </div>
                                              </div>
                                            </div>
                                            <blockquote
                                              class="gmail_quote"
                                              style="margin:0 0 0
                                              .8ex;border-left:1px #ccc
                                              solid;padding-left:1ex">
                                              <div>
                                                <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844h5">
                                                  <div dir="ltr">On one
                                                    ovirt server, I'm
                                                    now seeing these
                                                    messages:
                                                    <div>
                                                      <div>[56474.239725]
blk_update_request: 63 callbacks suppressed</div>
                                                      <div>[56474.239732]
blk_update_request: I/O error, dev dm-2, sector 0</div>
                                                      <div>[56474.240602]
blk_update_request: I/O error, dev dm-2, sector 3905945472</div>
                                                      <div>[56474.241346]
blk_update_request: I/O error, dev dm-2, sector 3905945584</div>
                                                      <div>[56474.242236]
blk_update_request: I/O error, dev dm-2, sector 2048</div>
                                                      <div>[56474.243072]
blk_update_request: I/O error, dev dm-2, sector 3905943424</div>
                                                      <div>[56474.243997]
blk_update_request: I/O error, dev dm-2, sector 3905943536</div>
                                                      <div>[56474.247347]
blk_update_request: I/O error, dev dm-2, sector 0</div>
                                                      <div>[56474.248315]
blk_update_request: I/O error, dev dm-2, sector 3905945472</div>
                                                      <div>[56474.249231]
blk_update_request: I/O error, dev dm-2, sector 3905945584</div>
                                                      <div>[56474.250221]
blk_update_request: I/O error, dev dm-2, sector 2048</div>
                                                    </div>
                                                    <div><br>
                                                    </div>
                                                    <div><br>
                                                    </div>
                                                    <div><br>
                                                    </div>
                                                  </div>
                                                  <div
                                                    class="gmail_extra"><br>
                                                    <div
                                                      class="gmail_quote">On
                                                      Tue, May 29, 2018
                                                      at 11:59 AM, Jim
                                                      Kusznir <span
                                                        dir="ltr">&lt;<a
href="mailto:jim@palousetech.com" rel="noreferrer" target="_blank"
                                                          moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>
                                                      wrote:<br>
                                                      <blockquote
                                                        class="gmail_quote"
                                                        style="margin:0
                                                        0 0
                                                        .8ex;border-left:1px
                                                        #ccc
                                                        solid;padding-left:1ex">
                                                        <div dir="ltr">I
                                                          see in
                                                          messages on
                                                          ovirt3 (my 3rd
                                                          machine, the
                                                          one upgraded
                                                          to 4.2):
                                                          <div><br>
                                                          </div>
                                                          <div>
                                                          <div>May 29
                                                          11:54:41
                                                          ovirt3
                                                          ovs-vsctl:
                                                          ovs|00001|db_ctl_base|ERR|unix<wbr>:/var/run/openvswitch/db.sock:
                                                          database
                                                          connection
                                                          failed (No
                                                          such file or
                                                          directory)</div>
                                                          <div>May 29
                                                          11:54:51
                                                          ovirt3
                                                          ovs-vsctl:
                                                          ovs|00001|db_ctl_base|ERR|unix<wbr>:/var/run/openvswitch/db.sock:
                                                          database
                                                          connection
                                                          failed (No
                                                          such file or
                                                          directory)</div>
                                                          <div>May 29
                                                          11:55:01
                                                          ovirt3
                                                          ovs-vsctl:
                                                          ovs|00001|db_ctl_base|ERR|unix<wbr>:/var/run/openvswitch/db.sock:
                                                          database
                                                          connection
                                                          failed (No
                                                          such file or
                                                          directory)</div>
                                                          </div>
                                                          <div>(appears
                                                          a lot).</div>
                                                          <div><br>
                                                          </div>
                                                          <div>I also
                                                          found on the
                                                          ssh session of
                                                          that, some
                                                          sysv warnings
                                                          about the
                                                          backing disk
                                                          for one of the
                                                          gluster
                                                          volumes
                                                          (straight
                                                          replica 3). 
                                                          The glusterfs
                                                          process for
                                                          that disk on
                                                          that machine
                                                          went offline. 
                                                          Its my
                                                          understanding
                                                          that it should
                                                          continue to
                                                          work with the
                                                          other two
                                                          machines while
                                                          I attempt to
                                                          replace that
                                                          disk, right? 
                                                          Attempted
                                                          writes
                                                          (touching an
                                                          empty file)
                                                          can take 15
                                                          seconds,
                                                          repeating it
                                                          later will be
                                                          much faster.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Gluster
                                                          generates a
                                                          bunch of
                                                          different log
                                                          files, I don't
                                                          know what ones
                                                          you want, or
                                                          from which
                                                          machine(s).</div>
                                                          <div><br>
                                                          </div>
                                                          <div>How do I
                                                          do "volume
                                                          profiling"?</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Thanks!</div>
                                                        </div>
                                                        <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928HOEnZb">
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928h5">
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <div
                                                          class="gmail_quote">On
                                                          Tue, May 29,
                                                          2018 at 11:53
                                                          AM, Sahina
                                                          Bose <span
                                                          dir="ltr">&lt;<a
href="mailto:sabose@redhat.com" rel="noreferrer" target="_blank"
                                                          moz-do-not-send="true">sabose@redhat.com</a>&gt;</span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div dir="ltr">
                                                          <div>Do you
                                                          see errors
                                                          reported in
                                                          the mount logs
                                                          for the
                                                          volume? If so,
                                                          could you
                                                          attach the
                                                          logs?<br>
                                                          </div>
                                                          Any issues
                                                          with your
                                                          underlying
                                                          disks. Can you
                                                          also attach
                                                          output of
                                                          volume
                                                          profiling?<br>
                                                          </div>
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702HOEnZb">
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702h5">
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <div
                                                          class="gmail_quote">On
                                                          Wed, May 30,
                                                          2018 at 12:13
                                                          AM, Jim
                                                          Kusznir <span
                                                          dir="ltr">&lt;<a
href="mailto:jim@palousetech.com" rel="noreferrer" target="_blank"
                                                          moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div dir="ltr">Ok,
                                                          things have
                                                          gotten MUCH
                                                          worse this
                                                          morning.  I'm
                                                          getting random
                                                          errors from
                                                          VMs, right
                                                          now, about a
                                                          third of my
                                                          VMs have been
                                                          paused due to
                                                          storage
                                                          issues, and
                                                          most of the
                                                          remaining VMs
                                                          are not
                                                          performing
                                                          well.
                                                          <div><br>
                                                          </div>
                                                          <div>At this
                                                          point, I am in
                                                          full EMERGENCY
                                                          mode, as my
                                                          production
                                                          services are
                                                          now impacted,
                                                          and I'm
                                                          getting calls
                                                          coming in with
                                                          problems...</div>
                                                          <div><br>
                                                          </div>
                                                          <div>I'd
                                                          greatly
                                                          appreciate
                                                          help...VMs are
                                                          running VERY
                                                          slowly (when
                                                          they run), and
                                                          they are
                                                          steadily
                                                          getting
                                                          worse.  I
                                                          don't know
                                                          why.  I was
                                                          seeing CPU
                                                          peaks (to
                                                          100%) on
                                                          several VMs,
                                                          in perfect
                                                          sync, for a
                                                          few minutes at
                                                          a time (while
                                                          the VM became
                                                          unresponsive
                                                          and any VMs I
                                                          was logged
                                                          into that were
                                                          linux were
                                                          giving me the
                                                          CPU stuck
                                                          messages in my
                                                          origional
                                                          post).  Is all
                                                          this storage
                                                          related?</div>
                                                          <div><br>
                                                          </div>
                                                          <div>I also
                                                          have two
                                                          different
                                                          gluster
                                                          volumes for VM
                                                          storage, and
                                                          only one had
                                                          the issues,
                                                          but now VMs in
                                                          both are being
                                                          affected at
                                                          the same time
                                                          and same way.</div>
                                                          <span
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339HOEnZb"><font
color="#888888">
                                                          <div><br>
                                                          </div>
                                                          <div>--Jim</div>
                                                          </font></span></div>
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339HOEnZb">
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339h5">
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <div
                                                          class="gmail_quote">On
                                                          Mon, May 28,
                                                          2018 at 10:50
                                                          PM, Sahina
                                                          Bose <span
                                                          dir="ltr">&lt;<a
href="mailto:sabose@redhat.com" rel="noreferrer" target="_blank"
                                                          moz-do-not-send="true">sabose@redhat.com</a>&gt;</span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div dir="ltr">[Adding
                                                          gluster-users
                                                          to look at the
                                                          heal issue]<br>
                                                          </div>
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <div
                                                          class="gmail_quote">
                                                          <div>
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339m_2506865858631215125h5">On
                                                          Tue, May 29,
                                                          2018 at 9:17
                                                          AM, Jim
                                                          Kusznir <span
                                                          dir="ltr">&lt;<a
href="mailto:jim@palousetech.com" rel="noreferrer" target="_blank"
                                                          moz-do-not-send="true">jim@palousetech.com</a>&gt;</span>
                                                          wrote:<br>
                                                          </div>
                                                          </div>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div>
                                                          <div
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339m_2506865858631215125h5">
                                                          <div dir="ltr">Hello:
                                                          <div><br>
                                                          </div>
                                                          <div>I've been
                                                          having some
                                                          cluster and
                                                          gluster
                                                          performance
                                                          issues
                                                          lately.  I
                                                          also found
                                                          that my
                                                          cluster was
                                                          out of date,
                                                          and was trying
                                                          to apply
                                                          updates
                                                          (hoping to fix
                                                          some of
                                                          these), and
                                                          discovered the
                                                          ovirt 4.1
                                                          repos were
                                                          taken
                                                          completely
                                                          offline.  So,
                                                          I was forced
                                                          to begin an
                                                          upgrade to
                                                          4.2. 
                                                          According to
                                                          docs I
                                                          found/read, I
                                                          needed only
                                                          add the new
                                                          repo, do a yum
                                                          update,
                                                          reboot, and be
                                                          good on my
                                                          hosts (did the
                                                          yum update,
                                                          the
                                                          engine-setup
                                                          on my hosted
                                                          engine). 
                                                          Things seemed
                                                          to work
                                                          relatively
                                                          well, except
                                                          for a gluster
                                                          sync issue
                                                          that showed
                                                          up.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>My
                                                          cluster is a 3
                                                          node
                                                          hyperconverged
                                                          cluster.  I
                                                          upgraded the
                                                          hosted engine
                                                          first, then
                                                          engine 3. 
                                                          When engine 3
                                                          came back up,
                                                          for some
                                                          reason one of
                                                          my gluster
                                                          volumes would
                                                          not sync. 
                                                          Here's sample
                                                          output:</div>
                                                          <div><br>
                                                          </div>
                                                          <div>
                                                          <div>[root@ovirt3
                                                          ~]# gluster
                                                          volume heal
                                                          data-hdd info</div>
                                                          <div>Brick
                                                          172.172.1.11:/gluster/brick3/d<wbr>ata-hdd</div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/48d7ecb8-7ac5-4<wbr>725-bca5-b3519681cf2f/0d6080b0<wbr>-7018-4fa3-bb82-1dd9ef07d9b9 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/647be733-f153-4<wbr>cdc-85bd-ba72544c2631/b453a300<wbr>-0602-4be1-8310-8bd5abe00971 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/6da854d1-b6be-4<wbr>46b-9bf0-90a0dbbea830/3c93bd1f<wbr>-b7fa-4aa2-b445-6904e31839ba </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/7f647567-d18c-4<wbr>4f1-a58e-9b8865833acb/f9364470<wbr>-9770-4bb1-a6b9-a54861849625 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/f3c8e7aa-6ef2-4<wbr>2a7-93d4-e0a4df6dd2fa/2eb0b1ad<wbr>-2606-44ef-9cd3-ae59610a504b </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/b1ea3f62-0f05-4<wbr>ded-8c82-9c91c90e0b61/d5d6bf5a<wbr>-499f-431d-9013-5453db93ed32 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/8c8b5147-e9d6-4<wbr>810-b45b-185e3ed65727/16f08231<wbr>-93b0-489d-a2fd-687b6bf88eaa </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/12924435-b9c2-4<wbr>aab-ba19-1c1bc31310ef/07b3db69<wbr>-440e-491e-854c-bbfa18a7cff2 </div>
                                                          <div>Status:
                                                          Connected</div>
                                                          <div>Number of
                                                          entries: 8</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Brick
                                                          172.172.1.12:/gluster/brick3/d<wbr>ata-hdd</div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/48d7ecb8-7ac5-4<wbr>725-bca5-b3519681cf2f/0d6080b0<wbr>-7018-4fa3-bb82-1dd9ef07d9b9 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/647be733-f153-4<wbr>cdc-85bd-ba72544c2631/b453a300<wbr>-0602-4be1-8310-8bd5abe00971 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/b1ea3f62-0f05-4<wbr>ded-8c82-9c91c90e0b61/d5d6bf5a<wbr>-499f-431d-9013-5453db93ed32 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/6da854d1-b6be-4<wbr>46b-9bf0-90a0dbbea830/3c93bd1f<wbr>-b7fa-4aa2-b445-6904e31839ba </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/7f647567-d18c-4<wbr>4f1-a58e-9b8865833acb/f9364470<wbr>-9770-4bb1-a6b9-a54861849625 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/8c8b5147-e9d6-4<wbr>810-b45b-185e3ed65727/16f08231<wbr>-93b0-489d-a2fd-687b6bf88eaa </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/12924435-b9c2-4<wbr>aab-ba19-1c1bc31310ef/07b3db69<wbr>-440e-491e-854c-bbfa18a7cff2 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/f3c8e7aa-6ef2-4<wbr>2a7-93d4-e0a4df6dd2fa/2eb0b1ad<wbr>-2606-44ef-9cd3-ae59610a504b </div>
                                                          <div>Status:
                                                          Connected</div>
                                                          <div>Number of
                                                          entries: 8</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Brick
                                                          172.172.1.13:/gluster/brick3/d<wbr>ata-hdd</div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/b1ea3f62-0f05-4<wbr>ded-8c82-9c91c90e0b61/d5d6bf5a<wbr>-499f-431d-9013-5453db93ed32 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/8c8b5147-e9d6-4<wbr>810-b45b-185e3ed65727/16f08231<wbr>-93b0-489d-a2fd-687b6bf88eaa </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/12924435-b9c2-4<wbr>aab-ba19-1c1bc31310ef/07b3db69<wbr>-440e-491e-854c-bbfa18a7cff2 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/f3c8e7aa-6ef2-4<wbr>2a7-93d4-e0a4df6dd2fa/2eb0b1ad<wbr>-2606-44ef-9cd3-ae59610a504b </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/647be733-f153-4<wbr>cdc-85bd-ba72544c2631/b453a300<wbr>-0602-4be1-8310-8bd5abe00971 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/48d7ecb8-7ac5-4<wbr>725-bca5-b3519681cf2f/0d6080b0<wbr>-7018-4fa3-bb82-1dd9ef07d9b9 </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/6da854d1-b6be-4<wbr>46b-9bf0-90a0dbbea830/3c93bd1f<wbr>-b7fa-4aa2-b445-6904e31839ba </div>
                                                          <div>/cc65f671-3377-494a-a7d4-1d9f7<wbr>c3ae46c/images/7f647567-d18c-4<wbr>4f1-a58e-9b8865833acb/f9364470<wbr>-9770-4bb1-a6b9-a54861849625 </div>
                                                          <div>Status:
                                                          Connected</div>
                                                          <div>Number of
                                                          entries: 8</div>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>---------</div>
                                                          <div>Its been
                                                          in this state
                                                          for a couple
                                                          days now, and
                                                          bandwidth
                                                          monitoring
                                                          shows no
                                                          appreciable
                                                          data moving. 
                                                          I've tried
                                                          repeatedly
                                                          commanding a
                                                          full heal from
                                                          all three
                                                          clusters in
                                                          the node.  Its
                                                          always the
                                                          same files
                                                          that need
                                                          healing.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>When
                                                          running
                                                          gluster volume
                                                          heal data-hdd
                                                          statistics, I
                                                          see sometimes
                                                          different
                                                          information,
                                                          but always
                                                          some number of
                                                          "heal failed"
                                                          entries.  It
                                                          shows 0 for
                                                          split brain.</div>
                                                          <div><br>
                                                          </div>
                                                          <div>I'm not
                                                          quite sure
                                                          what to do.  I
                                                          suspect it may
                                                          be due to
                                                          nodes 1 and 2
                                                          still being on
                                                          the older
                                                          ovirt/gluster
                                                          release, but
                                                          I'm afraid to
                                                          upgrade and
                                                          reboot them
                                                          until I have a
                                                          good gluster
                                                          sync (don't
                                                          need to create
                                                          a split brain
                                                          issue).  How
                                                          do I proceed
                                                          with this?</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Second
                                                          issue: I've
                                                          been
                                                          experiencing
                                                          VERY POOR
                                                          performance on
                                                          most of my
                                                          VMs.  To the
                                                          tune that
                                                          logging into a
                                                          windows 10 vm
                                                          via remote
                                                          desktop can
                                                          take 5
                                                          minutes,
                                                          launching
                                                          quickbooks
                                                          inside said vm
                                                          can easily
                                                          take 10
                                                          minutes.  On
                                                          some linux
                                                          VMs, I get
                                                          random
                                                          messages like
                                                          this:</div>
                                                          <div>
                                                          <div>Message
                                                          from
                                                          syslogd@unifi
                                                          at May 28
                                                          20:39:23 ...</div>
                                                          <div> kernel:[6171996.308904]
                                                          NMI watchdog:
                                                          BUG: soft
                                                          lockup - CPU#0
                                                          stuck for 22s!
                                                          [mongod:14766]</div>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>(the
                                                          process and
                                                          PID are often
                                                          different)</div>
                                                          <div><br>
                                                          </div>
                                                          <div>I'm not
                                                          quite sure
                                                          what to do
                                                          about this
                                                          either.  My
                                                          initial
                                                          thought was
                                                          upgrad
                                                          everything to
                                                          current and
                                                          see if its
                                                          still there,
                                                          but I cannot
                                                          move forward
                                                          with that
                                                          until my
                                                          gluster is
                                                          healed...</div>
                                                          <div><br>
                                                          </div>
                                                          <div>Thanks!</div>
                                                          <span
class="m_-5992202424066002276m_1037085839393797930m_-4909453786756208844m_-1594786904780884718m_492621309039667928m_-6088757787094439702m_1448879657997877339m_2506865858631215125m_-3484925472286407273HOEnZb"><font
color="#888888">
                                                          <div>--Jim</div>
                                                          </font></span></div>
                                                          <br>
                                                          </div>
                                                          </div>
______________________________<wbr>_________________<br>
                                                          Users mailing
                                                          list -- <a
                                                          href="mailto:users@ovirt.org"
rel="noreferrer" target="_blank" moz-do-not-send="true">users@ovirt.org</a><br>
                                                          To unsubscribe
                                                          send an email
                                                          to <a
                                                          href="mailto:users-leave@ovirt.org"
rel="noreferrer" target="_blank" moz-do-not-send="true">users-leave@ovirt.org</a><br>
                                                          Privacy
                                                          Statement: <a
href="https://www.ovirt.org/site/privacy-policy/" rel="noreferrer
                                                          noreferrer"
                                                          target="_blank"
moz-do-not-send="true">https://www.ovirt.org/site/pri<wbr>vacy-policy/</a><br>
                                                          oVirt Code of
                                                          Conduct: <a
                                                          href="https://www.ovirt.org/community/about/community-guidelines/"
rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://www.ovirt.org/communit<wbr>y/about/community-guidelines/</a><br>
                                                          List Archives:
                                                          <a
href="https://lists.ovirt.org/archives/list/users@ovirt.org/message/3LEV6ZQ3JV2XLAL7NYBTXOYMYUOTIRQF/"
rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://lists.ovirt.org/archiv<wbr>es/list/users@ovirt.org/messag<wbr>e/3LEV6ZQ3JV2XLAL7NYBTXOYMYUOT<wbr>IRQF/</a><br>
                                                          <br>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </blockquote>
                                                    </div>
                                                    <br>
                                                  </div>
______________________________<wbr>_________________<br>
                                                  Users mailing list --
                                                  <a
                                                    href="mailto:users@ovirt.org"
                                                    rel="noreferrer"
                                                    target="_blank"
                                                    moz-do-not-send="true">users@ovirt.org</a><br>
                                                  To unsubscribe send an
                                                  email to <a
                                                    href="mailto:users-leave@ovirt.org"
                                                    rel="noreferrer"
                                                    target="_blank"
                                                    moz-do-not-send="true">users-leave@ovirt.org</a><br>
                                                  Privacy Statement: <a
href="https://www.ovirt.org/site/privacy-policy/" rel="noreferrer
                                                    noreferrer"
                                                    target="_blank"
                                                    moz-do-not-send="true">https://www.ovirt.org/site/pri<wbr>vacy-policy/</a><br>
                                                  oVirt Code of Conduct:
                                                  <a
                                                    href="https://www.ovirt.org/community/about/community-guidelines/"
                                                    rel="noreferrer
                                                    noreferrer"
                                                    target="_blank"
                                                    moz-do-not-send="true">https://www.ovirt.org/communit<wbr>y/about/community-guidelines/</a><br>
                                                </div>
                                              </div>
                                              List Archives: <a
href="https://lists.ovirt.org/archives/list/users@ovirt.org/message/ACO7RFSLBSRBAIONIC2HQ6Z24ZDES5MF/"
                                                rel="noreferrer
                                                noreferrer"
                                                target="_blank"
                                                moz-do-not-send="true">https://lists.ovirt.org/archiv<wbr>es/list/users@ovirt.org/messag<wbr>e/ACO7RFSLBSRBAIONIC2HQ6Z24ZDE<wbr>S5MF/</a><br>
                                            </blockquote>
                                          </div>
                                        </blockquote>
                                      </div>
                                      <br>
                                    </div>
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                            <br>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
            <br>
            ______________________________<wbr>_________________<br>
            Users mailing list -- <a href="mailto:users@ovirt.org"
              moz-do-not-send="true">users@ovirt.org</a><br>
            To unsubscribe send an email to <a
              href="mailto:users-leave@ovirt.org" moz-do-not-send="true">users-leave@ovirt.org</a><br>
            Privacy Statement: <a
              href="https://www.ovirt.org/site/privacy-policy/"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.ovirt.org/site/<wbr>privacy-policy/</a><br>
            oVirt Code of Conduct: <a
              href="https://www.ovirt.org/community/about/community-guidelines/"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.ovirt.org/<wbr>community/about/community-<wbr>guidelines/</a><br>
            List Archives: <a
href="https://lists.ovirt.org/archives/list/users@ovirt.org/message/3DEQQLJM3WHQNZJ7KEMRZVFZ52MTIL74/"
              rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.ovirt.org/<wbr>archives/list/users@ovirt.org/<wbr>message/<wbr>3DEQQLJM3WHQNZJ7KEMRZVFZ52MTIL<wbr>74/</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>