<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jul 27, 2018 at 11:11 AM, Hu Bert <span dir="ltr">&lt;<a href="mailto:revirii@googlemail.com" target="_blank">revirii@googlemail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Good Morning :-)<br>

<br>

on server gluster11 about 1.25 million and on gluster13 about 1.35<br>

million log entries in glustershd.log file. About 70 GB got healed,<br>

overall ~700GB of 2.0TB. Doesn&#39;t seem to run faster. I&#39;m calling<br>

&#39;find...&#39; whenever i notice that it has finished. Hmm... is it<br>

possible and reasonable to run 2 finds in parallel, maybe on different<br>

subdirectories? E.g. running one one $volume/public/ and on one<br>

$volume/private/ ?<br></blockquote><div><br></div><div>Do you already have all the 190000 directories already created? If not could you find out which of the paths need it and do a stat directly instead of find?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="HOEnZb"><div class="h5"><br>

2018-07-26 11:29 GMT+02:00 Pranith Kumar Karampuri &lt;<a href="mailto:pkarampu@redhat.com">pkarampu@redhat.com</a>&gt;:<br>

&gt;<br>

&gt;<br>

&gt; On Thu, Jul 26, 2018 at 2:41 PM, Hu Bert &lt;<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>&gt; wrote:<br>

&gt;&gt;<br>

&gt;&gt; &gt; Sorry, bad copy/paste :-(.<br>

&gt;&gt;<br>

&gt;&gt; np :-)<br>

&gt;&gt;<br>

&gt;&gt; The question regarding version 4.1 was meant more generally: does<br>

&gt;&gt; gluster v4.0 etc. have a better performance than version 3.12 etc.?<br>

&gt;&gt; Just curious :-) Sooner or later we have to upgrade anyway.<br>

&gt;<br>

&gt;<br>

&gt; You can check what changed @<br>

&gt; <a href="https://github.com/gluster/glusterfs/blob/release-4.0/doc/release-notes/4.0.0.md#performance" rel="noreferrer" target="_blank">https://github.com/gluster/<wbr>glusterfs/blob/release-4.0/<wbr>doc/release-notes/4.0.0.md#<wbr>performance</a><br>

&gt; <a href="https://github.com/gluster/glusterfs/blob/release-4.1/doc/release-notes/4.1.0.md#performance" rel="noreferrer" target="_blank">https://github.com/gluster/<wbr>glusterfs/blob/release-4.1/<wbr>doc/release-notes/4.1.0.md#<wbr>performance</a><br>

&gt;<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; btw.: gluster12 was the node with the failed brick, and i started the<br>

&gt;&gt; full heal on this node (has the biggest uuid as well). Is it normal<br>

&gt;&gt; that the glustershd.log on this node is rather empty (some hundred<br>

&gt;&gt; entries), but the glustershd.log files on the 2 other nodes have<br>

&gt;&gt; hundreds of thousands of entries?<br>

&gt;<br>

&gt;<br>

&gt; heals happen on the good bricks, so this is expected.<br>

&gt;<br>

&gt;&gt;<br>

&gt;&gt; (sry, mail twice, didn&#39;t go to the list, but maybe others are<br>

&gt;&gt; interested... :-) )<br>

&gt;&gt;<br>

&gt;&gt; 2018-07-26 10:17 GMT+02:00 Pranith Kumar Karampuri &lt;<a href="mailto:pkarampu@redhat.com">pkarampu@redhat.com</a>&gt;:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; On Thu, Jul 26, 2018 at 12:59 PM, Hu Bert &lt;<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>&gt;<br>

&gt;&gt; &gt; wrote:<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; Hi Pranith,<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; thanks a lot for your efforts and for tracking &quot;my&quot; problem with an<br>

&gt;&gt; &gt;&gt; issue.<br>

&gt;&gt; &gt;&gt; :-)<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; I&#39;ve set this params on the gluster volume and will start the<br>

&gt;&gt; &gt;&gt; &#39;find...&#39; command within a short time. I&#39;ll probably add another<br>

&gt;&gt; &gt;&gt; answer to the list to document the progress.<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; btw. - you had some typos:<br>

&gt;&gt; &gt;&gt; gluster volume set &lt;volname&gt; cluster.cluster.heal-wait-<wbr>queue-length<br>

&gt;&gt; &gt;&gt; 10000 =&gt; cluster is doubled<br>

&gt;&gt; &gt;&gt; gluster volume set &lt;volname&gt; cluster.data-self-heal-window-<wbr>size 16 =&gt;<br>

&gt;&gt; &gt;&gt; it&#39;s actually cluster.self-heal-window-size<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; but actually no problem :-)<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; Sorry, bad copy/paste :-(.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; Just curious: would gluster 4.1 improve the performance for healing<br>

&gt;&gt; &gt;&gt; and in general for &quot;my&quot; scenario?<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; No, this issue is present in all the existing releases. But it is<br>

&gt;&gt; &gt; solvable.<br>

&gt;&gt; &gt; You can follow that issue to see progress and when it is fixed etc.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; 2018-07-26 8:56 GMT+02:00 Pranith Kumar Karampuri<br>

&gt;&gt; &gt;&gt; &lt;<a href="mailto:pkarampu@redhat.com">pkarampu@redhat.com</a>&gt;:<br>

&gt;&gt; &gt;&gt; &gt; Thanks a lot for detailed write-up, this helps find the bottlenecks<br>

&gt;&gt; &gt;&gt; &gt; easily.<br>

&gt;&gt; &gt;&gt; &gt; On a high level, to handle this directory hierarchy i.e. lots of<br>

&gt;&gt; &gt;&gt; &gt; directories<br>

&gt;&gt; &gt;&gt; &gt; with files, we need to improve healing<br>

&gt;&gt; &gt;&gt; &gt; algorithms. Based on the data you provided, we need to make the<br>

&gt;&gt; &gt;&gt; &gt; following<br>

&gt;&gt; &gt;&gt; &gt; enhancements:<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; 1) At the moment directories are healed one at a time, but files can<br>

&gt;&gt; &gt;&gt; &gt; be<br>

&gt;&gt; &gt;&gt; &gt; healed upto 64 in parallel per replica subvolume.<br>

&gt;&gt; &gt;&gt; &gt; So if you have nX2 or nX3 distributed subvolumes, it can heal 64n<br>

&gt;&gt; &gt;&gt; &gt; number<br>

&gt;&gt; &gt;&gt; &gt; of<br>

&gt;&gt; &gt;&gt; &gt; files in parallel.<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; I raised <a href="https://github.com/gluster/glusterfs/issues/477" rel="noreferrer" target="_blank">https://github.com/gluster/<wbr>glusterfs/issues/477</a> to track<br>

&gt;&gt; &gt;&gt; &gt; this.<br>

&gt;&gt; &gt;&gt; &gt; In<br>

&gt;&gt; &gt;&gt; &gt; the mean-while you can use the following work-around:<br>

&gt;&gt; &gt;&gt; &gt; a) Increase background heals on the mount:<br>

&gt;&gt; &gt;&gt; &gt; gluster volume set &lt;volname&gt; cluster.background-self-heal-<wbr>count 256<br>

&gt;&gt; &gt;&gt; &gt; gluster volume set &lt;volname&gt; cluster.cluster.heal-wait-<wbr>queue-length<br>

&gt;&gt; &gt;&gt; &gt; 10000<br>

&gt;&gt; &gt;&gt; &gt; find &lt;mnt&gt; -type d | xargs stat<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; one &#39;find&#39; will trigger 10256 directories. So you may have to do this<br>

&gt;&gt; &gt;&gt; &gt; periodically until all directories are healed.<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; 2) Self-heal heals a file 128KB at a<br>

&gt;&gt; &gt;&gt; &gt; time(data-self-heal-window-<wbr>size). I<br>

&gt;&gt; &gt;&gt; &gt; think for your environment bumping upto MBs is better. Say 2MB i.e.<br>

&gt;&gt; &gt;&gt; &gt; 16*128KB?<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; Command to do that is:<br>

&gt;&gt; &gt;&gt; &gt; gluster volume set &lt;volname&gt; cluster.data-self-heal-window-<wbr>size 16<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; On Thu, Jul 26, 2018 at 10:40 AM, Hu Bert &lt;<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>&gt;<br>

&gt;&gt; &gt;&gt; &gt; wrote:<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; Hi Pranith,<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; Sry, it took a while to count the directories. I&#39;ll try to answer<br>

&gt;&gt; &gt;&gt; &gt;&gt; your<br>

&gt;&gt; &gt;&gt; &gt;&gt; questions as good as possible.<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; What kind of data do you have?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; How many directories in the filesystem?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; On average how many files per directory?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; What is the depth of your directory hierarchy on average?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; What is average filesize?<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; We have mostly images (more than 95% of disk usage, 90% of file<br>

&gt;&gt; &gt;&gt; &gt;&gt; count), some text files (like css, jsp, gpx etc.) and some binaries.<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; There are about 190.000 directories in the file system; maybe there<br>

&gt;&gt; &gt;&gt; &gt;&gt; are some more because we&#39;re hit by bug 1512371 (parallel-readdir =<br>

&gt;&gt; &gt;&gt; &gt;&gt; TRUE prevents directories listing). But the number of directories<br>

&gt;&gt; &gt;&gt; &gt;&gt; could/will rise in the future (maybe millions).<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; files per directory: ranges from 0 to 100, on average it should be<br>

&gt;&gt; &gt;&gt; &gt;&gt; 20<br>

&gt;&gt; &gt;&gt; &gt;&gt; files per directory (well, at least in the deepest dirs, see<br>

&gt;&gt; &gt;&gt; &gt;&gt; explanation below).<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; Average filesize: ranges from a few hundred bytes up to 30 MB, on<br>

&gt;&gt; &gt;&gt; &gt;&gt; average it should be 2-3 MB.<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; Directory hierarchy: maximum depth as seen from within the volume is<br>

&gt;&gt; &gt;&gt; &gt;&gt; 6, the average should be 3.<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; volume name: shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; mount point on clients: /data/repository/shared/<br>

&gt;&gt; &gt;&gt; &gt;&gt; below /shared/ there are 2 directories:<br>

&gt;&gt; &gt;&gt; &gt;&gt; - public/: mainly calculated images (file sizes from a few KB up to<br>

&gt;&gt; &gt;&gt; &gt;&gt; max 1 MB) and some resouces (small PNGs with a size of a few hundred<br>

&gt;&gt; &gt;&gt; &gt;&gt; bytes).<br>

&gt;&gt; &gt;&gt; &gt;&gt; - private/: mainly source images; file sizes from 50 KB up to 30MB<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; We migrated from a NFS server (SPOF) to glusterfs and simply copied<br>

&gt;&gt; &gt;&gt; &gt;&gt; our files. The images (which have an ID) are stored in the deepest<br>

&gt;&gt; &gt;&gt; &gt;&gt; directories of the dir tree. I&#39;ll better explain it :-)<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; directory structure for the images (i&#39;ll omit some other<br>

&gt;&gt; &gt;&gt; &gt;&gt; miscellaneous<br>

&gt;&gt; &gt;&gt; &gt;&gt; stuff, but it looks quite similar):<br>

&gt;&gt; &gt;&gt; &gt;&gt; - ID of an image has 7 or 8 digits<br>

&gt;&gt; &gt;&gt; &gt;&gt; - /shared/private/: /(first 3 digits of ID)/(next 3 digits of<br>

&gt;&gt; &gt;&gt; &gt;&gt; ID)/$ID.jpg<br>

&gt;&gt; &gt;&gt; &gt;&gt; - /shared/public/: /(first 3 digits of ID)/(next 3 digits of<br>

&gt;&gt; &gt;&gt; &gt;&gt; ID)/$ID/$misc_formats.jpg<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; That&#39;s why we have that many (sub-)directories. Files are only<br>

&gt;&gt; &gt;&gt; &gt;&gt; stored<br>

&gt;&gt; &gt;&gt; &gt;&gt; in the lowest directory hierarchy. I hope i could make our structure<br>

&gt;&gt; &gt;&gt; &gt;&gt; at least a bit more transparent.<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; i hope there&#39;s something we can do to raise performance a bit. thx<br>

&gt;&gt; &gt;&gt; &gt;&gt; in<br>

&gt;&gt; &gt;&gt; &gt;&gt; advance :-)<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; 2018-07-24 10:40 GMT+02:00 Pranith Kumar Karampuri<br>

&gt;&gt; &gt;&gt; &gt;&gt; &lt;<a href="mailto:pkarampu@redhat.com">pkarampu@redhat.com</a>&gt;:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; On Mon, Jul 23, 2018 at 4:16 PM, Hu Bert &lt;<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; wrote:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Well, over the weekend about 200GB were copied, so now there are<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; ~400GB copied to the brick. That&#39;s far beyond a speed of 10GB per<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; hour. If I copied the 1.6 TB directly, that would be done within<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; max<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; 2<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; days. But with the self heal this will take at least 20 days<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; minimum.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Why is the performance that bad? No chance of speeding this up?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; What kind of data do you have?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; How many directories in the filesystem?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; On average how many files per directory?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; What is the depth of your directory hierarchy on average?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; What is average filesize?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; Based on this data we can see if anything can be improved. Or if<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; there<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; are<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; some<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; enhancements that need to be implemented in gluster to address<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; this<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; kind<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; of<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; data layout<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; 2018-07-20 9:41 GMT+02:00 Hu Bert &lt;<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>&gt;:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; hmm... no one any idea?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Additional question: the hdd on server gluster12 was changed,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; so<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; far<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; ~220 GB were copied. On the other 2 servers i see a lot of<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; entries<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; in<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; glustershd.log, about 312.000 respectively 336.000 entries<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; there<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; yesterday, most of them (current log output) looking like this:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [2018-07-20 07:30:49.757595] I [MSGID: 108026]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [afr-self-heal-common.c:1724:<wbr>afr_log_selfheal]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0-shared-replicate-3:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Completed data selfheal on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0d863a62-0dd8-401c-b699-<wbr>2b642d9fd2b6.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; sources=0 [2]  sinks=1<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [2018-07-20 07:30:49.992398] I [MSGID: 108026]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [afr-self-heal-metadata.c:52:_<wbr>_afr_selfheal_metadata_do]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0-shared-replicate-3: performing metadata selfheal on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0d863a62-0dd8-401c-b699-<wbr>2b642d9fd2b6<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [2018-07-20 07:30:50.243551] I [MSGID: 108026]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [afr-self-heal-common.c:1724:<wbr>afr_log_selfheal]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0-shared-replicate-3:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Completed metadata selfheal on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0d863a62-0dd8-401c-b699-<wbr>2b642d9fd2b6.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; sources=0 [2]  sinks=1<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; or like this:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [2018-07-20 07:38:41.726943] I [MSGID: 108026]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [afr-self-heal-metadata.c:52:_<wbr>_afr_selfheal_metadata_do]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0-shared-replicate-3: performing metadata selfheal on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 9276097a-cdac-4d12-9dc6-<wbr>04b1ea4458ba<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [2018-07-20 07:38:41.855737] I [MSGID: 108026]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [afr-self-heal-common.c:1724:<wbr>afr_log_selfheal]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0-shared-replicate-3:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; Completed metadata selfheal on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 9276097a-cdac-4d12-9dc6-<wbr>04b1ea4458ba.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; sources=[0] 2  sinks=1<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [2018-07-20 07:38:44.755800] I [MSGID: 108026]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; [afr-self-heal-entry.c:887:<wbr>afr_selfheal_entry_do]<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 0-shared-replicate-3: performing entry selfheal on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 9276097a-cdac-4d12-9dc6-<wbr>04b1ea4458ba<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; is this behaviour normal? I&#39;d expect these messages on the<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; server<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; with<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; the failed brick, not on the other ones.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt; 2018-07-19 8:31 GMT+02:00 Hu Bert &lt;<a href="mailto:revirii@googlemail.com">revirii@googlemail.com</a>&gt;:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Hi there,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; sent this mail yesterday, but somehow it didn&#39;t work? Wasn&#39;t<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; archived,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; so please be indulgent it you receive this mail again :-)<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; We are currently running a replicate setup and are<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; experiencing a<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; quite poor performance. It got even worse when within a couple<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; of<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; weeks 2 bricks (disks) crashed. Maybe some general information<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; of<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; our<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; setup:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; 3 Dell PowerEdge R530 (Xeon E5-1650 v3 Hexa-Core, 64 GB DDR4,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; OS<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; separate disks); each server has 4 10TB disks -&gt; each is a<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; brick;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; replica 3 setup (see gluster volume status below). Debian<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; stretch,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; kernel 4.9.0, gluster version 3.12.12. Servers and clients are<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; connected via 10 GBit ethernet.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; About a month ago and 2 days ago a disk died (on different<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; servers);<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; disk were replaced, were brought back into the volume and full<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; self<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; heal started. But the speed for this is quite...<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; disappointing.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Each<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; brick has ~1.6TB of data on it (mostly the infamous small<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; files).<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; The<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; full heal i started yesterday copied only ~50GB within 24<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; hours<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; (48<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; hours: about 100GB) - with<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; this rate it would take weeks until the self heal finishes.<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; After the first heal (started on gluster13 about a month ago,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; took<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; about 3 weeks) finished we had a terrible performance; CPU on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; one<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; or<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; two of the nodes (gluster11, gluster12) was up to 1200%,<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; consumed<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; by<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; the brick process of the former crashed brick (bricksdd1),<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; interestingly not on the server with the failed this, but on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; the<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; other<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; 2 ones...<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Well... am i doing something wrong? Some options wrongly<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; configured?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Terrible setup? Anyone got an idea? Any additional information<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; needed?<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Thx in advance :-)<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; gluster volume status<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Volume Name: shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Type: Distributed-Replicate<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Volume ID: e879d208-1d8c-4089-85f3-<wbr>ef1b3aa45d36<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Status: Started<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Snapshot Count: 0<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Number of Bricks: 4 x 3 = 12<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Transport-type: tcp<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Bricks:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick1: gluster11:/gluster/bricksda1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick2: gluster12:/gluster/bricksda1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick3: gluster13:/gluster/bricksda1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick4: gluster11:/gluster/bricksdb1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick5: gluster12:/gluster/bricksdb1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick6: gluster13:/gluster/bricksdb1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick7: gluster11:/gluster/bricksdc1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick8: gluster12:/gluster/bricksdc1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick9: gluster13:/gluster/bricksdc1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick10: gluster11:/gluster/bricksdd1/<wbr>shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick11: gluster12:/gluster/bricksdd1_<wbr>new/shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Brick12: gluster13:/gluster/bricksdd1_<wbr>new/shared<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Options Reconfigured:<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; cluster.shd-max-threads: 4<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.md-cache-timeout: 60<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; cluster.lookup-optimize: on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; cluster.readdir-optimize: on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.cache-refresh-<wbr>timeout: 4<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.parallel-readdir: on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; server.event-threads: 8<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; client.event-threads: 8<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.cache-max-file-<wbr>size: 128MB<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.write-behind-<wbr>window-size: 16MB<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.io-thread-count: 64<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; cluster.min-free-disk: 1%<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.cache-size: 24GB<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; nfs.disable: on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; transport.address-family: inet<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.high-prio-threads: 32<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.normal-prio-<wbr>threads: 32<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.low-prio-threads: 32<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.least-prio-<wbr>threads: 8<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.io-cache: on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; server.allow-insecure: on<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; performance.strict-o-direct: off<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; transport.listen-backlog: 100<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; server.outstanding-rpc-limit: 128<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; ______________________________<wbr>_________________<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; Gluster-users mailing list<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;&gt; <a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; --<br>

&gt;&gt; &gt;&gt; &gt;&gt; &gt; Pranith<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt;<br>

&gt;&gt; &gt;&gt; &gt; --<br>

&gt;&gt; &gt;&gt; &gt; Pranith<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; --<br>

&gt;&gt; &gt; Pranith<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; --<br>

&gt; Pranith<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div>

</div></div>