<div dir="ltr"><div>Hello Rafi,</div><div><br></div><div>I tried to set the xattr via</div><div><br></div><div>setfattr -n trusted.io-stats-dump -v '/tmp/iostat.log' /gluster/repositories/repo1/</div><div><br></div><div>but it had no effect. There is no such a xattr via getfattr and no logfile. The command setxattr is not available. What I am doing wrong?</div><div>By the way, you mean to increase the inode size of xfs layer from 512 Bytes to 1024KB(!)? I think it should be 1024 Bytes because 2048 Bytes is the maximum</div><div><br></div><div>Regards</div><div>David<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Am Mi., 6. Nov. 2019 um 04:10 Uhr schrieb RAFI KC <<a href="mailto:rkavunga@redhat.com">rkavunga@redhat.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>I will take a look at the profile info shared. Since there is a
huge difference in the performance numbers between fuse and samba,
it would be great if we can get the profile info of fuse (on v7).
This will help to compare the number of calls for each fops. There
should be some fops that samba repeat, and we can find out it by
comparing with fuse.</p>
<p>Also if possible, can you please get client profile info from
fuse mount using the command `setxattr -n trusted.io-stats-dump -v
<logfile /tmp/iostat.log> </mnt/fuse(mount point)>`.</p>
<p><br>
</p>
<p>Regards</p>
<p>Rafi KC<br>
</p>
<br>
<div>On 11/5/19 11:05 PM, David Spisla
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>I did the test with Gluster 7.0 ctime disabled. But it had
no effect:</div>
<div>
<div>(All values in MiB/s)<br>
</div>
<div>
64KiB 1MiB 10MiB</div>
<div>0,16 2,60 54,74</div>
<div><br>
</div>
<div>Attached there is now the complete profile file also with
the results from the last test. I will not repeat it with an
higher inode size because I don't think this will have an
effect.</div>
<div>There must be another cause for the low performance<br>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>Yes. No need to try with higher inode size<br>
</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div>
<div><br>
</div>
<div>Regards</div>
<div>David Spisla<br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">Am Di., 5. Nov. 2019 um
16:25 Uhr schrieb David Spisla <<a href="mailto:spisla80@gmail.com" target="_blank">spisla80@gmail.com</a>>:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">Am Di., 5. Nov. 2019 um
12:06 Uhr schrieb RAFI KC <<a href="mailto:rkavunga@redhat.com" target="_blank">rkavunga@redhat.com</a>>:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p><br>
</p>
<div>On 11/4/19 8:46 PM, David Spisla wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>Dear Gluster Community,</div>
<div><br>
</div>
<div>I also have a issue concerning
performance. The last days I updated our
test cluster from GlusterFS v5.5 to v7.0 .
The setup in general:</div>
<div><br>
</div>
<div>2 HP DL380 Servers with 10Gbit NICs, 1
Distribute-Replica 2 Volume with 2 Replica
Pairs. Client is SMB Samba (access via
vfs_glusterfs) . I did several tests to
ensure that Samba don't causes the fall.</div>
<div>The setup ist completely the same except
the Gluster Version<br>
</div>
<div>Here are my results:</div>
<div>64KiB 1MiB 10MiB
(Filesize)</div>
<div>
<div>
<div>3,49 47,41
300,50 (Values in MiB/s with
GlusterFS v5.5) <br>
</div>
<div>0,16 2,61
76,63 (Values in MiB/s with
GlusterFS v7.0) <br>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p>Can you please share the profile information [1]
for both versions? Also it would be really helpful
if you can mention the io patterns that used for
this tests.<br>
</p>
<p>[1] :
<a href="https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/" target="_blank">https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/</a></p>
</div>
</blockquote>
<div>Hello Rafi,<br>
</div>
<div>thank you for your help.</div>
<div> <br>
</div>
<div>* First more information about the io patterns: As a
client we use a DL360 Windws Server 2017 machine with
10Gbit NIC connected to the storage machines. The share
will be mounted via SMB and the tests writes with fio.
We use this job files (see attachment). Each job file
will be executed separetely and there is a sleep about
60s between each test run to calm down the system before
starting a new test.</div>
<div><br>
</div>
<div>* Attached below you find the profile output from the
tests with v5.5 (ctime enabled), v7.0 (ctime enabled).</div>
<div><br>
</div>
<div>
<div>* Beside of the tests with Samba I did also some
fio tests directly on the FUSE Mounts (locally on one
of the storage nodes). The results show that there is
only a small decrease of performance between v5.5 and
v7.0<br>
</div>
<div>(All values in MiB/s)<br>
</div>
<div>
64KiB 1MiB 10MiB<br>
50,09 679,96 1023,02 (v5.5)<br>
47,00 656,46 977,60 (v7.0)<br>
</div>
<div><br>
</div>
<div>It seems to be that the combination of samba +
gluster7.0 has a lot of problems, or not?</div>
</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>
<div>
<div><br>
</div>
<div>We use this volume options (GlusterFS
7.0):</div>
<div><br>
</div>
<div>Volume Name: archive1<br>
Type: Distributed-Replicate<br>
Volume ID:
44c17844-0bd4-4ca2-98d8-a1474add790c<br>
Status: Started<br>
Snapshot Count: 0<br>
Number of Bricks: 2 x 2 = 4<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1:
fs-dl380-c1-n1:/gluster/brick1/glusterbrick<br>
Brick2:
fs-dl380-c1-n2:/gluster/brick1/glusterbrick<br>
Brick3:
fs-dl380-c1-n1:/gluster/brick2/glusterbrick<br>
Brick4:
fs-dl380-c1-n2:/gluster/brick2/glusterbrick<br>
Options Reconfigured:<br>
performance.client-io-threads: off<br>
nfs.disable: on<br>
storage.fips-mode-rchecksum: on<br>
transport.address-family: inet<br>
user.smb: disable<br>
features.read-only: off<br>
features.worm: off<br>
features.worm-file-level: on<br>
features.retention-mode: enterprise<br>
features.default-retention-period: 120<br>
network.ping-timeout: 10<br>
features.cache-invalidation: on<br>
features.cache-invalidation-timeout: 600<br>
performance.nl-cache: on<br>
performance.nl-cache-timeout: 600<br>
client.event-threads: 32<br>
server.event-threads: 32<br>
cluster.lookup-optimize: on<br>
performance.stat-prefetch: on<br>
performance.cache-invalidation: on<br>
performance.md-cache-timeout: 600<br>
performance.cache-samba-metadata: on<br>
performance.cache-ima-xattrs: on<br>
performance.io-thread-count: 64<br>
cluster.use-compound-fops: on<br>
performance.cache-size: 512MB<br>
performance.cache-refresh-timeout: 10<br>
performance.read-ahead: off<br>
performance.write-behind-window-size:
4MB<br>
performance.write-behind: on<br>
storage.build-pgfid: on<br>
features.ctime: on<br>
cluster.quorum-type: fixed<br>
cluster.quorum-count: 1<br>
features.bitrot: on<br>
features.scrub: Active<br>
features.scrub-freq: daily<br>
</div>
</div>
</div>
<div><br>
</div>
<div>For GlusterFS 5.5 its nearly the same
except the fact that there were 2 options to
enable ctime feature. <br>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
<p><br>
Ctime stores additional metadata information as an
extended attributes which sometimes exceeds the
default inode size. In such scenarios the additional
xattrs won't fit into the default size. This will
result in additional blocks to be used to store
xattrs in the inide, which will effect the latency.
This is purely based on the i/o operations and the
total xattrs size stored in the inode.<br>
<br>
Is it possible for you to repeat the test by
disabling ctime or increasing the inode size to a
higher value say 1024KB?<br>
</p>
</div>
</blockquote>
<div>I will do so but for today I could not finish tests
with ctime disabled (or higher inode value) because it
takes a lot of time with v7.0 due to the low performance
and I will perform it tomorrow. As soon as possible I
give you the results.</div>
<div>By the way: You really mean inode size on xfs layer
1024KB? Or do you mean 1024Bytes? We use per default
512Bytes, because this is the recommended size until now
. But it seems to be that there is a need for a new
recommendation when using ctime feature as a default. I
can not image that this is the real cause for the low
performance because in v5.5 we also use ctime feature
with inode size 512Bytes.<br>
</div>
<div><br>
</div>
<div>Regards</div>
<div>David<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p> </p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div>Our optimization for Samba looks like
this (for every version):</div>
<div><br>
</div>
<div>[global]<br>
workgroup = SAMBA<br>
netbios name = CLUSTER<br>
kernel share modes = no<br>
aio read size = 1<br>
aio write size = 1<br>
kernel oplocks = no<br>
max open files = 100000<br>
nt acl support = no<br>
security = user<br>
server min protocol = SMB2<br>
store dos attributes = no<br>
strict locking = no<br>
full_audit:failure = pwrite_send pwrite_recv
pwrite offload_write_send offload_write_recv
create_file open unlink connect disconnect
rename chown fchown lchown chmod fchmod
mkdir rmdir ntimes ftruncate fallocate <br>
full_audit:success = pwrite_send pwrite_recv
pwrite offload_write_send offload_write_recv
create_file open unlink connect disconnect
rename chown fchown lchown chmod fchmod
mkdir rmdir ntimes ftruncate fallocate <br>
full_audit:facility = local5<br>
durable handles = yes<br>
posix locking = no<br>
log level = 2<br>
max log size = 100000<br>
debug pid = yes<br>
</div>
<div><br>
</div>
<div>What can be the cause for this rapid
falling of the performance for small files?
Are some of our vol options not recommended
anymore? <br>
</div>
<div>There were some patches concerning
performance for small files in v6.0 und v7.0
:<br>
</div>
<div>
<p class="MsoNormal" style="margin:0cm 0cm 0.0001pt;font-size:11pt;font-family:"Calibri",sans-serif"><span style="font-size:9pt;font-family:"Verdana",sans-serif"><a href="https://bugzilla.redhat.com/1670031" style="color:blue;text-decoration:underline" target="_blank"><span style="font-size:12pt;font-family:"\000026quot",serif;color:rgb(63,81,181);text-decoration:none">#1670031</span></a></span><span style="font-size:12pt;font-family:"Helvetica",sans-serif;background:white none repeat scroll 0% 0%"><span style="color:rgba(0,0,0,0.87);float:none;word-spacing:0px">:
performance regression seen with
smallfile workload tests</span></span><span style="font-size:9pt;font-family:"Verdana",sans-serif"><span></span></span></p>
</div>
<div>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif"><span style="font-size:12pt;font-family:"\000026quot",serif"><a href="https://bugzilla.redhat.com/1659327" style="color:blue;text-decoration:underline" target="_blank"><span style="color:rgb(63,81,181)">#1659327</span></a></span><span style="color:rgba(0,0,0,0.87);float:none;word-spacing:0px"><span style="font-size:12pt;font-family:"Helvetica",sans-serif;background:white none repeat scroll 0% 0%">: 43%
regression in small-file sequential
read performance</span></span></p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif">And
one patch for the io-cache:</p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif">
</p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif"><span style="font-size:12pt;font-family:"\000026quot",serif"><a href="https://bugzilla.redhat.com/1659869" style="color:blue;text-decoration:underline" target="_blank"><span style="color:rgb(63,81,181)">#1659869</span></a></span><span style="color:rgba(0,0,0,0.87);float:none;word-spacing:0px"><span style="font-size:12pt;font-family:"Helvetica",sans-serif;background:white none repeat scroll 0% 0%">:
improvements to io-cache</span></span></p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif">Regards</p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif">David
Spisla<br>
</p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif"><span style="color:rgba(0,0,0,0.87);float:none;word-spacing:0px"><span style="font-size:12pt;font-family:"Helvetica",sans-serif;background:white none repeat scroll 0% 0%"></span></span><span style="font-size:12pt;font-family:"\000026quot",serif"><span></span></span></p>
<p style="margin:12pt 0cm;font-size:11pt;font-family:"Calibri",sans-serif"><span style="color:rgba(0,0,0,0.87);float:none;word-spacing:0px"><span style="font-size:12pt;font-family:"Helvetica",sans-serif;background:white none repeat scroll 0% 0%"></span></span><span style="font-size:12pt;font-family:"\000026quot",serif"><span></span></span></p>
</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>________
Community Meeting Calendar:
APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: <a href="https://bluejeans.com/118564314" target="_blank">https://bluejeans.com/118564314</a>
NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: <a href="https://bluejeans.com/118564314" target="_blank">https://bluejeans.com/118564314</a>
Gluster-users mailing list
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a>
</pre>
</blockquote>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
</blockquote></div>