[Gluster-users] Quota issue
Geoffrey Letessier
geoffrey.letessier at cnrs.fr
Mon Jun 8 13:41:35 UTC 2015
In addition, i notice a very big difference between the sum of DU on each brick and « quota list » display, as you can read below:
[root at lucifer ~]# pdsh -w cl-storage[1,3] du -sh /export/brick_home/brick*/amyloid_team
cl-storage1: 1,6T /export/brick_home/brick1/amyloid_team
cl-storage3: 1,6T /export/brick_home/brick1/amyloid_team
cl-storage1: 1,6T /export/brick_home/brick2/amyloid_team
cl-storage3: 1,6T /export/brick_home/brick2/amyloid_team
[root at lucifer ~]# gluster volume quota vol_home list /amyloid_team
Path Hard-limit Soft-limit Used Available
--------------------------------------------------------------------------------
/amyloid_team 9.0TB 90% 7.8TB 1.2TB
As you can notice, the sum of all bricks gives me roughly 6.4TB and « quota list » around 7.8TB; so there is a difference of 1.4TB i’m not able to explain… Do you have any idea?
Thanks,
Geoffrey
------------------------------------------------------
Geoffrey Letessier
Responsable informatique & ingénieur système
UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
Institut de Biologie Physico-Chimique
13, rue Pierre et Marie Curie - 75005 Paris
Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> Le 8 juin 2015 à 14:30, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a écrit :
>
> Hello,
>
> Concerning the 3.5.3 version of GlusterFS, I met this morning a strange issue writing file when quota is exceeded.
>
> One person of my lab, whose her quota is exceeded (but she didn’t know about) try to modify a file but, because of exceeded quota, she was unable to and decided to exit VI. Now, her file is empty/blank as you can read below:
> pdsh at lucifer: cl-storage3: ssh exited with exit code 2
> cl-storage1: ---------T 2 tarus amyloid_team 0 19 févr. 12:34 /export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
> cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0 8 juin 12:38 /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
>
> In addition, i dont understand why, my volume being a distributed volume inside replica (cl-storage[1,3] is replicated only on cl-storage[2,4]), i have 2 « same » files (complete path) in 2 different bricks (as you can read above).
>
> Thanks by advance for your help and clarification.
> Geoffrey
> ------------------------------------------------------
> Geoffrey Letessier
> Responsable informatique & ingénieur système
> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr <mailto:geoffrey.letessier at ibpc.fr>
>> Le 2 juin 2015 à 23:45, Geoffrey Letessier <geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>> a écrit :
>>
>> Hi Ben,
>>
>> I just check my messages log files, both on client and server, and I dont find any hung task you notice on yours..
>>
>> As you can read below, i dont note the performance issue in a simple DD but I think my issue is concerning a set of small files (tens of thousands nay more)…
>>
>> [root at nisus test]# ddt -t 10g /mnt/test/
>> Writing to /mnt/test/ddt.8362 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /mnt/test/ddt.8362 ... done.
>> 10240MiB KiB/s CPU%
>> Write 114770 4
>> Read 40675 4
>>
>> for info: /mnt/test concerns the single v2 GlFS volume
>>
>> [root at nisus test]# ddt -t 10g /mnt/fhgfs/
>> Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
>> sleeping 10 seconds ... done.
>> Reading from /mnt/fhgfs/ddt.8380 ... done.
>> 10240MiB KiB/s CPU%
>> Write 102591 1
>> Read 98079 2
>>
>> Do you have a idea how to tune/optimize performance settings? and/or TCP settings (MTU, etc.)?
>>
>> ---------------------------------------------------------------
>> | | UNTAR | DU | FIND | TAR | RM |
>> ---------------------------------------------------------------
>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>> ---------------------------------------------------------------
>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>> ---------------------------------------------------------------
>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>> ---------------------------------------------------------------
>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>> ---------------------------------------------------------------
>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>> ---------------------------------------------------------------
>> | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s |
>> ---------------------------------------------------------------
>> | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s |
>> ---------------------------------------------------------------
>> for info:
>> -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers)
>> - single (v2): simple gluster volume with default settings
>>
>> I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to be OK.
>>
>> Thank you very much for your reply and help.
>> Geoffrey
>> -----------------------------------------------
>> Geoffrey Letessier
>>
>> Responsable informatique & ingénieur système
>> CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>
>> Le 2 juin 2015 à 21:53, Ben Turner <bturner at redhat.com <mailto:bturner at redhat.com>> a écrit :
>>
>>> I am seeing problems on 3.7 as well. Can you check /var/log/messages on both the clients and servers for hung tasks like:
>>>
>>> Jun 2 15:23:14 gqac006 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> Jun 2 15:23:14 gqac006 kernel: iozone D 0000000000000001 0 21999 1 0x00000080
>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082 ffff880611321c18 ffffffffa027236e
>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78
>>> Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8
>>> Jun 2 15:23:14 gqac006 kernel: Call Trace:
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ? rpc_make_runnable+0x7e/0x80 [sunrpc]
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ? rpc_execute+0x50/0xa0 [sunrpc]
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ? ktime_get_ts+0xb1/0xf0
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ? sync_page+0x0/0x50
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>] io_schedule+0x73/0xc0
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>] sync_page+0x3d/0x50
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>] __wait_on_bit+0x5f/0x90
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>] wait_on_page_bit+0x73/0x80
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ? wake_bit_function+0x0/0x50
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ? pagevec_lookup_tag+0x25/0x40
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>] wait_on_page_writeback_range+0xfb/0x190
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>] filemap_write_and_wait_range+0x78/0x90
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>] vfs_fsync_range+0x7e/0x100
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>] vfs_fsync+0x1d/0x20
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>] do_fsync+0x3e/0x60
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>] sys_fsync+0x10/0x20
>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
>>>
>>> Do you see a perf problem with just a simple DD or do you need a more complex workload to hit the issue? I think I saw an issue with metadata performance that I am trying to run down, let me know if you can see the problem with simple DD reads / writes or if we need to do some sort of dir / metadata access as well.
>>>
>>> -b
>>>
>>> ----- Original Message -----
>>>> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>>
>>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>
>>>> Cc: gluster-users at gluster.org <mailto:gluster-users at gluster.org>
>>>> Sent: Tuesday, June 2, 2015 8:09:04 AM
>>>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances
>>>>
>>>> Hi Pranith,
>>>>
>>>> I’m sorry but I cannot bring you any comparison because comparison will be
>>>> distorted by the fact in my HPC cluster in production the network technology
>>>> is InfiniBand QDR and my volumes are quite different (brick in RAID6
>>>> (12x2TB), 2 bricks per server and 4 servers into my pool)
>>>>
>>>> Concerning your demand, in attachments you can find all expected results
>>>> hoping it can help you to solve this serious performance issue (maybe I need
>>>> play with glusterfs parameters?).
>>>>
>>>> Thank you very much by advance,
>>>> Geoffrey
>>>> ------------------------------------------------------
>>>> Geoffrey Letessier
>>>> Responsable informatique & ingénieur système
>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>> Institut de Biologie Physico-Chimique
>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr <mailto:geoffrey.letessier at ibpc.fr>
>>>>
>>>>
>>>>
>>>>
>>>> Le 2 juin 2015 à 10:09, Pranith Kumar Karampuri < pkarampu at redhat.com <mailto:pkarampu at redhat.com> > a
>>>> écrit :
>>>>
>>>> hi Geoffrey,
>>>> Since you are saying it happens on all types of volumes, lets do the
>>>> following:
>>>> 1) Create a dist-repl volume
>>>> 2) Set the options etc you need.
>>>> 3) enable gluster volume profile using "gluster volume profile <volname>
>>>> start"
>>>> 4) run the work load
>>>> 5) give output of "gluster volume profile <volname> info"
>>>>
>>>> Repeat the steps above on new and old version you are comparing this with.
>>>> That should give us insight into what could be causing the slowness.
>>>>
>>>> Pranith
>>>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote:
>>>>
>>>>
>>>> Dear all,
>>>>
>>>> I have a crash test cluster where i’ve tested the new version of GlusterFS
>>>> (v3.7) before upgrading my HPC cluster in production.
>>>> But… all my tests show me very very low performances.
>>>>
>>>> For my benches, as you can read below, I do some actions (untar, du, find,
>>>> tar, rm) with linux kernel sources, dropping cache, each on distributed,
>>>> replicated, distributed-replicated, single (single brick) volumes and the
>>>> native FS of one brick.
>>>>
>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf ~/linux-4.1-rc5.tar.xz;
>>>> sync; echo 3 > /proc/sys/vm/drop_caches)
>>>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/; echo 3 >
>>>> /proc/sys/vm/drop_caches)
>>>> # time (echo 3 > /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc -l; echo 3
>>>>> /proc/sys/vm/drop_caches)
>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz
>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz
>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>>>
>>>> And here are the process times:
>>>>
>>>> ---------------------------------------------------------------
>>>> | | UNTAR | DU | FIND | TAR | RM |
>>>> ---------------------------------------------------------------
>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>>>> ---------------------------------------------------------------
>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>>>> ---------------------------------------------------------------
>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>>>> ---------------------------------------------------------------
>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>>>> ---------------------------------------------------------------
>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>>>> ---------------------------------------------------------------
>>>>
>>>> I get the same results, whether with default configurations with custom
>>>> configurations.
>>>>
>>>> if I look at the side of the ifstat command, I can note my IO write processes
>>>> never exceed 3MBs...
>>>>
>>>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS one
>>>>
>>>> My [test] storage cluster config is composed by 2 identical servers (biCPU
>>>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet)
>>>>
>>>> My volume settings:
>>>> single: 1server 1 brick
>>>> replicated: 2 servers 1 brick each
>>>> distributed: 2 servers 2 bricks each
>>>> dist-repl: 2 bricks in the same server and replica 2
>>>>
>>>> All seems to be OK in gluster status command line.
>>>>
>>>> Do you have an idea why I obtain so bad results?
>>>> Thanks in advance.
>>>> Geoffrey
>>>> -----------------------------------------------
>>>> Geoffrey Letessier
>>>>
>>>> Responsable informatique & ingénieur système
>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
>>>> Institut de Biologie Physico-Chimique
>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150608/7432d54e/attachment.html>
More information about the Gluster-users
mailing list