[Gluster-users] Quota issue
Vijaikumar M
vmallika at redhat.com
Wed Jun 10 04:12:30 UTC 2015
Hi Geoffrey,
grep for 'ERROR' from the log file, and only these lines would be
sufficient.
Thanks,
Vijay
On Wednesday 10 June 2015 04:38 AM, Geoffrey Letessier wrote:
> Hello Vijay,
>
> Quota-verify is still running since a couple of hours (more than 10)
> and each output file sizes (4 files because 4 bricks per replica) are
> very huge: around 800MB per file in the first server and 5GB per file
> in the second one. Do your still want these? How can I send it to you?
>
> Nice night (in France)
> Geoffrey
> ------------------------------------------------------
> Geoffrey Letessier
> Responsable informatique & ingénieur système
> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> <mailto:geoffrey.letessier at ibpc.fr>
>
> Le 9 juin 2015 à 12:46, Vijaikumar M <vmallika at redhat.com
> <mailto:vmallika at redhat.com>> a écrit :
>
>> Hi Geoffrey,
>>
>> The file content deletion is because of 'vi editor' behaviour of
>> truncating the file when writing the updated content.
>>
>> Regarding quota size/usage problem, can you please execute the script
>> attached on each brick and provide us the output generated, this will
>> help us analyse why quota list is showing wrong-size.
>> The script basically crawls the directory given as argument.
>> It collects quota "contri" and "size" extended attribute and also
>> "block size" from stat call.
>>
>> Usage:
>>
>> ./quota-verify -b <brick_path> | tee brick_name.log
>>
>>
>> Thanks,
>> Vijay
>>
>>
>>
>> On Tuesday 09 June 2015 03:45 PM, Vijaikumar M wrote:
>>>
>>>
>>> On Tuesday 09 June 2015 03:40 PM, Geoffrey Letessier wrote:
>>>> Hi Vijay,
>>>>
>>>> Thanks for having replied.
>>>>
>>>> Unfortunately, i check each bricks on my stockage pool and dont
>>>> find any backup file.. damage!
>>>
>>> Please check backup file on client machine where the file was edited
>>> and on the home dir of a user (this is the user login used to edit a
>>> file).
>>>
>>> Thanks,
>>> Vijay
>>>
>>>
>>>>
>>>> Thank you again!
>>>> Good luck and see you,
>>>> Geoffrey
>>>> ------------------------------------------------------
>>>> Geoffrey Letessier
>>>> Responsable informatique & ingénieur système
>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>> Institut de Biologie Physico-Chimique
>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>> <mailto:geoffrey.letessier at ibpc.fr>
>>>>
>>>>> Le 9 juin 2015 à 10:05, Vijaikumar M <vmallika at redhat.com
>>>>> <mailto:vmallika at redhat.com>> a écrit :
>>>>>
>>>>>
>>>>>
>>>>> On Tuesday 09 June 2015 01:08 PM, Geoffrey Letessier wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Yes of course:
>>>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] du -s
>>>>>> /export/brick_home/brick*/amyloid_team
>>>>>> cl-storage1: 1608522280/export/brick_home/brick1/amyloid_team
>>>>>> cl-storage3: 1619630616/export/brick_home/brick1/amyloid_team
>>>>>> cl-storage1: 1614057836/export/brick_home/brick2/amyloid_team
>>>>>> cl-storage3: 1602653808/export/brick_home/brick2/amyloid_team
>>>>>>
>>>>>> The sum is: 6444864540 (around 6.4-6.5TB) while the quota list
>>>>>> displays 7.7TB.
>>>>>> So, the mistake is roughly 1.2-1.3TB, in other words around 16%
>>>>>> -which is too huge, no?
>>>>>>
>>>>>> In addition, since the quota is exceeded, i note a lot of files
>>>>>> like following:
>>>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] "cd
>>>>>> /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/;
>>>>>> ls -ail remd_100.sh 2> /dev/null" 2>/dev/null
>>>>>> cl-storage3: 133325688 ---------T 2 tarus amyloid_team 0 16 févr.
>>>>>> 10:20 remd_100.sh
>>>>>> note the ’T’ at the end of perms and the file size to 0B.
>>>>>>
>>>>>> And, yesterday, some files were duplicated but not anymore...
>>>>>>
>>>>>> The worst is, previously, all these files were OK. In other
>>>>>> words, exceeding quota made file or content deletions or
>>>>>> corruptions… What can I do to prevent to situation for the futur
>>>>>> -because I guess i cannot do something to rollback this situation
>>>>>> now, right?
>>>>>>
>>>>>
>>>>> Hi Geoffrey,
>>>>>
>>>>> I tried re-creating the problem.
>>>>>
>>>>> Here is the behaviour of vi editor.
>>>>> When a file is saved in vi editor, it creates a backup file under
>>>>> home dir and opens the original file with 'O_TRUNC' flag and hence
>>>>> file was truncated.
>>>>>
>>>>>
>>>>> Here is the strace of vi editor when it gets 'EDQUOT' error:
>>>>>
>>>>> open("hello", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 3
>>>>> write(3, "line one\nline two\n", 18) = 18
>>>>> fsync(3) = 0
>>>>> close(3) = -1 EDQUOT (Disk quota exceeded)
>>>>> chmod("hello", 0100644) = 0
>>>>> open("/root/hello~", O_RDONLY) = 3
>>>>> *open("hello", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 7*
>>>>> read(3, "line one\n", 256) = 9
>>>>> write(7, "line one\n", 9) = 9
>>>>> read(3, "", 256) = 0
>>>>> close(7) = -1 EDQUOT (Disk quota exceeded)
>>>>> close(3) = 0
>>>>>
>>>>>
>>>>> To re-cover the truncated file, please find if there are any
>>>>> backup file 'remd_115.sh~' under '~/' or on the same dir where
>>>>> this file exists.If exists you can copy this file.
>>>>>
>>>>> Thanks,
>>>>> Vijay
>>>>>
>>>>>
>>>>>> Geoffrey
>>>>>> ------------------------------------------------------
>>>>>> Geoffrey Letessier
>>>>>> Responsable informatique & ingénieur système
>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>>>> Institut de Biologie Physico-Chimique
>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>>> <mailto:geoffrey.letessier at ibpc.fr>
>>>>>>
>>>>>>> Le 9 juin 2015 à 09:01, Vijaikumar M <vmallika at redhat.com
>>>>>>> <mailto:vmallika at redhat.com>> a écrit :
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Monday 08 June 2015 07:11 PM, Geoffrey Letessier wrote:
>>>>>>>> In addition, i notice a very big difference between the sum of
>>>>>>>> DU on each brick and « quota list » display, as you can read
>>>>>>>> below:
>>>>>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] du -sh
>>>>>>>> /export/brick_home/brick*/amyloid_team
>>>>>>>> cl-storage1: 1,6T/export/brick_home/brick1/amyloid_team
>>>>>>>> cl-storage3: 1,6T/export/brick_home/brick1/amyloid_team
>>>>>>>> cl-storage1: 1,6T/export/brick_home/brick2/amyloid_team
>>>>>>>> cl-storage3: 1,6T/export/brick_home/brick2/amyloid_team
>>>>>>>> [root at lucifer ~]# gluster volume quota vol_home list /amyloid_team
>>>>>>>> Path Hard-limit Soft-limit Used Available
>>>>>>>> --------------------------------------------------------------------------------
>>>>>>>> /amyloid_team 9.0TB 90% 7.8TB 1.2TB
>>>>>>>>
>>>>>>>> As you can notice, the sum of all bricks gives me roughly 6.4TB
>>>>>>>> and « quota list » around 7.8TB; so there is a difference of
>>>>>>>> 1.4TB i’m not able to explain… Do you have any idea?
>>>>>>>>
>>>>>>>
>>>>>>> There were few issues when quota accounting the size, we have
>>>>>>> fixed some of these issues in 3.7
>>>>>>> 'df -h' will round off the values, can you please provide the
>>>>>>> output of 'df' without -h option?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Geoffrey
>>>>>>>> ------------------------------------------------------
>>>>>>>> Geoffrey Letessier
>>>>>>>> Responsable informatique & ingénieur système
>>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>>>>> <mailto:geoffrey.letessier at ibpc.fr>
>>>>>>>>
>>>>>>>>> Le 8 juin 2015 à 14:30, Geoffrey Letessier
>>>>>>>>> <geoffrey.letessier at cnrs.fr
>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>> a écrit :
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Concerning the 3.5.3 version of GlusterFS, I met this morning
>>>>>>>>> a strange issue writing file when quota is exceeded.
>>>>>>>>>
>>>>>>>>> One person of my lab, whose her quota is exceeded (but she
>>>>>>>>> didn’t know about) try to modify a file but, because of
>>>>>>>>> exceeded quota, she was unable to and decided to exit VI. Now,
>>>>>>>>> her file is empty/blank as you can read below:
>>>>>>> we suspect 'vi' might have created tmp file before writing to a
>>>>>>> file. We are working on re-creating this problem and will update
>>>>>>> you on the same.
>>>>>>>
>>>>>>>
>>>>>>>>> pdsh at lucifer: cl-storage3: ssh exited with exit code 2
>>>>>>>>> cl-storage1: ---------T 2 tarus amyloid_team 0 19 févr. 12:34
>>>>>>>>> /export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
>>>>>>>>> cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0 8 juin 12:38
>>>>>>>>> /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh
>>>>>>>>>
>>>>>>>>> In addition, i dont understand why, my volume being a
>>>>>>>>> distributed volume inside replica (cl-storage[1,3] is
>>>>>>>>> replicated only on cl-storage[2,4]), i have 2 « same » files
>>>>>>>>> (complete path) in 2 different bricks (as you can read above).
>>>>>>>>>
>>>>>>>>> Thanks by advance for your help and clarification.
>>>>>>>>> Geoffrey
>>>>>>>>> ------------------------------------------------------
>>>>>>>>> Geoffrey Letessier
>>>>>>>>> Responsable informatique & ingénieur système
>>>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>>>>>> <mailto:geoffrey.letessier at ibpc.fr>
>>>>>>>>>
>>>>>>>>>> Le 2 juin 2015 à 23:45, Geoffrey Letessier
>>>>>>>>>> <geoffrey.letessier at cnrs.fr
>>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>> a écrit :
>>>>>>>>>>
>>>>>>>>>> Hi Ben,
>>>>>>>>>>
>>>>>>>>>> I just check my messages log files, both on client and
>>>>>>>>>> server, and I dont find any hung task you notice on yours..
>>>>>>>>>>
>>>>>>>>>> As you can read below, i dont note the performance issue in a
>>>>>>>>>> simple DD but I think my issue is concerning a set of small
>>>>>>>>>> files (tens of thousands nay more)…
>>>>>>>>>>
>>>>>>>>>> [root at nisus test]# ddt -t 10g /mnt/test/
>>>>>>>>>> Writing to /mnt/test/ddt.8362 ... syncing ... done.
>>>>>>>>>> sleeping 10 seconds ... done.
>>>>>>>>>> Reading from /mnt/test/ddt.8362 ... done.
>>>>>>>>>> 10240MiB KiB/s CPU%
>>>>>>>>>> Write 114770 4
>>>>>>>>>> Read 40675 4
>>>>>>>>>>
>>>>>>>>>> for info: /mnt/test concerns the single v2 GlFS volume
>>>>>>>>>>
>>>>>>>>>> [root at nisus test]# ddt -t 10g /mnt/fhgfs/
>>>>>>>>>> Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done.
>>>>>>>>>> sleeping 10 seconds ... done.
>>>>>>>>>> Reading from /mnt/fhgfs/ddt.8380 ... done.
>>>>>>>>>> 10240MiB KiB/s CPU%
>>>>>>>>>> Write 102591 1
>>>>>>>>>> Read 98079 2
>>>>>>>>>>
>>>>>>>>>> Do you have a idea how to tune/optimize performance settings?
>>>>>>>>>> and/or TCP settings (MTU, etc.)?
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | | UNTAR | DU | FIND | TAR | RM |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s |
>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>> for info:
>>>>>>>>>> -BeeGFS is a distributed FS (4 bricks, 2 bricks per server
>>>>>>>>>> and 2 servers)
>>>>>>>>>> - single (v2): simple gluster volume with default settings
>>>>>>>>>>
>>>>>>>>>> I also note I obtain the same tar/untar performance issue
>>>>>>>>>> with FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to
>>>>>>>>>> be OK.
>>>>>>>>>>
>>>>>>>>>> Thank you very much for your reply and help.
>>>>>>>>>> Geoffrey
>>>>>>>>>> -----------------------------------------------
>>>>>>>>>> Geoffrey Letessier
>>>>>>>>>>
>>>>>>>>>> Responsable informatique & ingénieur système
>>>>>>>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
>>>>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
>>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>
>>>>>>>>>>
>>>>>>>>>> Le 2 juin 2015 à 21:53, Ben Turner <bturner at redhat.com
>>>>>>>>>> <mailto:bturner at redhat.com>> a écrit :
>>>>>>>>>>
>>>>>>>>>>> I am seeing problems on 3.7 as well. Can you check
>>>>>>>>>>> /var/log/messages on both the clients and servers for hung
>>>>>>>>>>> tasks like:
>>>>>>>>>>>
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: "echo 0 >
>>>>>>>>>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: iozone D
>>>>>>>>>>> 0000000000000001 0 21999 1 0x00000080
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8
>>>>>>>>>>> 0000000000000082 ffff880611321c18 ffffffffa027236e
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321c48
>>>>>>>>>>> ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0
>>>>>>>>>>> ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: Call Trace:
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ?
>>>>>>>>>>> rpc_make_runnable+0x7e/0x80 [sunrpc]
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ?
>>>>>>>>>>> rpc_execute+0x50/0xa0 [sunrpc]
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ?
>>>>>>>>>>> ktime_get_ts+0xb1/0xf0
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ?
>>>>>>>>>>> sync_page+0x0/0x50
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>]
>>>>>>>>>>> io_schedule+0x73/0xc0
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>]
>>>>>>>>>>> sync_page+0x3d/0x50
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>]
>>>>>>>>>>> __wait_on_bit+0x5f/0x90
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>]
>>>>>>>>>>> wait_on_page_bit+0x73/0x80
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ?
>>>>>>>>>>> wake_bit_function+0x0/0x50
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ?
>>>>>>>>>>> pagevec_lookup_tag+0x25/0x40
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>]
>>>>>>>>>>> wait_on_page_writeback_range+0xfb/0x190
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>]
>>>>>>>>>>> filemap_write_and_wait_range+0x78/0x90
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>]
>>>>>>>>>>> vfs_fsync_range+0x7e/0x100
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>]
>>>>>>>>>>> vfs_fsync+0x1d/0x20
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>]
>>>>>>>>>>> do_fsync+0x3e/0x60
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>]
>>>>>>>>>>> sys_fsync+0x10/0x20
>>>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>]
>>>>>>>>>>> system_call_fastpath+0x16/0x1b
>>>>>>>>>>>
>>>>>>>>>>> Do you see a perf problem with just a simple DD or do you
>>>>>>>>>>> need a more complex workload to hit the issue? I think I
>>>>>>>>>>> saw an issue with metadata performance that I am trying to
>>>>>>>>>>> run down, let me know if you can see the problem with simple
>>>>>>>>>>> DD reads / writes or if we need to do some sort of dir /
>>>>>>>>>>> metadata access as well.
>>>>>>>>>>>
>>>>>>>>>>> -b
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr
>>>>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>>
>>>>>>>>>>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com
>>>>>>>>>>>> <mailto:pkarampu at redhat.com>>
>>>>>>>>>>>> Cc:gluster-users at gluster.org <mailto:gluster-users at gluster.org>
>>>>>>>>>>>> Sent: Tuesday, June 2, 2015 8:09:04 AM
>>>>>>>>>>>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor
>>>>>>>>>>>> performances
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Pranith,
>>>>>>>>>>>>
>>>>>>>>>>>> I’m sorry but I cannot bring you any comparison because
>>>>>>>>>>>> comparison will be
>>>>>>>>>>>> distorted by the fact in my HPC cluster in production the
>>>>>>>>>>>> network technology
>>>>>>>>>>>> is InfiniBand QDR and my volumes are quite different (brick
>>>>>>>>>>>> in RAID6
>>>>>>>>>>>> (12x2TB), 2 bricks per server and 4 servers into my pool)
>>>>>>>>>>>>
>>>>>>>>>>>> Concerning your demand, in attachments you can find all
>>>>>>>>>>>> expected results
>>>>>>>>>>>> hoping it can help you to solve this serious performance
>>>>>>>>>>>> issue (maybe I need
>>>>>>>>>>>> play with glusterfs parameters?).
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you very much by advance,
>>>>>>>>>>>> Geoffrey
>>>>>>>>>>>> ------------------------------------------------------
>>>>>>>>>>>> Geoffrey Letessier
>>>>>>>>>>>> Responsable informatique & ingénieur système
>>>>>>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>>>>>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>>>>>>>>>>> <mailto:geoffrey.letessier at ibpc.fr>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Le 2 juin 2015 à 10:09, Pranith Kumar Karampuri <
>>>>>>>>>>>> pkarampu at redhat.com <mailto:pkarampu at redhat.com> > a
>>>>>>>>>>>> écrit :
>>>>>>>>>>>>
>>>>>>>>>>>> hi Geoffrey,
>>>>>>>>>>>> Since you are saying it happens on all types of volumes,
>>>>>>>>>>>> lets do the
>>>>>>>>>>>> following:
>>>>>>>>>>>> 1) Create a dist-repl volume
>>>>>>>>>>>> 2) Set the options etc you need.
>>>>>>>>>>>> 3) enable gluster volume profile using "gluster volume
>>>>>>>>>>>> profile <volname>
>>>>>>>>>>>> start"
>>>>>>>>>>>> 4) run the work load
>>>>>>>>>>>> 5) give output of "gluster volume profile <volname> info"
>>>>>>>>>>>>
>>>>>>>>>>>> Repeat the steps above on new and old version you are
>>>>>>>>>>>> comparing this with.
>>>>>>>>>>>> That should give us insight into what could be causing the
>>>>>>>>>>>> slowness.
>>>>>>>>>>>>
>>>>>>>>>>>> Pranith
>>>>>>>>>>>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Dear all,
>>>>>>>>>>>>
>>>>>>>>>>>> I have a crash test cluster where i’ve tested the new
>>>>>>>>>>>> version of GlusterFS
>>>>>>>>>>>> (v3.7) before upgrading my HPC cluster in production.
>>>>>>>>>>>> But… all my tests show me very very low performances.
>>>>>>>>>>>>
>>>>>>>>>>>> For my benches, as you can read below, I do some actions
>>>>>>>>>>>> (untar, du, find,
>>>>>>>>>>>> tar, rm) with linux kernel sources, dropping cache, each on
>>>>>>>>>>>> distributed,
>>>>>>>>>>>> replicated, distributed-replicated, single (single brick)
>>>>>>>>>>>> volumes and the
>>>>>>>>>>>> native FS of one brick.
>>>>>>>>>>>>
>>>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf
>>>>>>>>>>>> ~/linux-4.1-rc5.tar.xz;
>>>>>>>>>>>> sync; echo 3 > /proc/sys/vm/drop_caches)
>>>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh
>>>>>>>>>>>> linux-4.1-rc5/; echo 3 >
>>>>>>>>>>>> /proc/sys/vm/drop_caches)
>>>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; find
>>>>>>>>>>>> linux-4.1-rc5/|wc -l; echo 3
>>>>>>>>>>>>> /proc/sys/vm/drop_caches)
>>>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf
>>>>>>>>>>>> linux-4.1-rc5.tgz
>>>>>>>>>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf
>>>>>>>>>>>> linux-4.1-rc5.tgz
>>>>>>>>>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches)
>>>>>>>>>>>>
>>>>>>>>>>>> And here are the process times:
>>>>>>>>>>>>
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>> | | UNTAR | DU | FIND | TAR | RM |
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s |
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s |
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s |
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s |
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s |
>>>>>>>>>>>> ---------------------------------------------------------------
>>>>>>>>>>>>
>>>>>>>>>>>> I get the same results, whether with default configurations
>>>>>>>>>>>> with custom
>>>>>>>>>>>> configurations.
>>>>>>>>>>>>
>>>>>>>>>>>> if I look at the side of the ifstat command, I can note my
>>>>>>>>>>>> IO write processes
>>>>>>>>>>>> never exceed 3MBs...
>>>>>>>>>>>>
>>>>>>>>>>>> EXT4 native FS seems to be faster (roughly 15-20% but no
>>>>>>>>>>>> more) than XFS one
>>>>>>>>>>>>
>>>>>>>>>>>> My [test] storage cluster config is composed by 2 identical
>>>>>>>>>>>> servers (biCPU
>>>>>>>>>>>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb
>>>>>>>>>>>> ethernet)
>>>>>>>>>>>>
>>>>>>>>>>>> My volume settings:
>>>>>>>>>>>> single: 1server 1 brick
>>>>>>>>>>>> replicated: 2 servers 1 brick each
>>>>>>>>>>>> distributed: 2 servers 2 bricks each
>>>>>>>>>>>> dist-repl: 2 bricks in the same server and replica 2
>>>>>>>>>>>>
>>>>>>>>>>>> All seems to be OK in gluster status command line.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you have an idea why I obtain so bad results?
>>>>>>>>>>>> Thanks in advance.
>>>>>>>>>>>> Geoffrey
>>>>>>>>>>>> -----------------------------------------------
>>>>>>>>>>>> Geoffrey Letessier
>>>>>>>>>>>>
>>>>>>>>>>>> Responsable informatique & ingénieur système
>>>>>>>>>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Théorique
>>>>>>>>>>>> Institut de Biologie Physico-Chimique
>>>>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>>>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr
>>>>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Gluster-users mailing list Gluster-users at gluster.org
>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>> <quota-verify.gz>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150610/2d409068/attachment-0001.html>
More information about the Gluster-users
mailing list