[Gluster-users] Cascading errors and very bad write performance
Vijaikumar M
vmallika at redhat.com
Fri Aug 7 04:28:04 UTC 2015
Hi Geoffrey,
Some performance improvements has been done in quota in glusterfs-3.7.3.
Could you upgrade to glusterfs-3.7.3 and see if this helps
Thanks,
Vijay
On Friday 07 August 2015 05:02 AM, Geoffrey Letessier wrote:
> Hi,
>
> No idea to help me fix this issue? (big logs, small write performance
> (/4), etc.)
>
> For comparison, here to volumes:
> - home: distributed on 4 bricks / 2 nodes (and replicated on 4 other
> bricks / 2 other nodes):
> # ddt -t 35g /home
> Writing to /home/ddt.24172 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /home/ddt.24172 ... done.
> 33792MiB KiB/s CPU%
> Write 103659 1
> Read 391955 3
>
> - workdir: distributed on 4 bricks / 2 nodes (one the same RAID
> volumes and servers than home):
> # ddt -t 35g /workdir
> Writing to /workdir/ddt.24717 ... syncing ... done.
> sleeping 10 seconds ... done.
> Reading from /workdir/ddt.24717 ... done.
> 35840MiB KiB/s CPU%
> Write 738314 4
> Read 536497 4
>
> For information, previously on 3.5.3-2 version, I obtained roughly
> 1.1GBs for workdir volume and ~550-600MBs for home.
>
> All my tests (CP, RSYNC, etc.) provides me the same result (write
> throughput between 100MBs and 150MBs)
>
> Thanks.
> Geoffrey
> ------------------------------------------------------
> Geoffrey Letessier
> Responsable informatique & ingénieur système
> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
> Institut de Biologie Physico-Chimique
> 13, rue Pierre et Marie Curie - 75005 Paris
> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
> <mailto:geoffrey.letessier at ibpc.fr>
>
> Le 5 août 2015 à 10:40, Geoffrey Letessier <geoffrey.letessier at cnrs.fr
> <mailto:geoffrey.letessier at cnrs.fr>> a écrit :
>
>> Hello,
>>
>> In addition, knowing I have reactivated the log (brick-log-level =
>> INFO not CRITICAL) only for the file creation duration (i.e. a few
>> minutes), do you have noticed the log sizes and the number of lines
>> inside:
>> # ls -lh storage*
>> -rw------- 1 letessier staff 18M 5 aoû 00:54
>> storage1__export-brick_home-brick1-data.log
>> -rw------- 1 letessier staff 2,1K 5 aoû 00:54
>> storage1__export-brick_home-brick2-data.log
>> -rw------- 1 letessier staff 15M 5 aoû 00:56
>> storage2__export-brick_home-brick1-data.log
>> -rw------- 1 letessier staff 2,1K 5 aoû 00:54
>> storage2__export-brick_home-brick2-data.log
>> -rw------- 1 letessier staff 47M 5 aoû 00:55
>> storage3__export-brick_home-brick1-data.log
>> -rw------- 1 letessier staff 2,1K 5 aoû 00:54
>> storage3__export-brick_home-brick2-data.log
>> -rw------- 1 letessier staff 47M 5 aoû 00:55
>> storage4__export-brick_home-brick1-data.log
>> -rw------- 1 letessier staff 2,1K 5 aoû 00:55
>> storage4__export-brick_home-brick2-data.log
>>
>> # wc -l storage*
>> 55381 storage1__export-brick_home-brick1-data.log
>> 17 storage1__export-brick_home-brick2-data.log
>> 41636 storage2__export-brick_home-brick1-data.log
>> 17 storage2__export-brick_home-brick2-data.log
>> 270360 storage3__export-brick_home-brick1-data.log
>> 17 storage3__export-brick_home-brick2-data.log
>> 270358 storage4__export-brick_home-brick1-data.log
>> 17 storage4__export-brick_home-brick2-data.log
>> 637803 total
>>
>> If the let brick-log-level to INFO, the brick log files in each
>> server will consume all my /var partition capacity within only a few
>> hours/days…
>>
>> Thanks in advance,
>> Geoffrey
>> ------------------------------------------------------
>> Geoffrey Letessier
>> Responsable informatique & ingénieur système
>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>> Institut de Biologie Physico-Chimique
>> 13, rue Pierre et Marie Curie - 75005 Paris
>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>> <mailto:geoffrey.letessier at ibpc.fr>
>>
>> Le 5 août 2015 à 01:12, Geoffrey Letessier
>> <geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>> a
>> écrit :
>>
>>> Hello,
>>>
>>> Since the problem motioned previously (all errors noticed in brick
>>> log files), i notice a very very bad performance: i can note my
>>> write performance divided by 4 than previously -knowing it was not
>>> so good before.
>>> Now, a write of a 33GB file, my write throughput is around 150MBs
>>> (with Infiniband), before it was around 550-600MBs; and this, both
>>> with RDMA and TCP protocol.
>>>
>>> During this test, more than 40 000 error lines (as the following)
>>> were added to the brick log files.
>>> [2015-08-04 22:34:27.337622] E [dict.c:1418:dict_copy_with_ref]
>>> (-->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(server_resolve_inode+0x60)
>>> [0x7f021c6f7410]
>>> -->/usr/lib64/glusterfs/3.7.3/xlator/protocol/server.so(resolve_gfid+0x88)
>>> [0x7f021c6f7188]
>>> -->/usr/lib64/libglusterfs.so.0(dict_copy_with_ref+0xa4)
>>> [0x7f0229cba674] ) 0-dict: invalid argument: dict [Argument invalide]
>>>
>>>
>>> All brick log files are in attachments.
>>>
>>> Thanks in advance for all your help and fix,
>>> Best,
>>> Geoffrey
>>>
>>> PS: question: is it possible to easily downgrade GlusterFS to a
>>> previous version from 3.7 (for example: v3.5)?
>>>
>>> ------------------------------------------------------
>>> Geoffrey Letessier
>>> Responsable informatique & ingénieur système
>>> UPR 9080 - CNRS - Laboratoire de Biochimie Théorique
>>> Institut de Biologie Physico-Chimique
>>> 13, rue Pierre et Marie Curie - 75005 Paris
>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr
>>> <mailto:geoffrey.letessier at ibpc.fr>
>>> <bricks-logs.tgz>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150807/5454c964/attachment.html>
More information about the Gluster-users
mailing list