[Gluster-devel] Slowness and segfault with 313
Amar S. Tumballi
amar at zresearch.com
Wed Jul 25 15:34:21 UTC 2007
Hi Harris,
Thanks for your effort to find out which patch did the magic. Surely, this
is going to help us to improve the performance of glusterfs to very high
levels. Thanks again.
-amar
On 7/25/07, Harris Landgarten <harrisl at lhjonline.com> wrote:
>
> Patch 354 is most responsible for the speed up. See data:
>
> Patch 353
>
> real 6m40.655s
> user 0m0.120s
> sys 0m1.380s
>
> Patch 354
>
> real 2m56.825s
> user 0m0.130s
> sys 0m1.400s
>
> Patch 362
>
> real 2m54.541s
> user 0m0.100s
> sys 0m1.220s
>
> Harris
>
> ----- Original Message -----
> From: "Harris Landgarten" <harrisl at lhjonline.com>
> To: "Harris Landgarten" <harrisl at lhjonline.com>
> Cc: "gluster-devel" <gluster-devel at nongnu.org>, "Anand Avati" <
> avati at zresearch.com>
> Sent: Wednesday, July 25, 2007 9:05:56 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Testing with Patch 361
>
> real 2m58.516s
> user 0m0.140s
> sys 0m1.470s
>
> Unbelievable.
>
> Harris
>
> ----- Original Message -----
> From: "Harris Landgarten" <harrisl at lhjonline.com>
> To: "Harris Landgarten" <harrisl at lhjonline.com>
> Cc: "gluster-devel" <gluster-devel at nongnu.org>, "Anand Avati" <
> avati at zresearch.com>
> Sent: Monday, July 23, 2007 9:58:36 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> I have narrowed the problem down to Patch-333:
>
> Patch 332
>
> real 6m37.204s
> user 0m0.180s
> sys 0m1.590s
>
> Patch 333
>
> real 10m38.697s
> user 0m0.110s
> sys 0m1.300s
>
> Patch 334
>
> real 10m47.733s
> user 0m0.190s
> sys 0m1.570s
>
> Patch 347
>
> real 10m20.522s
> user 0m0.130s
> sys 0m1.590s
>
> Patch-333 introduced some changes to readahead to handle atimes. My
> configs haven't changed and atime updates are not occurring but the slow
> down is dramatic.
>
> Harris
>
> ----- Original Message -----
> From: "Harris Landgarten" <harrisl at lhjonline.com>
> To: "Anand Avati" <avati at zresearch.com>
> Cc: "gluster-devel" <gluster-devel at nongnu.org>
> Sent: Sunday, July 22, 2007 8:33:56 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> # du -h 0 1 2 3
> 134M 0
> 79M 1
> 121M 2
> 151M 3
>
> # ls -lh /mnt/glusterfs/test/test.tbz
> -rw-r--r-- 1 root root 460M Jul 21 20:04 /mnt/glusterfs/test/test.tbz
>
> Harris
>
> ----- Original Message -----
> From: "Anand Avati" <avati at zresearch.com>
> To: "Harris Landgarten" <harrisl at lhjonline.com>
> Cc: "Amar S. Tumballi" <amar at zresearch.com>, "gluster-devel" <
> gluster-devel at nongnu.org>
> Sent: Sunday, July 22, 2007 8:26:21 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> approximate combined size of 0 1 2 and 3 ?
>
> avati
>
>
> 2007/7/22 , Harris Landgarten < harrisl at lhjonline.com >:
>
> The 0 1 2 3 folders are on the same glusterfs mount. This is a test of
> gluster to gluster tar. The 0 1 2 3 folders are part of a mail server store
> tree. The resulting tar contains 15547 objects ranging in size from 100b to
> 10m.
>
> Harris
>
> ----- Original Message -----
> From: "Anand Avati" < avati at zresearch.com >
> To: "Harris Landgarten" < harrisl at lhjonline.com >
> Cc: "Amar S. Tumballi" < amar at zresearch.com >, "gluster-devel" <
> gluster-devel at nongnu.org >
> Sent: Sunday, July 22, 2007 8:08:17 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Harris,
> where are the 0 1 and 2 files/dirs? on the same glusterfs mount or on
> local disk?
>
> avati
>
>
> 2007/7/22 , Harris Landgarten < harrisl at lhjonline.com >:
>
> Nothing in spec about flush-behind.
>
> ### Add writeback feature
> volume writeback
> type performance/write-behind
> option aggregate-size 131072 # unit in bytes
> subvolumes bricks
> end-volume
>
> Harris
>
> ----- Original Message -----
> From: "Anand Avati" < avati at zresearch.com >
> To: "Harris Landgarten" < harrisl at lhjonline.com >
> Cc: "Amar S. Tumballi" < amar at zresearch.com >, "gluster-devel" <
> gluster-devel at nongnu.org >
> Sent: Sunday, July 22, 2007 2:29:08 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Harris,
> do you have 'option flush-behind on' set in the write-behind section of
> your client spec file?
>
> avati
>
>
> 2007/7/22 , Harris Landgarten < harrisl at lhjonline.com >:
>
> There is a major slow down in patch 336 and beyond. Here are some numbers:
>
> Patch 331
>
> time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3
>
> real 6m36.388s
> user 0m0.150s
> sys 0m1.400s
>
> Patch 336
>
> time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3
>
> real 11m5.022s
> user 0m0.180s
> sys 0m1.420s
>
> Patch 341
>
> time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3
>
> real 12m8.700s
> user 0m0.170s
> sys 0m1.550s
>
>
> Patch 344
>
> time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3
>
> real 11m10.577s
> user 0m0.130s
> sys 0m1.700s
>
> Something in patch 332-336 seems to be the problem.
>
> Harris
>
> ----- Original Message -----
> From: "Amar S. Tumballi" < amar at zresearch.com >
> To: "Harris Landgarten" < harrisl at lhjonline.com >
> Cc: "Anand Avati" < avati at zresearch.com >, "gluster-devel" <
> gluster-devel at nongnu.org >
> Sent: Friday, July 20, 2007 7:17:56 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Hi Harris,
> Thanks for notifying us. Fix committed. (patch 331)
>
> -amar
>
>
> On 7/20/07 , Harris Landgarten < harrisl at lhjonline.com > wrote:
>
> Strange crash.
>
> After restarting client, df -h crashes. If ls /mnt/glusterfs is run first,
> df -h runs fine. Verified on both clients.
>
> Here is the bt from Ubuntu
>
> Program terminated with signal 11, Segmentation fault.
> #0 data_to_ptr (data=0x8091e30) at dict.c:812
> 812 {
> (gdb) bt
> #0 data_to_ptr (data=0x8091e30) at dict.c:812
> #1 0xb7faacc6 in default_statfs (frame=0x8091e30, this=0x80586c0,
> loc=0x8091cc4) at defaults.c:1001
> #2 0xb7faacc6 in default_statfs (frame=0x8091da0, this=0x8058760,
> loc=0x8091cc4) at defaults.c:1001
> #3 0x0804c19b in fuse_statfs (req=0x8091c68, ino=0) at fuse-bridge.c:1496
> #4 0xb7f9ab12 in fuse_reply_statfs_compat () from /usr/lib/libfuse.so.2
> #5 0xb7f9b1e3 in fuse_reply_entry () from /usr/lib/libfuse.so.2
> #6 0xb7f9c9c6 in fuse_session_process () from /usr/lib/libfuse.so.2
> #7 0x0804e019 in fuse_transport_notify (xl=0x8058c90, event=2,
> data=0x8052408) at fuse-bridge.c:2028
> #8 0xb7fad7c7 in transport_notify (this=0x8057a7c, event=0) at transport.c
> :152
> #9 0xb7fae239 in sys_epoll_iteration (ctx=0xbfecfec4) at epoll.c:54
> #10 0xb7fad89d in poll_iteration (ctx=0xbfecfec4) at transport.c:260
> #11 0x0804a36b in main (argc=6, argv=0xbfecffa4) at glusterfs.c:382
>
> Harris
>
> ----- Original Message -----
> From: "Anand Avati" < avati at zresearch.com >
> To: "Harris Landgarten" < harrisl at lhjonline.com >
> Cc: "Amar S. Tumballi" < amar at zresearch.com >, "gluster-devel" <
> gluster-devel at nongnu.org >
> Sent: Thursday, July 19, 2007 4:17:39 PM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Harris,
> both the slowness bug and the segfault you reported have been fixed in the
> latest TLA patchset. Please update and confirm that the fixes work for you.
>
> thanks,
> avati
>
>
> 2007/7/17 , Harris Landgarten < harrisl at lhjonline.com >:
>
> Amar,
>
> Anything on this bug. If you can tell me the tla command to get specific
> patch levels I will try to narrow to bug down further.
>
> Harris
>
> ----- Original Message -----
> From: "Amar S. Tumballi" < amar at zresearch.com >
> To: "Harris Landgarten" < harrisl at lhjonline.com >
> Cc: "gluster-devel" < gluster-devel at nongnu.org >
> Sent: Monday, July 16, 2007 12:31:19 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Hi Harris,
> Thanks for cornering the bugs between 309-313 . We are looking into it.
>
> -amar
>
>
> On 7/15/07 , Harris Landgarten < harrisl at lhjonline.com > wrote:
>
> I just tested a full zimbra backup on my mailbox with 308 client and 313
> bricks and it completed in normal time with no errors. This is more evidence
> that the problem is client only and was introduced in 309-313 .
>
> Harris
>
> ----- Original Message -----
> From: "Harris Landgarten" < harrisl at lhjonline.com >
> To: "Harris Landgarten" < harrisl at lhjonline.com >
> Cc: "gluster-devel" < gluster-devel at nongnu.org >
> Sent: Sunday, July 15, 2007 8:30:15 AM (GMT-0500) America/New_York
> Subject: Re: [Gluster-devel] Slowness and segfault with 313
>
> Some more testing:
>
> Patch 308
>
> time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3
>
> real 6m47.947s
> user 0m0.180s
> sys 0m1.220s
>
> Patch 313
>
> time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3
>
> real 9m21.909s
> user 0m0.160s
> sys 0m1.470s
>
>
> Patch 313 also used 50% more memory.
>
> This leads me to suspect the problem is in writebehind or posix-locks
>
> Harris
>
> ----- Original Message -----
> From: "Harris Landgarten" < harrisl at lhjonline.com >
> To: "gluster-devel" < gluster-devel at nongnu.org >
> Sent: Saturday, July 14, 2007 2:23:20 PM (GMT-0500) America/New_York
> Subject: [Gluster-devel] Slowness and segfault with 313
>
> Last weekend with 299 a full Zimbra backup completed in 27 minutes. This
> weekend with 313 the backup was only about 1/2 through after 5 hrs. I
> aborted the backup and the client crashed. The abort would have tried to
> remove about 4G of files from the /mnt/glusterfs/backups/tmp folder. The
> following BT was generated from the core:
>
> Core was generated by `[glusterfs] '.
> Program terminated with signal 11, Segmentation fault.
> #0 unify_unlink (frame=0xdec4b30, this=0x80585b0, loc=0xddb98dc) at
> unify.c:2256
> 2256 list = loc->inode->private;
> (gdb) bt
> #0 unify_unlink (frame=0xdec4b30, this=0x80585b0, loc=0xddb98dc) at
> unify.c:2256
> #1 0xb7f32c66 in default_unlink (frame=0xdbc51f8, this=0x80592c0,
> loc=0xddb98dc) at defaults.c:480
> #2 0xb7f32c66 in default_unlink (frame=0xde06ca8, this=0x8059350,
> loc=0xddb98dc) at defaults.c:480
> #3 0x0804ccf3 in fuse_unlink (req=0xdf68980, par= 4786914 , name=0xcc16770
> "BbZslzxnS2Y,8Az0m2v5ExLBXbs= 6411-6246.msg1") at fuse-bridge.c:781
> #4 0xb7f21461 in fuse_reply_err () from /usr/lib/libfuse.so.2
> #5 0xb7f221e3 in fuse_reply_entry () from /usr/lib/libfuse.so.2
> #6 0xb7f239c6 in fuse_session_process () from /usr/lib/libfuse.so.2
> #7 0x0804abae in fuse_transport_notify (xl=0x8059910, event=2,
> data=0x8053410) at fuse-bridge.c:1942
> #8 0xb7f34cc7 in transport_notify (this=0xddb98dc, event= 204699600 ) at
> transport.c:152
> #9 0xb7f35979 in sys_epoll_iteration (ctx=0xbfb56b14) at epoll.c:54
> #10 0xb7f34d9d in poll_iteration (ctx=0xbfb56b14) at transport.c:260
> #11 0x0804a29b in main (argc=5, argv=0xbfb56bf4) at glusterfs.c:348
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
> --
> Amar Tumballi
> http://amar.80x25.org
> [bulde on #gluster/irc.gnu.org]
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
> --
> Anand V. Avati
>
>
>
>
> --
> Amar Tumballi
> http://amar.80x25.org
> [bulde on #gluster/irc.gnu.org]
>
>
>
> --
> Anand V. Avati
>
>
>
> --
> Anand V. Avati
>
>
>
> --
> Anand V. Avati
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
--
Amar Tumballi
http://amar.80x25.org
[bulde on #gluster/irc.gnu.org]
More information about the Gluster-devel
mailing list