[Gluster-devel] Slowness and segfault with 313

Harris Landgarten harrisl at lhjonline.com
Thu Jul 19 21:01:12 UTC 2007


Strange crash.

After restarting client, df -h crashes. If ls /mnt/glusterfs is run first, df -h runs fine. Verified on both clients.

Here is the bt from Ubuntu

Program terminated with signal 11, Segmentation fault.
#0  data_to_ptr (data=0x8091e30) at dict.c:812
812     {
(gdb) bt
#0  data_to_ptr (data=0x8091e30) at dict.c:812
#1  0xb7faacc6 in default_statfs (frame=0x8091e30, this=0x80586c0, loc=0x8091cc4) at defaults.c:1001
#2  0xb7faacc6 in default_statfs (frame=0x8091da0, this=0x8058760, loc=0x8091cc4) at defaults.c:1001
#3  0x0804c19b in fuse_statfs (req=0x8091c68, ino=0) at fuse-bridge.c:1496
#4  0xb7f9ab12 in fuse_reply_statfs_compat () from /usr/lib/libfuse.so.2
#5  0xb7f9b1e3 in fuse_reply_entry () from /usr/lib/libfuse.so.2
#6  0xb7f9c9c6 in fuse_session_process () from /usr/lib/libfuse.so.2
#7  0x0804e019 in fuse_transport_notify (xl=0x8058c90, event=2, data=0x8052408) at fuse-bridge.c:2028
#8  0xb7fad7c7 in transport_notify (this=0x8057a7c, event=0) at transport.c:152
#9  0xb7fae239 in sys_epoll_iteration (ctx=0xbfecfec4) at epoll.c:54
#10 0xb7fad89d in poll_iteration (ctx=0xbfecfec4) at transport.c:260
#11 0x0804a36b in main (argc=6, argv=0xbfecffa4) at glusterfs.c:382

Harris

----- Original Message -----
From: "Anand Avati" <avati at zresearch.com>
To: "Harris Landgarten" <harrisl at lhjonline.com>
Cc: "Amar S. Tumballi" <amar at zresearch.com>, "gluster-devel" <gluster-devel at nongnu.org>
Sent: Thursday, July 19, 2007 4:17:39 PM (GMT-0500) America/New_York
Subject: Re: [Gluster-devel] Slowness and segfault with 313

Harris, 
both the slowness bug and the segfault you reported have been fixed in the latest TLA patchset. Please update and confirm that the fixes work for you. 

thanks, 
avati 


2007/7/17 , Harris Landgarten < harrisl at lhjonline.com >: 

Amar, 

Anything on this bug. If you can tell me the tla command to get specific patch levels I will try to narrow to bug down further. 

Harris 

----- Original Message ----- 
From: "Amar S. Tumballi" < amar at zresearch.com > 
To: "Harris Landgarten" < harrisl at lhjonline.com > 
Cc: "gluster-devel" < gluster-devel at nongnu.org > 
Sent: Monday, July 16, 2007 12:31:19 AM (GMT-0500) America/New_York 
Subject: Re: [Gluster-devel] Slowness and segfault with 313 

Hi Harris, 
Thanks for cornering the bugs between 309-313 . We are looking into it. 

-amar 


On 7/15/07 , Harris Landgarten < harrisl at lhjonline.com > wrote: 

I just tested a full zimbra backup on my mailbox with 308 client and 313 bricks and it completed in normal time with no errors. This is more evidence that the problem is client only and was introduced in 309-313 . 

Harris 

----- Original Message ----- 
From: "Harris Landgarten" < harrisl at lhjonline.com > 
To: "Harris Landgarten" < harrisl at lhjonline.com > 
Cc: "gluster-devel" < gluster-devel at nongnu.org > 
Sent: Sunday, July 15, 2007 8:30:15 AM (GMT-0500) America/New_York 
Subject: Re: [Gluster-devel] Slowness and segfault with 313 

Some more testing: 

Patch 308 

time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3 

real 6m47.947s 
user 0m0.180s 
sys 0m1.220s 

Patch 313 

time tar -cvf /mnt/glusterfs/test/test.tbz 0 1 2 3 

real 9m21.909s 
user 0m0.160s 
sys 0m1.470s 


Patch 313 also used 50% more memory. 

This leads me to suspect the problem is in writebehind or posix-locks 

Harris 

----- Original Message ----- 
From: "Harris Landgarten" < harrisl at lhjonline.com > 
To: "gluster-devel" < gluster-devel at nongnu.org > 
Sent: Saturday, July 14, 2007 2:23:20 PM (GMT-0500) America/New_York 
Subject: [Gluster-devel] Slowness and segfault with 313 

Last weekend with 299 a full Zimbra backup completed in 27 minutes. This weekend with 313 the backup was only about 1/2 through after 5 hrs. I aborted the backup and the client crashed. The abort would have tried to remove about 4G of files from the /mnt/glusterfs/backups/tmp folder. The following BT was generated from the core: 

Core was generated by `[glusterfs] '. 
Program terminated with signal 11, Segmentation fault. 
#0 unify_unlink (frame=0xdec4b30, this=0x80585b0, loc=0xddb98dc) at unify.c:2256 
2256 list = loc->inode->private; 
(gdb) bt 
#0 unify_unlink (frame=0xdec4b30, this=0x80585b0, loc=0xddb98dc) at unify.c:2256 
#1 0xb7f32c66 in default_unlink (frame=0xdbc51f8, this=0x80592c0, loc=0xddb98dc) at defaults.c:480 
#2 0xb7f32c66 in default_unlink (frame=0xde06ca8, this=0x8059350, loc=0xddb98dc) at defaults.c:480 
#3 0x0804ccf3 in fuse_unlink (req=0xdf68980, par= 4786914 , name=0xcc16770 "BbZslzxnS2Y,8Az0m2v5ExLBXbs= 6411-6246.msg1") at fuse-bridge.c:781 
#4 0xb7f21461 in fuse_reply_err () from /usr/lib/libfuse.so.2 
#5 0xb7f221e3 in fuse_reply_entry () from /usr/lib/libfuse.so.2 
#6 0xb7f239c6 in fuse_session_process () from /usr/lib/libfuse.so.2 
#7 0x0804abae in fuse_transport_notify (xl=0x8059910, event=2, data=0x8053410) at fuse-bridge.c:1942 
#8 0xb7f34cc7 in transport_notify (this=0xddb98dc, event= 204699600 ) at transport.c:152 
#9 0xb7f35979 in sys_epoll_iteration (ctx=0xbfb56b14) at epoll.c:54 
#10 0xb7f34d9d in poll_iteration (ctx=0xbfb56b14) at transport.c:260 
#11 0x0804a29b in main (argc=5, argv=0xbfb56bf4) at glusterfs.c:348 



_______________________________________________ 
Gluster-devel mailing list 
Gluster-devel at nongnu.org 
http://lists.nongnu.org/mailman/listinfo/gluster-devel 




_______________________________________________ 
Gluster-devel mailing list 
Gluster-devel at nongnu.org 
http://lists.nongnu.org/mailman/listinfo/gluster-devel 



-- 
Amar Tumballi 
http://amar.80x25.org 
[bulde on #gluster/irc.gnu.org] 


_______________________________________________ 
Gluster-devel mailing list 
Gluster-devel at nongnu.org 
http://lists.nongnu.org/mailman/listinfo/gluster-devel 



-- 
Anand V. Avati 





More information about the Gluster-devel mailing list