[Gluster-devel] difficult bug in 2.5 mainline

Amar S. Tumballi amar at zresearch.com
Sun Jul 1 23:55:09 UTC 2007


Hi Harris,
With the latest patch this bug is fixed. Also, i hope it should fix the
problem of 'rm -rf' too.. please confirm.

i am looking into other strange bug reported by you.

-bulde

On 7/2/07, Harris Landgarten <harrisl at lhjonline.com> wrote:
>
> Disabling posix-locks changes the problem
>
> The client crashes along with the lock-server brick
>
> Here is the bt from the client:
>
> #0  unify_bg_cbk (frame=0xe080168, cookie=0xe1109c8, this=0x8057730,
> op_ret=0, op_errno=13) at unify.c:83
> 83          callcnt = --local->call_count;
> (gdb) bt
> #0  unify_bg_cbk (frame=0xe080168, cookie=0xe1109c8, this=0x8057730,
> op_ret=0, op_errno=13) at unify.c:83
> #1  0xb75b96e5 in client_unlink_cbk (frame=0xe1109c8, args=0x8059248) at
> client-protocol.c:2969
> #2  0xb75beff5 in notify (this=0x8057730, event=2, data=0x8095338) at
> client-protocol.c:4184
> #3  0xb7f73827 in transport_notify (this=0x0, event=235405672) at
> transport.c:152
> #4  0xb7f74299 in sys_epoll_iteration (ctx=0xbfb96248) at epoll.c:54
> #5  0xb7f738fd in poll_iteration (ctx=0xbfb96248) at transport.c:260
> #6  0x0804a170 in main (argc=6, argv=0xbfb96324) at glusterfs.c:341
> (gdb) print local
> $1 = (unify_local_t *) 0x0
>
> Harris
>
> ----- Original Message -----
> From: "Harris Landgarten" <harrisl at lhjonline.com>
> To: "gluster-devel" <gluster-devel at nongnu.org>
> Sent: Sunday, July 1, 2007 10:56:05 AM (GMT-0500) America/New_York
> Subject: [Gluster-devel] difficult bug in 2.5 mainline
>
> I am trying to track down a bug that is causing hangs in 2.5-patch-249 and
> all previous.
>
> This happens during a full Zimbra backup of certain accounts to
> /mnt/glusterfs/backups
>
> The first stage of the backup copies indexes and primary storage to
> /mnt/glusterfs/backups/tmp
> All of this data resides in local storage and the writing to gluster is
> successful.
>
> The next stage copies secondary storage to /mnt/glusterfs/backups/tmp
> This fails in the following way:
>
> Brick1 hangs with no errors
> Brick2 hangs with no errors
> Zimbra client hangs with no errors
> second client loses connectivity
>
> The second client bails after 2 min but cannot connect
> The Zimbra client never bails
>
> I then restart the bricks
>
> After both bricks are restarted, the second client reconnects and a hung
> df -h completes
>
> Zimbra client stays in a hung unconnected start
>
> ls -l /mnt/glusterfs hangs
>
> Only way is reset is
>
> kill -9 pidof glusterfs
> umount /mnt/glusterfs
>
> glusterfs
>
> Post mortem examination of /mnt/glusterfs/backups/tmp shows that a few
> files have the written from the secondary storage volume. I this can over
> 15,000 files should have been written.
>
> Note: this only happen with large email boxed with some large >10M files.
>
> Note: with patch-247 the zimbra client would seqfault. With 249 it just
> hangs in unrecoverable state.
>
>
> Harris
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>



-- 
Amar Tumballi
http://amar.80x25.org
[bulde on #gluster/irc.gnu.org]



More information about the Gluster-devel mailing list