[Gluster-users] 3.3.1 breaks NFS CARP setup

Jeff Darcy jdarcy at redhat.com
Mon Oct 22 14:08:12 UTC 2012


On 10/22/2012 09:42 AM, Dan Bretherton wrote:
> Dear All-
> I upgraded from 3.3.0 to 3.3.1 from the epel-glusterfs repository a few 
> days ago, but I discovered that NFS in the new version does not work 
> with virtual IP addresses managed by CARP.  NFS crashed as soon as an 
> NFS client made an attempt to mount a volume using a virtual IP address, 
> but mounting from a server's fixed IP address did not cause any 
> problems.  The following errors appeared in nfs.log every time NFS crashed.

To resolve this, we need to find out where some of those addresses really are.
 If you're comfortable with gdb you could try the following.

	$ gdb glusterfs core.XXXXXX
	(gdb) thread apply all bt full

It would be preferable to have this info in a bug report instead of email for
tracking purposes.

	https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS

Unfortunately, the fact that we're hitting __stack_chk_fail suggests that we
might be dealing with stack corruption, so the actual culprit might not be the
code in the stack trace that was printed.  :(  In any case, this does seem like
something that should be addressed quickly, so the more information we can get
the better.  Thanks!

> [2012-10-18 23:26:49.493953] E [nfs3.c:1409:nfs3_lookup] 0-nfs-nfsv3: 
> Volume is disabled: odin
> pending frames:
> 
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 6
> time of crash: 2012-10-18 23:26:49
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> fdatasync 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.3.1
> /lib64/libc.so.6[0x3b5a8302d0]
> /lib64/libc.so.6(gsignal+0x35)[0x3b5a830265]
> /lib64/libc.so.6(abort+0x110)[0x3b5a831d10]
> /lib64/libc.so.6[0x3b5a86a99b]
> /lib64/libc.so.6(__stack_chk_fail+0x2f)[0x3b5a8e969f]
> /usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so[0x2aaaaba23f24]
> /usr/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x2bb)[0x2aef5ef0b22b]
> /usr/lib64/libgfrpc.so.0(rpcsvc_notify+0x16c)[0x2aef5ef0b42c]
> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x27)[0x2aef5ef0ce17]
> /usr/lib64/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_poll_in+0x3f)[0x2aaaaaab6c5f]
> /usr/lib64/glusterfs/3.3.1/rpc-transport/socket.so(socket_event_handler+0x188)[0x2aaaaaab6e08]
> /usr/lib64/libglusterfs.so.0[0x2aef5ecba781]
> /usr/sbin/glusterfs(main+0x502)[0x406712]
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x3b5a81d994]
> /usr/sbin/glusterfs[0x404439]
> ---------
> 
> A large core dump was also produced each time, usually about 800MB in 
> size. If anyone would be interested in looking at one of these I will 
> make it available for download from our web server.
> 
> I don't know if this can be called a bug since CARP is not a standard 
> GlusterFS feature, but I really would prefer to retain the measure of 
> resiliency that CARP provides.  Any suggestions would be much 
> appreciated.  I ended up rolling back to version 3.3.0 to get things 
> working again.
> 
> Incidentally, when I decided to downgrade 3.3.0 I discovered that those 
> RPMs aren't available for download from http://download.glusterfs.org  
> or http://repos.fedorapeople.org/repos/kkeithle/glusterfs 
> (epel-glusterfs) any more.  I managed to find RPMs for version 3.3.0 by 
> Googling for the file names and found them here: 
> http://bits.gluster.com/gluster/glusterfs/3.3.0/x86_64/.  I don't know 
> if this is an official download site, but it might be worth putting a 
> link to it from the Gluster community web site in case anyone else needs 
> to downgrade.
> 
> -Dan.
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 




More information about the Gluster-users mailing list