[Gluster-devel] pre6 hanging problems

Mon Jul 30 10:41:54 UTC 2007

Hi,

One more thing --

I noticed that the reason the entries in the logs on the clients was
inconsistent (sometimes the try again later message, sometimes just the
disconnect message) is that sometimes the logs are flushed and sometimes
not. When I killed my server and restarted the tail -f on the client logs, I
see that every time it's this "try again later" message.

2007-07-30 03:08:15 D [fuse-bridge.c:344:fuse_entry_cbk] glusterfs-fuse: ERR
=> -1 (107)
2007-07-30 03:08:15 D [inode.c:285:__destroy_inode] fuse/inode: destroy
inode(0)
2007-07-30 03:08:15 D [inode.c:357:__passive_inode] fuse/inode: passivating
inode(52759082), lru=59885/1024
2007-07-30 03:08:15 D [inode.c:327:__active_inode] fuse/inode: activating
inode(52759082), lru=59884/1024
2007-07-30 03:08:15 D [fuse-bridge.c:366:fuse_lookup] glusterfs-fuse: LOOKUP
52759082/rex (/20070730/home/rex)
2007-07-30 03:08:15 D [tcp-client.c:178:tcp_connect] brick_2: connection on
7 still in progress - try later
2007-07-30 03:08:15 W [client-protocol.c:344:client_protocol_xfer] brick_2:
not connected at the moment to submit frame type(0) op(34)

thanks,
:august

On 7/30/07, August R. Wohlt <glusterfs at isidore.net> wrote:
>
> Hi -
>
> So still the same problem with pre6 hanging regularly with or without gdb:
>
> To summarize:
>
> After starting 2 physically distinct clients (separate clients on gigE
> switch with server) on centos 5 with pre6 against a server on centos 5 pre
> 6, doing a heavy rsync to these mounts will regularly freeze up after a few
> hours. The clients only have writebehind enabled, and the server has no
> other performance enhancements in it.
>
> When the clients' mounts hang, nothing at all shows up in the server log
> with debug turned on. The log simple goes along as usual then stops:
>
> ...
> 2007-07-30 05:18:36 D [inode.c:285:__destroy_inode] brick_1/inode: destroy
> inode(0)
> 2007-07-30 05:18:36 D [inode.c:534:__create_inode] brick_1/inode: create
> inode(8979319)
> 2007-07-30 05:18:36 D [inode.c:327:__active_inode] brick_1/inode:
> activating inode(8979319), lru=1024/1024
> 2007-07-30 05:18:36 D [ inode.c:357:__passive_inode] brick_1/inode:
> passivating inode(8979319), lru=1025/1024
> 2007-07-30 05:18:36 D [inode.c:285:__destroy_inode] brick_1/inode: destroy
> inode(5031361)
> 2007-07-30 05:18:36 D [inode.c:285:__destroy_inode] brick_1/inode: destroy
> inode(0)
>
>
> Both of the clients have disconnected it seems, because their logs show:
>
> ...
> 2007-07-29 21:45:18 D [inode.c:285:__destroy_inode] fuse/inode: destroy
> inode(46470909)
> 2007-07-29 21:45:18 D [inode.c:327:__active_inode] fuse/inode: activating
> inode(46470877), lru=236831/1024
> 2007-07-29 21:45:18 D [inode.c:285:__destroy_inode] fuse/inode: destroy
> inode(46470877)
> 2007-07-29 21:45:18 E [fuse-bridge.c:2017:fuse_transport_notify]
> glusterfs-fuse: got GF_EVENT_POLLERR
> 2007-07-29 21:45:18 D [ fuse-bridge.c:1857:fuse_transport_disconnect]
> glusterfs-fuse: cleaning up fuse transport in disconnect handler
>
> ...
> 2007-07-29 21:45:13 D [inode.c:285:__destroy_inode] fuse/inode: destroy
> inode(13058343)
> 2007-07-29 21:45:13 D [ inode.c:327:__active_inode] fuse/inode: activating
> inode(13058335), lru=330169/1024
> 2007-07-29 21:45:13 D [inode.c:285:__destroy_inode] fuse/inode: destroy
> inode(13058335)
> 2007-07-29 21:45:13 E [fuse-bridge.c:2017:fuse_transport_notify]
> glusterfs-fuse: got GF_EVENT_POLLERR
> 2007-07-29 21:45:13 D [fuse-bridge.c:1857:fuse_transport_disconnect]
> glusterfs-fuse: cleaning up fuse transport in disconnect handler
>
> Both of them generate SIGCONT in gdb. But they have different behavior
> when I continue:
>
> The first one just loops with SIGCONT, the other a SIGPIPE:
>
> client1:
>
> (gdb) run
> Starting program: /usr/sbin/glusterfs -N -LDEBUG
> -l/var/log/glusterfs/glusterfs.log -f/etc/glusterfs/glusterfs-client.vol/backups
> [Thread debugging using libthread_db enabled]
> [New Thread -1209063744 (LWP 12708)]
> Detaching after fork from child process 12710.
> [New Thread 16427936 (LWP 12711)]
>
> Program received signal SIGCONT, Continued.
> [Switching to Thread 16427936 (LWP 12711)]
> 0x0032e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) c
> Continuing.
>
> Program received signal SIGCONT, Continued.
> 0x0032e7a2 in _dl_sysinfo_int80 () from /lib/ld- linux.so.2
> (gdb) bt
> #0  0x0032e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x005a3824 in raise () from /lib/tls/libpthread.so.0
> #2  0x0013bb0c in tcp_bail (this=0x947b3d0) at
> ../../../../transport/tcp/tcp.c:146
> #3  0x00122bbc in transport_bail (this=0x947b3d0) at transport.c:192
> #4  0x0012ea16 in call_bail (trans=0x947b3d0) at client-protocol.c:220
> #5  0x00123870 in gf_timer_proc (ctx=0xbfe97140) at timer.c:119
> #6  0x0059d3cc in start_thread () from /lib/tls/libpthread.so.0
> #7  0x00414c3e in clone () from /lib/tls/libc.so.6
> (gdb)
>
>
> client2:
>
> Starting program: /usr/sbin/glusterfs -f/usr/etc/glusterfs/glusterfs-
> client.vol -N -LDEBUG -l/var/log/glusterfs/glusterfs.log /backups
> [Thread debugging using libthread_db enabled]
> [New Thread -1208265024 (LWP 7022)]
> Detaching after fork from child process 7023.
> [New Thread 26753968 (LWP 7024)]
>
> Program received signal SIGCONT, Continued.
> [Switching to Thread 26753968 (LWP 7024)]
> 0x007167a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) c
> Continuing.
>
> Program received signal SIGPIPE, Broken pipe.
> [Switching to Thread -1208265024 (LWP 7022)]
> 0x007167a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> (gdb) bt
> #0  0x007167a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
> #1  0x007ef544 in writev () from /lib/tls/libc.so.6
> #2  0x00f799a1 in full_rwv (fd=7, vector=0x20157, count=39, fn=0x7ef490
> <writev>) at common-utils.c:100
> #3  0x00962386 in tcp_client_writev (this=0x8ecf418, vector=0xbffe0fd0,
> count=55) at tcp-client.c:251
> #4  0x002a7b3b in client_protocol_xfer (frame=0xdd003b0, this=0x8e935c8,
> type=GF_OP_TYPE_FOP_REQUEST, op=GF_FOP_WRITE,
>     request=0xdea0028) at client-protocol.c:324
> #5  0x002a8ec5 in client_writev (frame=0xdd003b0, this=0x8e935c8,
> fd=0xc379bc8, vector=0xa8b05d0, count=32, offset=131072)
>     at client-protocol.c:1126
> #6  0x00efacf0 in wb_sync (frame=0xa8b057c, file=0xdb29698) at
> write-behind.c:180
> #7  0x00efb7eb in wb_writev (frame=0xdd003b0, this=0x8e97f80,
> fd=0xc379bc8, vector=0xbffe13a0, count=1, offset=258048)
>     at write-behind.c:603
> #8  0x0804d4e8 in fuse_write (req=0xc04dcd8, ino=53232572,
>     buf=0xb7b97048 "d6450 at mypacs.net>; Sun, 29 Jul 2007 08:34:58
> -0700\nFrom: \"Ebay Leader\" <DemetriusVenable at nosopoetic.com >\nTo:
> fubbed6450 at mypacs.net\nSubject: Ebay for dummies for Fubbed6450\nDate:
> Sun, 29 Jul 2007 07:28"..., size=4096,
>     off=258048, fi=0xffffffe0) at fuse-bridge.c:1235
> #9  0x00320c1a in do_write (req=0xc04dcd8, nodeid=53232572,
> inarg=0xb7b97030) at fuse_lowlevel.c:635
> #10 0x00321bd0 in fuse_ll_process (data=0x8e93718, buf=0xb7b97008 "@\020",
> len=4160, ch=0x8e936a8) at fuse_lowlevel.c:1171
> #11 0x00322ea3 in fuse_session_process (se=0xffffffe0, buf=0xb7b97008
> "@\020", len=4160, ch=0x8e936a8) at fuse_session.c:89
> #12 0x0804efb9 in fuse_transport_notify (xl=0x8e98020, event=2,
> data=0x8e933d0) at fuse-bridge.c:2050
> #13 0x00f7af6c in transport_notify (this=0x8e933d0, event=1) at
> transport.c:154
> #14 0x00f7b581 in sys_epoll_iteration (ctx=0xbffe1570) at epoll.c:54
> #15 0x00f7b10b in poll_iteration (ctx=0xbffe1570) at transport.c:262
> #16 0x0804a5e3 in main (argc=6, argv=0xbffe1644) at glusterfs.c:382
> (gdb) c
> Continuing.
>
> All commands running on the mountpoint are frozen when this happens, and
> when I run glusterfs without gdb, it just freezes up, and does not die.
>
> If I suspend the server and do a backtrace on it, it shows:
>
> (gdb) run
> Starting program: /usr/sbin/glusterfsd -N -LDEBUG
> -f/etc/glusterfs/glusterfs-server.vol -l/var/log/glusterfs/glusterfs.log
> [Thread debugging using libthread_db enabled]
> [New Thread 46912496205696 (LWP 22070)]
> glusterfsd: WARNING: ignoring stale pidfile for PID 22044
> [New Thread 1084229952 (LWP 22071)]
> [New Thread 1094719808 (LWP 22072)]
> [New Thread 1105209664 (LWP 22073)]
>
> Program received signal SIGTSTP, Stopped (user).
> [Switching to Thread 46912496205696 (LWP 22070)]
> 0x000000358400cb5b in __read_nocancel () from /lib64/libpthread.so.0
> (gdb) bt
> #0  0x000000358400cb5b in __read_nocancel () from /lib64/libpthread.so.0
> #1  0x00000036a800ed08 in full_rw (fd=8, buf=0x2aaaac155c90
> "00000007\n0000000b:00000006\nCALLER_PID", size=131287,
>     op=0x358400cb10 <read>) at common-utils.c:51
> #2  0x00000036a800ed7a in gf_full_read (fd=8, buf=0x2aaaac155c90
> "00000007\n0000000b:00000006\nCALLER_PID", size=131287)
>     at common-utils.c:71
> #3  0x00002aaaaaedce8f in tcp_recieve (this=0xd9f13c0, buf=0x2aaaac155c90
> "00000007\n0000000b:00000006\nCALLER_PID",
>     len=131287) at ../../../../transport/tcp/tcp.c:49
> #4  0x00000036a80104a8 in gf_block_unserialize_transport (trans=0xd9f13c0)
> at protocol.c:340
> #5  0x00002aaaaacd842c in notify (this=0xd9f7e10, event=2, data=0xd9f13c0)
> at server-protocol.c:5819
> #6  0x00000036a8010df2 in transport_notify (this=0xd9f13c0, event=1) at
> transport.c:154
> #7  0x00000036a80114e9 in epoll_notify (eevent=1, data=0xd9f13c0) at
> epoll.c:54
> #8  0x00000036a80117a7 in sys_epoll_iteration (ctx=0x7fff2458de70) at
> epoll.c:146
> #9  0x00000036a8010fc6 in poll_iteration (ctx=0x7fff2458de70) at
> transport.c:262
> #10 0x0000000000401a58 in main (argc=5, argv=0x7fff2458dfc8) at
> glusterfsd.c:309
> (gdb)
>
>
> When I kill the server and restart it, the client will show Transport not
> connected the first time accessing the mount (ie, with "ls"), but the 2nd
> time it works and shows the directories therein.
>
> This happens with great regularity. Sometimes the message in the client
> logs is "connection on x is still in progress - try again later" but
> sometimes it's this disconnected message. Every time, the cure for me is the
> same-- restart the server and the clients become happy and continue on their
> merry way again.
>
> I don't have any problem when I remove writebehind from the clients so
> that it's just plain vanilla client -> server, but my throughput drops from
> 40MB/s to 150Kbps which is not useable for me and so maybe I just haven't
> let it go long enough. I have to use pre6 because I'm doing hardlinks on the
> mount, so I can't drop to a lower version for stability.
>
> Because this happens with great regularity (~ every 6hrs), I'm more than
> happy to do any tracing/debugging that anyone can suggest. Otherwise, I'll
> try to start some debugging in the code when I have a few spare cycles (few
> and far between!)
>
> Again, this is on a stock kernel on CentOS 5 installation, pre6 glusterfs
> client and server, and fuse 2.7.0-glfs1.
>
> client configuration:
>
>  ### Add client feature and attach to remote subvolume
>  volume brick
>    type protocol/client
>    option transport-type tcp/client     # for TCP/IP transport
>    option remote-host 192.168.2.5       # IP address of the remote brick
>    option remote-subvolume brick_1  # name of the remote volume
>  end-volume
>
> # #### Add writeback feature
>  volume brick-wb
>    type performance/write-behind
>    option aggregate-size 131072 # unit in bytes
>    subvolumes brick
>  end-volume
>
> server configuration:
>
>  ### Export volume "brick" with the contents of "/home/export" directory.
>  volume brick_1
>    type storage/posix
>    option directory /home/vg_3ware1/vivalog/brick_1
>  end-volume
>  volume brick_2
>    type storage/posix
>    option directory /home/vg_3ware1/vivalog/brick_2
>  end-volume
>
>  ### Add network serving capability to above brick.
>  volume server
>    type protocol/server
>    option transport-type tcp/server     # For TCP/IP transport
>    option bind-address 192.168.2.5     # Default is to listen on all
> interfaces
>    subvolumes brick_1
>    option auth.ip.brick_1.allow * # Allow access to "brick" volume
>  end-volume
>
>  ### Add network serving capability to above brick.
>  volume server
>    type protocol/server
>    option transport-type tcp/server     # For TCP/IP transport
>    option bind-address 192.168.2.5     # Default is to listen on all
> interfaces
>    subvolumes brick_2
>    option listen-port 6997
>    option auth.ip.brick_2.allow * # Allow access to "brick" volume
>  end-volume
>
> I tried a single "volume server" entry for both bricks, and I get the same
> thing.  Also, I have an ssh connection between the two servers open all the
> time which never drops, so I'm ruling out network problems across the
> switch.
>
> Help!
> :august
>
> On 7/26/07, August R. Wohlt <glusterfs at isidore.net> wrote:
> >
> > Hi avati,
> >
> > When I run it without gdb, it still has the same behavior. It'll run
> > fine for a few hours under load and then freeze. When it does, the client
> > spews these to the logs forever. When I kill glusterfs and remount the
> > directory, everything's fine again:
> >
> > 2007-07-26 12:21:31 D [fuse-bridge.c:344:fuse_entry_cbk] glusterfs-fuse:
> > ERR => -1 (107)
> > 2007-07-26 12:21:31 D [inode.c:285:__destroy_inode] fuse/inode: destroy
> > inode(0)
> > 2007-07-26 12:23:34 W [client-protocol.c:4158:client_protocol_reconnect] brick: attempting reconnect
> > 2007-07-26 12:23:34 D [tcp-client.c:178:tcp_connect] brick: connection
> > on 4 still in progress - try later
> > 2007-07-26 12:29:51 W [client-protocol.c:4158:client_protocol_reconnect]
> > brick: attempting reconnect
> > 2007-07-26 12:29:51 E [tcp-client.c:170:tcp_connect] brick: non-blocking
> > connect() returned: 110 (Connection timed out)
> >
> > :g
> >
> > On 7/26/07, Anand Avati <avati at zresearch.com> wrote:
> > >
> > > August,
> > >  It seems to me that you were running the client in GDB, and for some
> > > reason that particular client bailed out. While bailing out the client
> > > raises SIGCONT which has been caught by gdb (gdb catches all signals before
> > > letting the signal handlers take over). the backtrace you have attached is
> > > NOT a crash, you had to just 'c' (continue) at the gdb. And most likely,
> > > this is what has given the 'hung' effect as well.
> > > Is this reproducible for you?
> > >
> > > thanks,
> > > avati
> > >
> > > 2007/7/26, August R. Wohlt < glusterfs at isidore.net>:
> > > >
> > > > Hi all -
> > > >
> > > > I have client and server set up with the pre6 version of gluserfs.
> > > > Several
> > > > times a day the client mount will freeze up as does any command that
> > > > tries
> > > > to read from the mountpoint. I have to kill the glusterfs process,
> > > > unmount
> > > > the directory and remount it to get it to work again.
> > > >
> > > > When this happens, there is another glusterfs client on other
> > > > machines
> > > > connected to the same server that does not get disconnected. So the
> > > > timeout
> > > > message in the logs is confusing to me. If it's really timing out
> > > > wouldn't
> > > > the other server be disconnected, too?
> > > >
> > > > This is on CentOS 5 with fuse 2.7.0-glfs.
> > > >
> > > > When it happens, here's what shows up in the client:
> > > >
> > > > ...
> > > > 2007-07-25 09:45:59 D [inode.c:327:__active_inode] fuse/inode:
> > > > activating
> > > > inode(4210807), lru=0/1024
> > > > 2007-07-25 09:45:59 D [inode.c:285:__destroy_inode] fuse/inode:
> > > > destroy
> > > > inode(4210807)
> > > > 2007-07-25 12:37:26 W [client-protocol.c:211:call_bail] brick:
> > > > activating
> > > > bail-out. pending frames = 1. last sent =
> > > > 2007-07-25 12:33:42. last received = 2007-07-25 11:42:59
> > > > transport-timeout =
> > > > 120
> > > > 2007-07-25 12:37:26 C [client-protocol.c:219:call_bail] brick:
> > > > bailing
> > > > transport
> > > > 2007-07-25 12:37:26 W [client-protocol.c:4189:client_protocol_cleanup]
> > > >
> > > > brick: cleaning up state in transport object
> > > > 0x80a03d0
> > > > 2007-07-25 12:37:26 W [client-protocol.c
> > > > :4238:client_protocol_cleanup]
> > > > brick: forced unwinding frame type(0) op(15)
> > > > 2007-07-25 12:37:26 C [tcp.c:81:tcp_disconnect] brick: connection
> > > > disconnected
> > > >
> > > > When it happens, here's what shows up in the server:
> > > >
> > > > 2007-07-25 15:37:40 E [protocol.c
> > > > :346:gf_block_unserialize_transport]
> > > > libglusterfs/protocol: full_read of block failed: peer (
> > > > 192.168.2.3:1023)
> > > > 2007-07-25 15:37:40 C [tcp.c:81:tcp_disconnect] server: connection
> > > > disconnected
> > > > 2007-07-25 15:37:40 E [ protocol.c
> > > > :251:gf_block_unserialize_transport]
> > > > libglusterfs/protocol: EOF from peer ( 192.168.2.4:1023)
> > > > 2007-07-25 15:37:40 C [tcp.c:81:tcp_disconnect] server: connection
> > > > disconnected
> > > >
> > > > And here's the client backtrace:
> > > >
> > > > (gdb) bt
> > > > #0  0x0032e7a2 in _dl_sysinfo_int80 () from /lib/ld- linux.so.2
> > > > #1  0x005a3824 in raise () from /lib/tls/libpthread.so.0
> > > > #2  0x00655b0c in tcp_bail (this=0x80a03d0) at
> > > > ../../../../transport/tcp/tcp.c:146
> > > > #3  0x00695bbc in transport_bail (this=0x80a03d0) at transport.c:192
> > > > #4  0x00603a16 in call_bail (trans=0x80a03d0) at client-protocol.c
> > > > :220
> > > > #5  0x00696870 in gf_timer_proc (ctx=0xbffeec30) at timer.c:119
> > > > #6  0x0059d3cc in start_thread () from /lib/tls/libpthread.so.0
> > > > #7  0x00414c3e in clone () from /lib/tls/libc.so.6
> > > >
> > > >
> > > > client config:
> > > >
> > > > ### Add client feature and attach to remote subvolume
> > > > volume brick
> > > >    type protocol/client
> > > >    option transport-type tcp/client     # for TCP/IP transport
> > > >    option remote-host 192.168.2.5       # IP address of the remote
> > > > brick
> > > >    option remote-subvolume brick_1  # name of the remote volume
> > > > end-volume
> > > >
> > > > # #### Add writeback feature
> > > >   volume brick-wb
> > > >     type performance/write-behind
> > > >     option aggregate-size 131072 # unit in bytes
> > > >     subvolumes brick
> > > >   end-volume
> > > >
> > > > server config:
> > > >
> > > > ### Export volume "brick" with the contents of "/home/export"
> > > > directory.
> > > > volume brick_1
> > > >    type storage/posix
> > > >    option directory /home/vg_3ware1/vivalog/brick_1
> > > > end-volume
> > > >
> > > > volume brick_2
> > > >    type storage/posix
> > > >    option directory /home/vg_3ware1/vivalog/brick_2
> > > > end-volume
> > > >
> > > > ### Add network serving capability to above brick.
> > > > volume server
> > > >    type protocol/server
> > > >    option transport-type tcp/server     # For TCP/IP transport
> > > >    option bind-address 192.168.2.5     # Default is to listen on all
> > > > interfaces
> > > >    subvolumes brick_1
> > > >    option auth.ip.brick_2.allow * # Allow access to "brick" volume
> > > >    option auth.ip.brick_1.allow * # Allow access to "brick" volume
> > > > end-volume
> > > >
> > > > ps I have one server serving two volume bricks to two physically
> > > > distinct
> > > > clients.  I assume this is okay--that I don't need to have two
> > > > separate
> > > > server declarations.
> > > > _______________________________________________
> > > > Gluster-devel mailing list
> > > > Gluster-devel at nongnu.org
> > > > http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > > >
> > >
> > >
> > >
> > > --
> > > Anand V. Avati
> >
> >
> >
>