[Gluster-devel] GlusterFS 1.3 mainlin 2.5-patch-317 crashes when second afr-node goes down

Anand Avati avati at zresearch.com
Thu Jul 19 06:44:31 UTC 2007


Urban,
  this bug fix will be available in mainline when glusterfs--tmp--2.5 is
merged (very soon)

thanks,
avati

2007/7/18, Urban Loesch <ul at enas.net>:
>
> Hi,
>
> I checked out the latest source and compiled it with debian testing.
> I tried the setup with 2 Servers as described on
>
> http://www.gluster.org/docs/index.php/GlusterFS_High_Availability_Storage_with_GlusterFS
>
> (thanks to Paul England).
>
> Attached you can find server and client configuration.
> - I use 3 Servers with debian testing installed
> - fuse version:
> ii  fuse-utils               2.6.5-1                         Filesystem
> in USErspace (utilities)
> ii  libfuse-dev              2.6.5-1                         Filesystem
> in USErspace (development files)
> ii  libfuse2                 2.6.5-1                         Filesystem
> in USErspace library
> fuse init (API version 7.8)
> - 2 servers for Storage
> - 1 server as a client
>
> Everything works until I shutdown the second storage server, and try to
> write a file or do some ls on the client the "glusterfd" on server one
> will crash and the client gives me the error message:
>
> ls: /mnt/gluster/data1/: Transport endpoint is not connected
> ls: /mnt/gluster/data1/: Transport endpoint is not connected
>
> The log from server 1 is attached (error-server1.txt)
>
> Have you any idea where the error could be?
> If you need further information please let me know.
>
> Thanks and regards
> Urban Loesch
>
> 2007-07-18 16:50:14 E [tcp-client.c:170:tcp_connect] data1-gluster2-ds:
> non-blocking connect() returned: 111 (Connection refused)
> 2007-07-18 16:50:14 W [client-protocol.c:340:client_protocol_xfer]
> data1-gluster2-ds: not connected at the moment to submit frame type(0)
> op(22)
> 2007-07-18 16:50:14 D [tcp-client.c:70:tcp_connect] data1-gluster2-ds:
> socket fd = 3
> 2007-07-18 16:50:14 D [tcp-client.c:88:tcp_connect] data1-gluster2-ds:
> finalized on port `1023'
> 2007-07-18 16:50:14 D [tcp-client.c:109:tcp_connect] data1-gluster2-ds:
> defaulting remote-port to 6996
> 2007-07-18 16:50:14 D [tcp-client.c:141:tcp_connect] data1-gluster2-ds:
> connect on 3 in progress (non-blocking)
> 2007-07-18 16:50:14 E [tcp-client.c:170:tcp_connect] data1-gluster2-ds:
> non-blocking connect() returned: 111 (Connection refused)
> 2007-07-18 16:50:14 W [client-protocol.c:340:client_protocol_xfer]
> data1-gluster2-ds: not connected at the moment to submit frame type(0)
> op(20)
> 2007-07-18 16:50:14 E [afr.c:576:afr_getxattr_cbk] data1-ds-afr: (path=)
> op_ret=-1 op_errno=107
> 2007-07-18 16:50:14 C [common-utils.c:208:gf_print_trace] debug-backtrace:
> Got signal (11), printing backtrace
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0(gf_print_trace+0x2b) [0xb7fc44fb]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /lib/i686/cmov/libc.so.6 [0xb7e7e208]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/server.so [0xb7607e1f]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0 [0xb7fc32a7]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/cluster/unify.so [0xb762410c]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/cluster/afr.so [0xb762cf4c]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/client.so [0xb7645da2]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/client.so [0xb763f160]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/client.so [0xb7640c9c]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/cluster/afr.so [0xb762d0da]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/cluster/unify.so(unify_getxattr+0x140)
> [0xb7624253]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0(default_getxattr+0xe1) [0xb7fc338f]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/server.so [0xb760ce6f]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0 [0xb7fcc151]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0(call_resume+0x33) [0xb7fce6b5]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/server.so [0xb760d067]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/server.so [0xb76114f1]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/glusterfs/1.3.pre6/xlator/protocol/server.so(notify+0xc9)
> [0xb7611e8b]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0(transport_notify+0x62) [0xb7fc631f]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0 [0xb7fc6a28]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0(sys_epoll_iteration+0x147) [0xb7fc6d0c]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /usr/lib/libglusterfs.so.0(poll_iteration+0x1d) [0xb7fc654c]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> [glusterfsd] [0x8049340]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> /lib/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7e6a030]
> 2007-07-18 16:50:14 C [common-utils.c:210:gf_print_trace] debug-backtrace:
> [glusterfsd] [0x8048d51]
>
>    ### Add client feature and attach to remote subvolume
>    volume gluster1
>      type protocol/client
>      option transport-type tcp/client     # for TCP/IP transport
>      option remote-host 10.137.252.137       # IP address of the remote
> brick
>      option remote-subvolume data1        # name of the remote volume
>    end-volume
>
>    volume gluster2
>      type protocol/client
>      option transport-type tcp/client     # for TCP/IP transport
>      option remote-host 10.137.252.138    # IP address of the remote brick
>      option remote-subvolume data1        # name of the remote volume
>    end-volume
>
>    ### Add writeback feature
>    volume writeback
>      type performance/write-behind
>      option aggregate-size 131072 # unit in bytes
>      subvolumes gluster1
>    end-volume
>
>    ### Add readahead feature
>    volume readahead
>      type performance/read-ahead
>      option page-size 65536     # unit in bytes
>      option page-count 16       # cache per file  = (page-count x
> page-size)
>      subvolumes writeback
>    end-volume
>
>    volume data1-ds
>            type storage/posix                           # POSIX FS
> translator
>            option directory /glusterfs/data1            # Export this
> directory
>    end-volume
>
>    volume data1-ns
>            type storage/posix                           # POSIX FS
> translator
>            option directory /glusterfs/namespace1       # Export this
> directory
>    end-volume
>
>    volume data1-gluster1-ds
>            type protocol/client
>            option transport-type tcp/client
>            option remote-host 127.0.0.1
>            option remote-subvolume data1-ds
>    end-volume
>
>    volume data1-gluster1-ns
>            type protocol/client
>            option transport-type tcp/client
>            option remote-host 127.0.0.1
>            option remote-subvolume data1-ns
>    end-volume
>
>    volume data1-gluster2-ds
>            type protocol/client
>            option transport-type tcp/client
>            option remote-host 192.168.0.138
>            option remote-subvolume data1-ds
>    end-volume
>
>    volume data1-gluster2-ns
>            type protocol/client
>            option transport-type tcp/client
>            option remote-host 192.168.0.138
>            option remote-subvolume data1-ns
>    end-volume
>
> # Add AFR to Datastorage
>    volume data1-ds-afr
>            type cluster/afr
>            # There appears to be a bug with AFR and Local Posix Volumes.
>            # To get around this we pretend the local volume is remote with
> an extra client volume named mailspool-santa1-ds.
>            subvolumes data1-gluster1-ds data1-gluster2-ds
>            option replicate *:2
>    end-volume
>
> # Add AFR to Namespacestorage
>    volume data1-ns-afr
>            type cluster/afr
>            # There appears to be a bug with AFR and Local Posix Volumes.
>            # To get around this we pretend the local volume is remote with
> an extra client volume named mailspool-santa1-ns.
>            # subvolumes mailspool-ns mailspool-santa2-ns
> mailspool-santa3-ns
>            subvolumes data1-gluster1-ns data1-gluster2-ns
>            option replicate *:2
>    end-volume
>
> # Unify
>    volume data1-unify
>            type cluster/unify
>            subvolumes data1-ds-afr
>            option namespace data1-ns-afr
>            option scheduler rr
>    end-volume
>
> # Performance
>    volume data1
>            type performance/io-threads
>            option thread-count 8
>            option cache-size 64MB
>            subvolumes data1-unify
>    end-volume
>
>    ### Add network serving capability to above brick.
>    volume server
>      type protocol/server
>      option transport-type tcp/server     # For TCP/IP transport
>      subvolumes data1
>      option auth.ip.data1-ds.allow 192.168.0.*,127.0.0.1 # Allow access to
> "brick" volume
>      option auth.ip.data1-ns.allow 192.168.0.*,127.0.0.1 # Allow access to
> "brick" volume
>      option auth.ip.data1.allow * # Allow access to "brick" volume
>    end-volume
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>


-- 
Anand V. Avati



More information about the Gluster-devel mailing list