[Gluster-devel] Segfault in read_ahead in 1.3.0_pre4

Thu May 24 12:20:29 UTC 2007

Harris,
  this bug was fixed a few days back and the fix is available in the
glusterfs--mainline--2.4 repository latest checkout.

thanks,
avati

2007/5/24, Harris Landgarten <harrisl at lhjonline.com>:
> I am running glusterfs in a very basic configuration on Amazon EC instances. I have a 2 brick cluster and 2 clients. One of the clients is running Zimbra and I am using the cluster as secondary storage for the mail store. I have repeatedly tried to reindex a mailbox with 31000 items. Most of the email is on the cluster. The entire process takes about 2 hours. Part way through I get at least one TCP Disconnect which seems random. With read_ahead enabled on the client, the disconnect results in a segfault and the mount point disappears. When I disabled read_ahead on the client, the disconnect was recovered from, and the process completed. This is the backtrace from the read_ahead segfault:
>
> [May 23 20:00:37] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport
> [May 23 20:00:37] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any)
> [May 23 20:00:37] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
> [May 23 20:00:37] [DEBUG/protocol.c:244/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of header failed
> [May 23 20:00:37] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x8078a08
> [May 23 20:00:37] [CRITICAL/common-utils.c:215/gf_print_trace()] debug-backtrace:Got signal (11), printing backtrace
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(gf_print_trace+0x2d) [0xb7f2584d]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[0xbfffe420]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read-ahead.so(ra_page_error+0x47) [0xb755e587]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read-ahead.so [0xb755ecf0]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/write-behind.so [0xb7561809]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/cluster/unify.so [0xb7564919]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/protocol/client.so [0xb756d17b]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/protocol/client.so [0xb75717a5]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(transport_notify+0x1d) [0xb7f26d2d]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xe7) [0xb7f279d7]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(poll_iteration+0x1d) [0xb7f26ddd]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:glusterfs [0x804a15e]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/lib/libc.so.6(__libc_start_main+0xdc) [0xb7dca8cc]
> [May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:glusterfs [0x8049e71]
> Segmentation fault (core dumped)
>
> This is a sample of the debug log with read_ahead turned off
>
> [May 24 05:35:05] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport
> [May 24 05:35:05] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any)
> [May 24 05:35:05] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
> [May 24 05:35:05] [DEBUG/protocol.c:244/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of header failed
> [May 24 05:35:05] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
> [May 24 05:35:05] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:client1: connection to server disconnected
> [May 24 05:35:05] [DEBUG/tcp-client.c:180/tcp_connect()] transport: tcp: :try_connect: socket fd = 4
> [May 24 05:35:05] [DEBUG/tcp-client.c:202/tcp_connect()] transport: tcp: :try_connect: finalized on port `1022'
> [May 24 05:35:05] [DEBUG/tcp-client.c:226/tcp_connect()] tcp/client:try_connect: defaulting remote-port to 6996
> [May 24 05:35:05] [DEBUG/tcp-client.c:262/tcp_connect()] tcp/client:connect on 4 in progress (non-blocking)
> [May 24 05:35:05] [DEBUG/tcp-client.c:301/tcp_connect()] tcp/client:connection on 4 still in progress - try later
> [May 24 05:35:05] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed
> [May 24 05:35:05] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
> [May 24 05:35:11] [DEBUG/tcp-client.c:310/tcp_connect()] tcp/client:connection on 4 success, attempting to handshake
> [May 24 05:35:11] [DEBUG/tcp-client.c:54/do_handshake()] transport/tcp-client:dictionary length = 50
> [May 24 07:20:10] [DEBUG/stat-prefetch.c:58/stat_prefetch_cache_flush()] stat-prefetch:flush on: /
> [May 24 07:20:20] [DEBUG/stat-prefetch.c:58/stat_prefetch_cache_flush()] stat-prefetch:flush on: /backups/sessions
> [May 24 07:57:12] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport
> [May 24 07:57:12] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any)
> [May 24 07:57:12] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
> [May 24 07:57:12] [DEBUG/protocol.c:244/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of header failed
> [May 24 07:57:12] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
> [May 24 07:57:12] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:client1: connection to server disconnected
> [May 24 07:57:12] [DEBUG/tcp-client.c:180/tcp_connect()] transport: tcp: :try_connect: socket fd = 4
> [May 24 07:57:12] [DEBUG/tcp-client.c:202/tcp_connect()] transport: tcp: :try_connect: finalized on port `1023'
> [May 24 07:57:12] [DEBUG/tcp-client.c:226/tcp_connect()] tcp/client:try_connect: defaulting remote-port to 6996
> [May 24 07:57:12] [DEBUG/tcp-client.c:262/tcp_connect()] tcp/client:connect on 4 in progress (non-blocking)
> [May 24 07:57:12] [DEBUG/tcp-client.c:301/tcp_connect()] tcp/client:connection on 4 still in progress - try later
> [May 24 07:57:12] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed
> [May 24 07:57:12] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
> [May 24 07:57:12] [DEBUG/tcp-client.c:310/tcp_connect()] tcp/client:connection on 4 success, attempting to handshake
> [May 24 07:57:12] [DEBUG/tcp-client.c:54/do_handshake()] transport/tcp-client:dictionary length = 50
>
> This is the client config with read_ahead
>
> ### Add client feature and attach to remote subvolume
> volume client1
>   type protocol/client
>   option transport-type tcp/client     # for TCP/IP transport
> # option ibv-send-work-request-size  131072
> # option ibv-send-work-request-count 64
> # option ibv-recv-work-request-size  131072
> # option ibv-recv-work-request-count 64
> # option transport-type ib-sdp/client  # for Infiniband transport
> # option transport-type ib-verbs/client # for ib-verbs transport
>   option remote-host xx.xxx.xx.xxx     # IP address of the remote brick
> # option remote-port 6996              # default server port is 6996
>
> # option transport-timeout 30          # seconds to wait for a reply
>                                        # from server for each request
>   option remote-subvolume brick        # name of the remote volume
> end-volume
>
> ### Add client feature and attach to remote subvolume
> volume client2
>   type protocol/client
>     option transport-type tcp/client     # for TCP/IP transport
>     # option ibv-send-work-request-size  131072
>     # option ibv-send-work-request-count 64
>     # option ibv-recv-work-request-size  131072
>     # option ibv-recv-work-request-count 64
>     # option transport-type ib-sdp/client  # for Infiniband transport
>     # option transport-type ib-verbs/client # for ib-verbs transport
>       option remote-host yy.yyy.yy.yyy     # IP address of the remote brick
>     # option remote-port 6996              # default server port is 6996
>
>     # option transport-timeout 30          # seconds to wait for a reply
>                                            # from server for each request
>       option remote-subvolume brick        # name of the remote volume
> end-volume
>
> volume bricks
>   type cluster/unify
>     subvolumes client1 client2
>     option scheduler alu
>     option alu.limits.min-free-disk 4GB
>     option alu.limits.max-open-files 10000
>
>     option alu.order disk-usage:read-usage:write-usage:open-files-usage
>     option alu.disk-usage.entry-threshold 2GB
>     option alu.disk-usage.exit-threshold 10GB
>     option alu.open-files-usage.entry-threshold 1024
>     option alu.open-files-usage.exit-threshold 32
>     option alu.stat-refresh.interval 10sec
>
> end-volume
> #
>
> ### Add writeback feature
> volume writeback
>   type performance/write-behind
>   option aggregate-size 131072 # unit in bytes
>   subvolumes bricks
> end-volume
>
> ### Add readahead feature
> volume readahead
>   type performance/read-ahead
>   option page-size 65536     # unit in bytes
>   option page-count 16       # cache per file  = (page-count x page-size)
>   subvolumes writeback
> end-volume
>
> ### Add stat-prefetch feature
> ### If you are not concerned about performance of interactive commands
> ### like "ls -l", you wouln't need this translator.
> volume statprefetch
>    type performance/stat-prefetch
>    option cache-seconds 2   # timeout for stat cache
>    subvolumes readahead
> end-volume
>
> This is the brick config:
>
> ### Export volume "brick" with the contents of "/home/export" directory.
> volume brick
>   type storage/posix                   # POSIX FS translator
>   option directory /mnt/export        # Export this directory
> end-volume
>
> volume iothreads
>   type performance/io-threads
>   option thread-count 8
>   subvolumes brick
> end-volume
>
> ### Add network serving capability to above brick.
> volume server
>   type protocol/server
>   option transport-type tcp/server     # For TCP/IP transport
> # option ibv-send-work-request-size  131072
> # option ibv-send-work-request-count 64
> # option ibv-recv-work-request-size  131072
> # option ibv-recv-work-request-count 64
> # option transport-type ib-sdp/server  # For Infiniband transport
> # option transport-type ib-verbs/server # For ib-verbs transport
> # option bind-address 192.168.1.10     # Default is to listen on all interfaces
> # option listen-port 6996              # Default is 6996
> # option client-volume-filename /etc/glusterfs/glusterfs-client.vol
>   subvolumes iothreads
> # NOTE: Access to any volume through protocol/server is denied by
> # default. You need to explicitly grant access through # "auth"
> # option.
>   option auth.ip.brick.allow * # Allow access to "brick" volume
> end-volume
>
>
>
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>

-- 
Anand V. Avati