[Gluster-devel] Segfault in read_ahead in 1.3.0_pre4

Thu May 24 12:18:32 UTC 2007

I am running glusterfs in a very basic configuration on Amazon EC instances. I have a 2 brick cluster and 2 clients. One of the clients is running Zimbra and I am using the cluster as secondary storage for the mail store. I have repeatedly tried to reindex a mailbox with 31000 items. Most of the email is on the cluster. The entire process takes about 2 hours. Part way through I get at least one TCP Disconnect which seems random. With read_ahead enabled on the client, the disconnect results in a segfault and the mount point disappears. When I disabled read_ahead on the client, the disconnect was recovered from, and the process completed. This is the backtrace from the read_ahead segfault:

[May 23 20:00:37] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport
[May 23 20:00:37] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any)
[May 23 20:00:37] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
[May 23 20:00:37] [DEBUG/protocol.c:244/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of header failed
[May 23 20:00:37] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x8078a08
[May 23 20:00:37] [CRITICAL/common-utils.c:215/gf_print_trace()] debug-backtrace:Got signal (11), printing backtrace
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(gf_print_trace+0x2d) [0xb7f2584d]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[0xbfffe420]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read-ahead.so(ra_page_error+0x47) [0xb755e587]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/read-ahead.so [0xb755ecf0]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/performance/write-behind.so [0xb7561809]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/cluster/unify.so [0xb7564919]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/protocol/client.so [0xb756d17b]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre4/xlator/protocol/client.so [0xb75717a5]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(transport_notify+0x1d) [0xb7f26d2d]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xe7) [0xb7f279d7]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(poll_iteration+0x1d) [0xb7f26ddd]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:glusterfs [0x804a15e]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/lib/libc.so.6(__libc_start_main+0xdc) [0xb7dca8cc]
[May 23 20:00:37] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:glusterfs [0x8049e71]
Segmentation fault (core dumped)

This is a sample of the debug log with read_ahead turned off

[May 24 05:35:05] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport
[May 24 05:35:05] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any)
[May 24 05:35:05] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
[May 24 05:35:05] [DEBUG/protocol.c:244/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of header failed
[May 24 05:35:05] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
[May 24 05:35:05] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:client1: connection to server disconnected
[May 24 05:35:05] [DEBUG/tcp-client.c:180/tcp_connect()] transport: tcp: :try_connect: socket fd = 4
[May 24 05:35:05] [DEBUG/tcp-client.c:202/tcp_connect()] transport: tcp: :try_connect: finalized on port `1022'
[May 24 05:35:05] [DEBUG/tcp-client.c:226/tcp_connect()] tcp/client:try_connect: defaulting remote-port to 6996
[May 24 05:35:05] [DEBUG/tcp-client.c:262/tcp_connect()] tcp/client:connect on 4 in progress (non-blocking)
[May 24 05:35:05] [DEBUG/tcp-client.c:301/tcp_connect()] tcp/client:connection on 4 still in progress - try later
[May 24 05:35:05] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed
[May 24 05:35:05] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
[May 24 05:35:11] [DEBUG/tcp-client.c:310/tcp_connect()] tcp/client:connection on 4 success, attempting to handshake
[May 24 05:35:11] [DEBUG/tcp-client.c:54/do_handshake()] transport/tcp-client:dictionary length = 50
[May 24 07:20:10] [DEBUG/stat-prefetch.c:58/stat_prefetch_cache_flush()] stat-prefetch:flush on: /
[May 24 07:20:20] [DEBUG/stat-prefetch.c:58/stat_prefetch_cache_flush()] stat-prefetch:flush on: /backups/sessions
[May 24 07:57:12] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport
[May 24 07:57:12] [DEBUG/tcp.c:123/cont_hand()] tcp:forcing poll/read/write to break on blocked socket (if any)
[May 24 07:57:12] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115)
[May 24 07:57:12] [DEBUG/protocol.c:244/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of header failed
[May 24 07:57:12] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
[May 24 07:57:12] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:client1: connection to server disconnected
[May 24 07:57:12] [DEBUG/tcp-client.c:180/tcp_connect()] transport: tcp: :try_connect: socket fd = 4
[May 24 07:57:12] [DEBUG/tcp-client.c:202/tcp_connect()] transport: tcp: :try_connect: finalized on port `1023'
[May 24 07:57:12] [DEBUG/tcp-client.c:226/tcp_connect()] tcp/client:try_connect: defaulting remote-port to 6996
[May 24 07:57:12] [DEBUG/tcp-client.c:262/tcp_connect()] tcp/client:connect on 4 in progress (non-blocking)
[May 24 07:57:12] [DEBUG/tcp-client.c:301/tcp_connect()] tcp/client:connection on 4 still in progress - try later
[May 24 07:57:12] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed
[May 24 07:57:12] [DEBUG/client-protocol.c:2605/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x80783d0
[May 24 07:57:12] [DEBUG/tcp-client.c:310/tcp_connect()] tcp/client:connection on 4 success, attempting to handshake
[May 24 07:57:12] [DEBUG/tcp-client.c:54/do_handshake()] transport/tcp-client:dictionary length = 50

This is the client config with read_ahead 

### Add client feature and attach to remote subvolume
volume client1
  type protocol/client
  option transport-type tcp/client     # for TCP/IP transport
# option ibv-send-work-request-size  131072
# option ibv-send-work-request-count 64
# option ibv-recv-work-request-size  131072
# option ibv-recv-work-request-count 64
# option transport-type ib-sdp/client  # for Infiniband transport
# option transport-type ib-verbs/client # for ib-verbs transport
  option remote-host xx.xxx.xx.xxx     # IP address of the remote brick
# option remote-port 6996              # default server port is 6996

# option transport-timeout 30          # seconds to wait for a reply
                                       # from server for each request
  option remote-subvolume brick        # name of the remote volume
end-volume

### Add client feature and attach to remote subvolume
volume client2
  type protocol/client
    option transport-type tcp/client     # for TCP/IP transport
    # option ibv-send-work-request-size  131072
    # option ibv-send-work-request-count 64
    # option ibv-recv-work-request-size  131072
    # option ibv-recv-work-request-count 64
    # option transport-type ib-sdp/client  # for Infiniband transport
    # option transport-type ib-verbs/client # for ib-verbs transport
      option remote-host yy.yyy.yy.yyy     # IP address of the remote brick
    # option remote-port 6996              # default server port is 6996

    # option transport-timeout 30          # seconds to wait for a reply
                                           # from server for each request
      option remote-subvolume brick        # name of the remote volume
end-volume

volume bricks
  type cluster/unify
    subvolumes client1 client2
    option scheduler alu
    option alu.limits.min-free-disk 4GB
    option alu.limits.max-open-files 10000

    option alu.order disk-usage:read-usage:write-usage:open-files-usage
    option alu.disk-usage.entry-threshold 2GB
    option alu.disk-usage.exit-threshold 10GB
    option alu.open-files-usage.entry-threshold 1024
    option alu.open-files-usage.exit-threshold 32
    option alu.stat-refresh.interval 10sec

end-volume
#

### Add writeback feature
volume writeback
  type performance/write-behind
  option aggregate-size 131072 # unit in bytes
  subvolumes bricks
end-volume

### Add readahead feature
volume readahead
  type performance/read-ahead
  option page-size 65536     # unit in bytes
  option page-count 16       # cache per file  = (page-count x page-size)
  subvolumes writeback
end-volume

### Add stat-prefetch feature
### If you are not concerned about performance of interactive commands
### like "ls -l", you wouln't need this translator.
volume statprefetch
   type performance/stat-prefetch
   option cache-seconds 2   # timeout for stat cache
   subvolumes readahead
end-volume

This is the brick config:

### Export volume "brick" with the contents of "/home/export" directory.
volume brick
  type storage/posix                   # POSIX FS translator
  option directory /mnt/export        # Export this directory
end-volume

volume iothreads
  type performance/io-threads
  option thread-count 8
  subvolumes brick
end-volume

### Add network serving capability to above brick.
volume server
  type protocol/server
  option transport-type tcp/server     # For TCP/IP transport
# option ibv-send-work-request-size  131072
# option ibv-send-work-request-count 64
# option ibv-recv-work-request-size  131072
# option ibv-recv-work-request-count 64
# option transport-type ib-sdp/server  # For Infiniband transport
# option transport-type ib-verbs/server # For ib-verbs transport
# option bind-address 192.168.1.10     # Default is to listen on all interfaces
# option listen-port 6996              # Default is 6996
# option client-volume-filename /etc/glusterfs/glusterfs-client.vol
  subvolumes iothreads
# NOTE: Access to any volume through protocol/server is denied by
# default. You need to explicitly grant access through # "auth"
# option.
  option auth.ip.brick.allow * # Allow access to "brick" volume
end-volume