[Bugs] [Bug 1218167] New: [GlusterFS 3.6.3]: Brick crashed after setting up SSL/TLS in I/O access path with error: "E [socket.c:2495:socket_poller] 0-tcp.gluster-native-volume-3G-1-server: error in polling loop"

bugzilla at redhat.com bugzilla at redhat.com
Mon May 4 10:50:36 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1218167

            Bug ID: 1218167
           Summary: [GlusterFS 3.6.3]: Brick crashed after setting up
                    SSL/TLS in I/O access path with error: "E
                    [socket.c:2495:socket_poller]
                    0-tcp.gluster-native-volume-3G-1-server: error in
                    polling loop"
           Product: GlusterFS
           Version: 3.6.3
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: ssamanta at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com



Description of problem:
Brick got crashed on a node with the following error in the brick logs: 
"E [socket.c:2495:socket_poller] 0-tcp.gluster-native-volume-3G-1-server: error
in polling loop"

Version-Release number of selected component (if applicable):
GlusterFS3.6.3

[root at gqas006 ssl]# rpm -qa | grep gluster
glusterfs-hadoop-distribution-glusterfs-hadoop-setup_hadoop-0.1-122.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_gluster_selfheal-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-setup_bigtop-0.2.1-24.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_user_mapred_job-0.1-4.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_file_dir_permissions-0.1-9.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_home_dir_listing-0.1-5.noarch
glusterfs-libs-3.6.3-1.fc20.x86_64
glusterfs-geo-replication-3.6.3-1.fc20.x86_64
glusterfs-resource-agents-3.5.3-1.fc20.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_default_block_size-0.1-4.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_multiuser_support-0.1-4.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_multiple_volumes-0.1-18.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_hive-0.1-12.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_gridmix3-0.1-2.noarch
glusterfs-devel-3.6.3-1.fc20.x86_64
glusterfs-hadoop-distribution-glusterfs-hadoop-setup_common-0.2-119.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-setup_gluster-0.2-78.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-glusterd_tests-0.2-1.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop-0.1-7.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_special_char_in_path-0.1-2.noarch
glusterfs-debuginfo-3.6.2-1.fc20.x86_64
glusterfs-hadoop-distribution-glusterfs-hadoop-test_dfsio_io_exception-0.1-8.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_ldap-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_hadoop_hcfs_fileappend-0.1-5.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_missing_dirs_create-0.1-4.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_sqoop-0.1-2.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_hadoop_hcfs_quota-0.1-6.noarch
glusterfs-3.6.3-1.fc20.x86_64
glusterfs-cli-3.6.3-1.fc20.x86_64
glusterfs-rdma-3.6.3-1.fc20.x86_64
glusterfs-hadoop-2.1.2-2.fc20.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_hadoop_hcfs_testcli-0.2-7.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_dfsio-0.1-2.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_multifilewc_null_pointer_exception-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_gluster_quota_selfheal-0.2-11.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_append_to_file-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_hbase-0.1-4.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_shim_access_error_messages-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_hadoop_mapreduce-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_mahout-0.1-6.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_erroneous_multivolume_filepaths-0.1-4.noarch
glusterfs-fuse-3.6.3-1.fc20.x86_64
glusterfs-server-3.6.3-1.fc20.x86_64
glusterfs-hadoop-javadoc-2.1.2-2.fc20.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_groovy_sync-0.1-24.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-setup_rhs_georep-0.1-3.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_setting_working_directory-0.1-2.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_junit_shim-0.1-13.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-setup_hadoop_security-0.0.1-11.noarch
glusterfs-extra-xlators-3.6.3-1.fc20.x86_64
glusterfs-hadoop-distribution-glusterfs-hadoop-test_brick_sorted_order_of_filenames-0.1-2.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_fs_counters-0.1-11.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_generate_gridmix2_data-0.1-3.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_selinux_persistently_disabled-0.1-2.noarch
glusterfs-hadoop-distribution-glusterfs-hadoop-test_bigtop_pig-0.1-9.noarch
glusterfs-api-3.6.3-1.fc20.x86_64
glusterfs-api-devel-3.6.3-1.fc20.x86_64
[root at gqas006 ssl]# 

How reproducible:
I am not certain which caused the crash. I will update more details if I
reproduce it again.

Steps to Reproduce:
1. Create a 2*2 dist-rep volume and start it
2. Create a private+public file for each server and client nodes
3. Concatenate the ca file and copy to server and client nodes. Set the
necessary volume option for SSL/TLS to work properly.
https://github.com/gluster/glusterfs/blob/master/doc/admin-guide/en-US/markdown/admin_ssl.md
4. Mount from the client

Actual results:
There is a brick crash for some nodes.

Expected results:
Bricks should not crash.

Additional info:

[root at remote-gluster-server ~]# gluster volume status
Status of volume: gluster-native-volume-1G-1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick 10.16.156.12:/rhs/brick1/newvol7            49159    Y    22418
Brick 10.16.156.15:/rhs/brick1/newvol7            49159    Y    6059
Brick 10.16.156.24:/rhs/brick1/newvol7            49170    Y    24581
Brick 10.16.156.24:/rhs/brick2/newvol7            49171    Y    24605
NFS Server on localhost                    2049    Y    24043
Self-heal Daemon on localhost                N/A    Y    24050
NFS Server on gqas006.sbu.lab.eng.bos.redhat.com    2049    Y    7212
Self-heal Daemon on gqas006.sbu.lab.eng.bos.redhat.com    N/A    Y    7219
NFS Server on gqas009.sbu.lab.eng.bos.redhat.com    2049    Y    26026
Self-heal Daemon on gqas009.sbu.lab.eng.bos.redhat.com    N/A    Y    26033

Task Status of Volume gluster-native-volume-1G-1
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: gluster-native-volume-3G-1
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick 10.16.156.12:/rhs/brick1/newvol8            49160    Y    13626
Brick 10.16.156.15:/rhs/brick1/newvol8            N/A    N    31104  ---> Brick
crashed
Brick 10.16.156.24:/rhs/brick1/newvol8            N/A    N    14854
Brick 10.16.156.24:/rhs/brick2/newvol8            49173    Y    14865
NFS Server on localhost                    2049    Y    24043
Self-heal Daemon on localhost                N/A    Y    24050
NFS Server on gqas006.sbu.lab.eng.bos.redhat.com    2049    Y    7212
Self-heal Daemon on gqas006.sbu.lab.eng.bos.redhat.com    N/A    Y    7219
NFS Server on gqas009.sbu.lab.eng.bos.redhat.com    2049    Y    26026
Self-heal Daemon on gqas009.sbu.lab.eng.bos.redhat.com    N/A    Y    26033

Task Status of Volume gluster-native-volume-3G-1
------------------------------------------------------------------------------
There are no active volume tasks

[root at remote-gluster-server ~]# yum info openssl
Installed Packages
Name        : openssl
Arch        : x86_64
Epoch       : 1
Version     : 1.0.1e
Release     : 42.fc20
Size        : 1.5 M
Repo        : installed
>From repo   : fedora-updates
Summary     : Utilities from the general purpose cryptography library with TLS
implementation
URL         : http://www.openssl.org/
License     : OpenSSL
Description : The OpenSSL toolkit provides support for secure communications
between
            : machines. OpenSSL includes a certificate management tool and
shared
            : libraries which provide various cryptographic algorithms and
            : protocols.

[root at remote-gluster-server ~]# 


[2015-04-29 09:32:46.921692] E [socket.c:2495:socket_poller]
0-tcp.gluster-native-volume-3G-1-server: error in polling loop
[2015-04-29 09:32:47.927424] E [socket.c:2495:socket_poller]
0-tcp.gluster-native-volume-3G-1-server: error in polling loop
[2015-04-29 09:32:49.084098] E [socket.c:2495:socket_poller]
0-tcp.gluster-native-volume-3G-1-server: error in polling loop
[2015-04-29 09:32:49.242428] E [socket.c:2495:socket_poller]
0-tcp.gluster-native-volume-3G-1-server: error in polling loop
[2015-04-29 09:32:50.089756] E [socket.c:2495:socket_poller]
0-tcp.gluster-native-volume-3G-1-server: error in polling loop
[2015-04-29 09:32:50.250215] E [socket.c:2495:socket_poller]
0-tcp.gluster-native-volume-3G-1-server: error in polling loop
pending frames:
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-04-29 09:32:51
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.3
pending frames:
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-04-29 09:32:51
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.3
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f2022cac362]
/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f2022cc385d]
/lib64/libc.so.6(+0x358f0)[0x7f2021cc68f0]
/lib64/libcrypto.so.10(sk_value+0x19)[0x7f20221323f9]
/lib64/libcrypto.so.10(+0x10126b)[0x7f202215026b]
/lib64/libcrypto.so.10(ASN1_item_ex_i2d+0x163)[0x7f2022154f03]
/lib64/libcrypto.so.10(+0x1061ff)[0x7f20221551ff]
/lib64/libcrypto.so.10(X509_NAME_cmp+0x5a)[0x7f202216963a]
/lib64/libcrypto.so.10(X509_check_issued+0x28)[0x7f202217b628]
/lib64/libcrypto.so.10(+0x11b8a5)[0x7f202216a8a5]
/lib64/libcrypto.so.10(X509_verify_cert+0xb4)[0x7f202216bfa4]
/lib64/libssl.so.10(ssl3_output_cert_chain+0x1a8)[0x7f2013bacb68]
/lib64/libssl.so.10(ssl3_send_server_certificate+0x35)[0x7f2013ba03d5]
/lib64/libssl.so.10(ssl3_accept+0xd1d)[0x7f2013ba184d]
/usr/lib64/glusterfs/3.6.3/rpc-transport/socket.so(+0x478a)[0x7f2013def78a]
/usr/lib64/glusterfs/3.6.3/rpc-transport/socket.so(+0x5e50)[0x7f2013df0e50]
/usr/lib64/glusterfs/3.6.3/rpc-transport/socket.so(+0xb159)[0x7f2013df6159]
/lib64/libpthread.so.0(+0x7ee5)[0x7f202243eee5]
/lib64/libc.so.6(clone+0x6d)[0x7f2021d85d1d]
---------
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f2022cac362]
/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f2022cc385d]
/lib64/libc.so.6(+0x358f0)[0x7f2021cc68f0]

sos-repots:
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/sosreport-gqas006.sbu.lab.eng.bos.redhat.com-20150504043010.tar.xz

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list