[Gluster-users] Instable server with server/server encryption

Kaushal M kshlmster at gmail.com
Mon Dec 7 10:32:45 UTC 2015


On Mon, Dec 7, 2015 at 2:55 PM, Yannick Perret
<yannick.perret at liris.cnrs.fr> wrote:
> Hello,
>
> I'm having problems with glusterfs and server/server encryption.
>
> I have 2 servers (sto1 & sto2) with latest stable version (3.6.7-1 from
> gluster repo) on Debian 8.2 (amd64), with one single volume with
> replication.
>
> Without /var/lib/glusterd/secure-access all works as expected.
>

Enabling encryption requires a little more work before touching
/var/lib/gluster/secure-access. I have written a blog post [1] which
should help with the steps for getting encryption working with
GlusterFS. Please check it out, and see if you've done everything
required.

[1] https://kshlm.in/network-encryption-in-glusterfs/

> Then I shut down both servers (without any client mounting the volume),
> touch /var/lib/glusterd/secure-access on both servers, and start service on
> one of the servers:
> root at sto2:~# /etc/init.d/glusterfs-server stop
> [ ok ] Stopping glusterfs-server (via systemctl): glusterfs-server.service.
>
> I touch the file:
> root at sto2:~# touch /var/lib/glusterd/secure-access
>
> I start the service (the other server is still down):
> root at sto2:~# /etc/init.d/glusterfs-server start
> [ ok ] Starting glusterfs-server (via systemctl): glusterfs-server.service.
> root at sto2:~# ps aux | grep glus
> root     22538  1.3  0.4 402828 18668 ?        Ssl  10:07   0:00
> /usr/sbin/glusterd -p /var/run/glusterd.pid
> -> it is running.
>
> I check the pool:
> root at sto2:~# gluster pool list
> UUID                    Hostname              State
> 5fdb629d-886f-43cb-9a71-582051b0dbb2    sto1...    Disconnected
> 8f51f101-254e-43f9-82a3-ec02591110b5    localhost Connected
>
> It is what expected at this point.
> But now the gluster daemon is dead:
> root at sto2:~# gluster pool list
> Connection failed. Please check if gluster daemon is operational.
>
> I can stop and start again the service, and it dies after the 1st command,
> whatever the command (tested with 'gluster volume status' which answers
> 'Volume HOME is not started' which is the correct state as I stoped the only
> volume before activating server/server encryption).
>
> Note that at this point the other server is still down and no client is
> started.
> See at the end the "crash log" from the server.
>
>
> I guess it is not the expected behavior, and it is clearly a different
> behavior than without server/server encryption. For example if I remove the
> secure-access file:
>
> root at sto2:~# /etc/init.d/glusterfs-server stop
> [ ok ] Stopping glusterfs-server (via systemctl): glusterfs-server.service.
> root at sto2:~# rm /var/lib/glusterd/secure-access
> root at sto2:~# /etc/init.d/glusterfs-server start
> [ ok ] Starting glusterfs-server (via systemctl): glusterfs-server.service.
> root at sto2:~# gluster pool list
> UUID                    Hostname              State
> 5fdb629d-886f-43cb-9a71-582051b0dbb2    sto1...    Disconnected
> 8f51f101-254e-43f9-82a3-ec02591110b5    localhost Connected
>
> And whatever I do the daemon is still alive and responding.
>
>
> Is this a bug or I missed something needed when moving to server/server
> encryption?
>
>
> Moreover if I try to start the other server without performing any action on
> the 1st (to prevent crash I have a "ping-pong" crash (start at sto2 then
> start at sto1):
> root at sto2:~# /etc/init.d/glusterfs-server start
> [ ok ] Starting glusterfs-server (via systemctl): glusterfs-server.service.
> root at sto1:~# /etc/init.d/glusterfs-server start
> [ ok ] Starting glusterfs-server (via systemctl): glusterfs-server.service.
> root at sto1:~# gluster pool list
> UUID                    Hostname              State
> 8f51f101-254e-43f9-82a3-ec02591110b5    sto2.liris.cnrs.fr Disconnected
> 5fdb629d-886f-43cb-9a71-582051b0dbb2    localhost Connected
> -> here daemon is dead on sto2. Let restart sto2 daemon:
> root at sto2:~# /etc/init.d/glusterfs-server restart
> [ ok ] Restarting glusterfs-server (via systemctl):
> glusterfs-server.service.
> root at sto2:~# gluster pool list
> UUID                    Hostname              State
> 5fdb629d-886f-43cb-9a71-582051b0dbb2    sto1.liris.cnrs.fr Disconnected
> 8f51f101-254e-43f9-82a3-ec02591110b5    localhost Connected
> -> here daemon is dead on sto1.
> root at sto1:~# gluster pool list
> Connection failed. Please check if gluster daemon is operational.
>
>
> If I restart both daemons (mostly) at the same time it works fine:
> root at sto1:~# /etc/init.d/glusterfs-server restart
> [ ok ] Restarting glusterfs-server (via systemctl):
> glusterfs-server.service.
> root at sto2:~# /etc/init.d/glusterfs-server restart
> [ ok ] Restarting glusterfs-server (via systemctl): glusterfs-server.service
> root at sto1:~# gluster pool list
> UUID                    Hostname              State
> 8f51f101-254e-43f9-82a3-ec02591110b5    sto2.liris.cnrs.fr Connected
> 5fdb629d-886f-43cb-9a71-582051b0dbb2    localhost Connected
> root at sto2:~# gluster pool list
> UUID                    Hostname              State
> 5fdb629d-886f-43cb-9a71-582051b0dbb2    sto1.liris.cnrs.fr Connected
> 8f51f101-254e-43f9-82a3-ec02591110b5    localhost Connected
>
>
> Of course this is not an expected behavior as after a global shutdown
> servers may not restart at the same time. Moreover it is a real problem when
> shuting down a single server (i.e. for maintenance) as I get again the
> "ping-pong" problem.
>
>
> Any help would be appreciate.
>
> Note : before that these 2 servers were used for testing replicated volumes
> (without encryption) without any problem.
>
> Regards,
> --
> Y.
>
> Log from sto2:
>
> cat /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
>
> [2015-12-07 09:09:43.345640] I [MSGID: 100030] [glusterfsd.c:2035:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.7
> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid)
> [2015-12-07 09:09:43.352452] I [glusterd.c:1214:init] 0-management: Maximum
> allowed open file descriptors set to 65536
> [2015-12-07 09:09:43.352516] I [glusterd.c:1259:init] 0-management: Using
> /var/lib/glusterd as working directory
> [2015-12-07 09:09:43.359063] I [socket.c:3880:socket_init]
> 0-socket.management: SSL support on the I/O path is ENABLED
> [2015-12-07 09:09:43.359102] I [socket.c:3883:socket_init]
> 0-socket.management: SSL support for glusterd is ENABLED
> [2015-12-07 09:09:43.359138] I [socket.c:3900:socket_init]
> 0-socket.management: using private polling thread
> [2015-12-07 09:09:43.361848] W [rdma.c:4440:__gf_rdma_ctx_create]
> 0-rpc-transport/rdma: rdma_cm event channel creation failed (Aucun
> périphérique de ce type)
> [2015-12-07 09:09:43.361885] E [rdma.c:4744:init] 0-rdma.management: Failed
> to initialize IB Device
> [2015-12-07 09:09:43.361902] E [rpc-transport.c:333:rpc_transport_load]
> 0-rpc-transport: 'rdma' initialization failed
> [2015-12-07 09:09:43.362023] W [rpcsvc.c:1524:rpcsvc_transport_create]
> 0-rpc-service: cannot create listener, initing the transport failed
> [2015-12-07 09:09:43.362267] I [socket.c:3883:socket_init]
> 0-socket.management: SSL support for glusterd is ENABLED
> [2015-12-07 09:09:46.812491] I
> [glusterd-store.c:2048:glusterd_restore_op_version] 0-glusterd: retrieved
> op-version: 30603
> [2015-12-07 09:09:47.192205] I
> [glusterd-handler.c:3179:glusterd_friend_add_from_peerinfo] 0-management:
> connect returned 0
> [2015-12-07 09:09:47.192321] I [rpc-clnt.c:969:rpc_clnt_connection_init]
> 0-management: setting frame-timeout to 600
> [2015-12-07 09:09:47.192564] I [socket.c:3880:socket_init] 0-management: SSL
> support on the I/O path is ENABLED
> [2015-12-07 09:09:47.192585] I [socket.c:3883:socket_init] 0-management: SSL
> support for glusterd is ENABLED
> [2015-12-07 09:09:47.192601] I [socket.c:3900:socket_init] 0-management:
> using private polling thread
> [2015-12-07 09:09:47.195831] E [socket.c:3016:socket_connect] 0-management:
> connection attempt on  failed, (Connexion refusée)
> [2015-12-07 09:09:47.196341] I [MSGID: 106004]
> [glusterd-handler.c:4398:__glusterd_peer_rpc_notify] 0-management: Peer
> 5fdb629d-886f-43cb-9a71-582051b0dbb2, in Peer in Cluster state, has
> disconnected from glusterd.
> [2015-12-07 09:09:47.196413] E [socket.c:384:ssl_setup_connection]
> 0-management: SSL connect error
> [2015-12-07 09:09:47.196480] E [socket.c:2386:socket_poller] 0-management:
> client setup failed
> [2015-12-07 09:09:47.196534] E [glusterd-utils.c:181:glusterd_unlock]
> 0-management: Cluster lock not held!
> [2015-12-07 09:09:47.196642] I [mem-pool.c:545:mem_pool_destroy]
> 0-management: size=588 max=0 total=0
> [2015-12-07 09:09:47.196671] I [mem-pool.c:545:mem_pool_destroy]
> 0-management: size=124 max=0 total=0
> [2015-12-07 09:09:47.196787] I [glusterd.c:146:glusterd_uuid_init]
> 0-management: retrieved UUID: 8f51f101-254e-43f9-82a3-ec02591110b5
> Final graph:
> +------------------------------------------------------------------------------+
>   1: volume management
>   2:     type mgmt/glusterd
>   3:     option transport.socket.ssl-enabled on
>   4:     option rpc-auth.auth-glusterfs on
>   5:     option rpc-auth.auth-unix on
>   6:     option rpc-auth.auth-null on
>   7:     option transport.socket.listen-backlog 128
>   8:     option ping-timeout 30
>   9:     option transport.socket.read-fail-log off
>  10:     option transport.socket.keepalive-interval 2
>  11:     option transport.socket.keepalive-time 10
>  12:     option transport-type rdma
>  13:     option working-directory /var/lib/glusterd
>  14: end-volume
>  15:
> +------------------------------------------------------------------------------+
> [2015-12-07 09:09:50.348636] E [socket.c:2859:socket_connect] (-->
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x147)[0x7f1b5a951497]
> (-->
> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.7/rpc-transport/socket.so(+0x6c32)[0x7f1b545c3c32]
> (-->
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_reconnect+0xb9)[0x7f1b5a723469]
> (-->
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_timer_proc+0xcd)[0x7f1b5a96b40d]
> (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f1b5a0e50a4] )))))
> 0-socket: invalid argument: this->private
> [2015-12-07 09:09:53.349724] E [socket.c:2859:socket_connect] (-->
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x147)[0x7f1b5a951497]
> (-->
> /usr/lib/x86_64-linux-gnu/glusterfs/3.6.7/rpc-transport/socket.so(+0x6c32)[0x7f1b545c3c32]
> (-->
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_reconnect+0xb9)[0x7f1b5a723469]
> (-->
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_timer_proc+0xcd)[0x7f1b5a96b40d]
> (--> /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f1b5a0e50a4] )))))
> 0-socket: invalid argument: this->private
> [2015-12-07 09:09:55.604061] W
> [glusterd-op-sm.c:4073:glusterd_op_modify_op_ctx] 0-management: op_ctx
> modification failed
> [2015-12-07 09:09:55.604797] I
> [glusterd-handler.c:3836:__glusterd_handle_status_volume] 0-management:
> Received status volume req for volume HOME
> [2015-12-07 09:09:55.605488] E [glusterd-syncop.c:1184:gd_stage_op_phase]
> 0-management: Staging of operation 'Volume Status' failed on localhost :
> Volume HOME is not started
> [2015-12-07 09:09:47.196634] I [MSGID: 106004]
> [glusterd-handler.c:4398:__glusterd_peer_rpc_notify] 0-management: Peer
> 5fdb629d-886f-43cb-9a71-582051b0dbb2, in Peer in Cluster state, has
> disconnected from glusterd.
> pending frames:
> patchset: git://git.gluster.com/glusterfs.git
> signal received: 11
> time of crash:
> 2015-12-07 09:09:56
> configuration details:
> argp 1
> backtrace 1
> dlfcn 1
> libpthread 1
> llistxattr 1
> setfsid 1
> spinlock 1
> epoll.h 1
> xattr.h 1
> st_atim.tv_nsec 1
> package-string: glusterfs 3.6.7
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb1)[0x7f1b5a9522a1]
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f1b5a96919d]
> /lib/x86_64-linux-gnu/libc.so.6(+0x35180)[0x7f1b5996e180]
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_connect+0x8)[0x7f1b5a721f48]
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_reconnect+0xb9)[0x7f1b5a723469]
> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_timer_proc+0xcd)[0x7f1b5a96b40d]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)[0x7f1b5a0e50a4]
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f1b59a1f04d]
> ---------
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list