[Gluster-users] Mysterious volume unmounts

Jamie Lawrence jlawrence at squaretrade.com
Tue Nov 19 20:13:39 UTC 2019


Hello,

I have a bizarre situation, and cannot figure out what is going on. One Gluster volume will spontaneously unmount. This happens across multiple clients, but only with this volume - other gluster volumes from the same cluster mounted on the same guests do not have this problem.

This is Gluster 5.9, 3x3 on Centos 7.7. Clients are Ubuntu 16.04.  The volume:

gluster volume create inf_prod_shared replica 3 \
  storage-1:/gluster-bricks/pool-1/inf_prod_shared \
  storage-2:/gluster-bricks/pool-1/inf_prod_shared \
  storage-3:/gluster-bricks/pool-1/inf_prod_shared \
  storage-4:/gluster-bricks/pool-1/inf_prod_shared \
  storage-5:/gluster-bricks/pool-1/inf_prod_shared \
  storage-6:/gluster-bricks/pool-1/inf_prod_shared \
  storage-7:/gluster-bricks/pool-1/inf_prod_shared \
  storage-8:/gluster-bricks/pool-1/inf_prod_shared \
  storage-9:/gluster-bricks/pool-1/inf_prod_shared

gluster v set inf_prod_shared rpc-auth-allow     10.1.2.190,10.1.2.191,10.1.2.192,10.1.2.193

gluster v quota inf_prod_shared enable
gluster v quota inf_prod_shared limit-usage / 1TB
gluster v set inf_prod_shared quota-deem-statfs on

The client logs show what seems to me like an orderly unmount:

[2019-11-18 19:52:16.301715] I [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse: initating unmount of /mnt/inf
[2019-11-18 19:52:16.302067] W [glusterfsd.c:1500:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f9d6a7c66ba] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xed) [0x564cfdec094d] -->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x564cfdec07b4] ) 0-: received signum (15), shutting down
[2019-11-18 19:52:16.302093] I [fuse-bridge.c:5914:fini] 0-fuse: Unmounting '/mnt/inf'.
[2019-11-18 19:52:16.302104] I [fuse-bridge.c:5919:fini] 0-fuse: Closing fuse connection to '/mnt/inf'.

Brick log from one of the servers: 

[2019-11-18 19:52:16.305067] I [MSGID: 115036] [server.c:469:server_rpc_notify] 0-inf_prod_shared-server: disconnecting connection from CTX_ID:61d1fc2c-9c73-4ca0-a0e9-ad074b65b1a9-GRAPH_ID:0-PID:14661-HOST:inf-2-PC_NAME:inf_prod_shared-client-0-RECON_NO:-0

Nothing in glusterd.log for the same time.

I was curious as to where the signal 15 was coming from, so I  asked auditd to tell me:

time->Wed Nov 13 12:07:24 2019
node=inf-2 type=PROCTITLE msg=audit(1573675644.059:161681): proctitle=2F7573722F7362696E2F676C75737465726673002D2D70726F636573732D6E616D650066757365002D2D766F6C66696C652D7365727665723D7363352D73746F726167652D31002D2D766F6C66696C652D7365727665723D7363352D73746F726167652D32002D2D766F6C66696C652D69643D2F7363355F696E666F726D6174
node=inf-2 type=OBJ_PID msg=audit(1573675644.059:161681): opid=11946 oauid=12662 ouid=0 oses=3922 ocomm="glusterfs"
node=inf-2 type=SYSCALL msg=audit(1573675644.059:161681): arch=c000003e syscall=62 success=yes exit=0 a0=2eaa a1=f a2=2eaa a3=2 items=0 ppid=1 pid=11949 auid=12662 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=3922 comm="glfs_sigwait" exe="/usr/sbin/glusterfsd" key="jal_test"

So glusterfsd is telling glusterfs to exit. I just have no idea why.

The fact that only one volume of many has this trouble (and across multiple clients) makes me think config, but not only is this nothing special, configuration-wise it is identical to other volumes that are behaving normally. I could be searching on the wrong keywords, but am not finding examples of this happening to others. 

Even though these are controlled servers with very restricted access, I've dug through systemctl's unit files, wondering if something bizarre ended up enabled, and checked crons for anything that might unmount a volume. Nothing.

Correlating timing is inconclusive; sometimes one client will unmount while the others don't, sometimes more than one will unmount within a short time frame. I can't find anything external that would cause this, and am running out of ideas for things to check.

Does anyone have any suggestions?

-j
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4975 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191119/19da3719/attachment.p7s>


More information about the Gluster-users mailing list