[Gluster-devel] Glusto failures with dispersed volumes + Samba
Henrik Kuhn
henrik.kuhn at gmail.com
Thu Jan 25 10:39:31 UTC 2018
Hi gluster-devel experts,
I've stumbled upon the very same issue/observations like the one
mentioned in
http://lists.gluster.org/pipermail/gluster-devel/2017-July/053234.html.
Nigel Babu and Pranith Kumar Karampuri told me that this selinux related
issue had been also been fixed in >= v3.13.0.
The test setup is as follows:
OS: OpenSUSE Leap 42.3 (selinux features installed/enabled)
SW: gluster 3.13.1
- 3 gluster nodes (gnode[1,2,3] with own vlan for gluster communication)
- 1 virtualization server (snode) upon which the test samba server
instance is running under KVM/QEMU. The image is provided by the
gluster (/vmvol/).
The gluster volume /vol1/ is for the samba share.
snode:~ # ssh gnode1 gluster vol info all
*Volume Name: vmvol*
Type: Replicate
Volume ID: a03b8fc1-4fcb-4268-bf09-0f554ba5e7a5
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gs-gnode1:/data/glusterfs/vmvol/brick1/brick
Brick2: gs-gnode2:/data/glusterfs/vmvol/brick1/brick
Brick3: gs-gnode3:/data/glusterfs/vmvol/brick1/brick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.low-prio-threads: 32
network.remote-dio: enable
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
server.allow-insecure: on
storage.owner-uid: 500
storage.owner-gid: 500
*Volume Name: vol1 *
Type: Disperse
Volume ID: fb081b58-bffc-4ddd-bf62-a87a13abec9b
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: gs-gnode1:/data/glusterfs/vol1/brick1/brick
Brick2: gs-gnode2:/data/glusterfs/vol1/brick1/brick
Brick3: gs-gnode3:/data/glusterfs/vol1/brick1/brick
Brick4: gs-gnode1:/data/glusterfs/vol1/brick2/brick
Brick5: gs-gnode2:/data/glusterfs/vol1/brick2/brick
Brick6: gs-gnode3:/data/glusterfs/vol1/brick2/brick
Options Reconfigured:
performance.readdir-ahead: on
storage.batch-fsync-delay-usec: 0
performance.stat-prefetch: off
nfs.disable: on
transport.address-family: inet
server.allow-insecure: on
The samba server config is:
samba:~ # cat /etc/samba/smb.conf
[global]
workgroup = TESTGROUP
server string = Samba Server Version %v
log file = /var/log/samba/log.%m
log level = 2
map to guest = Bad User
unix charset = UTF-8
idmap config * : backend = autorid
idmap config * : range = 1000000-1999999
[gluster]
kernel share modes = no
vfs objects = acl_xattr glusterfs
comment = glusterfs based volume
browseable = Yes
read only = No
writeable = Yes
public = Yes
guest ok = Yes
inherit acls = Yes
path = /shares/data/
glusterfs:volume = vol1
glusterfs:volfile_server = gs-gnode1.origenis.de
gs-gnode2.origenis.de gs-gnode3.origenis.de
# glusterfs:volfile_server = 172.17.20.1 172.17.20.2 172.17.20.3
glusterfs:loglevel = 9
glusterfs:logfile = /var/log/samba/glusterfs-vol1.%M.log
The corresponding logs on the samba server are:
* The vfs_gluster log shows lots of:
[2018-01-15 15:27:49.349995] D
[logging.c:1817:__gf_log_inject_timer_event] 0-logging-infra: Starting
timer now. Timeout = 120, current buf size = 5
[2018-01-15 15:27:49.351598] D
[rpc-clnt.c:1047:rpc_clnt_connection_init] 0-gfapi: defaulting
frame-timeout to 30mins
[2018-01-15 15:27:49.351625] D
[rpc-clnt.c:1061:rpc_clnt_connection_init] 0-gfapi: disable ping-timeout
[2018-01-15 15:27:49.351644] D [rpc-transport.c:279:rpc_transport_load]
0-rpc-transport: attempt to load file
/usr/lib64/glusterfs/3.13.1/rpc-transport/socket.so
[2018-01-15 15:27:49.352372] W [MSGID: 101002]
[options.c:995:xl_opt_validate] 0-gfapi: option 'address-family' is
deprecated, preferred is 'transport.address-family', continuing with
correction
[2018-01-15 15:27:49.352402] T [MSGID: 0]
[options.c:86:xlator_option_validate_int] 0-gfapi: no range check
required for 'option remote-port 24007'
[2018-01-15 15:27:49.354865] D [socket.c:4236:socket_init] 0-gfapi:
Configued transport.tcp-user-timeout=0
[2018-01-15 15:27:49.354885] D [socket.c:4254:socket_init] 0-gfapi:
Reconfigued transport.keepalivecnt=9
[2018-01-15 15:27:49.354901] D [socket.c:4339:socket_init] 0-gfapi: SSL
support on the I/O path is NOT enabled
[2018-01-15 15:27:49.354914] D [socket.c:4342:socket_init] 0-gfapi: SSL
support for glusterd is NOT enabled
[2018-01-15 15:27:49.354926] D [socket.c:4359:socket_init] 0-gfapi:
using system polling thread
[2018-01-15 15:27:49.354943] D
[rpc-clnt.c:1567:rpcclnt_cbk_program_register] 0-gfapi: New program
registered: GlusterFS Callback, Num: 52743234, Ver: 1
[2018-01-15 15:27:49.354958] T [rpc-clnt.c:406:rpc_clnt_reconnect]
0-gfapi: attempting reconnect
[2018-01-15 15:27:49.354971] T [socket.c:3146:socket_connect] 0-gfapi:
connecting 0x55e1aca9f9b0, state=0 gen=0 sock=-1
[2018-01-15 15:27:49.354991] T [MSGID: 0]
[common-utils.c:306:gf_resolve_ip6] 0-resolver: DNS cache not present,
freshly probing hostname: gs-gnode1.origenis.de
[2018-01-15 15:27:49.360012] D [MSGID: 0]
[common-utils.c:346:gf_resolve_ip6] 0-resolver: returning ip-172.17.20.1
(port-24007) for hostname: gs-gnode1.origenis.de and port: 24007
[2018-01-15 15:27:49.360039] D [socket.c:3063:socket_fix_ssl_opts]
0-gfapi: disabling SSL for portmapper connection
[2018-01-15 15:27:49.360068] W [socket.c:3216:socket_connect] 0-gfapi:
Error disabling sockopt IPV6_V6ONLY: "Protocol not available"
[2018-01-15 15:27:49.360086] T [socket.c:834:__socket_nodelay] 0-gfapi:
NODELAY enabled for socket 40
[2018-01-15 15:27:49.360109] T [socket.c:922:__socket_keepalive]
0-gfapi: Keep-alive enabled for socket: 40, (idle: 20, interval: 2,
max-probes: 9, timeout: 0)
[2018-01-15 15:27:49.360776] T [rpc-clnt.c:1496:rpc_clnt_record]
0-gfapi: Auth Info: pid: 0, uid: 0, gid: 0, owner:
[2018-01-15 15:27:49.360811] T
[rpc-clnt.c:1353:rpc_clnt_record_build_header] 0-rpc-clnt: Request
fraglen 140, payload: 76, rpc hdr: 64
[2018-01-15 15:27:49.360856] T [rpc-clnt.c:1699:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x2 Program: GlusterFS Handshake,
ProgVers: 2, Proc: 2) to rpc-transport (gfapi)
[2018-01-15 15:27:49.360875] D [rpc-clnt-ping.c:317:rpc_clnt_start_ping]
0-gfapi: ping timeout is 0, returning
[2018-01-15 15:27:49.362364] T [rpc-clnt.c:675:rpc_clnt_reply_init]
0-gfapi: received rpc message (RPC XID: 0x2 Program: GlusterFS
Handshake, ProgVers: 2, Proc: 2) from rpc-transport (gfapi)
[2018-01-15 15:27:49.362932] T [MSGID: 0] [graph.y:187:new_volume]
0-parser: New node for 'vol1-client-0'
....
....
[2018-01-15 15:27:49.386102] T [socket.c:922:__socket_keepalive]
0-vol1-client-5: Keep-alive enabled for socket: 47, (idle: 20, interval:
2, max-probes: 9, timeout: 0)
Final graph:
+------------------------------------------------------------------------------+
1: volume vol1-client-0
2: type protocol/client
3: option ping-timeout 42
4: option remote-host gs-gnode1
5: option remote-subvolume /data/glusterfs/vol1/brick1/brick
6: option transport-type socket
7: option transport.address-family inet
8: option transport.tcp-user-timeout 0
9: option transport.socket.keepalive-time 20
10: option transport.socket.keepalive-interval 2
11: option transport.socket.keepalive-count 9
12: option send-gids true
13: end-volume
14:
15: volume vol1-client-1
16: type protocol/client
17: option ping-timeout 42
18: option remote-host gs-gnode2
19: option remote-subvolume /data/glusterfs/vol1/brick1/brick
20: option transport-type socket
21: option transport.address-family inet
22: option transport.tcp-user-timeout 0
23: option transport.socket.keepalive-time 20
24: option transport.socket.keepalive-interval 2
25: option transport.socket.keepalive-count 9
26: option send-gids true
27: end-volume
28:
29: volume vol1-client-2
30: type protocol/client
31: option ping-timeout 42
32: option remote-host gs-gnode3
33: option remote-subvolume /data/glusterfs/vol1/brick1/brick
34: option transport-type socket
35: option transport.address-family inet
36: option transport.tcp-user-timeout 0
37: option transport.socket.keepalive-time 20
38: option transport.socket.keepalive-interval 2
39: option transport.socket.keepalive-count 9
40: option send-gids true
41: end-volume
42:
43: volume vol1-client-3
44: type protocol/client
45: option ping-timeout 42
46: option remote-host gs-gnode1
47: option remote-subvolume /data/glusterfs/vol1/brick2/brick
48: option transport-type socket
49: option transport.address-family inet
50: option transport.tcp-user-timeout 0
51: option transport.socket.keepalive-time 20
52: option transport.socket.keepalive-interval 2
53: option transport.socket.keepalive-count 9
54: option send-gids true
55: end-volume
56:
57: volume vol1-client-4
58: type protocol/client
59: option ping-timeout 42
60: option remote-host gs-gnode2
61: option remote-subvolume /data/glusterfs/vol1/brick2/brick
62: option transport-type socket
63: option transport.address-family inet
64: option transport.tcp-user-timeout 0
65: option transport.socket.keepalive-time 20
66: option transport.socket.keepalive-interval 2
67: option transport.socket.keepalive-count 9
68: option send-gids true
69: end-volume
70:
71: volume vol1-client-5
72: type protocol/client
73: option ping-timeout 42
74: option remote-host gs-gnode3
75: option remote-subvolume /data/glusterfs/vol1/brick2/brick
76: option transport-type socket
77: option transport.address-family inet
78: option transport.tcp-user-timeout 0
79: option transport.socket.keepalive-time 20
80: option transport.socket.keepalive-interval 2
81: option transport.socket.keepalive-count 9
82: option send-gids true
83: end-volume
84:
85: volume vol1-disperse-0
86: type cluster/disperse
87: option redundancy 2
88: subvolumes vol1-client-0 vol1-client-1 vol1-client-2
vol1-client-3 vol1-client-4 vol1-client-5
89: end-volume
90:
91: volume vol1-dht
92: type cluster/distribute
93: option lock-migration off
94: subvolumes vol1-disperse-0
95: end-volume
96:
97: volume vol1-write-behind
98: type performance/write-behind
99: subvolumes vol1-dht
100: end-volume
101:
102: volume vol1-read-ahead
103: type performance/read-ahead
104: subvolumes vol1-write-behind
105: end-volume
106:
107: volume vol1-readdir-ahead
108: type performance/readdir-ahead
109: option parallel-readdir off
[2018-01-15 15:27:49.387104] T [rpc-clnt.c:675:rpc_clnt_reply_init]
0-vol1-client-0: received rpc message (RPC XID: 0x2 Program: GF-DUMP,
ProgVers: 1, Proc: 1) from rpc-transport (vol1-client-0)
....
....
[2018-01-15 15:27:49.390946] T [rpc-clnt.c:1699:rpc_clnt_submit]
0-rpc-clnt: submitted request (XID: 0x3 Program: PORTMAP, ProgVers: 1,
Proc: 1) to rpc-transport (vol1-client-1)
110: option rda-request-size 131072
111: option rda-cache-limit 10MB
112: subvolumes vol1-read-ahead
113: end-volume
114:
115: volume vol1-io-cache
116: type performance/io-cache
117: subvolumes vol1-readdir-ahead
118: end-volume
119:
120: volume vol1-quick-read
121: type performance/quick-read
122: subvolumes vol1-io-cache
123: end-volume
124:
125: volume vol1-open-behind
126: type performance/open-behind
127: subvolumes vol1-quick-read
128: end-volume
129:
130: volume vol1-io-threads
131: type performance/io-threads
132: subvolumes vol1-open-behind
133: end-volume
134:
135: volume vol1
136: type debug/io-stats
137: option log-level INFO
138: option latency-measurement off
139: option count-fop-hits off
140: subvolumes vol1-io-threads
141: end-volume
142:
143: volume meta-autoload
144: type meta
145: subvolumes vol1
146: end-volume
147:
+------------------------------------------------------------------------------+
[2018-01-15 15:27:49.391515] T [rpc-clnt.c:675:rpc_clnt_reply_init]
0-vol1-client-2: received rpc message (RPC XID: 0x2 Program: GF-DUMP,
ProgVers: 1, Proc: 1) from rpc-transport (vol1-client-2)
....
....
*[2018-01-15 15:27:49.446699] W [MSGID: 122019]
[ec-helpers.c:412:ec_loc_gfid_check] 0-vol1-disperse-0: **Mismatching
GFID's in loc*
[2018-01-15 15:27:49.446719] T [MSGID: 0] [ec-helpers.c:89:ec_trace]
0-ec: MANAGER(LOOKUP) 0x7f92bc00db48((nil)) [refs=3, winds=0, jobs=0]
frame=0x7f92bc0013c8/0x7f92bc003878, min/exp=4/6, err=0 state=5
{111111:000000:111111} error=22
[2018-01-15 15:27:49.446739] D [MSGID: 0]
[defaults.c:1266:default_lookup_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1-disperse-0 returned -1 error: Invalid argument
[Invalid argument]
[2018-01-15 15:27:49.446759] D [MSGID: 0]
[dht-common.c:1501:dht_revalidate_cbk] 0-vol1-dht: revalidate lookup of
/shares/data returned with op_ret -1 [Invalid argument]
[2018-01-15 15:27:49.446778] I [MSGID: 109094]
[dht-common.c:1514:dht_revalidate_cbk] 0-vol1-dht: Revalidate: subvolume
vol1-disperse-0 for /shares/data (gfid =
2f5a8895-bf08-41df-a13f-3205c62125ab) returned -1 [Invalid argument]
[2018-01-15 15:27:49.446796] E [MSGID: 101046]
[dht-common.c:1810:dht_revalidate_cbk] 0-vol1-dht: dict is null
[2018-01-15 15:27:49.446814] D [MSGID: 0]
[dht-common.c:1814:dht_revalidate_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1-dht returned -1 error: Invalid argument [Invalid
argument]
[2018-01-15 15:27:49.446832] D [MSGID: 0]
[write-behind.c:2394:wb_lookup_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1-write-behind returned -1 error: Invalid argument
[Invalid argument]
[2018-01-15 15:27:49.446852] D [MSGID: 0]
[io-cache.c:268:ioc_lookup_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1-io-cache returned -1 error: Invalid argument
[Invalid argument]
[2018-01-15 15:27:49.446871] D [MSGID: 0]
[quick-read.c:450:qr_lookup_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1-quick-read returned -1 error: Invalid argument
[Invalid argument]
[2018-01-15 15:27:49.446890] D [MSGID: 0]
[defaults.c:1266:default_lookup_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1-io-threads returned -1 error: Invalid argument
[Invalid argument]
[2018-01-15 15:27:49.446908] D [MSGID: 0]
[io-stats.c:2216:io_stats_lookup_cbk] 0-stack-trace: stack-address:
0x55e1acad3c38, vol1 returned -1 error: Invalid argument [Invalid argument]
[2018-01-15 15:27:49.446988] T [MSGID: 0] [syncop.c:1239:syncop_lookup]
0-stack-trace: stack-address: 0x55e1acad3c38, winding from gfapi to
meta-autoload
[2018-01-15 15:27:49.447011] T [MSGID: 0]
[defaults.c:2574:default_lookup] 0-stack-trace: stack-address:
0x55e1acad3c38, winding from meta-autoload to vol1
[2018-01-15 15:27:49.447028] T [MSGID: 0]
[io-stats.c:2728:io_stats_lookup] 0-stack-trace: stack-address:
0x55e1acad3c38, winding from vol1 to vol1-io-threads
[2018-01-15 15:27:49.447047] D [MSGID: 0]
[io-threads.c:358:iot_schedule] 0-vol1-io-threads: LOOKUP scheduled as
fast fop
[2018-01-15 15:27:49.447113] T [MSGID: 0] [ec-helpers.c:89:ec_trace]
0-ec: MANAGER(LOOKUP) 0x7f92bc00db48((nil)) [refs=3, winds=0, jobs=0]
frame=0x7f92bc0013c8/0x7f92bc003878, min/exp=4/6, err=0 state=0
{111111:000000:111111} error=22
[2018-01-15 15:27:49.447140] T [MSGID: 0] [ec-helpers.c:89:ec_trace]
0-ec: RELEASE(LOOKUP) 0x7f92bc00db48((nil)) [refs=3, winds=0, jobs=0]
frame=0x7f92bc0013c8/0x7f92bc003878, min/exp=4/6, err=22 state=0
{111111:000000:111111}
[2018-01-15 15:27:49.447159] T [MSGID: 0] [ec-helpers.c:89:ec_trace]
0-ec: RELEASE(LOOKUP) 0x7f92bc00db48((nil)) [refs=2, winds=0, jobs=0]
frame=0x7f92bc0013c8/0x7f92bc003878, min/exp=4/6, err=22 state=0
{111111:000000:111111}
[2018-01-15 15:27:49.447177] T [MSGID: 0] [ec-helpers.c:89:ec_trace]
0-ec: RELEASE(LOOKUP) 0x7f92bc00db48((nil)) [refs=1, winds=0, jobs=0]
frame=0x7f92bc0013c8/0x7f92bc003878, min/exp=4/6, err=22 state=0
{111111:000000:111111}
...
...
* The systems log-journal lists:
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.495269, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.489101, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.483684, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.478285, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.465356, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.459377, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.454542, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)
*Jan 15 15:27:49 samba smbd[3129]: glfs_stat(.) failed: Invalid argument**
**Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.449593, 0]
../source3/modules/vfs_glusterfs.c:996(vfs_gluster_stat)*
Jan 15 15:27:49 samba smbd[3129]: vfs_gluster_connect: vol1:
Initialized volume from servers gs-gnode1.origenis.de
gs-gnode2.origenis.de gs-gnode3.origenis.de
Jan 15 15:27:49 samba smbd[3129]: [2018/01/15 15:27:49.420543, 0]
../source3/modules/vfs_glusterfs.c:371(vfs_gluster_connect)
Jan 15 15:27:08 samba smbd[3117]: STATUS=daemon 'smbd' finished
starting up and ready to serve connections
Jan 15 15:27:08 samba smbd[3117]: [2018/01/15 15:27:08.920529, 0]
../lib/util/become_daemon.c:124(daemon_ready)
Jan 15 15:27:08 samba systemd[1]: Started Samba SMB Daemon.
Jan 15 15:27:08 samba systemd[1]: Starting Samba SMB Daemon...
Kind regards for your help,
Henrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180125/36d3005d/attachment-0001.html>
More information about the Gluster-devel
mailing list