[Bugs] [Bug 1672318] "failed to fetch volume file" when trying to activate host in DC with glusterfs 3.12 domains

Mon Mar 18 11:37:04 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1672318

Netbulae <info at netbulae.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(info at netbulae.com |
                   |)                           |

--- Comment #21 from Netbulae <info at netbulae.com> ---
Upgraded to node 4.3.2-2019031310.

As stated earlier, there is nothing in the brick and glusterd log. I've set
"diagnostics.client-log-level" to DEBUG but still see nothing.

glustershd.log

[2019-03-18 11:10:54.005050] D [MSGID: 114031]
[client-rpc-fops.c:1508:client3_3_inodelk_cbk] 0-hdd2-client-2: remote
operation failed [Resource temporarily unavailable]
[2019-03-18 11:10:54.005079] D [MSGID: 0]
[client-rpc-fops.c:1511:client3_3_inodelk_cbk] 0-stack-trace: stack-address:
0x7f1254002560, hdd2-client-2 returned -1 error: Resource temporarily
unavailable [Resource temporarily unavailable]
[2019-03-18 11:10:55.545357] D [MSGID: 0]
[afr-self-heald.c:218:afr_shd_index_inode] 0-hdd2-replicate-0:
glusterfs.xattrop_entry_changes_gfid dir gfid for hdd2-client-0:
b8f46fe6-ef56-44ac-8586-22061b0f2d5b
[2019-03-18 11:10:55.546606] D [MSGID: 0]
[afr-self-heald.c:597:afr_shd_index_healer] 0-hdd2-replicate-0: finished index
sweep on subvol hdd2-client-0
[2019-03-18 11:11:24.234360] D [rpc-clnt-ping.c:336:rpc_clnt_start_ping]
0-hdd2-client-0: returning as transport is already disconnected OR there are no
frames (0 || 0)
[2019-03-18 11:11:24.234416] D [rpc-clnt-ping.c:336:rpc_clnt_start_ping]
0-hdd2-client-1: returning as transport is already disconnected OR there are no
frames (0 || 0)
[2019-03-18 11:11:24.234425] D [rpc-clnt-ping.c:336:rpc_clnt_start_ping]
0-hdd2-client-2: returning as transport is already disconnected OR there are no
frames (0 || 0)
[2019-03-18 11:12:46.240335] D [logging.c:1855:gf_log_flush_timeout_cbk]
0-logging-infra: Log timer timed out. About to flush outstanding messages if
present
[2019-03-18 11:12:46.240388] D [logging.c:1817:__gf_log_inject_timer_event]
0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2019-03-18 11:14:09.000515] D
[rpc-clnt-ping.c:99:rpc_clnt_remove_ping_timer_locked] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f12ab9f7ebb] (-->
/lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7f12ab7c228b]
(--> /lib64/libgfrpc.so.0(+0x14a31)[0x7f12ab7c2a31] (-->
/lib64/libgfrpc.so.0(rpc_clnt_submit+0x4a1)[0x7f12ab7bf651] (-->
/usr/lib64/glusterfs/3.12.15/xlator/protocol/client.so(+0x102e2)[0x7f129e0e02e2]
))))) 0-: 192.168.99.14:49154: ping timer event already removed
[2019-03-18 11:14:09.000619] D [MSGID: 0]
[syncop-utils.c:548:syncop_is_subvol_local] 0-ssd4-client-0: subvol
ssd4-client-0 is local
[2019-03-18 11:14:09.000637] D [MSGID: 0]
[afr-self-heald.c:580:afr_shd_index_healer] 0-ssd4-replicate-0: starting index
sweep on subvol ssd4-client-0
[2019-03-18 11:14:09.000635] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk]
0-ssd4-client-0: Ping latency is 0ms

glusterfs clien log

[2019-03-18 11:03:57.442215] I [MSGID: 100030] [glusterfsd.c:2715:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 5.3 (args:
/usr/sbin/glusterfs --process-name fuse --volfile-server=*.*.*.14
--volfile-server=*.*.*.15 --volfile-server=*.*.*.16 --volfile-id=/ssd9
/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9)
[2019-03-18 11:03:57.472426] I [MSGID: 101190]
[event-epoll.c:622:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2019-03-18 11:06:03.636977] W [glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7f392add8dd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x556b04278e75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x556b04278ceb] ) 0-: received
signum (15), shutting down
[2019-03-18 11:06:03.637039] I [fuse-bridge.c:5914:fini] 0-fuse: Unmounting
'/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9'.
[2019-03-18 11:06:03.654263] I [fuse-bridge.c:5919:fini] 0-fuse: Closing fuse
connection to '/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9'.
[2019-03-18 11:13:29.415760] I [MSGID: 100030] [glusterfsd.c:2715:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 5.3 (args:
/usr/sbin/glusterfs --process-name fuse --volfile-server=*.*.*.14
--volfile-server=*.*.*.15 --volfile-server=*.*.*.16 --volfile-id=/ssd9
/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9)
[2019-03-18 11:13:29.444824] I [MSGID: 101190]
[event-epoll.c:622:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2019-03-18 11:29:01.000279] I [glusterfsd-mgmt.c:2424:mgmt_rpc_notify]
0-glusterfsd-mgmt: disconnected from remote-host: *.*.*.14
[2019-03-18 11:29:01.000330] I [glusterfsd-mgmt.c:2464:mgmt_rpc_notify]
0-glusterfsd-mgmt: connecting to next volfile server *.*.*.15
[2019-03-18 11:29:01.002495] E [rpc-clnt.c:346:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fb4beddbfbb] (-->
/lib64/libgfrpc.so.0(+0xce11)[0x7fb4beba4e11] (-->
/lib64/libgfrpc.so.0(+0xcf2e)[0x7fb4beba4f2e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7fb4beba6531] (-->
/lib64/libgfrpc.so.0(+0xf0d8)[0x7fb4beba70d8] ))))) 0-glusterfs: forced
unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2019-03-18
11:13:29.445101 (xid=0x2)
[2019-03-18 11:29:01.002517] E [glusterfsd-mgmt.c:2136:mgmt_getspec_cbk]
0-mgmt: failed to fetch volume file (key:/ssd9)
[2019-03-18 11:29:01.002550] W [glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libgfrpc.so.0(+0xce32) [0x7fb4beba4e32]
-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x841) [0x5586d9f3c361]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x5586d9f34ceb] ) 0-: received
signum (0), shutting down
[2019-03-18 11:29:01.002578] I [fuse-bridge.c:5914:fini] 0-fuse: Unmounting
'/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9'.
[2019-03-18 11:29:01.009036] I [fuse-bridge.c:5919:fini] 0-fuse: Closing fuse
connection to '/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9'.
[2019-03-18 11:29:01.009655] W [glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7fb4bdc3ddd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x5586d9f34e75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x5586d9f34ceb] ) 0-: received
signum (15), shutting down

supervdsm.log

MainProcess|jsonrpc/5::DEBUG::2019-03-18
12:11:29,020::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) call
volumeInfo with (u'ssd9', u'*.*.*.14') {}
MainProcess|jsonrpc/5::DEBUG::2019-03-18
12:11:29,020::commands::198::root::(execCmd) /usr/bin/taskset --cpu-list 0-63
/usr/sbin/gluster --mode=script volume info --remote-host=*.*.*.14 ssd9 --xml
(cwd None)
MainProcess|jsonrpc/6::DEBUG::2019-03-18
12:11:31,460::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) call
ksmTune with ({u'run': 0, u'merge_across_nodes': 1},) {}
MainProcess|jsonrpc/6::DEBUG::2019-03-18
12:11:31,461::supervdsm_server::106::SuperVdsm.ServerCallback::(wrapper) return
ksmTune with None
MainProcess|jsonrpc/5::DEBUG::2019-03-18
12:13:29,227::commands::219::root::(execCmd) FAILED: <err> = ''; <rc> = 1
MainProcess|jsonrpc/5::DEBUG::2019-03-18
12:13:29,227::logutils::319::root::(_report_stats) ThreadedHandler is ok in the
last 122 seconds (max pending: 1)
MainProcess|jsonrpc/5::ERROR::2019-03-18
12:13:29,227::supervdsm_server::103::SuperVdsm.ServerCallback::(wrapper) Error
in volumeInfo
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 101,
in wrapper
    res = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/gluster/cli.py", line 529, in
volumeInfo
    xmltree = _execGlusterXml(command)
  File "/usr/lib/python2.7/site-packages/vdsm/gluster/cli.py", line 131, in
_execGlusterXml
    return _getTree(rc, out, err)
  File "/usr/lib/python2.7/site-packages/vdsm/gluster/cli.py", line 112, in
_getTree
    raise ge.GlusterCmdExecFailedException(rc, out, err)
GlusterCmdExecFailedException: Command execution failed: rc=1 out='Error :
Request timed out\n' err=''
MainProcess|jsonrpc/5::DEBUG::2019-03-18
12:13:29,229::supervdsm_server::99::SuperVdsm.ServerCallback::(wrapper) call
mount with (<vdsm.supervdsm_server._SuperVdsm object at 0x7f3877683a10>,
u'*.*.*.14:/ssd9', u'/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9')
{'vfstype': u'glusterfs', 'mntOpts':
u'backup-volfile-servers=*.*.*.15:*.*.*.16', 'cgroup': 'vdsm-glusterfs'}
MainProcess|jsonrpc/5::DEBUG::2019-03-18
12:13:29,230::commands::198::root::(execCmd) /usr/bin/taskset --cpu-list 0-63
/usr/bin/systemd-run --scope --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs
-o backup-volfile-servers=*.*.*.15:*.*.*.16 *.*.*.14:/ssd9
/rhev/data-center/mnt/glusterSD/*.*.*.14:_ssd9 (cwd None)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.