[Bugs] [Bug 1672318] "failed to fetch volume file" when trying to activate host in DC with glusterfs 3.12 domains

Mon Apr 1 09:37:41 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1672318

Netbulae <info at netbulae.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(amukherj at redhat.c |
                   |om)                         |
                   |needinfo?(info at netbulae.com |
                   |)                           |

--- Comment #26 from Netbulae <info at netbulae.com> ---
(In reply to Atin Mukherjee from comment #24)
> [2019-03-18 11:29:01.000279] I [glusterfsd-mgmt.c:2424:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: disconnected from remote-host: *.*.*.14
> 
> Why did we get a disconnect. Was glusterd service at *.14 not running?
> 
> [2019-03-18 11:29:01.000330] I [glusterfsd-mgmt.c:2464:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: connecting to next volfile server *.*.*.15
> [2019-03-18 11:29:01.002495] E [rpc-clnt.c:346:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fb4beddbfbb] (-->
> /lib64/libgfrpc.so.0(+0xce11)[0x7fb4beba4e11] (-->
> /lib64/libgfrpc.so.0(+0xcf2e)[0x7fb4beba4f2e] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7fb4beba6531] (-->
> /lib64/libgfrpc.so.0(+0xf0d8)[0x7fb4beba70d8] ))))) 0-glusterfs: forced
> unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at
> 2019-03-18 11:13:29.445101 (xid=0x2)
> 
> The above log seems to be the culprit here. 
> 
> [2019-03-18 11:29:01.002517] E [glusterfsd-mgmt.c:2136:mgmt_getspec_cbk]
> 0-mgmt: failed to fetch volume file (key:/ssd9)
> 
> And the above log is the after effect.
> 
> 
> I have few questions:
> 
> 1. Does the mount fail everytime?

Yes. It also stays the same when we switch the primary storage domain to
another one.

> 2. Do you see any change in the behaviour when the primary volfile server is
> changed?

No I have different primary volfile server across volumes to spread the load a
bit more. Same effect always.

> 3. What are the gluster version in the individual peers?

All nodes and servers are on 3.12.15

> 
> (Keeping the needinfo intact for now, but request Sahina to get us these
> details to work on).

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.