[Bugs] [Bug 1420993] New: Modified volume options not synced once offline nodes comes up.

Fri Feb 10 05:05:22 UTC 2017

https://bugzilla.redhat.com/show_bug.cgi?id=1420993

            Bug ID: 1420993
           Summary: Modified volume options not synced once offline nodes
                    comes up.
           Product: GlusterFS
           Version: 3.8
         Component: glusterd
          Keywords: Triaged
          Assignee: bugs at gluster.org
          Reporter: amukherj at redhat.com
                CC: bsrirama at redhat.com, bugs at gluster.org,
                    rhs-bugs at redhat.com, storage-qa-internal at redhat.com,
                    vbellur at redhat.com
        Depends On: 1420637
            Blocks: 1420635, 1420991

+++ This bug was initially created as a clone of Bug #1420637 +++

+++ This bug was initially created as a clone of Bug #1420635 +++

Description of problem:
=======================
modification done to the volume when some cluster are down are not synced once
offline nodes comes up.

Version-Release number of selected component (if applicable):
==============================================================
glusterfs-3.8.4-14

How reproducible:
=================
Always

Steps to Reproduce:
====================
1. Have 3 nodes cluster
2. Create and start a Distributed volume using 3 bricks (pick one from each
node)
3. stop glusterd on two nodes (say n2 and n3 )
4. change these volume options from default to 
performance.readdir-ahead from on to off
cluster.server-quorum-ratio from default value to 30
5. Now start glusterd on n2 and n3 nodes
6. check the volume info on both nodes and check modified volume options are
synced.

Actual results:
===============
Modified volume options not synced once offline nodes comes up.

Expected results:
=================
sync should happen once nodes comes up.

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-02-09
01:49:18 EST ---

This bug is automatically being proposed for the current release of Red Hat
Gluster Storage 3 under active development, by setting the release flag
'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from Byreddy on 2017-02-09 01:52:24 EST ---

errors in glusterd log:
=======================
[2017-02-09 06:40:29.199737] E [MSGID: 106422]
[glusterd-utils.c:4357:glusterd_compare_friend_data] 0-management: Importing
global options failed
[2017-02-09 06:40:29.199775] E [MSGID: 106376]
[glusterd-sm.c:1397:glusterd_friend_sm] 0-glusterd: handler returned: 2
[2017-02-09 06:40:29.199926] I [MSGID: 106493]
[glusterd-rpc-ops.c:478:__glusterd_friend_add_cbk] 0-glusterd: Received ACC
from uuid: 273c5136-66a9-4b3e-8f1d-fb45509a4a18, host:
dhcp41-198.lab.eng.blr.redhat.com, port: 0
[2017-02-09 06:40:29.238089] I [MSGID: 106492]
[glusterd-handler.c:2788:__glusterd_handle_friend_update] 0-glusterd: Received
friend update from uuid: 273c5136-66a9-4b3e-8f1d-fb45509a4a18
[2017-02-09 06:40:29.238127] I [MSGID: 106502]
[glusterd-handler.c:2833:__glusterd_handle_friend_update] 0-management:
Received my uuid as Friend
[2017-02-09 06:40:29.270561] I [MSGID: 106493]
[glusterd-rpc-ops.c:693:__glusterd_friend_update_cbk] 0-management: Received
ACC from uuid: 273c5136-66a9-4b3e-8f1d-fb45509a4a18
[2017-02-09 06:40:29.270981] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-02-09 06:40:29.271042] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is
stopped
[2017-02-09 06:40:29.271475] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already
stopped
[2017-02-09 06:40:29.271522] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is
stopped
[2017-02-09 06:40:29.271591] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already
stopped
[2017-02-09 06:40:29.271715] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: quotad service is
stopped
[2017-02-09 06:40:29.271807] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-02-09 06:40:29.271841] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is
stopped
[2017-02-09 06:40:29.271901] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already
stopped
[2017-02-09 06:40:29.271947] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is
stopped
[2017-02-09 06:40:29.272106] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
0-snapd: setting frame-timeout to 600
[2017-02-09 06:40:30.976089] I [MSGID: 106488]
[glusterd-handler.c:1539:__glusterd_handle_cli_get_volume] 0-management:
Received get vol req
[2017-02-09 06:40:30.977864] I [MSGID: 106488]
[glusterd-handler.c:1539:__glusterd_handle_cli_get_volume] 0-management:
Received get vol req
[2017-02-09 06:40:44.641763] I [MSGID: 106163]
[glusterd-handshake.c:1274:__glusterd_mgmt_hndsk_versions_ack] 0-management:
using the op-version 30901
[2017-02-09 06:40:44.723849] I [MSGID: 106490]
[glusterd-handler.c:2610:__glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: c744d8ef-71ba-4429-9243-0456d2654824
[2017-02-09 06:40:44.764377] I [MSGID: 106493]
[glusterd-handler.c:3865:glusterd_xfer_friend_add_resp] 0-glusterd: Responded
to 10.70.43.71 (0), ret: 0, op_ret: 0
[2017-02-09 06:40:44.916543] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-02-09 06:40:44.916586] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: nfs service is
stopped
[2017-02-09 06:40:44.916926] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already
stopped
[2017-02-09 06:40:44.916951] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is
stopped
[2017-02-09 06:40:44.916985] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already
stopped
[2017-02-09 06:40:44.917006] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: quotad service is
stopped
[2017-02-09 06:40:44.917041] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-02-09 06:40:44.917067] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: bitd service is
stopped
[2017-02-09 06:40:44.917133] I [MSGID: 106132]
[glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already
stopped
[2017-02-09 06:40:44.917161] I [MSGID: 106568]
[glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: scrub service is
stopped
[2017-02-09 06:40:44.924636] I [MSGID: 106492]
[glusterd-handler.c:2788:__glusterd_handle_friend_update] 0-glusterd: Received
friend update from uuid: c744d8ef-71ba-4429-9243-0456d2654824
[2017-02-09 06:40:44.941841] I [MSGID: 106502]
[glusterd-handler.c:2833:__glusterd_handle_friend_update] 0-management:
Received my uuid as Friend
[2017-02-09 06:50:54.497245] E [rpc-clnt.c:200:call_bail] 0-management: bailing
out frame type(Peer mgmt) op(--(2)) xid = 0x4 sent = 2017-02-09
06:40:44.860661. timeout = 600 for 10.70.43.71:24007
(END)

--- Additional comment from Worker Ant on 2017-02-09 02:33:00 EST ---

REVIEW: https://review.gluster.org/16574 (glusterd: ignore return code of
glusterd_restart_bricks) posted (#1) for review on master by Atin Mukherjee
(amukherj at redhat.com)

--- Additional comment from Worker Ant on 2017-02-09 11:46:03 EST ---

COMMIT: https://review.gluster.org/16574 committed in master by Atin Mukherjee
(amukherj at redhat.com) 
------
commit 55625293093d485623f3f3d98687cd1e2c594460
Author: Atin Mukherjee <amukherj at redhat.com>
Date:   Thu Feb 9 12:56:38 2017 +0530

    glusterd: ignore return code of glusterd_restart_bricks

    When GlusterD is restarted on a multi node cluster, while syncing the
    global options from other GlusterD, it checks for quorum and based on
    which it decides whether to stop/start a brick. However we handle the
    return code of this function in which case if we don't want to start any
    bricks the ret will be non zero and we will end up failing the import
    which is incorrect.

    Fix is just to ignore the ret code of glusterd_restart_bricks ()

    Change-Id: I37766b0bba138d2e61d3c6034bd00e93ba43e553
    BUG: 1420637
    Signed-off-by: Atin Mukherjee <amukherj at redhat.com>
    Reviewed-on: https://review.gluster.org/16574
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Samikshan Bairagya <samikshan at gmail.com>
    Reviewed-by: Jeff Darcy <jdarcy at redhat.com>

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1420635
[Bug 1420635] Modified volume options not synced once offline nodes comes
up.
https://bugzilla.redhat.com/show_bug.cgi?id=1420637
[Bug 1420637] Modified volume options not synced once offline nodes comes
up.
https://bugzilla.redhat.com/show_bug.cgi?id=1420991
[Bug 1420991] Modified volume options not synced once offline nodes comes
up.
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.