[Gluster-users] Previously replaced brick not coming up after reboot

Thu Aug 16 07:28:02 UTC 2018

glusterfs 3.12.12

2018-08-16 9:26 GMT+02:00 Serkan Çoban <cobanserkan at gmail.com>:
> What is your gluster version? There was a bug in 3.10, when you reboot
> a node some bricks may not come online but it fixed in later versions.
>
> On 8/16/18, Hu Bert <revirii at googlemail.com> wrote:
>> Hi there,
>>
>> 2 times i had to replace a brick on 2 different servers; replace went
>> fine, heal took very long but finally finished. From time to time you
>> have to reboot the server (kernel upgrades), and i've noticed that the
>> replaced brick doesn't come up after the reboot. Status after reboot:
>>
>> gluster volume status
>> Status of volume: shared
>> Gluster process                             TCP Port  RDMA Port  Online
>> Pid
>> ------------------------------------------------------------------------------
>> Brick gluster11:/gluster/bricksda1/shared   49164     0          Y
>> 6425
>> Brick gluster12:/gluster/bricksda1/shared   49152     0          Y
>> 2078
>> Brick gluster13:/gluster/bricksda1/shared   49152     0          Y
>> 2478
>> Brick gluster11:/gluster/bricksdb1/shared   49165     0          Y
>> 6452
>> Brick gluster12:/gluster/bricksdb1/shared   49153     0          Y
>> 2084
>> Brick gluster13:/gluster/bricksdb1/shared   49153     0          Y
>> 2497
>> Brick gluster11:/gluster/bricksdc1/shared   49166     0          Y
>> 6479
>> Brick gluster12:/gluster/bricksdc1/shared   49154     0          Y
>> 2090
>> Brick gluster13:/gluster/bricksdc1/shared   49154     0          Y
>> 2485
>> Brick gluster11:/gluster/bricksdd1/shared   49168     0          Y
>> 7897
>> Brick gluster12:/gluster/bricksdd1_new/shared  49157     0          Y
>> 7632
>> Brick gluster13:/gluster/bricksdd1_new/shared  N/A       N/A        N
>>      N/A
>> Self-heal Daemon on localhost               N/A       N/A        Y
>> 25483
>> Self-heal Daemon on gluster13               N/A       N/A        Y
>> 2463
>> Self-heal Daemon on gluster12               N/A       N/A        Y
>> 17619
>>
>> Task Status of Volume shared
>> ------------------------------------------------------------------------------
>> There are no active volume tasks
>>
>> Here gluster13:/gluster/bricksdd1_new/shared is not up. Related log
>> message after reboot in glusterd.log:
>>
>> [2018-08-16 05:22:52.986757] W [socket.c:593:__socket_rwv]
>> 0-management: readv on
>> /var/run/gluster/02d086b75bfc97f2cce96fe47e26dcf3.socket failed (No
>> data available)
>> [2018-08-16 05:22:52.987648] I [MSGID: 106005]
>> [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management:
>> Brick gluster13:/gluster/bricksdd1_new/shared has disconnected from
>> glusterd.
>> [2018-08-16 05:22:52.987908] E [rpc-clnt.c:350:saved_frames_unwind]
>> (-->
>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13e)[0x7fdbaa398b8e]
>> (--> /usr/lib/x86_64-
>> linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1d1)[0x7fdbaa15f111]
>> (-->
>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fdbaa15f23e]
>> (--> /usr/lib/x86_64-linu
>> x-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x91)[0x7fdbaa1608d1]
>> (-->
>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x288)[0x7fdbaa1613f8]
>> ))))) 0-management: force
>> d unwinding frame type(brick operations) op(--(4)) called at
>> 2018-08-16 05:22:52.941332 (xid=0x2)
>> [2018-08-16 05:22:52.988058] W [dict.c:426:dict_set]
>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.12.12/xlator/mgmt/glusterd.so(+0xd1e59)
>> [0x7fdba4f9ce59]
>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_set_int32+0x2b)
>> [0x7fdbaa39122b]
>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_set+0xd3)
>> [0x7fdbaa38fa13] ) 0-dict: !this || !value for key=index [I
>> nvalid argument]
>> [2018-08-16 05:22:52.988092] E [MSGID: 106060]
>> [glusterd-syncop.c:1014:gd_syncop_mgmt_brick_op] 0-management: Error
>> setting index on brick status rsp dict
>>
>> This problem could be related to my previous mail. After executing
>> "gluster volume start shared force" the brick comes up, resulting in
>> healing the brick (and in high load, too). Is there any possibility to
>> track down why this happens and how to ensure that the brick comes up
>> at boot?
>>
>>
>> Best regards
>> Hubert
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>