[Gluster-users] Issue in Adding/Removing the gluster node
ABHISHEK PALIWAL
abhishpaliwal at gmail.com
Tue Feb 23 07:40:27 UTC 2016
Hi Gaurav,
In my case we are removing the brick in the offline state with the force
option like in the following way:
*gluster volume remove-brick %s replica 1 %s:%s force --mode=script*
but still getting the failure or remove-brick
it seems that brick is not present which we are trying to remove here are
the log snippet of both of the boards
*1st board:*
# gluster volume info
status
gluster volume status c_glusterfs
Volume Name: c_glusterfs
Type: Replicate
Volume ID: 32793e91-6f88-4f29-b3e4-0d53d02a4b99
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
nfs.disable: on
network.ping-timeout: 4
performance.readdir-ahead: on
# gluster peer status
Number of Peers: 1
Hostname: 10.32.1.144
Uuid: b88c74b9-457d-4864-9fe6-403f6934d7d1
State: Peer in Cluster (Connected)
# gluster volume status c_glusterfs
Status of volume: c_glusterfs
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------------------------
Brick 10.32.0.48:/opt/lvmdir/c2/brick 49153 0 Y
2537
Self-heal Daemon on localhost N/A N/A Y
5577
Self-heal Daemon on 10.32.1.144 N/A N/A Y
3850
Task Status of Volume c_glusterfs
------------------------------------------------------------------------------
There are no active volume tasks
*2nd Board*:
# gluster volume info
status
gluster volume status c_glusterfs
gluster volume heal c_glusterfs info
Volume Name: c_glusterfs
Type: Replicate
Volume ID: 32793e91-6f88-4f29-b3e4-0d53d02a4b99
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on
# gluster peer status
Number of Peers: 1
Hostname: 10.32.0.48
Uuid: e7c4494e-aa04-4909-81c9-27a462f6f9e7
State: Peer in Cluster (Connected)
# gluster volume status c_glusterfs
Status of volume: c_glusterfs
Gluster process TCP Port RDMA Port Online
Pid
------------------------------------------------------------------------------
Brick 10.32.0.48:/opt/lvmdir/c2/brick 49153 0 Y
2537
Self-heal Daemon on localhost N/A N/A Y
3850
Self-heal Daemon on 10.32.0.48 N/A N/A Y
5577
Task Status of Volume c_glusterfs
------------------------------------------------------------------------------
There are no active volume tasks
Do you know why these logs are not showing the Brick info at the time of
gluster volume status.
Because we are not able to collect the logs of cmd_history.log file from
the 2nd board.
Regards,
Abhishek
On Tue, Feb 23, 2016 at 12:02 PM, Gaurav Garg <ggarg at redhat.com> wrote:
> Hi abhishek,
>
> >> Can we perform remove-brick operation on the offline brick? what is the
> meaning of offline and online brick?
>
> No, you can't perform remove-brick operation on the offline brick. brick
> is offline means brick process is not running. you can see it by executing
> #gluster volume status. If brick is offline then respective brick will show
> "N" entry in Online column of #gluster volume status command. Alternatively
> you can also check whether glusterfsd process for that brick is running or
> not by executing #ps aux | grep glusterfsd, this command will list out all
> the brick process you can filter out from them, which one is online, which
> one is not.
>
> But if you want to perform remove-brick operation on the offline brick
> then you need to execute it with force option. #gluster volume remove-brick
> <volname> hostname:/brick_name force. This might lead to data loss.
>
>
>
> >> Also, Is there any logic in gluster through which we can check the
> connectivity of node established or not before performing the any operation
> on brick?
>
> Yes, you can check it by executing #gluster peer status command.
>
>
> Thanks,
>
> ~Gaurav
>
>
> ----- Original Message -----
> From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> To: "Gaurav Garg" <ggarg at redhat.com>
> Cc: gluster-users at gluster.org
> Sent: Tuesday, February 23, 2016 11:50:43 AM
> Subject: Re: [Gluster-users] Issue in Adding/Removing the gluster node
>
> Hi Gaurav,
>
> one general question related to gluster bricks.
>
> Can we perform remove-brick operation on the offline brick? what is the
> meaning of offline and online brick?
> Also, Is there any logic in gluster through which we can check the
> connectivity of node established or not before performing the any operation
> on brick?
>
> Regards,
> Abhishek
>
> On Mon, Feb 22, 2016 at 2:42 PM, Gaurav Garg <ggarg at redhat.com> wrote:
>
> > Hi abhishek,
> >
> > I went through your logs of node 1 and by looking glusterd logs its
> > clearly indicate that your 2nd node (10.32.1.144) have disconnected from
> > the cluster, because of that remove-brick operation failed. I think you
> > need to check your network interface.
> >
> > But surprising things is that i did not see duplicate peer entry in
> > #gluster peer status command output.
> >
> > May be i will get some more information from your (10.32.1.144) 2nd node
> > logs. Could you also attach your 2nd node logs.
> >
> > after restarting glusterd, are you seeing duplicate peer entry in
> #gluster
> > peer status command output ?
> >
> > will wait for 2nd node logs for further analyzing duplicate peer entry
> > problem.
> >
> > Thanks,
> >
> > ~Gaurav
> >
> > ----- Original Message -----
> > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > To: "Gaurav Garg" <ggarg at redhat.com>
> > Cc: gluster-users at gluster.org
> > Sent: Monday, February 22, 2016 12:48:55 PM
> > Subject: Re: [Gluster-users] Issue in Adding/Removing the gluster node
> >
> > Hi Gaurav,
> >
> > Here, You can find the attached logs for the boards in case of
> remove-brick
> > failure.
> > In these logs we do not have the cmd_history and
> > etc-glusterfs-glusterd.vol.log for the second board.
> >
> > May be for that we need to some more time.
> >
> >
> > Regards,
> > Abhishek
> >
> > On Mon, Feb 22, 2016 at 10:18 AM, Gaurav Garg <ggarg at redhat.com> wrote:
> >
> > > Hi Abhishek,
> > >
> > > >> I'll provide the required log to you.
> > >
> > > sure
> > >
> > > on both node. do "pkill glusterd" and then start glusterd services.
> > >
> > > Thanks,
> > >
> > > ~Gaurav
> > >
> > > ----- Original Message -----
> > > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > > To: "Gaurav Garg" <ggarg at redhat.com>
> > > Cc: gluster-users at gluster.org
> > > Sent: Monday, February 22, 2016 10:11:48 AM
> > > Subject: Re: [Gluster-users] Issue in Adding/Removing the gluster node
> > >
> > > Hi Gaurav,
> > >
> > > Thanks for your prompt reply.
> > >
> > > I'll provide the required log to you.
> > >
> > > As a workaround you suggested that restart the glusterd service. Could
> > you
> > > please tell me the point where I can do this?
> > >
> > > Regards,
> > > Abhishek
> > >
> > > On Fri, Feb 19, 2016 at 6:11 PM, Gaurav Garg <ggarg at redhat.com> wrote:
> > >
> > > > Hi Abhishek,
> > > >
> > > > Peer status output looks interesting where it have stale entry,
> > > > technically it should not happen. Here few thing need to ask
> > > >
> > > > Did you perform any manual operation with GlusterFS configuration
> file
> > > > which resides in /var/lib/glusterd/* folder.
> > > >
> > > > Can you provide output of "ls /var/lib/glusterd/peers" from both of
> > your
> > > > nodes.
> > > >
> > > > Could you provide output of #gluster peer status command when 2nd
> node
> > is
> > > > down
> > > >
> > > > Can you provide output of #gluster volume info command
> > > >
> > > > Can you provide full logs details of cmd_history.log and
> > > > etc-glusterfs-glusterd.vol.log from both the nodes.
> > > >
> > > >
> > > > You can restart your glusterd as of now as a workaround but we need
> to
> > > > analysis this issue further.
> > > >
> > > > Thanks,
> > > > Gaurav
> > > >
> > > > ----- Original Message -----
> > > > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > > > To: "Gaurav Garg" <ggarg at redhat.com>
> > > > Cc: gluster-users at gluster.org
> > > > Sent: Friday, February 19, 2016 5:27:21 PM
> > > > Subject: Re: [Gluster-users] Issue in Adding/Removing the gluster
> node
> > > >
> > > > Hi Gaurav,
> > > >
> > > > After the failure of add-brick following is outcome "gluster peer
> > status"
> > > > command
> > > >
> > > > Number of Peers: 2
> > > >
> > > > Hostname: 10.32.1.144
> > > > Uuid: bbe2a458-ad3d-406d-b233-b6027c12174e
> > > > State: Peer in Cluster (Connected)
> > > >
> > > > Hostname: 10.32.1.144
> > > > Uuid: bbe2a458-ad3d-406d-b233-b6027c12174e
> > > > State: Peer in Cluster (Connected)
> > > >
> > > > Regards,
> > > > Abhishek
> > > >
> > > > On Fri, Feb 19, 2016 at 5:21 PM, ABHISHEK PALIWAL <
> > > abhishpaliwal at gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi Gaurav,
> > > > >
> > > > > Both are the board connect through the backplane using ethernet.
> > > > >
> > > > > Even this inconsistency also occurs when I am trying to bringing
> back
> > > the
> > > > > node in slot. Means some time add-brick executes without failure
> but
> > > some
> > > > > time following error occurs.
> > > > >
> > > > > volume add-brick c_glusterfs replica 2 10.32.1.144:
> > > /opt/lvmdir/c2/brick
> > > > > force : FAILED : Another transaction is in progress for
> c_glusterfs.
> > > > Please
> > > > > try again after sometime.
> > > > >
> > > > >
> > > > > You can also see the attached logs for add-brick failure scenario.
> > > > >
> > > > > Please let me know if you need more logs.
> > > > >
> > > > > Regards,
> > > > > Abhishek
> > > > >
> > > > >
> > > > > On Fri, Feb 19, 2016 at 5:03 PM, Gaurav Garg <ggarg at redhat.com>
> > wrote:
> > > > >
> > > > >> Hi Abhishek,
> > > > >>
> > > > >> How are you connecting two board, and how are you removing it
> > manually
> > > > >> that need to know because if you are removing your 2nd board from
> > the
> > > > >> cluster (abrupt shutdown) then you can't perform remove brick
> > > operation
> > > > in
> > > > >> 2nd node from first node and its happening successfully in your
> > case.
> > > > could
> > > > >> you ensure your network connection once again while removing and
> > > > bringing
> > > > >> back your node again.
> > > > >>
> > > > >> Thanks,
> > > > >> Gaurav
> > > > >>
> > > > >> ------------------------------
> > > > >> *From: *"ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > > > >> *To: *"Gaurav Garg" <ggarg at redhat.com>
> > > > >> *Cc: *gluster-users at gluster.org
> > > > >> *Sent: *Friday, February 19, 2016 3:36:21 PM
> > > > >>
> > > > >> *Subject: *Re: [Gluster-users] Issue in Adding/Removing the
> gluster
> > > node
> > > > >>
> > > > >> Hi Gaurav,
> > > > >>
> > > > >> Thanks for reply
> > > > >>
> > > > >> 1. Here, I removed the board manually here but this time it works
> > fine
> > > > >>
> > > > >> [2016-02-18 10:03:40.601472] : volume remove-brick c_glusterfs
> > > replica
> > > > 1
> > > > >> 10.32.1.144:/opt/lvmdir/c2/brick force : SUCCESS
> > > > >> [2016-02-18 10:03:40.885973] : peer detach 10.32.1.144 : SUCCESS
> > > > >>
> > > > >> Yes this time board is reachable but how? don't know because board
> > is
> > > > >> detached.
> > > > >>
> > > > >> 2. Here, I attached the board this time its works fine in
> add-bricks
> > > > >>
> > > > >> 2016-02-18 10:03:42.065038] : peer probe 10.32.1.144 : SUCCESS
> > > > >> [2016-02-18 10:03:44.563546] : volume add-brick c_glusterfs
> > replica 2
> > > > >> 10.32.1.144:/opt/lvmdir/c2/brick force : SUCCESS
> > > > >>
> > > > >> 3.Here, again I removed the board this time failed occur
> > > > >>
> > > > >> [2016-02-18 10:37:02.816089] : volume remove-brick c_glusterfs
> > > replica
> > > > 1
> > > > >> 10.32.1.144:/opt/lvmdir/c2/brick force : FAILED : Incorrect brick
> > > > >> 10.32.1.144:/opt
> > > > >> /lvmdir/c2/brick for volume c_glusterfs
> > > > >>
> > > > >> but here board is not reachable.
> > > > >>
> > > > >> why this inconsistency is there while doing the same step multiple
> > > time.
> > > > >>
> > > > >> Hope you are getting my point.
> > > > >>
> > > > >> Regards,
> > > > >> Abhishek
> > > > >>
> > > > >> On Fri, Feb 19, 2016 at 3:25 PM, Gaurav Garg <ggarg at redhat.com>
> > > wrote:
> > > > >>
> > > > >>> Abhishek,
> > > > >>>
> > > > >>> when sometime its working fine means 2nd board network connection
> > is
> > > > >>> reachable to first node. you can conform this by executing same
> > > > #gluster
> > > > >>> peer status command.
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Gaurav
> > > > >>>
> > > > >>> ----- Original Message -----
> > > > >>> From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > > > >>> To: "Gaurav Garg" <ggarg at redhat.com>
> > > > >>> Cc: gluster-users at gluster.org
> > > > >>> Sent: Friday, February 19, 2016 3:12:22 PM
> > > > >>> Subject: Re: [Gluster-users] Issue in Adding/Removing the gluster
> > > node
> > > > >>>
> > > > >>> Hi Gaurav,
> > > > >>>
> > > > >>> Yes, you are right actually I am force fully detaching the node
> > from
> > > > the
> > > > >>> slave and when we removed the board it disconnected from the
> > another
> > > > >>> board.
> > > > >>>
> > > > >>> but my question is I am doing this process multiple time some
> time
> > it
> > > > >>> works
> > > > >>> fine but some time it gave these errors.
> > > > >>>
> > > > >>>
> > > > >>> you can see the following logs from cmd_history.log file
> > > > >>>
> > > > >>> [2016-02-18 10:03:34.497996] : volume set c_glusterfs
> nfs.disable
> > > on :
> > > > >>> SUCCESS
> > > > >>> [2016-02-18 10:03:34.915036] : volume start c_glusterfs force :
> > > > SUCCESS
> > > > >>> [2016-02-18 10:03:40.250326] : volume status : SUCCESS
> > > > >>> [2016-02-18 10:03:40.273275] : volume status : SUCCESS
> > > > >>> [2016-02-18 10:03:40.601472] : volume remove-brick c_glusterfs
> > > > replica 1
> > > > >>> 10.32.1.144:/opt/lvmdir/c2/brick force : SUCCESS
> > > > >>> [2016-02-18 10:03:40.885973] : peer detach 10.32.1.144 : SUCCESS
> > > > >>> [2016-02-18 10:03:42.065038] : peer probe 10.32.1.144 : SUCCESS
> > > > >>> [2016-02-18 10:03:44.563546] : volume add-brick c_glusterfs
> > replica
> > > 2
> > > > >>> 10.32.1.144:/opt/lvmdir/c2/brick force : SUCCESS
> > > > >>> [2016-02-18 10:30:53.297415] : volume status : SUCCESS
> > > > >>> [2016-02-18 10:30:53.313096] : volume status : SUCCESS
> > > > >>> [2016-02-18 10:37:02.748714] : volume status : SUCCESS
> > > > >>> [2016-02-18 10:37:02.762091] : volume status : SUCCESS
> > > > >>> [2016-02-18 10:37:02.816089] : volume remove-brick c_glusterfs
> > > > replica 1
> > > > >>> 10.32.1.144:/opt/lvmdir/c2/brick force : FAILED : Incorrect
> brick
> > > > >>> 10.32.1.144:/opt/lvmdir/c2/brick for volume c_glusterfs
> > > > >>>
> > > > >>>
> > > > >>> On Fri, Feb 19, 2016 at 3:05 PM, Gaurav Garg <ggarg at redhat.com>
> > > wrote:
> > > > >>>
> > > > >>> > Hi Abhishek,
> > > > >>> >
> > > > >>> > Seems your peer 10.32.1.144 have disconnected while doing
> remove
> > > > brick.
> > > > >>> > see the below logs in glusterd:
> > > > >>> >
> > > > >>> > [2016-02-18 10:37:02.816009] E [MSGID: 106256]
> > > > >>> > [glusterd-brick-ops.c:1047:__glusterd_handle_remove_brick]
> > > > >>> 0-management:
> > > > >>> > Incorrect brick 10.32.1.144:/opt/lvmdir/c2/brick for volume
> > > > >>> c_glusterfs
> > > > >>> > [Invalid argument]
> > > > >>> > [2016-02-18 10:37:02.816061] E [MSGID: 106265]
> > > > >>> > [glusterd-brick-ops.c:1088:__glusterd_handle_remove_brick]
> > > > >>> 0-management:
> > > > >>> > Incorrect brick 10.32.1.144:/opt/lvmdir/c2/brick for volume
> > > > >>> c_glusterfs
> > > > >>> > The message "I [MSGID: 106004]
> > > > >>> > [glusterd-handler.c:5065:__glusterd_peer_rpc_notify]
> > 0-management:
> > > > Peer
> > > > >>> > <10.32.1.144> (<6adf57dc-c619-4e56-ae40-90e6aef75fe9>), in
> state
> > > > <Peer
> > > > >>> in
> > > > >>> > Cluster>, has disconnected from glusterd." repeated 25 times
> > > between
> > > > >>> > [2016-02-18 10:35:43.131945] and [2016-02-18 10:36:58.160458]
> > > > >>> >
> > > > >>> >
> > > > >>> >
> > > > >>> > If you are facing the same issue now, could you paste your #
> > > gluster
> > > > >>> peer
> > > > >>> > status command output here.
> > > > >>> >
> > > > >>> > Thanks,
> > > > >>> > ~Gaurav
> > > > >>> >
> > > > >>> > ----- Original Message -----
> > > > >>> > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > > > >>> > To: gluster-users at gluster.org
> > > > >>> > Sent: Friday, February 19, 2016 2:46:35 PM
> > > > >>> > Subject: [Gluster-users] Issue in Adding/Removing the gluster
> > node
> > > > >>> >
> > > > >>> > Hi,
> > > > >>> >
> > > > >>> >
> > > > >>> > I am working on two board setup connecting to each other.
> Gluster
> > > > >>> version
> > > > >>> > 3.7.6 is running and added two bricks in replica 2 mode but
> when
> > I
> > > > >>> manually
> > > > >>> > removed (detach) the one board from the setup I am getting the
> > > > >>> following
> > > > >>> > error.
> > > > >>> >
> > > > >>> > volume remove-brick c_glusterfs replica 1 10.32.1.144:
> > > > >>> /opt/lvmdir/c2/brick
> > > > >>> > force : FAILED : Incorrect brick 10.32.1.144:
> > /opt/lvmdir/c2/brick
> > > > for
> > > > >>> > volume c_glusterfs
> > > > >>> >
> > > > >>> > Please find the logs file as an attachment.
> > > > >>> >
> > > > >>> >
> > > > >>> > Regards,
> > > > >>> > Abhishek
> > > > >>> >
> > > > >>> >
> > > > >>> > _______________________________________________
> > > > >>> > Gluster-users mailing list
> > > > >>> > Gluster-users at gluster.org
> > > > >>> > http://www.gluster.org/mailman/listinfo/gluster-users
> > > > >>> >
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> Regards
> > > > >>> Abhishek Paliwal
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> Regards
> > > > >> Abhishek Paliwal
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > >
> > > >
> > > >
> > > > Regards
> > > > Abhishek Paliwal
> > > >
> > >
> >
> >
> >
> > --
> >
> >
> >
> >
> > Regards
> > Abhishek Paliwal
> >
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
--
Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160223/bfed9721/attachment.html>
More information about the Gluster-users
mailing list