[Gluster-users] Fwd: Replica brick not working
Miloš Čučulović - MDPI
cuculovic at mdpi.com
Wed Dec 14 08:04:16 UTC 2016
Atin,
I was able to move forward a bit. Initially, I had this:
sudo gluster peer status
Number of Peers: 1
Hostname: storage2
Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad
State: Peer Rejected (Connected)
Then, on storage2 I removed all from /var/lib/glusterd except the info file.
Now I am getting another error message:
sudo gluster peer status
Number of Peers: 1
Hostname: storage2
Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad
State: Sent and Received peer request (Connected)
But the add brick is still not working. I checked the hosts file and all
seems ok, ping is also working well.
The think I also need to know, when adding a new replicated brick, do I
need to first sync all files, or the new brick server needs to be empty?
Also, do I first need to create the same volume on the new server or
adding it to the volume of server1 will do it automatically?
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic at mdpi.com
Skype: milos.cuculovic.mdpi
On 14.12.2016 05:13, Atin Mukherjee wrote:
> Milos,
>
> I just managed to take a look into a similar issue and my analysis is at
> [1]. I remember you mentioning about some incorrect /etc/hosts entries
> which lead to this same problem in earlier case, do you mind to recheck
> the same?
>
> [1]
> http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html
>
> On Wed, Dec 14, 2016 at 2:57 AM, Miloš Čučulović - MDPI
> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote:
>
> Hi All,
>
> Moving forward with my issue, sorry for the late reply!
>
> I had some issues with the storage2 server (original volume), then
> decided to use 3.9.0, si I have the latest version.
>
> For that, I synced manually all the files to the storage server. I
> installed there gluster 3.9.0, started it, created new volume called
> storage and all seems to work ok.
>
> Now, I need to create my replicated volume (add new brick on
> storage2 server). Almost all the files are there. So, I was adding
> on storage server:
>
> * sudo gluter peer probe storage2
> * sudo gluster volume add-brick storage replica 2
> storage2:/data/data-cluster force
>
> But there I am receiving "volume add-brick: failed: Host storage2 is
> not in 'Peer in Cluster' state"
>
> Any idea?
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
> Skype: milos.cuculovic.mdpi
>
> On 08.12.2016 17:52, Ravishankar N wrote:
>
> On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:
>
> I was able to fix the sync by rsync-ing all the directories,
> then the
> hale started. The next problem :), as soon as there are
> files on the
> new brick, the gluster mount will render also this one for
> mounts, and
> the new brick is not ready yet, as the sync is not yet done,
> so it
> results on missing files on client side. I temporary removed
> the new
> brick, now I am running a manual rsync and will add the
> brick again,
> hope this could work.
>
> What mechanism is managing this issue, I guess there is
> something per
> built to make a replica brick available only once the data is
> completely synced.
>
> This mechanism was introduced in 3.7.9 or 3.7.10
> (http://review.gluster.org/#/c/13806/
> <http://review.gluster.org/#/c/13806/>). Before that version, you
> manually needed to set some xattrs on the bricks so that healing
> could
> happen in parallel while the client still would server reads
> from the
> original brick. I can't find the link to the doc which
> describes these
> steps for setting xattrs.:-(
>
> Calling it a day,
> Ravi
>
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
> Skype: milos.cuculovic.mdpi
>
> On 08.12.2016 16:17, Ravishankar N wrote:
>
> On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
>
>
>
> On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>>> wrote:
>
> Ah, damn! I found the issue. On the storage
> server, the storage2
> IP address was wrong, I inversed two digits in
> the /etc/hosts
> file, sorry for that :(
>
> I was able to add the brick now, I started the
> heal, but still no
> data transfer visible.
>
> 1. Are the files getting created on the new brick though?
> 2. Can you provide the output of `getfattr -d -m . -e hex
> /data/data-cluster` on both bricks?
> 3. Is it possible to attach gdb to the self-heal daemon
> on the original
> (old) brick and get a backtrace?
> `gdb -p <pid of self-heal daemon on the orignal brick>`
> thread apply all bt -->share this output
> quit gdb.
>
>
> -Ravi
>
>
> @Ravi/Pranith - can you help here?
>
>
>
> By doing gluster volume status, I have
>
> Status of volume: storage
> Gluster process TCP Port
> RDMA Port
> Online Pid
> ------------------------------------------------------------------------------
>
> Brick storage2:/data/data-cluster 49152 0 Y
> 23101
> Brick storage:/data/data-cluster 49152 0 Y
> 30773
> Self-heal Daemon on localhost N/A
> N/A Y
> 30050
> Self-heal Daemon on storage N/A
> N/A Y
> 30792
>
>
> Any idea?
>
> On storage I have:
> Number of Peers: 1
>
> Hostname: 195.65.194.217
> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
> State: Peer in Cluster (Connected)
>
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
> Skype: milos.cuculovic.mdpi
>
> On 08.12.2016 13:55, Atin Mukherjee wrote:
>
> Can you resend the attachment as zip? I am
> unable to extract
> the
> content? We shouldn't have 0 info file. What
> does gluster peer
> status
> output say?
>
> On Thu, Dec 8, 2016 at 4:51 PM, Miloš
> Čučulović - MDPI
> <cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>>>> wrote:
>
> I hope you received my last email Atin,
> thank you!
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel,
> Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>
> Skype: milos.cuculovic.mdpi
>
> On 08.12.2016 10:28, Atin Mukherjee wrote:
>
>
> ---------- Forwarded message ----------
> From: *Atin Mukherjee*
> <amukherj at redhat.com <mailto:amukherj at redhat.com>
> <mailto:amukherj at redhat.com
> <mailto:amukherj at redhat.com>>
> <mailto:amukherj at redhat.com
> <mailto:amukherj at redhat.com>
> <mailto:amukherj at redhat.com
> <mailto:amukherj at redhat.com>>>
> <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> <mailto:amukherj at redhat.com
> <mailto:amukherj at redhat.com>>
> <mailto:amukherj at redhat.com
> <mailto:amukherj at redhat.com>
> <mailto:amukherj at redhat.com
> <mailto:amukherj at redhat.com>>>>>
> Date: Thu, Dec 8, 2016 at 11:56 AM
> Subject: Re: [Gluster-users] Replica
> brick not working
> To: Ravishankar N
> <ravishankar at redhat.com <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>>>>
> Cc: Miloš Čučulović - MDPI
> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>
> <mailto:cuculovic at mdpi.com
> <mailto:cuculovic at mdpi.com>>>>>,
> Pranith Kumar Karampuri
> <pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>
> <mailto:pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>
> <mailto:pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>
> <mailto:pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>>
> <mailto:pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>
> <mailto:pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>
> <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>
> <mailto:pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>>>>>,
> gluster-users
> <gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>>>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>
> <mailto:gluster-users at gluster.org
> <mailto:gluster-users at gluster.org>>>>>
>
>
>
>
> On Thu, Dec 8, 2016 at 11:11 AM,
> Ravishankar N
> <ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>
> <mailto:ravishankar at redhat.com
> <mailto:ravishankar at redhat.com>>>>>
>
> wrote:
>
> On 12/08/2016 10:43 AM, Atin
> Mukherjee wrote:
>
> >From the log snippet:
>
> [2016-12-07 09:15:35.677645]
> I [MSGID: 106482]
>
>
> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
> 0-management: Received add
> brick req
> [2016-12-07 09:15:35.677708]
> I [MSGID: 106062]
>
>
> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
> 0-management: replica-count is 2
> [2016-12-07 09:15:35.677735]
> E [MSGID: 106291]
>
>
> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
> 0-management:
>
> The last log entry indicates
> that we hit the
> code path in
>
> gd_addbr_validate_replica_count ()
>
> if
> (replica_count ==
> volinfo->replica_count) {
> if
> (!(total_bricks %
> volinfo->dist_leaf_count)) {
>
> ret = 1;
>
> goto out;
> }
> }
>
>
> It seems unlikely that this
> snippet was hit
> because we print
> the E
> [MSGID: 106291] in the above
> message only if
> ret==-1.
>
> gd_addbr_validate_replica_count() returns -1 and
> yet not
> populates
> err_str only when in
> volinfo->type doesn't match
> any of the
> known
> volume types, so volinfo->type
> is corrupted
> perhaps?
>
>
> You are right, I missed that ret is
> set to 1 here in
> the above
> snippet.
>
> @Milos - Can you please provide us
> the volume info
> file from
> /var/lib/glusterd/vols/<volname>/
> from all the three
> nodes to
> continue
> the analysis?
>
>
>
> -Ravi
>
> @Pranith, Ravi - Milos was
> trying to convert a
> dist (1 X 1)
> volume to a replicate (1 X
> 2) using add brick
> and hit
> this issue
> where add-brick failed. The
> cluster is
> operating with 3.7.6.
> Could you help on what
> scenario this code path
> can be
> hit? One
> straight forward issue I see
> here is missing
> err_str in
> this path.
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>
>
>
>
> --
>
> ~ Atin (atinm)
More information about the Gluster-users
mailing list