[Gluster-users] Fwd: Replica brick not working
Atin Mukherjee
amukherj at redhat.com
Wed Dec 14 12:50:55 UTC 2016
On Wed, Dec 14, 2016 at 1:34 PM, Miloš Čučulović - MDPI <cuculovic at mdpi.com>
wrote:
> Atin,
>
> I was able to move forward a bit. Initially, I had this:
>
> sudo gluster peer status
> Number of Peers: 1
>
> Hostname: storage2
> Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad
> State: Peer Rejected (Connected)
>
> Then, on storage2 I removed all from /var/lib/glusterd except the info
> file.
>
> Now I am getting another error message:
>
> sudo gluster peer status
> Number of Peers: 1
>
> Hostname: storage2
> Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad
> State: Sent and Received peer request (Connected)
>
Please edit /var/lib/glusterd/peers/32bef70a-9e31-403e-b9f3-ec9e1bd162ad
file and set the state to 3 in storage1 and restart glusterd instance.
> But the add brick is still not working. I checked the hosts file and all
> seems ok, ping is also working well.
>
> The think I also need to know, when adding a new replicated brick, do I
> need to first sync all files, or the new brick server needs to be empty?
> Also, do I first need to create the same volume on the new server or adding
> it to the volume of server1 will do it automatically?
>
>
> - Kindest regards,
>
> Milos Cuculovic
> IT Manager
>
> ---
> MDPI AG
> Postfach, CH-4020 Basel, Switzerland
> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
> Tel. +41 61 683 77 35
> Fax +41 61 302 89 18
> Email: cuculovic at mdpi.com
> Skype: milos.cuculovic.mdpi
>
> On 14.12.2016 05:13, Atin Mukherjee wrote:
>
>> Milos,
>>
>> I just managed to take a look into a similar issue and my analysis is at
>> [1]. I remember you mentioning about some incorrect /etc/hosts entries
>> which lead to this same problem in earlier case, do you mind to recheck
>> the same?
>>
>> [1]
>> http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html
>>
>> On Wed, Dec 14, 2016 at 2:57 AM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote:
>>
>> Hi All,
>>
>> Moving forward with my issue, sorry for the late reply!
>>
>> I had some issues with the storage2 server (original volume), then
>> decided to use 3.9.0, si I have the latest version.
>>
>> For that, I synced manually all the files to the storage server. I
>> installed there gluster 3.9.0, started it, created new volume called
>> storage and all seems to work ok.
>>
>> Now, I need to create my replicated volume (add new brick on
>> storage2 server). Almost all the files are there. So, I was adding
>> on storage server:
>>
>> * sudo gluter peer probe storage2
>> * sudo gluster volume add-brick storage replica 2
>> storage2:/data/data-cluster force
>>
>> But there I am receiving "volume add-brick: failed: Host storage2 is
>> not in 'Peer in Cluster' state"
>>
>> Any idea?
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 17:52, Ravishankar N wrote:
>>
>> On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:
>>
>> I was able to fix the sync by rsync-ing all the directories,
>> then the
>> hale started. The next problem :), as soon as there are
>> files on the
>> new brick, the gluster mount will render also this one for
>> mounts, and
>> the new brick is not ready yet, as the sync is not yet done,
>> so it
>> results on missing files on client side. I temporary removed
>> the new
>> brick, now I am running a manual rsync and will add the
>> brick again,
>> hope this could work.
>>
>> What mechanism is managing this issue, I guess there is
>> something per
>> built to make a replica brick available only once the data is
>> completely synced.
>>
>> This mechanism was introduced in 3.7.9 or 3.7.10
>> (http://review.gluster.org/#/c/13806/
>> <http://review.gluster.org/#/c/13806/>). Before that version, you
>> manually needed to set some xattrs on the bricks so that healing
>> could
>> happen in parallel while the client still would server reads
>> from the
>> original brick. I can't find the link to the doc which
>> describes these
>> steps for setting xattrs.:-(
>>
>> Calling it a day,
>> Ravi
>>
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 16:17, Ravishankar N wrote:
>>
>> On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
>>
>>
>>
>> On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>> wrote:
>>
>> Ah, damn! I found the issue. On the storage
>> server, the storage2
>> IP address was wrong, I inversed two digits in
>> the /etc/hosts
>> file, sorry for that :(
>>
>> I was able to add the brick now, I started the
>> heal, but still no
>> data transfer visible.
>>
>> 1. Are the files getting created on the new brick though?
>> 2. Can you provide the output of `getfattr -d -m . -e hex
>> /data/data-cluster` on both bricks?
>> 3. Is it possible to attach gdb to the self-heal daemon
>> on the original
>> (old) brick and get a backtrace?
>> `gdb -p <pid of self-heal daemon on the orignal
>> brick>`
>> thread apply all bt -->share this output
>> quit gdb.
>>
>>
>> -Ravi
>>
>>
>> @Ravi/Pranith - can you help here?
>>
>>
>>
>> By doing gluster volume status, I have
>>
>> Status of volume: storage
>> Gluster process TCP Port
>> RDMA Port
>> Online Pid
>> ------------------------------
>> ------------------------------------------------
>>
>> Brick storage2:/data/data-cluster 49152 0
>> Y
>> 23101
>> Brick storage:/data/data-cluster 49152 0
>> Y
>> 30773
>> Self-heal Daemon on localhost N/A
>> N/A Y
>> 30050
>> Self-heal Daemon on storage N/A
>> N/A Y
>> 30792
>>
>>
>> Any idea?
>>
>> On storage I have:
>> Number of Peers: 1
>>
>> Hostname: 195.65.194.217
>> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
>> State: Peer in Cluster (Connected)
>>
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel,
>> Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com
>> >>
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 13:55, Atin Mukherjee wrote:
>>
>> Can you resend the attachment as zip? I am
>> unable to extract
>> the
>> content? We shouldn't have 0 info file. What
>> does gluster peer
>> status
>> output say?
>>
>> On Thu, Dec 8, 2016 at 4:51 PM, Miloš
>> Čučulović - MDPI
>> <cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com
>> >>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>>> wrote:
>>
>> I hope you received my last email Atin,
>> thank you!
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel,
>> Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com
>> >>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com
>> >>>
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 10:28, Atin Mukherjee wrote:
>>
>>
>> ---------- Forwarded message
>> ----------
>> From: *Atin Mukherjee*
>> <amukherj at redhat.com <mailto:amukherj at redhat.com>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>>>
>> <mailto:amukherj at redhat.com <mailto:
>> amukherj at redhat.com>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>
>> <mailto:amukherj at redhat.com
>> <mailto:amukherj at redhat.com>>>>>
>> Date: Thu, Dec 8, 2016 at 11:56 AM
>> Subject: Re: [Gluster-users] Replica
>> brick not working
>> To: Ravishankar N
>> <ravishankar at redhat.com <mailto:
>> ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>>>>
>> Cc: Miloš Čučulović - MDPI
>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com
>> >>>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com
>> >>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>
>> <mailto:cuculovic at mdpi.com
>> <mailto:cuculovic at mdpi.com>>>>>,
>> Pranith Kumar Karampuri
>> <pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>>>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>>
>> <mailto:pkarampu at redhat.com <mailto:
>> pkarampu at redhat.com>
>> <mailto:pkarampu at redhat.com
>> <mailto:pkarampu at redhat.com>>>>>,
>> gluster-users
>> <gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>
>> <mailto:gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>>>>
>>
>>
>>
>>
>> On Thu, Dec 8, 2016 at 11:11 AM,
>> Ravishankar N
>> <ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>
>> <mailto:ravishankar at redhat.com
>> <mailto:ravishankar at redhat.com>>>>>
>>
>> wrote:
>>
>> On 12/08/2016 10:43 AM, Atin
>> Mukherjee wrote:
>>
>> >From the log snippet:
>>
>> [2016-12-07 09:15:35.677645]
>> I [MSGID: 106482]
>>
>>
>> [glusterd-brick-ops.c:442:__gl
>> usterd_handle_add_brick]
>> 0-management: Received add
>> brick req
>> [2016-12-07 09:15:35.677708]
>> I [MSGID: 106062]
>>
>>
>> [glusterd-brick-ops.c:494:__gl
>> usterd_handle_add_brick]
>> 0-management: replica-count
>> is 2
>> [2016-12-07 09:15:35.677735]
>> E [MSGID: 106291]
>>
>>
>> [glusterd-brick-ops.c:614:__gl
>> usterd_handle_add_brick]
>> 0-management:
>>
>> The last log entry indicates
>> that we hit the
>> code path in
>>
>> gd_addbr_validate_replica_count ()
>>
>> if
>> (replica_count ==
>> volinfo->replica_count) {
>> if
>> (!(total_bricks %
>> volinfo->dist_leaf_count)) {
>>
>> ret = 1;
>>
>> goto out;
>> }
>> }
>>
>>
>> It seems unlikely that this
>> snippet was hit
>> because we print
>> the E
>> [MSGID: 106291] in the above
>> message only if
>> ret==-1.
>>
>> gd_addbr_validate_replica_count() returns -1 and
>> yet not
>> populates
>> err_str only when in
>> volinfo->type doesn't match
>> any of the
>> known
>> volume types, so volinfo->type
>> is corrupted
>> perhaps?
>>
>>
>> You are right, I missed that ret is
>> set to 1 here in
>> the above
>> snippet.
>>
>> @Milos - Can you please provide us
>> the volume info
>> file from
>> /var/lib/glusterd/vols/<volname>/
>> from all the three
>> nodes to
>> continue
>> the analysis?
>>
>>
>>
>> -Ravi
>>
>> @Pranith, Ravi - Milos was
>> trying to convert a
>> dist (1 X 1)
>> volume to a replicate (1 X
>> 2) using add brick
>> and hit
>> this issue
>> where add-brick failed. The
>> cluster is
>> operating with 3.7.6.
>> Could you help on what
>> scenario this code path
>> can be
>> hit? One
>> straight forward issue I see
>> here is missing
>> err_str in
>> this path.
>>
>>
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>
--
~ Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161214/6ff72255/attachment.html>
More information about the Gluster-users
mailing list