[Gluster-users] Fwd: Replica brick not working
Miloš Čučulović - MDPI
cuculovic at mdpi.com
Tue Dec 13 21:27:33 UTC 2016
Hi All,
Moving forward with my issue, sorry for the late reply!
I had some issues with the storage2 server (original volume), then
decided to use 3.9.0, si I have the latest version.
For that, I synced manually all the files to the storage server. I
installed there gluster 3.9.0, started it, created new volume called
storage and all seems to work ok.
Now, I need to create my replicated volume (add new brick on storage2
server). Almost all the files are there. So, I was adding on storage server:
* sudo gluter peer probe storage2
* sudo gluster volume add-brick storage replica 2
storage2:/data/data-cluster force
But there I am receiving "volume add-brick: failed: Host storage2 is not
in 'Peer in Cluster' state"
Any idea?
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic at mdpi.com
Skype: milos.cuculovic.mdpi
On 08.12.2016 17:52, Ravishankar N wrote:
> On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:
>> I was able to fix the sync by rsync-ing all the directories, then the
>> hale started. The next problem :), as soon as there are files on the
>> new brick, the gluster mount will render also this one for mounts, and
>> the new brick is not ready yet, as the sync is not yet done, so it
>> results on missing files on client side. I temporary removed the new
>> brick, now I am running a manual rsync and will add the brick again,
>> hope this could work.
>>
>> What mechanism is managing this issue, I guess there is something per
>> built to make a replica brick available only once the data is
>> completely synced.
> This mechanism was introduced in 3.7.9 or 3.7.10
> (http://review.gluster.org/#/c/13806/). Before that version, you
> manually needed to set some xattrs on the bricks so that healing could
> happen in parallel while the client still would server reads from the
> original brick. I can't find the link to the doc which describes these
> steps for setting xattrs.:-(
>
> Calling it a day,
> Ravi
>>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 16:17, Ravishankar N wrote:
>>> On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
>>>>
>>>>
>>>> On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
>>>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote:
>>>>
>>>> Ah, damn! I found the issue. On the storage server, the storage2
>>>> IP address was wrong, I inversed two digits in the /etc/hosts
>>>> file, sorry for that :(
>>>>
>>>> I was able to add the brick now, I started the heal, but still no
>>>> data transfer visible.
>>>>
>>> 1. Are the files getting created on the new brick though?
>>> 2. Can you provide the output of `getfattr -d -m . -e hex
>>> /data/data-cluster` on both bricks?
>>> 3. Is it possible to attach gdb to the self-heal daemon on the original
>>> (old) brick and get a backtrace?
>>> `gdb -p <pid of self-heal daemon on the orignal brick>`
>>> thread apply all bt -->share this output
>>> quit gdb.
>>>
>>>
>>> -Ravi
>>>>
>>>> @Ravi/Pranith - can you help here?
>>>>
>>>>
>>>>
>>>> By doing gluster volume status, I have
>>>>
>>>> Status of volume: storage
>>>> Gluster process TCP Port RDMA Port
>>>> Online Pid
>>>> ------------------------------------------------------------------------------
>>>>
>>>> Brick storage2:/data/data-cluster 49152 0 Y
>>>> 23101
>>>> Brick storage:/data/data-cluster 49152 0 Y
>>>> 30773
>>>> Self-heal Daemon on localhost N/A N/A Y
>>>> 30050
>>>> Self-heal Daemon on storage N/A N/A Y
>>>> 30792
>>>>
>>>>
>>>> Any idea?
>>>>
>>>> On storage I have:
>>>> Number of Peers: 1
>>>>
>>>> Hostname: 195.65.194.217
>>>> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
>>>> State: Peer in Cluster (Connected)
>>>>
>>>>
>>>> - Kindest regards,
>>>>
>>>> Milos Cuculovic
>>>> IT Manager
>>>>
>>>> ---
>>>> MDPI AG
>>>> Postfach, CH-4020 Basel, Switzerland
>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>>>> Tel. +41 61 683 77 35
>>>> Fax +41 61 302 89 18
>>>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> Skype: milos.cuculovic.mdpi
>>>>
>>>> On 08.12.2016 13:55, Atin Mukherjee wrote:
>>>>
>>>> Can you resend the attachment as zip? I am unable to extract
>>>> the
>>>> content? We shouldn't have 0 info file. What does gluster peer
>>>> status
>>>> output say?
>>>>
>>>> On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI
>>>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> wrote:
>>>>
>>>> I hope you received my last email Atin, thank you!
>>>>
>>>> - Kindest regards,
>>>>
>>>> Milos Cuculovic
>>>> IT Manager
>>>>
>>>> ---
>>>> MDPI AG
>>>> Postfach, CH-4020 Basel, Switzerland
>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>>>> Tel. +41 61 683 77 35
>>>> Fax +41 61 302 89 18
>>>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>>>> Skype: milos.cuculovic.mdpi
>>>>
>>>> On 08.12.2016 10:28, Atin Mukherjee wrote:
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: *Atin Mukherjee* <amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>
>>>> <mailto:amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>> <mailto:amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>
>>>> <mailto:amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>>>>
>>>> Date: Thu, Dec 8, 2016 at 11:56 AM
>>>> Subject: Re: [Gluster-users] Replica brick not working
>>>> To: Ravishankar N <ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>
>>>> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>
>>>> <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com>
>>>> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>>>
>>>> Cc: Miloš Čučulović - MDPI <cuculovic at mdpi.com
>>>> <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>,
>>>> Pranith Kumar Karampuri
>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>
>>>> <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>>
>>>> <mailto:pkarampu at redhat.com
>>>> <mailto:pkarampu at redhat.com> <mailto:pkarampu at redhat.com
>>>> <mailto:pkarampu at redhat.com>>>>,
>>>> gluster-users
>>>> <gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>
>>>> <mailto:gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>>
>>>> <mailto:gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>
>>>> <mailto:gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N
>>>> <ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>
>>>> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>>>
>>>>
>>>> wrote:
>>>>
>>>> On 12/08/2016 10:43 AM, Atin Mukherjee wrote:
>>>>
>>>> >From the log snippet:
>>>>
>>>> [2016-12-07 09:15:35.677645] I [MSGID: 106482]
>>>>
>>>> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
>>>> 0-management: Received add brick req
>>>> [2016-12-07 09:15:35.677708] I [MSGID: 106062]
>>>>
>>>> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
>>>> 0-management: replica-count is 2
>>>> [2016-12-07 09:15:35.677735] E [MSGID: 106291]
>>>>
>>>> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
>>>> 0-management:
>>>>
>>>> The last log entry indicates that we hit the
>>>> code path in
>>>> gd_addbr_validate_replica_count ()
>>>>
>>>> if (replica_count ==
>>>> volinfo->replica_count) {
>>>> if (!(total_bricks %
>>>> volinfo->dist_leaf_count)) {
>>>> ret = 1;
>>>> goto out;
>>>> }
>>>> }
>>>>
>>>>
>>>> It seems unlikely that this snippet was hit
>>>> because we print
>>>> the E
>>>> [MSGID: 106291] in the above message only if
>>>> ret==-1.
>>>> gd_addbr_validate_replica_count() returns -1 and
>>>> yet not
>>>> populates
>>>> err_str only when in volinfo->type doesn't match
>>>> any of the
>>>> known
>>>> volume types, so volinfo->type is corrupted
>>>> perhaps?
>>>>
>>>>
>>>> You are right, I missed that ret is set to 1 here in
>>>> the above
>>>> snippet.
>>>>
>>>> @Milos - Can you please provide us the volume info
>>>> file from
>>>> /var/lib/glusterd/vols/<volname>/ from all the three
>>>> nodes to
>>>> continue
>>>> the analysis?
>>>>
>>>>
>>>>
>>>> -Ravi
>>>>
>>>> @Pranith, Ravi - Milos was trying to convert a
>>>> dist (1 X 1)
>>>> volume to a replicate (1 X 2) using add brick
>>>> and hit
>>>> this issue
>>>> where add-brick failed. The cluster is
>>>> operating with 3.7.6.
>>>> Could you help on what scenario this code path
>>>> can be
>>>> hit? One
>>>> straight forward issue I see here is missing
>>>> err_str in
>>>> this path.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>
>>>
>
>
More information about the Gluster-users
mailing list