[Gluster-users] Fwd: Replica brick not working
Pranith Kumar Karampuri
pkarampu at redhat.com
Thu Dec 8 17:47:28 UTC 2016
On Thu, Dec 8, 2016 at 10:22 PM, Ravishankar N <ravishankar at redhat.com>
wrote:
> On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:
>
>> I was able to fix the sync by rsync-ing all the directories, then the
>> hale started. The next problem :), as soon as there are files on the new
>> brick, the gluster mount will render also this one for mounts, and the new
>> brick is not ready yet, as the sync is not yet done, so it results on
>> missing files on client side. I temporary removed the new brick, now I am
>> running a manual rsync and will add the brick again, hope this could work.
>>
>> What mechanism is managing this issue, I guess there is something per
>> built to make a replica brick available only once the data is completely
>> synced.
>>
> This mechanism was introduced in 3.7.9 or 3.7.10 (
> http://review.gluster.org/#/c/13806/). Before that version, you manually
> needed to set some xattrs on the bricks so that healing could happen in
> parallel while the client still would server reads from the original
> brick. I can't find the link to the doc which describes these steps for
> setting xattrs.:-(
>
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-brick
> Calling it a day,
> Ravi
>
>
>> - Kindest regards,
>>
>> Milos Cuculovic
>> IT Manager
>>
>> ---
>> MDPI AG
>> Postfach, CH-4020 Basel, Switzerland
>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>> Tel. +41 61 683 77 35
>> Fax +41 61 302 89 18
>> Email: cuculovic at mdpi.com
>> Skype: milos.cuculovic.mdpi
>>
>> On 08.12.2016 16:17, Ravishankar N wrote:
>>
>>> On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
>>>
>>>>
>>>>
>>>> On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
>>>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>> wrote:
>>>>
>>>> Ah, damn! I found the issue. On the storage server, the storage2
>>>> IP address was wrong, I inversed two digits in the /etc/hosts
>>>> file, sorry for that :(
>>>>
>>>> I was able to add the brick now, I started the heal, but still no
>>>> data transfer visible.
>>>>
>>>> 1. Are the files getting created on the new brick though?
>>> 2. Can you provide the output of `getfattr -d -m . -e hex
>>> /data/data-cluster` on both bricks?
>>> 3. Is it possible to attach gdb to the self-heal daemon on the original
>>> (old) brick and get a backtrace?
>>> `gdb -p <pid of self-heal daemon on the orignal brick>`
>>> thread apply all bt -->share this output
>>> quit gdb.
>>>
>>>
>>> -Ravi
>>>
>>>>
>>>> @Ravi/Pranith - can you help here?
>>>>
>>>>
>>>>
>>>> By doing gluster volume status, I have
>>>>
>>>> Status of volume: storage
>>>> Gluster process TCP Port RDMA Port Online
>>>> Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick storage2:/data/data-cluster 49152 0 Y
>>>> 23101
>>>> Brick storage:/data/data-cluster 49152 0 Y
>>>> 30773
>>>> Self-heal Daemon on localhost N/A N/A Y
>>>> 30050
>>>> Self-heal Daemon on storage N/A N/A Y
>>>> 30792
>>>>
>>>>
>>>> Any idea?
>>>>
>>>> On storage I have:
>>>> Number of Peers: 1
>>>>
>>>> Hostname: 195.65.194.217
>>>> Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
>>>> State: Peer in Cluster (Connected)
>>>>
>>>>
>>>> - Kindest regards,
>>>>
>>>> Milos Cuculovic
>>>> IT Manager
>>>>
>>>> ---
>>>> MDPI AG
>>>> Postfach, CH-4020 Basel, Switzerland
>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>>>> Tel. +41 61 683 77 35
>>>> Fax +41 61 302 89 18
>>>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> Skype: milos.cuculovic.mdpi
>>>>
>>>> On 08.12.2016 13:55, Atin Mukherjee wrote:
>>>>
>>>> Can you resend the attachment as zip? I am unable to extract the
>>>> content? We shouldn't have 0 info file. What does gluster peer
>>>> status
>>>> output say?
>>>>
>>>> On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI
>>>> <cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>> wrote:
>>>>
>>>> I hope you received my last email Atin, thank you!
>>>>
>>>> - Kindest regards,
>>>>
>>>> Milos Cuculovic
>>>> IT Manager
>>>>
>>>> ---
>>>> MDPI AG
>>>> Postfach, CH-4020 Basel, Switzerland
>>>> Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
>>>> Tel. +41 61 683 77 35
>>>> Fax +41 61 302 89 18
>>>> Email: cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>>>> Skype: milos.cuculovic.mdpi
>>>>
>>>> On 08.12.2016 10:28, Atin Mukherjee wrote:
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: *Atin Mukherjee* <amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>
>>>> <mailto:amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>> <mailto:amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>
>>>> <mailto:amukherj at redhat.com
>>>> <mailto:amukherj at redhat.com>>>>
>>>> Date: Thu, Dec 8, 2016 at 11:56 AM
>>>> Subject: Re: [Gluster-users] Replica brick not working
>>>> To: Ravishankar N <ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>
>>>> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>
>>>> <mailto:ravishankar at redhat.com <mailto:ravishankar at redhat.com>
>>>> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>>>
>>>> Cc: Miloš Čučulović - MDPI <cuculovic at mdpi.com
>>>> <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>
>>>> <mailto:cuculovic at mdpi.com <mailto:cuculovic at mdpi.com>>>>,
>>>> Pranith Kumar Karampuri
>>>> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>
>>>> <mailto:pkarampu at redhat.com <mailto:pkarampu at redhat.com>>
>>>> <mailto:pkarampu at redhat.com
>>>> <mailto:pkarampu at redhat.com> <mailto:pkarampu at redhat.com
>>>> <mailto:pkarampu at redhat.com>>>>,
>>>> gluster-users
>>>> <gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>
>>>> <mailto:gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>>
>>>> <mailto:gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>
>>>> <mailto:gluster-users at gluster.org
>>>> <mailto:gluster-users at gluster.org>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N
>>>> <ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>
>>>> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com> <mailto:ravishankar at redhat.com
>>>> <mailto:ravishankar at redhat.com>>>>
>>>>
>>>> wrote:
>>>>
>>>> On 12/08/2016 10:43 AM, Atin Mukherjee wrote:
>>>>
>>>> >From the log snippet:
>>>>
>>>> [2016-12-07 09:15:35.677645] I [MSGID: 106482]
>>>>
>>>> [glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
>>>> 0-management: Received add brick req
>>>> [2016-12-07 09:15:35.677708] I [MSGID: 106062]
>>>>
>>>> [glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
>>>> 0-management: replica-count is 2
>>>> [2016-12-07 09:15:35.677735] E [MSGID: 106291]
>>>>
>>>> [glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
>>>> 0-management:
>>>>
>>>> The last log entry indicates that we hit the
>>>> code path in
>>>> gd_addbr_validate_replica_count ()
>>>>
>>>> if (replica_count ==
>>>> volinfo->replica_count) {
>>>> if (!(total_bricks %
>>>> volinfo->dist_leaf_count)) {
>>>> ret = 1;
>>>> goto out;
>>>> }
>>>> }
>>>>
>>>>
>>>> It seems unlikely that this snippet was hit
>>>> because we print
>>>> the E
>>>> [MSGID: 106291] in the above message only if
>>>> ret==-1.
>>>> gd_addbr_validate_replica_count() returns -1 and
>>>> yet not
>>>> populates
>>>> err_str only when in volinfo->type doesn't match
>>>> any of the
>>>> known
>>>> volume types, so volinfo->type is corrupted perhaps?
>>>>
>>>>
>>>> You are right, I missed that ret is set to 1 here in
>>>> the above
>>>> snippet.
>>>>
>>>> @Milos - Can you please provide us the volume info
>>>> file from
>>>> /var/lib/glusterd/vols/<volname>/ from all the three
>>>> nodes to
>>>> continue
>>>> the analysis?
>>>>
>>>>
>>>>
>>>> -Ravi
>>>>
>>>> @Pranith, Ravi - Milos was trying to convert a
>>>> dist (1 X 1)
>>>> volume to a replicate (1 X 2) using add brick
>>>> and hit
>>>> this issue
>>>> where add-brick failed. The cluster is
>>>> operating with 3.7.6.
>>>> Could you help on what scenario this code path
>>>> can be
>>>> hit? One
>>>> straight forward issue I see here is missing
>>>> err_str in
>>>> this path.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>
>>>
>>>
>
--
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161208/9dce947e/attachment.html>
More information about the Gluster-users
mailing list