[Gluster-users] Does replace-brick migrate data?

Alan Orth alan.orth at gmail.com
Wed May 29 14:20:20 UTC 2019


Dear Ravi,

Thank you for the link to the blog post series—it is very informative and
current! If I understand your blog post correctly then I think the answer
to your previous question about pending AFRs is: no, there are no pending
AFRs. I have identified one file that is a good test case to try to
understand what happened after I issued the `gluster volume replace-brick
... commit force` a few days ago and then added the same original brick
back to the volume later. This is the current state of the replica 2
distribute/replicate volume:

[root at wingu0 ~]# gluster volume info apps

Volume Name: apps
Type: Distributed-Replicate
Volume ID: f118d2da-79df-4ee1-919d-53884cd34eda
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: wingu3:/mnt/gluster/apps
Brick2: wingu4:/mnt/gluster/apps
Brick3: wingu05:/data/glusterfs/sdb/apps
Brick4: wingu06:/data/glusterfs/sdb/apps
Brick5: wingu0:/mnt/gluster/apps
Brick6: wingu05:/data/glusterfs/sdc/apps
Options Reconfigured:
diagnostics.client-log-level: DEBUG
storage.health-check-interval: 10
nfs.disable: on

I checked the xattrs of one file that is missing from the volume's FUSE
mount (though I can read it if I access its full path explicitly), but is
present in several of the volume's bricks (some with full size, others
empty):

[root at wingu0 ~]# getfattr -d -m. -e hex
/mnt/gluster/apps/clcgenomics/clclicsrv/licenseserver.cfg

getfattr: Removing leading '/' from absolute path names
# file: mnt/gluster/apps/clcgenomics/clclicsrv/licenseserver.cfg
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.apps-client-3=0x000000000000000000000000
trusted.afr.apps-client-5=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x0200000000000000585a396f00046e15
trusted.gfid=0x878003a2fb5243b6a0d14d2f8b4306bd

[root at wingu05 ~]# getfattr -d -m. -e hex
/data/glusterfs/sdb/apps/clcgenomics/clclicsrv/licenseserver.cfg
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/sdb/apps/clcgenomics/clclicsrv/licenseserver.cfg
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x878003a2fb5243b6a0d14d2f8b4306bd
trusted.gfid2path.82586deefbc539c3=0x34666437323861612d356462392d343836382d616232662d6564393031636566333561392f6c6963656e73657365727665722e636667
trusted.glusterfs.dht.linkto=0x617070732d7265706c69636174652d3200

[root at wingu05 ~]# getfattr -d -m. -e hex
/data/glusterfs/sdc/apps/clcgenomics/clclicsrv/licenseserver.cfg
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/sdc/apps/clcgenomics/clclicsrv/licenseserver.cfg
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x878003a2fb5243b6a0d14d2f8b4306bd
trusted.gfid2path.82586deefbc539c3=0x34666437323861612d356462392d343836382d616232662d6564393031636566333561392f6c6963656e73657365727665722e636667

[root at wingu06 ~]# getfattr -d -m. -e hex
/data/glusterfs/sdb/apps/clcgenomics/clclicsrv/licenseserver.cfg
getfattr: Removing leading '/' from absolute path names
# file: data/glusterfs/sdb/apps/clcgenomics/clclicsrv/licenseserver.cfg
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.gfid=0x878003a2fb5243b6a0d14d2f8b4306bd
trusted.gfid2path.82586deefbc539c3=0x34666437323861612d356462392d343836382d616232662d6564393031636566333561392f6c6963656e73657365727665722e636667
trusted.glusterfs.dht.linkto=0x617070732d7265706c69636174652d3200

According to the trusted.afr.apps-client-xx xattrs this particular file
should be on bricks with id "apps-client-3" and "apps-client-5". It took me
a few hours to realize that the brick-id values are recorded in the
volume's volfiles in /var/lib/glusterd/vols/apps/bricks. After comparing
those brick-id values with a volfile backup from before the replace-brick,
I realized that the files are simply on the wrong brick now as far as
Gluster is concerned. This particular file is now on the brick for
"apps-client-4". As an experiment I copied this one file to the two bricks
listed in the xattrs and I was then able to see the file from the FUSE
mount (yay!).

Other than replacing the brick, removing it, and then adding the old brick
on the original server back, there has been no change in the data this
entire time. Can I change the brick IDs in the volfiles so they reflect
where the data actually is? Or perhaps script something to reset all the
xattrs on the files/directories to point to the correct bricks?

Thank you for any help or pointers,

On Wed, May 29, 2019 at 7:24 AM Ravishankar N <ravishankar at redhat.com>
wrote:

>
> On 29/05/19 9:50 AM, Ravishankar N wrote:
>
>
> On 29/05/19 3:59 AM, Alan Orth wrote:
>
> Dear Ravishankar,
>
> I'm not sure if Brick4 had pending AFRs because I don't know what that
> means and it's been a few days so I am not sure I would be able to find
> that information.
>
> When you find some time, have a look at a blog <http://wp.me/peiBB-6b>
> series I wrote about AFR- I've tried to explain what one needs to know to
> debug replication related issues in it.
>
> Made a typo error. The URL for the blog is https://wp.me/peiBB-6b
>
> -Ravi
>
>
> Anyways, after wasting a few days rsyncing the old brick to a new host I
> decided to just try to add the old brick back into the volume instead of
> bringing it up on the new host. I created a new brick directory on the old
> host, moved the old brick's contents into that new directory (minus the
> .glusterfs directory), added the new brick to the volume, and then did
> Vlad's find/stat trick¹ from the brick to the FUSE mount point.
>
> The interesting problem I have now is that some files don't appear in the
> FUSE mount's directory listings, but I can actually list them directly and
> even read them. What could cause that?
>
> Not sure, too many variables in the hacks that you did to take a guess.
> You can check if the contents of the .glusterfs folder are in order on the
> new brick (example hardlink for files and symlinks for directories are
> present etc.) .
> Regards,
> Ravi
>
>
> Thanks,
>
> ¹
> https://lists.gluster.org/pipermail/gluster-users/2018-February/033584.html
>
> On Fri, May 24, 2019 at 4:59 PM Ravishankar N <ravishankar at redhat.com>
> wrote:
>
>>
>> On 23/05/19 2:40 AM, Alan Orth wrote:
>>
>> Dear list,
>>
>> I seem to have gotten into a tricky situation. Today I brought up a shiny
>> new server with new disk arrays and attempted to replace one brick of a
>> replica 2 distribute/replicate volume on an older server using the
>> `replace-brick` command:
>>
>> # gluster volume replace-brick homes wingu0:/mnt/gluster/homes
>> wingu06:/data/glusterfs/sdb/homes commit force
>>
>> The command was successful and I see the new brick in the output of
>> `gluster volume info`. The problem is that Gluster doesn't seem to be
>> migrating the data,
>>
>> `replace-brick` definitely must heal (not migrate) the data. In your
>> case, data must have been healed from Brick-4 to the replaced Brick-3. Are
>> there any errors in the self-heal daemon logs of Brick-4's node? Does
>> Brick-4 have pending AFR xattrs blaming Brick-3? The doc is a bit out of
>> date. replace-brick command internally does all the setfattr steps that are
>> mentioned in the doc.
>>
>> -Ravi
>>
>>
>> and now the original brick that I replaced is no longer part of the
>> volume (and a few terabytes of data are just sitting on the old brick):
>>
>> # gluster volume info homes | grep -E "Brick[0-9]:"
>> Brick1: wingu4:/mnt/gluster/homes
>> Brick2: wingu3:/mnt/gluster/homes
>> Brick3: wingu06:/data/glusterfs/sdb/homes
>> Brick4: wingu05:/data/glusterfs/sdb/homes
>> Brick5: wingu05:/data/glusterfs/sdc/homes
>> Brick6: wingu06:/data/glusterfs/sdc/homes
>>
>> I see the Gluster docs have a more complicated procedure for replacing
>> bricks that involves getfattr/setfattr¹. How can I tell Gluster about the
>> old brick? I see that I have a backup of the old volfile thanks to yum's
>> rpmsave function if that helps.
>>
>> We are using Gluster 5.6 on CentOS 7. Thank you for any advice you can
>> give.
>>
>> ¹
>> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-faulty-brick
>>
>> --
>> Alan Orth
>> alan.orth at gmail.com
>> https://picturingjordan.com
>> https://englishbulgaria.net
>> https://mjanja.ch
>> "In heaven all the interesting people are missing." ―Friedrich Nietzsche
>>
>> _______________________________________________
>> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
> --
> Alan Orth
> alan.orth at gmail.com
> https://picturingjordan.com
> https://englishbulgaria.net
> https://mjanja.ch
> "In heaven all the interesting people are missing." ―Friedrich Nietzsche
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
>

-- 
Alan Orth
alan.orth at gmail.com
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
"In heaven all the interesting people are missing." ―Friedrich Nietzsche
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190529/b469325b/attachment.html>


More information about the Gluster-users mailing list