[Gluster-users] Advice for running out of space on a replicated 4-brick gluster

Wed Feb 19 07:56:47 UTC 2020

On February 19, 2020 1:59:12 AM GMT+02:00, Artem Russakovskii <archon810 at gmail.com> wrote:
>Hi Strahil,
>
>We have 4 main servers, and I wanted to run gluster on all of them,
>with
>everything working even if 3/4 are down, so I set up a replica 4 with
>quorum at 1. It's been working well for several years now, and I can
>lose 3
>out of 4 servers to outages and remain up.
>
>Amar, so to clarify, right now I set up the volume using "gluster v
>create
>$GLUSTER_VOL replica 4 server1:brick1 server2:brick2 server3:brick3
>server4:brick4".
>In order to turn it into a replica 4 but distributed across 4 old
>bricks
>and 4 new bricks (say server1:brick5 server2:brick6 server3:brick7
>server4:brick8), what exact commands do I need to issue?
>
>The docs are a bit confusing for this case IMO:
>
>> volume add-brick <VOLNAME> [<stripe|replica> <COUNT> [arbiter
><COUNT>]]
>> <NEW-BRICK> ... [force] - add brick to volume <VOLNAME>
>
>
>Do I need to specify a stripe? Do I need to repeat the replica param
>and
>keep it at 4? I.e.:
>
>> gluster v add-brick $GLUSTER_VOL replicate 4 server1:brick5
>> gluster v add-brick $GLUSTER_VOL replicate 4 server2:brick6
>> gluster v add-brick $GLUSTER_VOL replicate 4 server3:brick7
>> gluster v add-brick $GLUSTER_VOL replicate 4 server4:brick8
>> gluster v rebalance $GLUSTER_VOL fix-layout start
>
>
>My reservations about going with this new approach also include the
>fact
>that right now I can back up and restore just the brick data itself as
>each
>brick contains the full copy of the data, and it's a loooot faster to
>access the brick data during backups (probably an order of magnitude
>due to
>unresolved list issues). If I go distributed replicated, my current
>backup
>strategy will need to shift to backing up the gluster volume itself
>(not
>sure what kind of additional load that would put on the servers), or
>maybe
>backing up one brick from each replica would work too, though it's
>unclear
>if I'd be able to restore by just copying the data from such backups
>back
>into one restore location to recreate the full set of data (would that
>work?).
>
>Thanks again for your answers.
>
>Sincerely,
>Artem
>
>--
>Founder, Android Police <http://www.androidpolice.com>, APK Mirror
><http://www.apkmirror.com/>, Illogical Robot LLC
>beerpla.net | @ArtemR
><http://twitter.com/ArtemR>
>
>
>On Mon, Feb 17, 2020 at 3:29 PM Strahil Nikolov <hunter86_bg at yahoo.com>
>wrote:
>
>> On February 18, 2020 1:16:19 AM GMT+02:00, Artem Russakovskii <
>> archon810 at gmail.com> wrote:
>> >Hi all,
>> >
>> >We currently have an 8TB 4-brick replicated volume on our 4 servers,
>> >and
>> >are at 80% capacity. The max disk size on our host is 10TB. I'm
>> >starting to
>> >think about what happens closer to 100% and see 2 options.
>> >
>> >Either we go with another new 4-brick replicated volume and start
>> >dealing
>> >with symlinks in our webapp to make sure it knows which volumes the
>> >data is
>> >on, which is a bit of a pain (but not too much) on the sysops side
>of
>> >things. Right now the whole volume mount is symlinked to a single
>> >location
>> >in the webapps (an uploads/ directory) and life is good. After such
>a
>> >split, I'd have to split uploads into yeardir symlinks, make sure
>> >future
>> >yeardir symlinks are created ahead of time and point to the right
>> >volume,
>> >etc).
>> >
>> >The other direction would be converting the replicated volume to a
>> >distributed replicated one
>> >
>>
>https://docs.gluster.org/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/#creating-distributed-replicated-volumes
>> ,
>> >but I'm a bit scared to do it with production data (even after
>testing,
>> >of
>> >course), and having never dealt with a distributed replicated
>volume.
>> >
>> >1. Is it possible to convert our existing volume on the fly by
>adding 4
>> >   bricks but keeping the replica count at 4?
>> >2. What happens if bricks 5-8 which contain the replicated volume #2
>go
>> >down for whatever reason or can't meet their quorum, but the
>replicated
>> >   volume #1 is still up? Does the whole main combined volume become
>> >unavailable or only a portion of it which has data residing on
>> >replicated
>> >   volume #2?
>> >   3. Any other gotchas?
>> >
>> >Thank you very much in advance.
>> >
>> >Sincerely,
>> >Artem
>> >
>> >--
>> >Founder, Android Police <http://www.androidpolice.com>, APK Mirror
>> ><http://www.apkmirror.com/>, Illogical Robot LLC
>> >beerpla.net | @ArtemR
>> ><http://twitter.com/ArtemR>
>>
>> Distributed replicated sounds more reasonable.
>>
>> Out of curiocity, why did you decide to have an even number of bricks
>in
>> the replica - it can still suffer from split-brain?
>>
>> 1.  It should be OK, but I have never done it. Test on some VMs
>before
>> proceeding.
>> Rebalance might take some time, so keep that in mind.
>>
>> 2.All files on replica 5-8 will be unavailable untill yoiu recover
>that
>> set of bricks.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>

Hi Artem,

That's interesting...
In order to extend the volume you will need to:
gluster peer probe node5
gluster peer probe node6
gluster peer probe node7
gluster peer probe node8

gluster volume add-brick replica 4 node{5..8}:/brick/path 

Note: Asuming that the brick paths are the same.

For the backup, you can still backup via the gluster bricks, but you need to pick 1 node per replica set - as your data will be like this:
Node1..4-> A
Node5..8-> B

So, you need to backup from 2  hosts instead of one.

Best Regards,
Strahil Nikolov