[Gluster-users] Client Handling of Elastic Clusters

Amar Tumballi amarts at gmail.com
Wed Oct 16 03:51:52 UTC 2019


Hi Timothy,

Thanks for this report. This seems to be a genuine issue. I don't think we
have a solution for this issue for now, other than may be making sure we
point 'serverD' (or new server's IPs) as ServerA in /etc/hosts on that
particular client as a hack.

Meantime, it would be great if you copy paste this in an issue (
https://github.com/gluster/glusterfs/issues/new), it would be good to track
this.

Regards,
Amar

On Wed, Oct 16, 2019 at 12:35 AM Timothy Orme <torme at ancestry.com> wrote:

> Hello,
>
> I'm trying to setup an elastic gluster cluster and am running into a few
> odd edge cases that I'm unsure how to address.  I'll try and walk through
> the setup as best I can.
>
> If I have a replica 3 distributed-replicated volume, with 2 replicated
> volumes to start:
>
> MyVolume
>    Replica 1
>       serverA
>       serverB
>       serverC
>    Replica 2
>       serverD
>       serverE
>       serverF
>
> And the client mounts the volume with serverA as the primary volfile
> server, and B & C as the backups.
>
> Then, if I perform a scale down event, it selects the first replica volume
> as the one to remove.  So I end up with a configuration like:
>
> MyVolume
>    Replica 2
>       serverD
>       serverE
>       serverF
>
> Everything rebalances and works great.  However, at this point, the client
> has lost any connection with a volfile server.  It knows about D, E, and F,
> so my data is all fine, but it can no longer retrieve a volfile.  In the
> logs I see:
>
> [2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: Exhausted all volfile servers
>
> This becomes problematic when I try and scale back up, and add a
> replicated volume back in:
>
> MyVolume
>    Replica 2
>       serverD
>       serverE
>       serverF
>    Replica 3
>       serverG
>       serverH
>       serverI
>
> And then rebalance the volume.  Now, I have all my data present, but the
> client only knows about D,E,F, so when I run an `ls` on a directory, only
> about half of the files are returned, since the other half live on G,H,I
> which the client doesn't know about.  The data is still there, but it would
> require a re-mount at one of the new servers.
>
> My question then, is there a way to have a more dynamic set of volfile
> servers? What would be great is if there was a way to tell the mount to
> fall back on the servers returned in the volfile itself in case the primary
> one goes away.
>
> If there's not an easy way to do this, is there a flag on the mount helper
> that can cause the mount to die or error out in the event that it is unable
> to retrieve volfiles?  The problem now is that it sort of silently fails
> and returns incomplete file listings, which for my use cases can cause
> improper processing of that data.  I'd rather have it hard error than
> provide bad results silently obviously.
>
> Hope that makes sense, if you need further clarity please let me know.
>
> Thanks,
> Tim
>
>
> ________
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/118564314
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/118564314
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191016/9d102861/attachment.html>


More information about the Gluster-users mailing list