[Gluster-users] Replica 3 scale out and ZFS bricks

Thu Sep 17 14:47:20 UTC 2020

В четвъртък, 17 септември 2020 г., 13:16:06 Гринуич+3, Alexander Iliev <ailiev+gluster at mamul.org> написа: 

On 9/16/20 9:53 PM, Strahil Nikolov wrote:
> В сряда, 16 септември 2020 г., 11:54:57 Гринуич+3, Alexander Iliev <ailiev+gluster at mamul.org> написа:
> 
>  From what I understood, in order to be able to scale it one node at a
> time, I need to set up the initial nodes with a number of bricks that is
> a multiple of 3 (e.g., 3, 6, 9, etc. bricks). The initial cluster will
> be able to export a volume as large as the storage of a single node and
> adding one more node will grow the volume by 1/3 (assuming homogeneous
> nodes.)
> 
>      You can't add 1 node to a replica 3, so no - you won't get 1/3 with that extra node.

OK, then I guess I was totally confused on this point.

I'd imagined something like this would work:

  node1        node2        node3
+---------+  +---------+  +---------+
| brick 1 |  | brick 1 |  | brick 1 |
| brick 2 |  | brick 2 |  | brick 2 |
| brick 3 |  | brick 3 |  | brick 3 |
+---------+  +---------+  +---------+
                  |
                  v
  node1        node2        node3        node4
+---------+  +---------+  +---------+  +---------+
| brick 1 |  | brick 1 |  | brick 4 |  | brick 1 |
| brick 2 |  | brick 4 |  | brick 2 |  | brick 2 |
| brick 3 |  | brick 3 |  | brick 3 |  | brick 4 |
+---------+  +---------+  +---------+  +---------+

any# gluster peer probe node4
any# gluster volume replace-brick volume1 node2:/gfs/2/brick 
node4:/gfs/2/brick commit force
any# gluster volume replace-brick volume1 node3:/gfs/1/brick 
node4:/gfs/1/brick commit force
node2# umount /gfs/2 && mkfs /dev/... && mv /gfs/2 /gfs/4 && mount 
/dev/... /gfs/4 # or clean up the replaced brick by other means
node3# umount /gfs/1 && mkgs /dev/... && mv /gfs/1 /gfs/4 && mount 
/dev/... /gfs/4 # or clean up the replaced brick by other means
any# gluster volume add-brick volume1 node2:/gfs/4/brick 
node3:/gfs/4/brick node4:/gfs/4/brick

 I guess I misunderstood you - if I decode the diagram correctly it should be OK , you will always have at least 2 bricks available after a node get's down.

It would be way simpler if you add a 5th node (VM probably) as an arbiter and switch to 'replica 3 arbiter 1'.

Best Regards,
Strahil Nikolov