[Gluster-users] 2 nodes volume with existing data

Thu May 19 10:06:18 UTC 2016

----- Original Message -----
> From: "Atin Mukherjee" <amukherj at redhat.com>
> To: "CHEVALIER Pierre" <pchevalier at imajing.org>, gluster-users at gluster.org
> Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Ravishankar N" <ravishankar at redhat.com>, "Anuradha Talur"
> <atalur at redhat.com>
> Sent: Tuesday, May 17, 2016 10:02:55 PM
> Subject: Re: [Gluster-users] 2 nodes volume with existing data
> 
> 
> 
> On 05/17/2016 01:59 PM, CHEVALIER Pierre wrote:
> > Hi everyone,
> > 
> > I have found a similar question in the ML but didn't find the answer I
> > wanted.
> > My situation is  the following :
> > We have 2 existing data storage unit with 72TB each, with 10TB left.
> > Currently these storage unit are synced using rsync periodically. We
> > want to tansform these 2 nodes into glusterfs bricks without loosing any
> > data or deleting a node.
> > 
> > I was thinking setting up the 2 nodes with glusterfs in replica this way :
> > 
> > 1) Rsync server1/brick1 and server2/brick2 content
> > 2) Create the gluster replica and start :
> > gluster volume create volume1 replica 2 transport tcp server1:/brick1
> > server2:/brick1
> > gluster volume start volume1
> Gluster heavily depends on extended attributes for its business logic
> and hence configuring a brick with existing data means that the extended
> attributes would not be set for these files and hence this may not work
> as per the expectation.
> 
> Given that you have a restriction using another brick, I'd suggest you
> to do the following:
> 
> 1. Delete all the contents of server2/brick as this is a secondary copy
> 2. Create a plain distributed volume in server2 with server2/brick1
> 2. Start the volume
> 3. Mount the volume and then do copy the content from server1/brick1 to
> the mount point.
> 4. Probe server2 into the cluster
> 5. Perform an add-brick with server1/brick by gluster volume add-brick
> <volname> replica 2 <brick name> (Again make sure the content is deleted
> first)
> 6. This should trigger the self heal

As of now converting a plain distribute volume to distribute replicate is not going to
automatically heal the files. (We are working on a solution for this.)

If you really want to go ahead with this method then I'd recommend you to do the following :

1) Stop I/O on your volume.
2) Add the new brick.
3) Stop and start your gluster volume
4) Trigger lookup on the files from mount point by doing find . | xargs stat on the mount point.

This should trigger heals on the new brick.

> 
> However since the data volume is very high, self heal can take a longer
> time to sync and finish.
> 
> Also with a two node setup prevention of split brains is not guaranteed.
> An arbiter volume or a 3 way replica would be the solution to tackle
> this problem.
> 
> Since I am not an AFR expert, I'd like to get a sign off from AFR team
> (In cc) before you try this out.
> 
> ~Atin
> 	
> > 
> > Then, should I run a healing process, or stat each files in order to
> > self heal ?
> > 
> > What happens if there is a slight difference between the 2 nodes,
> > mathematically if :
> > server1/brick1= server2/brick1+delta
> > 
> > Will the delta part impact server1, or will it be lost  ?
> > 
> > Maybe I worry for nothing, but as these 2 nodes host important data, I
> > wanted to make sure that eveything is OK.
> > 
> > Thanks for your help !
> > 
> > --
> > Pierre CHEVALIER
> > 
> > 
> > 
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> > 
> 

-- 
Thanks,
Anuradha.