[Gluster-users] Adding new storage nodes to existing GlusterFS network
Craig Carl
craig at gluster.com
Sun Sep 26 02:50:48 UTC 2010
Ronald -
3.1 works the same way, what type of behavior would you prefer|expect to see?
Thanks.
--
Craig Carl
Gluster, Inc.
Cell - (408) 829-9953 (California, USA)
Gtalk - craig.carl at gmail.com
From: "Roland Rabben" <roland at jotta.no>
To: "Craig Carl" <craig at gluster.com>
Cc: gluster-users at gluster.org
Sent: Friday, September 24, 2010 2:01:47 AM
Subject: Re: [Gluster-users] Adding new storage nodes to existing GlusterFS network
Oh no. This is a big problem for me. My folder structure is locked.
I am running 3.0.5, so the messy symlink solution won't work for me, even if I wanted to use it.
Doing nothing means I can't scale my Glusterfs system, which kind of defeats the purpose of a scalable distributed file system.
Option 3 is to change file attributes for all folders and files on my system, and copy a large portion of my files over to the new servers. I have millions of files and folders and about 100 TB of data. This will take weeks. What if something fails during this process?
And, I have to do it over again when I need to add more storage servers.
Is this really it? It's not practically possible and it doesn't scale at all.
Does 3.1 address these issues, or would you still need to use the scale-and-defrag script?
Should I be using something different than the Distribute translator?
Best Regards
Roland Rabben
2010/9/24 Craig Carl < craig at gluster.com >
Roland -
The behavior you are seeing now is expected in your clusters current state. The elastic hash algorithm assigned each folder a hash range when the folder was created, before you added the new servers. Unless you update this range after adding the new storage server you will continue to see the current behavior.
The scale-n-defrag.sh (dependent on defrag.sh) script does two things -
1. Updates the hash range on each folder.
2. Moves any file that needs to be move to its 'correct' server.
You can do 1 of 3 things at this point - (only options 1 or 3 with Gluster 3.0.5)
1. Nothing.
1a. New directories will be created across all the storage nodes and files in those directories will distributed across all storage servers.
1b. Files written to existing directories will not be distributed onto the new storage servers.
2. Update the hash ranges on each directory but DO NOT move any data. (NOT AN OPTION WITH GLUSTER 3.0.5!)
2a. New directories will be created across all the storage nodes and files in those directories will distributed across all storage servers.
2b. New files written to existing directories will be distributed onto the new storage servers.
2c. A link file will be created for every file that isn't on the 'correct' server. The link file points the Gluster to the server on which the file actually exists. Creating the link file takes time, the re-direct takes time and the additional network I/O takes time, this slows down your cluster. In a cluster with a lot of nodes creating the link file takes longer.
2d. Your cluster won't get any faster. If you redistribute the data you can take advantage of the new cache (memory) and I/O (disks) and bandwidth (IP).
3. Update the hash ranges on each directory and move your files.
3a. New directories will be created across all the storage nodes and files in those directories will distributed across all storage servers.
3b. New files written to existing directories will be distributed onto the new storage servers.
3c. Your cluster will get faster.
3d. You will take a performance hit while scale-n-defrag.sh is running.
I don't recommend it, but if you want to use option 2 and you are NOT RUNNING 3.0.5 you just need to comment out the last 6 lines of scale-n-defrag.sh. If you are running 3.0.5 you must either do nothing or run the full scale-n-defrag. To check the version run 'glusterfs --version'. I hope this helps you understand the defrag process, please let me know if you have any other questions.
Thanks,
Craig
--
Craig Carl
Sales Engineer; Gluster, Inc.
Cell - ( 408) 829-9953 (California, USA)
Office - ( 408) 770-1884
Gtalk - craig.carl at gmail.com
Twitter - @gluster
Installing Gluster Storage Platform, the movie!
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/
From: "Roland Rabben" < roland at jotta.no >
To: "Craig Carl" < craig at gluster.com >
Cc: gluster-users at gluster.org
Sent: Thursday, September 23, 2010 5:03:03 AM
Subject: Re: [Gluster-users] Adding new storage nodes to existing GlusterFS network
Hi Craig
After unmounting the client, modifying my client vol file to include the new storage servers and mounting the volume on my client, it does not seem that new files written to existing folders are stored on the new servers. They only end up on the old servers. Is this expected?
If I create a new folder and store new files here, the files are stored on both new and old servers.
Is this where the scale-n-defrag script come into action?
Can you please describe what it does?
Does it move any files, or does it just update metadatainformation to include the new servers?
All new files are written to existing folders, so I need a solution to this. I also have many TB of data and millions of files, so moving files around will take a long time.
Thanks
Roland Rabben
2010/9/22 Craig Carl < craig at gluster.com >
Roland -
You can find the scale-n-defrag script here - http://ftp.gluster.com/pub/gluster/glusterfs/misc/defrag/ , be sure to edit the script first, instructions are inline.
Please let us know if you have any other questions.
Thanks,
Craig
--
Craig Carl
Sales Engineer; Gluster, Inc.
Cell - ( 408) 829-9953 (California, USA)
Office - ( 408) 770-1884
Gtalk - craig.carl at gmail.com
Twitter - @gluster
Installing Gluster Storage Platform, the movie!
http://rackerhacker.com/2010/08/11/one-month-with-glusterfs-in-production/
From: "Roland Rabben" < roland at jotta.no >
To: gluster-users at gluster.org
Sent: Wednesday, September 22, 2010 7:48:23 AM
Subject: Re: [Gluster-users] Adding new storage nodes to existing GlusterFS network
Hi James,
thanks for the answer I will try this tomorrow. I am adding the new servers
for capacity increase.
Do you have any idea how to rebalance old files?
Regards
Roland Rabben
2010/9/22 Burnash, James < jburnash at knight.com >
> Hi Roland.
>
> The short answer is - I'm not sure because I'm in the midst of doing this
> myself, but my setup is just Replicated.
>
> I believe that you can do what you said with the clients because they are
> the only entities with knowledge of what servers are in the backend, so
> adding new servers to their configs and restarting those clients should work
> just fine, assuming you get the replication/distributed part of their
> configs correct.
>
> The one thing is, from what I understand, no rebalancing of old files will
> take place on the new servers automatically - that's a manual procedure -
> but any new files written by the clients will be hashed out to all servers -
> including the new ones.
>
> Just for my information - did you add the extra servers for capacity /
> redundancy / performance increases?
>
> James Burnash, Unix Engineering
>
> -----Original Message-----
> From: gluster-users-bounces at gluster.org [mailto:
> gluster-users-bounces at gluster.org ] On Behalf Of Roland Rabben
> Sent: Wednesday, September 22, 2010 7:37 AM
> To: gluster-users at gluster.org
> Subject: Re: [Gluster-users] Adding new storage nodes to existing GlusterFS
> network
>
> Anyone who know this?
>
> Regards
> Roland Rabben
>
> 2010/9/21 Roland Rabben < roland at jotta.no >
>
> > Hi, this is probably a newbee question, but here goes.
> >
> > I am adding two new servers to my existing GlusterFS network and I am
> > wondering what the correct procedure is.
> > I am using a Distributed / Replicated setup. My existing network has
> > two servers.
> >
> > Do I need to take down the whole network with client and servers?
> > Can I just unmount the client, update the client config file and then
> > mount the client again with the new servers?
> > Will new files be written to the new servers only, or both the new and
> old?
> >
> > Best regards
> >
> > Roland Rabben
> > Founder & CEO Jotta AS
> > Cell: +47 90 85 85 39
> > Phone: +47 21 04 29 00
> > Email: roland at jotta.no
> >
>
>
>
> --
> Roland Rabben
> Founder & CEO Jotta AS
> Cell: +47 90 85 85 39
> Phone: +47 21 04 29 00
> Email: roland at jotta.no
>
>
> DISCLAIMER:
> This e-mail, and any attachments thereto, is intended only for use by the
> addressee(s) named herein and may contain legally privileged and/or
> confidential information. If you are not the intended recipient of this
> e-mail, you are hereby notified that any dissemination, distribution or
> copying of this e-mail, and any attachments thereto, is strictly prohibited.
> If you have received this in error, please immediately notify me and
> permanently delete the original and any copy of any e-mail and any printout
> thereof. E-mail transmission cannot be guaranteed to be secure or
> error-free. The sender therefore does not accept liability for any errors or
> omissions in the contents of this message which arise as a result of e-mail
> transmission.
> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at
> its discretion, monitor and review the content of all e-mail communications.
> http://www.knight.com
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
--
Roland Rabben
Founder & CEO Jotta AS
Cell: +47 90 85 85 39
Phone: +47 21 04 29 00
Email: roland at jotta.no
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
--
Roland Rabben
Founder & CEO Jotta AS
Cell: +47 90 85 85 39
Phone: +47 21 04 29 00
Email: roland at jotta.no
--
Roland Rabben
Founder & CEO Jotta AS
Cell: +47 90 85 85 39
Phone: +47 21 04 29 00
Email: roland at jotta.no
More information about the Gluster-users
mailing list