[Gluster-users] Small Tests in EC2 failing...

Adam Lindsay adam at nextfeature.com
Sun Nov 14 18:03:21 UTC 2010

A little background. I have gone through a lot of GlusterFS
documentation and outdated tutorials on installing and setting up a
standard 2 server replication with them acting as clients as well. I
am using Ubuntu 10.04 and GlusterFS 3.1. My goals are not that
ambitious. I don't have terabytes of data and only need the most
modest of replication, to the point where I have strongly considered
rsync or unison. GlusterFS seems to be the hotness so I figured I
would give it a try. Initially I spawned 2 m1.micro and got everything
installed and running. I setup Gluster using the command line tool.
The commands that are relevant are below. I do have a bit of questions
regarding this, which documentation isn't very clear on.

# On Server 1
gluster peer probe <server2 ip>
gluster volume create websites replica 2 transport tcp <server1
ip>:/exp1 <server2 ip>:/exp2
gluster volume start websites

mkdir -p /mnt/websites
modprobe fuse
mount -t glusterfs <server1 ip>:/websites /mnt/websites

As you can see this is extremely straight forward. What is weird is
when I start down the path of only simple tests like creating a text
file in the /mnt/websites mount and saving, it doesn't take long for
the /mnt/websites on both servers to not match. Whats odd is that the
/exp1 and /exp2 directories match nearly instantly. I figure the
problem lies between the client and the volume. I have tried all kinds
of configurations. Mounting both clients on each server to the server1
ip, also their own local IP, I even tried crossing them. Finally I
figured, maybe the m1.micro are just too small. So I redid this with
m1.small's. Yes these are 32bit, so I had to compile the code to
install. This went smoothly, and yet same results.

So my questions:

1) Do I have to use clients or can I just read/write to the /exp1 and
/exp2 directories directly?

2) Am I expecting too much from an m1.micro or even m1.small? Again
this was a simple text file and only a single one. Kinda surprised it
would take more CPU just to do that much.

3) I feel this is probably a configuration/optimization issue. It
seems as though the replication to the /exp1 and /exp2 directories
happen quickly and are ready to go, but something with the default
configuration to the client isn't good.

4) Could it be the way I am connecting the clients? Do they always
point to server1 ip? or to localhost?

Before its recommended, m1.large and a 4 server config is probably out
of the budget. If this is what it takes tough than I will simply need
to search for another solution. DRBD has come up as a potential for
what I want, but seems as though it might suffer from split brain on
EC2. Again though given the very very simple test, I would expect this
to work even if the instances are a bit underpowered for what most
people use on this list. Any advice or help is greatly appreciated.

