[Gluster-users] problems with replication & NFS
Lonni J Friedman
netllama at gmail.com
Thu Sep 13 19:17:52 UTC 2012
Greetings,
I'm trying to setup a small glusterFS test cluster, in order to gauge
the feasibility for using it in a large production environment. I've
been working through the official Admin Guide
(Gluster_File_System-3.3.0-Administration_Guide-en-US.pdf) along with
the website setup instructions (
http://www.gluster.org/community/documentation/index.php/Getting_started_overview
).
What I have are two Fedora16-x86_64 servers, with a 20GB XFS formatted
partition set aside as bricks. I'm using version 3.3.0. I setup each
for replication, and it seems like its setup & working:
####
$ gluster volume info gv0
Volume Name: gv0
Type: Replicate
Volume ID: 6c9fbbc7-e382-4f26-afae-60f8658207c5
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.31.99.166:/mnt/sdb1
Brick2: 10.31.99.165:/mnt/sdb1
####
This is where my problems begin. I assumed that if replication was
truly working, then any changes to the contents of /mnt/sdb1 on one
brick would automatically get replicated to the other brick. However,
that isn't happening. In fact, nothing seems to be happening. I've
added new files, changed pre-existing, yet none of it ever replicates
to the other brick. Both bricks were empty prior to formatting the
filesystem and setting them up for this test instance. Surely I must
be missing something obvious, as something this fundamental & basic
must work, right?
Next problem is that my production environment would need to access
the volume via NFS (rather than 'native' gluster). I had a 3rd system
setup (also with Fedora16-x86_64), and was able to successfully NFS
mount the gluster volume. Or so I thought. When I attempted to
simply look at the files on the mount point (using 'ls'), it seemed to
work at first, but then shortly afterwards, it failed with a cryptic
"Invalid argument" error. So I manually unmounted, then remounted,
and tried again. Once again, it worked ok for a few seconds, then
died again with the same "Invalid argument" error:
########
[root at cuda-fs3 basebackups]# mount -t nfs -o vers=3,mountproto=tcp
10.31.99.165:/gv0 /mnt/gv0
[root at cuda-fs3 basebackups]# ls -l /mnt/gv0/
total 8
-rw-r--r-- 0 root root 6670 Sep 13 10:21 foo1
[root at cuda-fs3 basebackups]# ls -l /mnt/gv0/
total 8
-rw-r--r-- 0 root root 6670 Sep 13 10:21 foo1
[root at cuda-fs3 basebackups]# ls -l /mnt/gv0/
ls: cannot access /mnt/gv0/foo1: Invalid argument
total 0
-????????? ? ? ? ? ? foo1
########
The duration between the mount command invocation and the failed 'ls'
command was literally about 5 seconds. I have numerous other
traditional NFS mounts that work just fine. Its only the gluster
volume that exhibits this behavior. I did some googling, and this bug
seems to match my problem exactly:
https://bugzilla.redhat.com/show_bug.cgi?id=800755
I can't quite tell from the bug whether its actually fixed in the
released 3.3.0, or not. Can someone clarify whether NFS is supposed
to work in 3.3.0 ? Am I doing something wrong?
thanks!
More information about the Gluster-users
mailing list