[Gluster-devel] short summary of issues experienced in testing

Matt Drew matt.drew at gmail.com
Sun Jan 27 14:04:35 UTC 2008


I apologize in advance for this - it's not complete and I can't go
back easily and test for the specific issues that I've run across, as
we're moving fast and trying to get things stable to go into
production.  I'll summarize issues that I saw and dealt with as best I
can, in the hope that it'll be useful.  I retained two sets of
configuration files, one for the single server setup and one for the
dual-server setup, see the end of the mail.

The performance issue (see previous thread) is present in all versions
- I'm now pretty convinced that it's not a glusterfs issue directly,
but rather some interaction between Apache/PHP and glusterfs/fuse - so
set that aside for now.

Patch 628, fuse 2.7.2glfs8, one server with two bricks, two clients
running unify (glusterfs-server.vol, glusterfs-client-old.vol)

1) Namespace showing files with sizes on the local disk.  This
occurred on files that we wrote to the glusterfs mount on the client.
The namespace did *not* show filesystem space used when I checked it
with "du -sh", but when I attempted to copy the files out from under
the glusterfs namespace on the server, the whole file was copied (more
on this in a second).  In other words, these files in the namespace
were acting similar to hard links.

>From what I understand, this problem has shown up before, but wasn't
reproduceable, so I missed this opportunity for troubleshooting.

Patch 628, fuse 2.7.2glfs8, two servers with two bricks, two clients
running afr'd namespace and afr/unify on the bricks
(glusterfs-server.vol, glusterfs-client.vol)

2) When I first set this up, I attempted to add the second server to
the first, with the plan being to add afr and then run the find
command in the documentation to trigger the afr self-heal.  This
turned out to be impossible, because when I added the namespace mounts
(one blank, one full)  to the namespace afr volume, namespace
essentially stopped working.  I could mount the share, and access any
file directly by name, but I couldn't list the directories nor could I
run the find command to trigger the afr.  Once a file or directory was
accessed, it would show up in the namespace.  I ended up pulling
directory listings from underneath the working namespace, and running
"head -c 1" on all those files one at a time to get the namespace to
come back.  This partially worked, in that I was able to get the mount
into a useable state, but it was not fully functional.

3) I know it wasn't fully functional because I then tried to wipe out
the mount via the client using "rm -rf *".  This failed in a number of
interesting ways.  One was a client deadlock - everything still
running, but any attempt to access the mount resulted in a hung
process (I was able to recover from this by terminating any process
with an open file on the mount, then unmounting and remounting).
There were also files that were apparently in the namespace, but not
present (or accessible?) via the mount - I didn't get a chance to get
a good look at this.

Patch 640, fuse 2.7.2glfs8, two servers, two clients running afr/unify
(glusterfs-test.vol, glusterfs-test-client.vol)

4) I upgraded to 640 to avoid the "always writes files with the group
as root" issue, which I checked for after seeing it on the mailing
list and found was occurring on our mount.

5) I then moved the brick directories and set up two different
glusterfsd configurations, so that I started with a clean slate and
had them split into production and test mounts (on 6996 and 6997,
respectively), so that I could mount and unmount independently.  I
attempted to rsync about 200G of data from an NFS mount to the
glusterfs mount.  This went ok, except for the client deadlocking
twice during the rsync.  This deadlock had the same symptoms as in #3,
no error messages, no indication of a problem, just hung processes and
no access to the mount.  I fixed it in the same way, by closing all
processes with open files, unmounting glusterfs, and remounting.

I updated to 642 because of the 0-size file replication issue, so I'm
now running 642.

********************
glusterfs-server.vol (corresponds with issues 1, 2, and 3 on the server side)

volume qbert-ns
  type storage/posix
  option directory /namespace
end-volume

volume qbert1
  type storage/posix
  option directory /mnt/qbert1
end-volume

volume qbert2
  type storage/posix
  option directory /mnt/qbert2
end-volume

volume qbert1-locks
  type features/posix-locks
  #option mandatory on
  subvolumes qbert1
end-volume

volume qbert2-locks
  type features/posix-locks
  #option mandatory on
  subvolumes qbert2
end-volume

volume qbert1-export
  type performance/io-threads
  option thread-count 4
  option cache-size 64MB
  subvolumes qbert1-locks
end-volume

volume qbert2-export
  type performance/io-threads
  option thread-count 4
  option cache-size 64MB
  subvolumes qbert2-locks
end-volume

volume server
  type protocol/server
  option transport-type tcp/server
  option client-volume-filename /etc/glusterfs/glusterfs-client.vol
  subvolumes qbert-ns qbert1-export qbert2-export
  option auth.ip.qbert-ns.allow *
  option auth.ip.qbert1-export.allow *
  option auth.ip.qbert2-export.allow *
end-volume

********************
glusterfs-client-old.vol (corresponds with issue 1)

volume qbert-ns-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option transport-timeout 30
  option remote-subvolume qbert-ns
end-volume

volume qbert1-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option transport-timeout 30
  option remote-subvolume qbert1-export
end-volume

volume qbert2-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option transport-timeout 30
  option remote-subvolume qbert2-export
end-volume

volume unify
  type cluster/unify
  subvolumes qbert1-client qbert2-client
  option scheduler alu
  option namespace qbert1-ns-client
  option alu.limits.min-free-disk 2GB
  option alu.order disk-usage
  option alu.disk-usage.entry-threshold 2GB
  option alu.disk-usage.exit-threshold  500MB
  option alu.stat-refresh.interval 10sec
  # option self-heal off
end-volume

volume unify-ra
  type performance/read-ahead
  option page-size 1MB
  option page-count 16
  subvolumes unify
end-volume

volume unify-iocache
  type performance/io-cache
  option cache-size 512MB
  option page-size 1MB
  subvolumes unify-ra
end-volume

volume unify-writeback
  type performance/write-behind
  option aggregate-size 1MB
  option flush-behind off
  subvolumes unify-iocache
end-volume

********************

glusterfs-client.vol (corresponds to issue 2 and 3)
volume qbert-ns-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option transport-timeout 30
  option remote-subvolume qbert-ns
end-volume

volume qbert1-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option transport-timeout 30
  option remote-subvolume qbert1-export
end-volume

volume qbert2-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option transport-timeout 30
  option remote-subvolume qbert2-export
end-volume

volume pacman-ns-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.41
  option transport-timeout 30
  option remote-subvolume pacman-ns
end-volume

volume pacman1-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.41
  option transport-timeout 30
  option remote-subvolume pacman1-export
end-volume

volume pacman2-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.41
  option transport-timeout 30
  option remote-subvolume pacman2-export
end-volume

volume ns-afr
  type cluster/afr
  subvolumes qbert-ns-client pacman-ns-client
end-volume

volume 1-afr
  type cluster/afr
  subvolumes qbert1-client pacman1-client
end-volume

volume 2-afr
  type cluster/afr
  subvolumes qbert2-client pacman2-client
end-volume

volume unify
  type cluster/unify
  subvolumes 1-afr 2-afr
  option scheduler alu
  option namespace ns-afr
  option alu.limits.min-free-disk 2GB
  option alu.order disk-usage
  option alu.disk-usage.entry-threshold 2GB
  option alu.disk-usage.exit-threshold  500MB
  option alu.stat-refresh.interval 10sec
  # option self-heal off
end-volume

volume unify-ra
  type performance/read-ahead
  option page-size 1MB
  option page-count 16
  subvolumes unify
end-volume

volume unify-iocache
  type performance/io-cache
  option cache-size 512MB
  option page-size 1MB
  subvolumes unify-ra
end-volume

volume unify-writeback
  type performance/write-behind
  option aggregate-size 1MB
  option flush-behind off
  subvolumes unify-iocache
end-volume

********************

glusterfs-test.vol (servers (identical except for name) issues 4 and
5, currently in use)

# namespace volumes

volume qbert-test-ns
  type storage/posix
  option directory /mnt/qbert1/test-ns
end-volume

# base volumes

volume qbert1-test
  type storage/posix
  option directory /mnt/qbert1/test
end-volume

volume qbert2-test
  type storage/posix
  option directory /mnt/qbert2/test
end-volume

volume qbert1-test-locks
  type features/posix-locks
  #option mandatory on
  subvolumes qbert1-test
end-volume

volume qbert2-test-locks
  type features/posix-locks
  #option mandatory on
  subvolumes qbert2-test
end-volume

# io-threads (should be just before server, always last, if you change this
# make sure the last translator is named -export to avoid client config

volume qbert1-test-export
  type performance/io-threads
  option thread-count 4
  option cache-size 64MB
  subvolumes qbert1-test-locks
end-volume

volume qbert2-test-export
  type performance/io-threads
  option thread-count 4
  option cache-size 64MB
  subvolumes qbert2-test-locks
end-volume

volume server
  type protocol/server
  option transport-type tcp/server
  option listen-port 6997
  option client-volume-filename /etc/glusterfs/glusterfs-test-client.vol
  subvolumes qbert-test-ns qbert1-test-export qbert2-test-export
  option auth.ip.qbert-test-ns.allow *
  option auth.ip.qbert1-test-export.allow *
  option auth.ip.qbert2-test-export.allow *
end-volume

********************

glusterfs-test-client.vol (issues 4 and 5, current)

# client connections

volume qbert-test-ns-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option remote-port 6997
  option transport-timeout 30
  option remote-subvolume qbert-test-ns
end-volume

volume qbert1-test-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option remote-port 6997
  option transport-timeout 30
  option remote-subvolume qbert1-test-export
end-volume

volume qbert2-test-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.40
  option remote-port 6997
  option transport-timeout 30
  option remote-subvolume qbert2-test-export
end-volume

volume pacman-test-ns-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.41
  option remote-port 6997
  option transport-timeout 30
  option remote-subvolume pacman-test-ns
end-volume

volume pacman1-test-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.41
  option remote-port 6997
  option transport-timeout 30
  option remote-subvolume pacman1-test-export
end-volume

volume pacman2-test-client
  type protocol/client
  option transport-type tcp/client
  option remote-host 10.0.0.41
  option remote-port 6997
  option transport-timeout 30
  option remote-subvolume pacman2-test-export
end-volume

# afr volumes

volume test-ns-afr
  type cluster/afr
  subvolumes qbert-test-ns-client pacman-test-ns-client
end-volume

volume test-1-afr
  type cluster/afr
  subvolumes qbert1-test-client pacman1-test-client
end-volume

volume test-2-afr
  type cluster/afr
  subvolumes qbert2-test-client pacman2-test-client
end-volume

# unify

volume unify
  type cluster/unify
  subvolumes test-1-afr test-2-afr
  option scheduler alu
  option namespace test-ns-afr
  option alu.limits.min-free-disk 2GB
  option alu.order disk-usage
  option alu.disk-usage.entry-threshold 2GB
  option alu.disk-usage.exit-threshold  500MB
  option alu.stat-refresh.interval 10sec
  # option self-heal off
end-volume

# performance translators

volume unify-ra
  type performance/read-ahead
  option page-size 1MB
  option page-count 16
  subvolumes unify
end-volume

volume unify-iocache
  type performance/io-cache
  option cache-size 512MB
  option page-size 1MB
  subvolumes unify-ra
end-volume

volume unify-writeback
  type performance/write-behind
  option aggregate-size 1MB
  option flush-behind off
  subvolumes unify-iocache
end-volume





More information about the Gluster-devel mailing list