[Gluster-devel] AFR+locks bug?

Székelyi Szabolcs cc at avaxio.hu
Thu Jan 17 17:08:22 UTC 2008


Hi,

AFR with posix-locks behaves really strange nowadays... GlusterFS is a
fresh TLA checkout (patch-636), FUSE is brand the new 2.7.2-glfs8.

I have 4 servers with a 4-way AFR on each and features/posix-locks
loaded just above storage/posix bricks. On each AFR, one replica is the
local storage, the remaining 3 are on the other 3 servers.

The 4 AFR bricks are mounted on each server from 'localhost'.

The machines are freshly booted. Basic FS functions (ls, copy, cat) work
fine.

Now I run a distributed locking test using [1]. On the "master" locker I
get:

> # /tmp/locktests -n 10 -c 3  -f /mnt/glusterfs/testfile
> Init
> process initalization
> ....................
> --------------------------------------
> 
> TEST : TRY TO WRITE ON A READ  LOCK:==========
> TEST : TRY TO WRITE ON A WRITE LOCK:==========
> TEST : TRY TO READ  ON A READ  LOCK:==========
> TEST : TRY TO READ  ON A WRITE LOCK:==========
> TEST : TRY TO SET A READ  LOCK ON A READ  LOCK:

After about 5 minutes, another

> RDONLY: fcntl: Transport endpoint is not connected

appears, and the locking processes exit on all slave servers, the master
 blocks.

The mount point locks up. Even an `ls` from a different terminal seems
to block forever.

You can find my server config below. Client configs are simple, just a
protocol/client brick from localhost. I can provide server debug logs if
you need.

Any idea?

Thanks,
--
Szabolcs


[1] http://nfsv4.bullopensource.org/tools/tests_tools/locktests-net.tar.gz


My server config (from a single node, lu1):

volume data-posix
  type storage/posix
  option directory /srv/glusterfs
end-volume

volume data1
  type features/posix-locks
  subvolumes data-posix
end-volume

volume data2
  type protocol/client
  option transport-type tcp/client
  option remote-host lu2
  option remote-subvolume data2
end-volume

volume data3
  type protocol/client
  option transport-type tcp/client
  option remote-host lu3
  option remote-subvolume data3
end-volume

volume data4
  type protocol/client
  option transport-type tcp/client
  option remote-host lu4
  option remote-subvolume data4
end-volume

volume data-afr
  type cluster/afr
  subvolumes data1 data2 data3 data4
end-volume

volume server
  type protocol/server
  subvolumes data1 data-afr
  option transport-type tcp/server
  option auth.ip.data1.allow 10.0.0.*
  option auth.ip.data-afr.allow 127.0.0.1,10.0.0.*
end-volume





More information about the Gluster-devel mailing list