[Gluster-users] NFS file locks
Steinmetz, Ian
Ian_Steinmetz at cable.comcast.com
Mon May 3 14:38:53 UTC 2010
Shehjar,
I've turned on debug logging for the server and client of GlusterFS and
amended them below. I've replaced the IP addresses with "x.x.x" and
left the last octet for security. It appears I'm able to lock the file
when I run the program directly on the gluster mount point, just not
when it's mounted via NFS. I checked and rpc.statd is running. One odd
thing, when I run the perl locking program directly on the mnt point,
it appears to work but spits out the following log message:
[2010-05-03 09:16:39] D [read-ahead.c:468:ra_readv] readahead:
unexpected offset (4096 != 362) resetting
You can see where I tried it several times in the log file. Running the
locking program on the nfs mount doesn't produce any extra error
messages, it just fails. I tested this by mounting my server's NFS
share on the same server (gluster02). Gluster01 (which is x.x.x.17) had
the client & server running at time of testing, I also tried this
without it running as well.
I'm able to lock files on a normal NFS shares.. so this may be some kind
of issue with FUSE and how it interacts with NFS. The version of FUSE.
The version of FUSE I'm running was compiled/installed from
http://ftp.gluster.com/pub/gluster/glusterfs/fuse/fuse-2.7.4glfs11.tar.g
z
Also, since last post I've upgraded Ubuntu to Lucid 10.4, to take
advantage of the latest kernel in case there were fixes there.
[console from gluster02, the NFS server]
root at gluster02:/var/log/glusterfs# rpc.statd
root at gluster02:/var/log/glusterfs# rpc.statd -V
rpc.statd version 1.1.6
root at gluster02:/var/lib/nfs# cd /mnt/glusterfs/
root at gluster02:/mnt/glusterfs# ./locktest.pl
Opened file lockfile
Got shared lock on file lockfile
Closed file lockfile
root at gluster02:/mnt/glusterfs# ( ip's removed for security)
root at gluster02:/mnt/glusterfs# exportfs -vfa
exporting x.x.x.0/25:/mnt/test
exporting x.x.x.0/25:/mnt/glusterfs
exporting x.x.x.0/25:/exports/glusterfs
root at gluster02:/mnt/glusterfs# mount x.x.x.16:/mnt/glusterfs /mnt/test
root at gluster02:/mnt/glusterfs# cd /mnt/test
root at gluster02:/mnt/test# ./locktest.pl
Opened file lockfile
Can't get shared lock on lockfile:
No locks available
[glusterfsd server logfile during above testing]
root at gluster02:/var/log/glusterfs# /usr/sbin/glusterfsd -p
/var/run/glusterfsd.pid -f /etc/glusterfs/glusterfsd.vol --log-file
/var/log/glusterfsd.log --debug -N
[2010-05-03 09:09:40] D [glusterfsd.c:424:_get_specfp] glusterfs:
loading volume file /etc/glusterfs/glusterfsd.vol
[2010-05-03 09:09:40] D [xlator.c:740:xlator_set_type] xlator:
dlsym(notify) on
/usr/local/lib/glusterfs/3.0.4/xlator/features/locks.so: undefined
symbol: notify -- neglecting
[2010-05-03 09:09:40] D [xlator.c:740:xlator_set_type] xlator:
dlsym(notify) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so:
undefined symbol: notify -- neglecting
[2010-05-03 09:09:40] D [xlator.c:745:xlator_set_type] xlator:
dlsym(dumpops) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so:
undefined symbol: dumpops -- neglecting
========================================================================
========
Version : glusterfs 3.0.2 built on Mar 23 2010 01:01:39
git: v3.0.2
Starting Time: 2010-05-03 09:10:20
Command line : /usr/sbin/glusterfsd -p /var/run/glusterfsd.pid -f
/etc/glusterfs/glusterfsd.vol --log-file /var/log/glusterfsd.log --debug
-N
PID : 2231
System name : Linux
Nodename : gluster02
Kernel Release : 2.6.32-21-generic-pae
Hardware Identifier: i686
Given volfile:
+-----------------------------------------------------------------------
-------+
1: volume posix
2: type storage/posix
3: option directory /data/export
4: end-volume
5:
6: volume locks
7: type features/locks
8: option manditory on
9: subvolumes posix
10: end-volume
11:
12: volume brick
13: type performance/io-threads
14: option thread-count 8
15: subvolumes locks
16: end-volume
17:
18: volume server
19: type protocol/server
20: option transport-type tcp
21: option auth.addr.brick.allow *
22: option auth.addr.brick-ns.allow *
23: option transport.socket.nodelay on
24: option auth.ip.locks.allow *
25: option auth.ip.locks-afr.allow *
26: subvolumes brick
27: end-volume
+-----------------------------------------------------------------------
-------+
[2010-05-03 09:10:20] D [glusterfsd.c:1370:main] glusterfs: running in
pid 2231
[2010-05-03 09:10:20] D [transport.c:123:transport_load] transport:
attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so
[2010-05-03 09:10:20] D [name.c:553:server_fill_address_family] server:
option address-family not specified, defaulting to inet/inet6
[2010-05-03 09:10:20] W [server-protocol.c:6521:get_auth_types] server:
assuming 'auth.ip' to be 'auth.addr'
[2010-05-03 09:10:20] W [server-protocol.c:6521:get_auth_types] server:
assuming 'auth.ip' to be 'auth.addr'
[2010-05-03 09:10:20] D [io-threads.c:2841:init] brick: io-threads:
Autoscaling: off, min_threads: 8, max_threads: 8
[2010-05-03 09:10:20] W [glusterfsd.c:540:_log_if_option_is_invalid]
locks: option 'manditory' is not recognized
[2010-05-03 09:10:20] N [glusterfsd.c:1396:main] glusterfs: Successfully
started
[2010-05-03 09:11:20] D [addr.c:190:gf_auth] brick: allowed = "*",
received addr = "x.x.x.17"
[2010-05-03 09:11:20] N [server-protocol.c:5852:mop_setvolume] server:
accepted client from x.x.x.17:1021
[2010-05-03 09:11:20] D [addr.c:190:gf_auth] brick: allowed = "*",
received addr = "x.x.x.17"
[2010-05-03 09:11:20] N [server-protocol.c:5852:mop_setvolume] server:
accepted client from x.x.x.17:1020
[2010-05-03 09:12:21] D [addr.c:190:gf_auth] brick: allowed = "*",
received addr = "x.x.x.18"
[2010-05-03 09:12:21] N [server-protocol.c:5852:mop_setvolume] server:
accepted client from x.x.x.18:1021
[2010-05-03 09:12:21] D [addr.c:190:gf_auth] brick: allowed = "*",
received addr = "x.x.x.18"
[2010-05-03 09:12:21] N [server-protocol.c:5852:mop_setvolume] server:
accepted client from x.x.x.18:1020
[2010-05-03 09:26:21] N [server-protocol.c:6788:notify] server:
x.x.x.18:1021 disconnected
[2010-05-03 09:26:21] N [server-protocol.c:6788:notify] server:
x.x.x.18:1020 disconnected
[2010-05-03 09:26:21] N [server-helpers.c:842:server_connection_destroy]
server: destroyed connection of
gluster02-2242-2010/05/03-09:12:21:41737-brick2
^C
[logfile from gluster client on same server]
root at gluster02:~# /usr/sbin/glusterfs --log-level=DEBUG
--volfile=/etc/glusterfs/glusterfs.vol /mnt/glusterfs --debug -N
[2010-05-03 09:12:21] D [glusterfsd.c:424:_get_specfp] glusterfs:
loading volume file /etc/glusterfs/glusterfs.vol
[2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator:
dlsym(notify) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/write-behind.so:
undefined symbol: notify -- neglecting
[2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator:
dlsym(notify) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/io-cache.so: undefined
symbol: notify -- neglecting
[2010-05-03 09:12:21] D [xlator.c:745:xlator_set_type] xlator:
dlsym(dumpops) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/io-cache.so: undefined
symbol: dumpops -- neglecting
[2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator:
dlsym(notify) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/read-ahead.so:
undefined symbol: notify -- neglecting
[2010-05-03 09:12:21] D [xlator.c:740:xlator_set_type] xlator:
dlsym(notify) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so:
undefined symbol: notify -- neglecting
[2010-05-03 09:12:21] D [xlator.c:745:xlator_set_type] xlator:
dlsym(dumpops) on
/usr/local/lib/glusterfs/3.0.4/xlator/performance/io-threads.so:
undefined symbol: dumpops -- neglecting
========================================================================
========
Version : glusterfs 3.0.2 built on Mar 23 2010 01:01:39
git: v3.0.2
Starting Time: 2010-05-03 09:12:21
Command line : /usr/sbin/glusterfs --log-level=DEBUG
--volfile=/etc/glusterfs/glusterfs.vol /mnt/glusterfs --debug -N
PID : 2242
System name : Linux
Nodename : gluster02
Kernel Release : 2.6.32-21-generic-pae
Hardware Identifier: i686
Given volfile:
+-----------------------------------------------------------------------
-------+
1: volume brick1
2: type protocol/client
3: option transport-type tcp/client
4: option remote-host x.x.x.17 # IP address of the remote brick
5: option remote-subvolume brick # name of the remote volume
6: end-volume
7:
8: volume brick2
9: type protocol/client
10: option transport-type tcp/client
11: option remote-host x.x.x.18 # IP address of the remote brick
12: option remote-subvolume brick # name of the remote volume
13: end-volume
14:
15: volume afr1
16: type cluster/afr
17: subvolumes brick1 brick2
18: end-volume
19:
20: volume writebehind
21: type performance/write-behind
22: option window-size 4MB
23: subvolumes afr1
24: end-volume
25:
26: volume cache
27: type performance/io-cache
28: option cache-size 512MB
29: subvolumes writebehind
30: end-volume
31:
32: volume readahead
33: type performance/read-ahead
34: option page-size 128KB # unit in bytes
35: subvolumes cache
36: end-volume
37:
38: volume iothreads
39: type performance/io-threads
40: option thread-count 4
41: option cache-size 64MB
42: subvolumes readahead
43: end-volume
+-----------------------------------------------------------------------
-------+
[2010-05-03 09:12:21] D [glusterfsd.c:1370:main] glusterfs: running in
pid 2242
[2010-05-03 09:12:21] W [xlator.c:656:validate_xlator_volume_options]
writebehind: option 'window-size' is deprecated, preferred is
'cache-size', continuing with correction
[2010-05-03 09:12:21] D [io-threads.c:2841:init] iothreads: io-threads:
Autoscaling: off, min_threads: 4, max_threads: 4
[2010-05-03 09:12:21] D [write-behind.c:2526:init] writebehind:
disabling write-behind for first 1 bytes
[2010-05-03 09:12:21] D [client-protocol.c:6603:init] brick2: defaulting
frame-timeout to 30mins
[2010-05-03 09:12:21] D [client-protocol.c:6614:init] brick2: defaulting
ping-timeout to 42
[2010-05-03 09:12:21] D [transport.c:123:transport_load] transport:
attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so
[2010-05-03 09:12:21] D [transport.c:123:transport_load] transport:
attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so
[2010-05-03 09:12:21] D [client-protocol.c:6603:init] brick1: defaulting
frame-timeout to 30mins
[2010-05-03 09:12:21] D [client-protocol.c:6614:init] brick1: defaulting
ping-timeout to 42
[2010-05-03 09:12:21] D [transport.c:123:transport_load] transport:
attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so
[2010-05-03 09:12:21] D [transport.c:123:transport_load] transport:
attempt to load file /usr/local/lib/glusterfs/3.0.4/transport/socket.so
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick1:
address-family not specified, guessing it to be inet/inet6
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick1:
address-family not specified, guessing it to be inet/inet6
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick2:
address-family not specified, guessing it to be inet/inet6
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [name.c:155:client_fill_address_family] brick2:
address-family not specified, guessing it to be inet/inet6
[2010-05-03 09:12:21] W [glusterfsd.c:540:_log_if_option_is_invalid]
iothreads: option 'cache-size' is not recognized
[2010-05-03 09:12:21] W [glusterfsd.c:540:_log_if_option_is_invalid]
readahead: option 'page-size' is not recognized
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick1: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] D [client-protocol.c:7027:notify] brick2: got
GF_EVENT_PARENT_UP, attempting connect on transport
[2010-05-03 09:12:21] N [glusterfsd.c:1396:main] glusterfs: Successfully
started
[2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick1: got
GF_EVENT_CHILD_UP
[2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick1: got
GF_EVENT_CHILD_UP
[2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick2: got
GF_EVENT_CHILD_UP
[2010-05-03 09:12:21] D [client-protocol.c:7041:notify] brick2: got
GF_EVENT_CHILD_UP
[2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk]
brick2: Connected to x.x.x.18:6996, attached to remote volume 'brick'.
[2010-05-03 09:12:21] N [afr.c:2632:notify] afr1: Subvolume 'brick2'
came back up; going online.
[2010-05-03 09:12:21] D [fuse-bridge.c:3100:fuse_thread_proc] fuse:
pthread_cond_timedout returned non zero value ret: 0 errno: 0
[2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk]
brick2: Connected to x.x.x.18:6996, attached to remote volume 'brick'.
[2010-05-03 09:12:21] N [fuse-bridge.c:2950:fuse_init] glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
[2010-05-03 09:12:21] N [afr.c:2632:notify] afr1: Subvolume 'brick2'
came back up; going online.
[2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk]
brick1: Connected to x.x.x.17:6996, attached to remote volume 'brick'.
[2010-05-03 09:12:21] N [client-protocol.c:6246:client_setvolume_cbk]
brick1: Connected to x.x.x.17:6996, attached to remote volume 'brick'.
[2010-05-03 09:16:39] D [read-ahead.c:468:ra_readv] readahead:
unexpected offset (4096 != 362) resetting
[2010-05-03 09:26:04] D [read-ahead.c:468:ra_readv] readahead:
unexpected offset (4096 != 362) resetting
[2010-05-03 09:26:21] N [fuse-bridge.c:3140:fuse_thread_proc]
glusterfs-fuse: terminating upon getting ENODEV when reading /dev/fuse
[2010-05-03 09:26:21] N [fuse-bridge.c:3208:fuse_thread_proc] fuse:
unmounting /mnt/glusterfs
[2010-05-03 09:26:21] W [glusterfsd.c:953:cleanup_and_exit] glusterfs:
shutting down
^Croot at gluster02:~#
--
Ian Steinmetz
-----Original Message-----
From: Shehjar Tikoo [mailto:shehjart at gluster.com]
Sent: Sunday, May 02, 2010 11:22 PM
To: Steinmetz, Ian
Cc: gluster-users at gluster.org
Subject: Re: [Gluster-users] NFS file locks
There are a couple of things we can do:
-> Mail us the Glusterfs log files from the NFS server and the glusterfs
servers when the lock script fails. Do file a bug if you can.
-> On the NFS client machine, before you run the mount command, make
sure you run the following command.
$ rpc.statd
-> Run the same perl script but this time at the nfs server over the
glusterfs mount point, not at the NFS client. If it runs fine, it is
probably related to locking over NFS and we'll look at other places to
figure it out.
-Shehjar
Steinmetz, Ian wrote:
> I'm seeing an issue where I can't lock files on a NFS exported
> GlusterFS mount. I have two servers connected to each other doing AFR
> to provide a high available NFS server (mirror the content, one VIP
for
> NFS mounts to clients). Both of the servers have mounted
> "/mnt/glusterfs" using GlusterFS with the client pointing to both
> servers. I then export the filesyste with NFS. I grabbed a quick
perl
> program that tries to lock a file for testing, which fails only on the
> glusterfs. When I export a normal directory "/mnt/test" the locking
> works.
> Any ideas appreciated. I have a feeling I've implemented the
> posix/locks option incorrectly.
>
> Both servers are running Ubuntu with identical setups, below are
> relevant configs.
> root at gluster01:/mnt/glusterfs# uname -a
> Linux gluster01 2.6.31-20-generic-pae #58-Ubuntu SMP Fri Mar 12
06:25:51
> UTC 2010 i686 GNU/Linux
>
> root at gluster01:/mnt/glusterfs# cat /etc/exports
> /mnt/glusterfs <ip removed for
>
security>/25(rw,no_root_squash,no_all_squash,no_subtree_check,sync,insec
> ure,fsid=10)
> /mnt/test <ip removed for
>
security>/25(rw,no_root_squash,no_all_squash,no_subtree_check,sync,insec
> ure,fsid=11)
>
> * I've tried async, rsync, removing all options except FSID.
>
> root at gluster02:/etc/glusterfs# cat glusterfs.vol
> volume brick1
> type protocol/client
> option transport-type tcp/client
> option remote-host <ip removed for security> # IP address of the
remote
> brick
> option remote-subvolume brick # name of the remote volume
> end-volume
>
> volume brick2
> type protocol/client
> option transport-type tcp/client
> option remote-host <ip removed for security> # IP address of the
> remote brick
> option remote-subvolume brick # name of the remote volume
> end-volume
>
> volume afr1
> type cluster/afr
> subvolumes brick1 brick2
> end-volume
>
> volume writebehind
> type performance/write-behind
> option window-size 4MB
> subvolumes afr1
> end-volume
>
> volume cache
> type performance/io-cache
> option cache-size 512MB
> subvolumes writebehind
> end-volume
>
> volume readahead
> type performance/read-ahead
> option page-size 128KB # unit in bytes
> subvolumes cache
> end-volume
>
> volume iothreads
> type performance/io-threads
> option thread-count 4
> option cache-size 64MB
> subvolumes readahead
> end-volume
>
> root at gluster02:/etc/glusterfs# cat glusterfsd.vol
> volume posix
> type storage/posix
> option directory /data/export
> end-volume
>
> volume locks
> type features/posix-locks
> option manditory on # tried with and without this, found in a
search
> of earlier post
> subvolumes posix
> end-volume
>
> volume brick
> type performance/io-threads
> option thread-count 8
> subvolumes locks
> end-volume
>
> volume server
> type protocol/server
> option transport-type tcp
> option auth.addr.brick.allow *
> option auth.addr.brick-ns.allow *
> option transport.socket.nodelay on
> option auth.ip.locks.allow *
> subvolumes brick
> end-volume
>
>
> * file to test locking...
> root at gluster02:/mnt/glusterfs# cat locktest.pl
> #!/usr/bin/perl
>
> use Fcntl qw(:flock);
>
> my $lock_file = 'lockfile';
>
> open(LOCKFILE,">>$lock_file") or die "Cannot open $lock_file: $!\n";
> print "Opened file $lock_file\n";
> flock(LOCKFILE, LOCK_SH) or die "Can't get shared lock on $lock_file:
> $!\n";
> print "Got shared lock on file $lock_file\n";
> sleep 2;
> close LOCKFILE;
> print "Closed file $lock_file\n";
>
> exit;
>
> *Test run from gluster02 using normal NFS mount:
> root at gluster02:/# mount <ip removed for security>:/mnt/test /mnt/test
> root at gluster02:/# cd /mnt/test
> root at gluster02:/mnt/test# ./locktest.pl
> Opened file lockfile
> Got shared lock on file lockfile
> Closed file lockfile
>
> *Test run from gluster02 using gluster exported NFS mount:
> root at gluster02:/# mount 74.81.128.17:/mnt/glusterfs /mnt/test
> root at gluster02:/# cd /mnt/test
> root at gluster02:/mnt/test# ./locktest.pl
> Opened file lockfile
> Can't get shared lock on lockfile:
> No locks available
>
> --
> Ian Steinmetz
> Comcast Engineering - Houston
> 713-375-7866
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
More information about the Gluster-users
mailing list