[Gluster-devel] Missing files?

Roger Torrentsgenerós rtorrents at flumotion.com
Wed Jul 22 17:17:12 UTC 2009


We have 2 servers, let's name them file01 and file02. They are synced
very frequently, so we can assume contents are the same. Then we have
lots of clients, everyone of each has two glusterfs mountings, one
against every file server.

Before you ask, let me say the clients are in a production environment,
where I can't afford any downtime. To make the migration from glusterfs
v1.3 to glusterfs v2.0 as smooth as possible, I recompiled the packages
to run under glusterfs2 name. Servers are running two instances of the
glusterfs daemon, and the old one is to be stopped when all the
migration is complete. So you'll be seeing some glusterfs2 and build
dates that may not be normal, but you'll also see this has nothing to do
with this matter.

file01 server log:

================================================================================
Version      : glusterfs 2.0.1 built on May 26 2009 05:11:51
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-07-14 18:07:12
Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid 
PID          : 6337
System name  : Linux
Nodename     : file01
Kernel Release : 2.6.18-128.1.14.el5
Hardware Identifier: x86_64

Given volfile:
+------------------------------------------------------------------------------+
  1: # The data store directory to serve
  2: volume filedata-ds
  3:   type storage/posix
  4:   option directory /file/data
  5: end-volume
  6: 
  7: # Make the data store read-only
  8: volume filedata-readonly
  9:   type testing/features/filter
10:   option read-only on
11:   subvolumes filedata-ds
12: end-volume
13: 
14: # Optimize
15: volume filedata-iothreads
16:   type performance/io-threads
17:   option thread-count 64
18: #  option autoscaling on
19: #  option min-threads 16
20: #  option max-threads 256
21:   subvolumes filedata-readonly
22: end-volume
23: 
24: # Add readahead feature
25: volume filedata
26:   type performance/read-ahead   # cache per file = (page-count x
page-size)
27: #  option page-size 256kB        # 256KB is the default option ?
28: #  option page-count 8           # 16 is default option ?
29:   subvolumes filedata-iothreads
30: end-volume
31: 
32: # Main server section
33: volume server
34:   type protocol/server
35:   option transport-type tcp
36:   option transport.socket.listen-port 6997
37:   subvolumes filedata
38:   option auth.addr.filedata.allow 192.168.128.* # streamers
39:   option verify-volfile-checksum off # don't have clients complain
40: end-volume
41: 

+------------------------------------------------------------------------------+
[2009-07-14 18:07:12] N [glusterfsd.c:1152:main] glusterfs: Successfully
started

file02 server log:

================================================================================
Version      : glusterfs 2.0.1 built on May 26 2009 05:11:51
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-06-28 08:42:13
Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid 
PID          : 5846
System name  : Linux
Nodename     : file02
Kernel Release : 2.6.18-92.1.10.el5
Hardware Identifier: x86_64

Given volfile:
+------------------------------------------------------------------------------+
  1: # The data store directory to serve
  2: volume filedata-ds
  3:   type storage/posix
  4:   option directory /file/data
  5: end-volume
  6: 
  7: # Make the data store read-only
  8: volume filedata-readonly
  9:   type testing/features/filter
10:   option read-only on
11:   subvolumes filedata-ds
12: end-volume
13: 
14: # Optimize
15: volume filedata-iothreads
16:   type performance/io-threads
17:   option thread-count 64
18: #  option autoscaling on
19: #  option min-threads 16
20: #  option max-threads 256
21:   subvolumes filedata-readonly
22: end-volume
23: 
24: # Add readahead feature
25: volume filedata
26:   type performance/read-ahead   # cache per file = (page-count x
page-size)
27: #  option page-size 256kB        # 256KB is the default option ?
28: #  option page-count 8           # 16 is default option ?
29:   subvolumes filedata-iothreads
30: end-volume
31: 
32: # Main server section
33: volume server
34:   type protocol/server
35:   option transport-type tcp
36:   option transport.socket.listen-port 6997
37:   subvolumes filedata
38:   option auth.addr.filedata.allow 192.168.128.* # streamers
39:   option verify-volfile-checksum off # don't have clients complain
40: end-volume
41: 

+------------------------------------------------------------------------------+
[2009-06-28 08:42:13] N [glusterfsd.c:1152:main] glusterfs: Successfully
started

Now let's pick a random client, for example streamer013, and see its
log:

================================================================================
Version      : glusterfs 2.0.1 built on May 26 2009 05:23:52
TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
Starting Time: 2009-07-22 18:34:31
Command line : /usr/sbin/glusterfs2 --log-level=NORMAL
--volfile-server=file02.priv --volfile-server-port=6997 /mnt/file02 
PID          : 14519
System name  : Linux
Nodename     : streamer013
Kernel Release : 2.6.18-92.1.10.el5PAE
Hardware Identifier: i686

Given volfile:
+------------------------------------------------------------------------------+
  1: # filedata
  2: volume filedata
  3:   type protocol/client
  4:   option transport-type tcp
  5:   option remote-host file02.priv
  6:   option remote-port 6997          # use non default to run in
parallel
  7:   option remote-subvolume filedata
  8: end-volume
  9: 
10: # Add readahead feature
11: volume readahead
12:   type performance/read-ahead   # cache per file = (page-count x
page-size)
13: #  option page-size 256kB        # 256KB is the default option ?
14: #  option page-count 2           # 16 is default option ?
15:   subvolumes filedata
16: end-volume
17: 
18: # Add threads
19: volume iothreads
20:   type performance/io-threads
21:   option thread-count 8
22: #  option autoscaling on
23: #  option min-threads 16
24: #  option max-threads 256
25:   subvolumes readahead
26: end-volume
27: 
28: # Add IO-Cache feature
29: volume iocache
30:   type performance/io-cache
31:   option cache-size 64MB        # default is 32MB (in 1.3)
32:   option page-size 256KB        # 128KB is default option (in 1.3)
33:   subvolumes iothreads
34: end-volume
35: 

+------------------------------------------------------------------------------+
[2009-07-22 18:34:31] N [glusterfsd.c:1152:main] glusterfs: Successfully
started
[2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk]
filedata: Connected to 192.168.128.232:6997, attached to remote volume
'filedata'.
[2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk]
filedata: Connected to 192.168.128.232:6997, attached to remote volume
'filedata'.

The mountings seem ok:

[root at streamer013 /]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
glusterfs#file01.priv on /mnt/file01 type fuse
(rw,max_read=131072,allow_other,default_permissions)
glusterfs#file02.priv on /mnt/file02 type fuse
(rw,max_read=131072,allow_other,default_permissions)

They work:

[root at streamer013 /]# ls /mnt/file01/
cust
[root at streamer013 /]# ls /mnt/file02/
cust

And they are seen by both servers:

file01:

[2009-07-22 18:34:19] N [server-helpers.c:723:server_connection_destroy]
server: destroyed connection of streamer013.
p4.bt.bcn.flumotion.net-14335-2009/07/22-18:34:13:210609-filedata
[2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
192.168.128.213:1017 disconnected
[2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
192.168.128.213:1018 disconnected
[2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
accepted client from 192.168.128.213:1017
[2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
accepted client from 192.168.128.213:1018

file02:

[2009-07-22 18:34:20] N [server-helpers.c:723:server_connection_destroy]
server: destroyed connection of streamer013.
p4.bt.bcn.flumotion.net-14379-2009/07/22-18:34:13:267495-filedata
[2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
192.168.128.213:1014 disconnected
[2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
192.168.128.213:1015 disconnected
[2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
accepted client from 192.168.128.213:1015
[2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
accepted client from 192.168.128.213:1014

Now let's see the funny things. First, a content listing of a particular
directory, locally from both servers:

[root at file01 ~]# ls /file/data/cust/cust1
configs  files  outgoing  reports

[root at file02 ~]# ls /file/data/cust/cust1
configs  files  outgoing  reports

Now let's try to see the same from the client side:

[root at streamer013 /]# ls /mnt/file01/cust/cust1
ls: /mnt/file01/cust/cust1: No such file or directory
[root at streamer013 /]# ls /mnt/file02/cust/cust1
configs  files  outgoing  reports

Oops :( And the client log says:

[2009-07-22 18:41:22] W [fuse-bridge.c:1651:fuse_opendir]
glusterfs-fuse: 64: OPENDIR (null) (fuse_loc_fill() failed)

While none of the servers logs say anything.

So files really exist in the servers, but the same client can see them
in one of the filers but not in the other, although both are running
exactly the same software. But there's more. It seems it only happens
for certain directories (I can't show you the contents due to privacity,
but I guess you'll figure it out):

[root at streamer013 /]# ls /mnt/file01/cust/|wc -l
95
[root at streamer013 /]# ls /mnt/file02/cust/|wc -l
95
[root at streamer013 /]# for i in `ls /mnt/file01/cust/`; do
ls /mnt/file01/cust/$i; done|grep such
ls: /mnt/file01/cust/cust1: No such file or directory
ls: /mnt/file01/cust/cust2: No such file or directory
[root at streamer013 /]# for i in `ls /mnt/file02/cust/`; do
ls /mnt/file02/cust/$i; done|grep such
[root at streamer013 /]# 

And of course, our client log error twice:

[2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir]
glusterfs-fuse: 2119: OPENDIR (null) (fuse_loc_fill() failed)
[2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir]
glusterfs-fuse: 2376: OPENDIR (null) (fuse_loc_fill() failed)


I hope having been clear enough this time. If you need more data just
let me know and I'll see what I can do.

And thanks again for your help.

Roger


On Wed, 2009-07-22 at 09:10 -0700, Anand Avati wrote: 
> > I have been witnessing some strange behaviour with my GlusterFS system.
> > Fact is there are some files which exist and are completely accessible
> > in the server, while they can't be accessed from a client, while other
> > files do.
> >
> > To be sure, I copied the same files to another directory and I still was
> > unable to see them from the client. To be sure it wasn't any kind of
> > file permissions, selinux or whatever issue, I created a copy from a
> > working directory, and still wasn't seen from the client. All I get is
> > a:
> >
> > ls: .: No such file or directory
> >
> > And the client log says:
> >
> > [2009-07-22 14:04:18] W [fuse-bridge.c:1651:fuse_opendir]
> > glusterfs-fuse: 104778: OPENDIR (null) (fuse_loc_fill() failed)
> >
> > While the server log says nothing.
> >
> > Funniest thing is the same client has another GlusterFS mount to another
> > server, which has exactly the same contents as the first one, and this
> > mount does work.
> >
> > Some data:
> >
> > [root at streamer001 /]# ls /mnt/file01/cust/cust1/
> > ls: /mnt/file01/cust/cust1/: No such file or directory
> >
> > [root at streamer001 /]# ls /mnt/file02/cust/cust1/
> > configs  files  outgoing  reports
> >
> > [root at streamer001 /]# mount
> > /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
> > proc on /proc type proc (rw)
> > sysfs on /sys type sysfs (rw)
> > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> > /dev/sda1 on /boot type ext3 (rw)
> > tmpfs on /dev/shm type tmpfs (rw)
> > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> > sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
> > glusterfs#file01.priv on /mnt/file01 type fuse
> > (rw,max_read=131072,allow_other,default_permissions)
> > glusterfs#file02.priv on /mnt/file02 type fuse
> > (rw,max_read=131072,allow_other,default_permissions)
> >
> > [root at file01 /]# ls /file/data/cust/cust1
> > configs  files  outgoing  reports
> >
> > [root at file02 /]# ls /file/data/cust/cust1
> > configs  files  outgoing  reports
> >
> > Any ideas?
> 
> Can you please post all your client and server logs and volfiles? Are
> you quite certain that this is not a result of some misconfiguration?
> 
> Avati





More information about the Gluster-devel mailing list