[Gluster-devel] Missing files?

Matthias Saou thias at spam.spam.spam.spam.spam.spam.spam.egg.and.spam.freshrpms.net
Wed Jul 22 17:39:26 UTC 2009


Hi,

(Note: I have access to the systems referenced in the initial post)

I think I've found the problem. It's the filesystem, XFS, which has
been mounted with the "inode64" option, as it can't be mounted without
since it has been grown to 39TB. I've just checked this :

# ls -1 -ai /file/data/cust | sort -n

And the last few lines are like this :

[...]
 2148235729 cust2
 2148236297 cust6
 2148236751 cust5
 2148236974 cust7
 2148237729 cust3
 2148239365 cust4
 2156210172 cust8
61637541899 cust1
96636784146 cust9

Note that "cust1" here is the one where the problem has been seen
initially. I've just checked, and the "cust9" directory is affected in
the exact same way.

So it seems like the glusterfs build being used has problems with 64bit
inodes. Is this a known limitation? Is it easy to fix or work around?

Matthias

Roger Torrentsgenerós wrote :

> 
> We have 2 servers, let's name them file01 and file02. They are synced
> very frequently, so we can assume contents are the same. Then we have
> lots of clients, everyone of each has two glusterfs mountings, one
> against every file server.
> 
> Before you ask, let me say the clients are in a production environment,
> where I can't afford any downtime. To make the migration from glusterfs
> v1.3 to glusterfs v2.0 as smooth as possible, I recompiled the packages
> to run under glusterfs2 name. Servers are running two instances of the
> glusterfs daemon, and the old one is to be stopped when all the
> migration is complete. So you'll be seeing some glusterfs2 and build
> dates that may not be normal, but you'll also see this has nothing to do
> with this matter.
> 
> file01 server log:
> 
> ================================================================================
> Version      : glusterfs 2.0.1 built on May 26 2009 05:11:51
> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
> Starting Time: 2009-07-14 18:07:12
> Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid 
> PID          : 6337
> System name  : Linux
> Nodename     : file01
> Kernel Release : 2.6.18-128.1.14.el5
> Hardware Identifier: x86_64
> 
> Given volfile:
> +------------------------------------------------------------------------------+
>   1: # The data store directory to serve
>   2: volume filedata-ds
>   3:   type storage/posix
>   4:   option directory /file/data
>   5: end-volume
>   6: 
>   7: # Make the data store read-only
>   8: volume filedata-readonly
>   9:   type testing/features/filter
> 10:   option read-only on
> 11:   subvolumes filedata-ds
> 12: end-volume
> 13: 
> 14: # Optimize
> 15: volume filedata-iothreads
> 16:   type performance/io-threads
> 17:   option thread-count 64
> 18: #  option autoscaling on
> 19: #  option min-threads 16
> 20: #  option max-threads 256
> 21:   subvolumes filedata-readonly
> 22: end-volume
> 23: 
> 24: # Add readahead feature
> 25: volume filedata
> 26:   type performance/read-ahead   # cache per file = (page-count x
> page-size)
> 27: #  option page-size 256kB        # 256KB is the default option ?
> 28: #  option page-count 8           # 16 is default option ?
> 29:   subvolumes filedata-iothreads
> 30: end-volume
> 31: 
> 32: # Main server section
> 33: volume server
> 34:   type protocol/server
> 35:   option transport-type tcp
> 36:   option transport.socket.listen-port 6997
> 37:   subvolumes filedata
> 38:   option auth.addr.filedata.allow 192.168.128.* # streamers
> 39:   option verify-volfile-checksum off # don't have clients complain
> 40: end-volume
> 41: 
> 
> +------------------------------------------------------------------------------+
> [2009-07-14 18:07:12] N [glusterfsd.c:1152:main] glusterfs: Successfully
> started
> 
> file02 server log:
> 
> ================================================================================
> Version      : glusterfs 2.0.1 built on May 26 2009 05:11:51
> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
> Starting Time: 2009-06-28 08:42:13
> Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid 
> PID          : 5846
> System name  : Linux
> Nodename     : file02
> Kernel Release : 2.6.18-92.1.10.el5
> Hardware Identifier: x86_64
> 
> Given volfile:
> +------------------------------------------------------------------------------+
>   1: # The data store directory to serve
>   2: volume filedata-ds
>   3:   type storage/posix
>   4:   option directory /file/data
>   5: end-volume
>   6: 
>   7: # Make the data store read-only
>   8: volume filedata-readonly
>   9:   type testing/features/filter
> 10:   option read-only on
> 11:   subvolumes filedata-ds
> 12: end-volume
> 13: 
> 14: # Optimize
> 15: volume filedata-iothreads
> 16:   type performance/io-threads
> 17:   option thread-count 64
> 18: #  option autoscaling on
> 19: #  option min-threads 16
> 20: #  option max-threads 256
> 21:   subvolumes filedata-readonly
> 22: end-volume
> 23: 
> 24: # Add readahead feature
> 25: volume filedata
> 26:   type performance/read-ahead   # cache per file = (page-count x
> page-size)
> 27: #  option page-size 256kB        # 256KB is the default option ?
> 28: #  option page-count 8           # 16 is default option ?
> 29:   subvolumes filedata-iothreads
> 30: end-volume
> 31: 
> 32: # Main server section
> 33: volume server
> 34:   type protocol/server
> 35:   option transport-type tcp
> 36:   option transport.socket.listen-port 6997
> 37:   subvolumes filedata
> 38:   option auth.addr.filedata.allow 192.168.128.* # streamers
> 39:   option verify-volfile-checksum off # don't have clients complain
> 40: end-volume
> 41: 
> 
> +------------------------------------------------------------------------------+
> [2009-06-28 08:42:13] N [glusterfsd.c:1152:main] glusterfs: Successfully
> started
> 
> Now let's pick a random client, for example streamer013, and see its
> log:
> 
> ================================================================================
> Version      : glusterfs 2.0.1 built on May 26 2009 05:23:52
> TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
> Starting Time: 2009-07-22 18:34:31
> Command line : /usr/sbin/glusterfs2 --log-level=NORMAL
> --volfile-server=file02.priv --volfile-server-port=6997 /mnt/file02 
> PID          : 14519
> System name  : Linux
> Nodename     : streamer013
> Kernel Release : 2.6.18-92.1.10.el5PAE
> Hardware Identifier: i686
> 
> Given volfile:
> +------------------------------------------------------------------------------+
>   1: # filedata
>   2: volume filedata
>   3:   type protocol/client
>   4:   option transport-type tcp
>   5:   option remote-host file02.priv
>   6:   option remote-port 6997          # use non default to run in
> parallel
>   7:   option remote-subvolume filedata
>   8: end-volume
>   9: 
> 10: # Add readahead feature
> 11: volume readahead
> 12:   type performance/read-ahead   # cache per file = (page-count x
> page-size)
> 13: #  option page-size 256kB        # 256KB is the default option ?
> 14: #  option page-count 2           # 16 is default option ?
> 15:   subvolumes filedata
> 16: end-volume
> 17: 
> 18: # Add threads
> 19: volume iothreads
> 20:   type performance/io-threads
> 21:   option thread-count 8
> 22: #  option autoscaling on
> 23: #  option min-threads 16
> 24: #  option max-threads 256
> 25:   subvolumes readahead
> 26: end-volume
> 27: 
> 28: # Add IO-Cache feature
> 29: volume iocache
> 30:   type performance/io-cache
> 31:   option cache-size 64MB        # default is 32MB (in 1.3)
> 32:   option page-size 256KB        # 128KB is default option (in 1.3)
> 33:   subvolumes iothreads
> 34: end-volume
> 35: 
> 
> +------------------------------------------------------------------------------+
> [2009-07-22 18:34:31] N [glusterfsd.c:1152:main] glusterfs: Successfully
> started
> [2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk]
> filedata: Connected to 192.168.128.232:6997, attached to remote volume
> 'filedata'.
> [2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk]
> filedata: Connected to 192.168.128.232:6997, attached to remote volume
> 'filedata'.
> 
> The mountings seem ok:
> 
> [root at streamer013 /]# mount
> /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> /dev/sda1 on /boot type ext3 (rw)
> tmpfs on /dev/shm type tmpfs (rw)
> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> glusterfs#file01.priv on /mnt/file01 type fuse
> (rw,max_read=131072,allow_other,default_permissions)
> glusterfs#file02.priv on /mnt/file02 type fuse
> (rw,max_read=131072,allow_other,default_permissions)
> 
> They work:
> 
> [root at streamer013 /]# ls /mnt/file01/
> cust
> [root at streamer013 /]# ls /mnt/file02/
> cust
> 
> And they are seen by both servers:
> 
> file01:
> 
> [2009-07-22 18:34:19] N [server-helpers.c:723:server_connection_destroy]
> server: destroyed connection of streamer013.
> p4.bt.bcn.flumotion.net-14335-2009/07/22-18:34:13:210609-filedata
> [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> 192.168.128.213:1017 disconnected
> [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> 192.168.128.213:1018 disconnected
> [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> accepted client from 192.168.128.213:1017
> [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> accepted client from 192.168.128.213:1018
> 
> file02:
> 
> [2009-07-22 18:34:20] N [server-helpers.c:723:server_connection_destroy]
> server: destroyed connection of streamer013.
> p4.bt.bcn.flumotion.net-14379-2009/07/22-18:34:13:267495-filedata
> [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> 192.168.128.213:1014 disconnected
> [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> 192.168.128.213:1015 disconnected
> [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> accepted client from 192.168.128.213:1015
> [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> accepted client from 192.168.128.213:1014
> 
> Now let's see the funny things. First, a content listing of a particular
> directory, locally from both servers:
> 
> [root at file01 ~]# ls /file/data/cust/cust1
> configs  files  outgoing  reports
> 
> [root at file02 ~]# ls /file/data/cust/cust1
> configs  files  outgoing  reports
> 
> Now let's try to see the same from the client side:
> 
> [root at streamer013 /]# ls /mnt/file01/cust/cust1
> ls: /mnt/file01/cust/cust1: No such file or directory
> [root at streamer013 /]# ls /mnt/file02/cust/cust1
> configs  files  outgoing  reports
> 
> Oops :( And the client log says:
> 
> [2009-07-22 18:41:22] W [fuse-bridge.c:1651:fuse_opendir]
> glusterfs-fuse: 64: OPENDIR (null) (fuse_loc_fill() failed)
> 
> While none of the servers logs say anything.
> 
> So files really exist in the servers, but the same client can see them
> in one of the filers but not in the other, although both are running
> exactly the same software. But there's more. It seems it only happens
> for certain directories (I can't show you the contents due to privacity,
> but I guess you'll figure it out):
> 
> [root at streamer013 /]# ls /mnt/file01/cust/|wc -l
> 95
> [root at streamer013 /]# ls /mnt/file02/cust/|wc -l
> 95
> [root at streamer013 /]# for i in `ls /mnt/file01/cust/`; do
> ls /mnt/file01/cust/$i; done|grep such
> ls: /mnt/file01/cust/cust1: No such file or directory
> ls: /mnt/file01/cust/cust2: No such file or directory
> [root at streamer013 /]# for i in `ls /mnt/file02/cust/`; do
> ls /mnt/file02/cust/$i; done|grep such
> [root at streamer013 /]# 
> 
> And of course, our client log error twice:
> 
> [2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir]
> glusterfs-fuse: 2119: OPENDIR (null) (fuse_loc_fill() failed)
> [2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir]
> glusterfs-fuse: 2376: OPENDIR (null) (fuse_loc_fill() failed)
> 
> 
> I hope having been clear enough this time. If you need more data just
> let me know and I'll see what I can do.
> 
> And thanks again for your help.
> 
> Roger
> 
> 
> On Wed, 2009-07-22 at 09:10 -0700, Anand Avati wrote: 
> > > I have been witnessing some strange behaviour with my GlusterFS system.
> > > Fact is there are some files which exist and are completely accessible
> > > in the server, while they can't be accessed from a client, while other
> > > files do.
> > >
> > > To be sure, I copied the same files to another directory and I still was
> > > unable to see them from the client. To be sure it wasn't any kind of
> > > file permissions, selinux or whatever issue, I created a copy from a
> > > working directory, and still wasn't seen from the client. All I get is
> > > a:
> > >
> > > ls: .: No such file or directory
> > >
> > > And the client log says:
> > >
> > > [2009-07-22 14:04:18] W [fuse-bridge.c:1651:fuse_opendir]
> > > glusterfs-fuse: 104778: OPENDIR (null) (fuse_loc_fill() failed)
> > >
> > > While the server log says nothing.
> > >
> > > Funniest thing is the same client has another GlusterFS mount to another
> > > server, which has exactly the same contents as the first one, and this
> > > mount does work.
> > >
> > > Some data:
> > >
> > > [root at streamer001 /]# ls /mnt/file01/cust/cust1/
> > > ls: /mnt/file01/cust/cust1/: No such file or directory
> > >
> > > [root at streamer001 /]# ls /mnt/file02/cust/cust1/
> > > configs  files  outgoing  reports
> > >
> > > [root at streamer001 /]# mount
> > > /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
> > > proc on /proc type proc (rw)
> > > sysfs on /sys type sysfs (rw)
> > > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> > > /dev/sda1 on /boot type ext3 (rw)
> > > tmpfs on /dev/shm type tmpfs (rw)
> > > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> > > sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
> > > glusterfs#file01.priv on /mnt/file01 type fuse
> > > (rw,max_read=131072,allow_other,default_permissions)
> > > glusterfs#file02.priv on /mnt/file02 type fuse
> > > (rw,max_read=131072,allow_other,default_permissions)
> > >
> > > [root at file01 /]# ls /file/data/cust/cust1
> > > configs  files  outgoing  reports
> > >
> > > [root at file02 /]# ls /file/data/cust/cust1
> > > configs  files  outgoing  reports
> > >
> > > Any ideas?
> > 
> > Can you please post all your client and server logs and volfiles? Are
> > you quite certain that this is not a result of some misconfiguration?
> > 
> > Avati
> 
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel


-- 
Clean custom Red Hat Linux rpm packages : http://freshrpms.net/
Fedora release 10 (Cambridge) - Linux kernel
2.6.27.25-170.2.72.fc10.x86_64 Load : 12.85 9.21 5.85





More information about the Gluster-devel mailing list