[Gluster-devel] Missing files?
Matthias Saou
thias at spam.spam.spam.spam.spam.spam.spam.egg.and.spam.freshrpms.net
Thu Jul 23 09:59:25 UTC 2009
Hi,
Replying to myself with some more details : The servers are 64bit
(x86_64) whereas the clients are 32bit (ix86). It seems like this could
be the cause of this problem...
http://oss.sgi.com/archives/xfs/2009-07/msg00044.html
But if the glusterfs client doesn't know about the original inodes of
the files, then it should be possible to fix, right?
Matthias
Matthias Saou wrote :
> Hi,
>
> (Note: I have access to the systems referenced in the initial post)
>
> I think I've found the problem. It's the filesystem, XFS, which has
> been mounted with the "inode64" option, as it can't be mounted without
> since it has been grown to 39TB. I've just checked this :
>
> # ls -1 -ai /file/data/cust | sort -n
>
> And the last few lines are like this :
>
> [...]
> 2148235729 cust2
> 2148236297 cust6
> 2148236751 cust5
> 2148236974 cust7
> 2148237729 cust3
> 2148239365 cust4
> 2156210172 cust8
> 61637541899 cust1
> 96636784146 cust9
>
> Note that "cust1" here is the one where the problem has been seen
> initially. I've just checked, and the "cust9" directory is affected in
> the exact same way.
>
> So it seems like the glusterfs build being used has problems with 64bit
> inodes. Is this a known limitation? Is it easy to fix or work around?
>
> Matthias
>
> Roger Torrentsgenerós wrote :
>
> >
> > We have 2 servers, let's name them file01 and file02. They are synced
> > very frequently, so we can assume contents are the same. Then we have
> > lots of clients, everyone of each has two glusterfs mountings, one
> > against every file server.
> >
> > Before you ask, let me say the clients are in a production environment,
> > where I can't afford any downtime. To make the migration from glusterfs
> > v1.3 to glusterfs v2.0 as smooth as possible, I recompiled the packages
> > to run under glusterfs2 name. Servers are running two instances of the
> > glusterfs daemon, and the old one is to be stopped when all the
> > migration is complete. So you'll be seeing some glusterfs2 and build
> > dates that may not be normal, but you'll also see this has nothing to do
> > with this matter.
> >
> > file01 server log:
> >
> > ================================================================================
> > Version : glusterfs 2.0.1 built on May 26 2009 05:11:51
> > TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
> > Starting Time: 2009-07-14 18:07:12
> > Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid
> > PID : 6337
> > System name : Linux
> > Nodename : file01
> > Kernel Release : 2.6.18-128.1.14.el5
> > Hardware Identifier: x86_64
> >
> > Given volfile:
> > +------------------------------------------------------------------------------+
> > 1: # The data store directory to serve
> > 2: volume filedata-ds
> > 3: type storage/posix
> > 4: option directory /file/data
> > 5: end-volume
> > 6:
> > 7: # Make the data store read-only
> > 8: volume filedata-readonly
> > 9: type testing/features/filter
> > 10: option read-only on
> > 11: subvolumes filedata-ds
> > 12: end-volume
> > 13:
> > 14: # Optimize
> > 15: volume filedata-iothreads
> > 16: type performance/io-threads
> > 17: option thread-count 64
> > 18: # option autoscaling on
> > 19: # option min-threads 16
> > 20: # option max-threads 256
> > 21: subvolumes filedata-readonly
> > 22: end-volume
> > 23:
> > 24: # Add readahead feature
> > 25: volume filedata
> > 26: type performance/read-ahead # cache per file = (page-count x
> > page-size)
> > 27: # option page-size 256kB # 256KB is the default option ?
> > 28: # option page-count 8 # 16 is default option ?
> > 29: subvolumes filedata-iothreads
> > 30: end-volume
> > 31:
> > 32: # Main server section
> > 33: volume server
> > 34: type protocol/server
> > 35: option transport-type tcp
> > 36: option transport.socket.listen-port 6997
> > 37: subvolumes filedata
> > 38: option auth.addr.filedata.allow 192.168.128.* # streamers
> > 39: option verify-volfile-checksum off # don't have clients complain
> > 40: end-volume
> > 41:
> >
> > +------------------------------------------------------------------------------+
> > [2009-07-14 18:07:12] N [glusterfsd.c:1152:main] glusterfs: Successfully
> > started
> >
> > file02 server log:
> >
> > ================================================================================
> > Version : glusterfs 2.0.1 built on May 26 2009 05:11:51
> > TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
> > Starting Time: 2009-06-28 08:42:13
> > Command line : /usr/sbin/glusterfsd2 -p /var/run/glusterfsd2.pid
> > PID : 5846
> > System name : Linux
> > Nodename : file02
> > Kernel Release : 2.6.18-92.1.10.el5
> > Hardware Identifier: x86_64
> >
> > Given volfile:
> > +------------------------------------------------------------------------------+
> > 1: # The data store directory to serve
> > 2: volume filedata-ds
> > 3: type storage/posix
> > 4: option directory /file/data
> > 5: end-volume
> > 6:
> > 7: # Make the data store read-only
> > 8: volume filedata-readonly
> > 9: type testing/features/filter
> > 10: option read-only on
> > 11: subvolumes filedata-ds
> > 12: end-volume
> > 13:
> > 14: # Optimize
> > 15: volume filedata-iothreads
> > 16: type performance/io-threads
> > 17: option thread-count 64
> > 18: # option autoscaling on
> > 19: # option min-threads 16
> > 20: # option max-threads 256
> > 21: subvolumes filedata-readonly
> > 22: end-volume
> > 23:
> > 24: # Add readahead feature
> > 25: volume filedata
> > 26: type performance/read-ahead # cache per file = (page-count x
> > page-size)
> > 27: # option page-size 256kB # 256KB is the default option ?
> > 28: # option page-count 8 # 16 is default option ?
> > 29: subvolumes filedata-iothreads
> > 30: end-volume
> > 31:
> > 32: # Main server section
> > 33: volume server
> > 34: type protocol/server
> > 35: option transport-type tcp
> > 36: option transport.socket.listen-port 6997
> > 37: subvolumes filedata
> > 38: option auth.addr.filedata.allow 192.168.128.* # streamers
> > 39: option verify-volfile-checksum off # don't have clients complain
> > 40: end-volume
> > 41:
> >
> > +------------------------------------------------------------------------------+
> > [2009-06-28 08:42:13] N [glusterfsd.c:1152:main] glusterfs: Successfully
> > started
> >
> > Now let's pick a random client, for example streamer013, and see its
> > log:
> >
> > ================================================================================
> > Version : glusterfs 2.0.1 built on May 26 2009 05:23:52
> > TLA Revision : 5c1d9108c1529a1155963cb1911f8870a674ab5b
> > Starting Time: 2009-07-22 18:34:31
> > Command line : /usr/sbin/glusterfs2 --log-level=NORMAL
> > --volfile-server=file02.priv --volfile-server-port=6997 /mnt/file02
> > PID : 14519
> > System name : Linux
> > Nodename : streamer013
> > Kernel Release : 2.6.18-92.1.10.el5PAE
> > Hardware Identifier: i686
> >
> > Given volfile:
> > +------------------------------------------------------------------------------+
> > 1: # filedata
> > 2: volume filedata
> > 3: type protocol/client
> > 4: option transport-type tcp
> > 5: option remote-host file02.priv
> > 6: option remote-port 6997 # use non default to run in
> > parallel
> > 7: option remote-subvolume filedata
> > 8: end-volume
> > 9:
> > 10: # Add readahead feature
> > 11: volume readahead
> > 12: type performance/read-ahead # cache per file = (page-count x
> > page-size)
> > 13: # option page-size 256kB # 256KB is the default option ?
> > 14: # option page-count 2 # 16 is default option ?
> > 15: subvolumes filedata
> > 16: end-volume
> > 17:
> > 18: # Add threads
> > 19: volume iothreads
> > 20: type performance/io-threads
> > 21: option thread-count 8
> > 22: # option autoscaling on
> > 23: # option min-threads 16
> > 24: # option max-threads 256
> > 25: subvolumes readahead
> > 26: end-volume
> > 27:
> > 28: # Add IO-Cache feature
> > 29: volume iocache
> > 30: type performance/io-cache
> > 31: option cache-size 64MB # default is 32MB (in 1.3)
> > 32: option page-size 256KB # 128KB is default option (in 1.3)
> > 33: subvolumes iothreads
> > 34: end-volume
> > 35:
> >
> > +------------------------------------------------------------------------------+
> > [2009-07-22 18:34:31] N [glusterfsd.c:1152:main] glusterfs: Successfully
> > started
> > [2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk]
> > filedata: Connected to 192.168.128.232:6997, attached to remote volume
> > 'filedata'.
> > [2009-07-22 18:34:31] N [client-protocol.c:5557:client_setvolume_cbk]
> > filedata: Connected to 192.168.128.232:6997, attached to remote volume
> > 'filedata'.
> >
> > The mountings seem ok:
> >
> > [root at streamer013 /]# mount
> > /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
> > proc on /proc type proc (rw)
> > sysfs on /sys type sysfs (rw)
> > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> > /dev/sda1 on /boot type ext3 (rw)
> > tmpfs on /dev/shm type tmpfs (rw)
> > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> > glusterfs#file01.priv on /mnt/file01 type fuse
> > (rw,max_read=131072,allow_other,default_permissions)
> > glusterfs#file02.priv on /mnt/file02 type fuse
> > (rw,max_read=131072,allow_other,default_permissions)
> >
> > They work:
> >
> > [root at streamer013 /]# ls /mnt/file01/
> > cust
> > [root at streamer013 /]# ls /mnt/file02/
> > cust
> >
> > And they are seen by both servers:
> >
> > file01:
> >
> > [2009-07-22 18:34:19] N [server-helpers.c:723:server_connection_destroy]
> > server: destroyed connection of streamer013.
> > p4.bt.bcn.flumotion.net-14335-2009/07/22-18:34:13:210609-filedata
> > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> > 192.168.128.213:1017 disconnected
> > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> > 192.168.128.213:1018 disconnected
> > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> > accepted client from 192.168.128.213:1017
> > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> > accepted client from 192.168.128.213:1018
> >
> > file02:
> >
> > [2009-07-22 18:34:20] N [server-helpers.c:723:server_connection_destroy]
> > server: destroyed connection of streamer013.
> > p4.bt.bcn.flumotion.net-14379-2009/07/22-18:34:13:267495-filedata
> > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> > 192.168.128.213:1014 disconnected
> > [2009-07-22 18:34:31] N [server-protocol.c:7796:notify] server:
> > 192.168.128.213:1015 disconnected
> > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> > accepted client from 192.168.128.213:1015
> > [2009-07-22 18:34:31] N [server-protocol.c:7035:mop_setvolume] server:
> > accepted client from 192.168.128.213:1014
> >
> > Now let's see the funny things. First, a content listing of a particular
> > directory, locally from both servers:
> >
> > [root at file01 ~]# ls /file/data/cust/cust1
> > configs files outgoing reports
> >
> > [root at file02 ~]# ls /file/data/cust/cust1
> > configs files outgoing reports
> >
> > Now let's try to see the same from the client side:
> >
> > [root at streamer013 /]# ls /mnt/file01/cust/cust1
> > ls: /mnt/file01/cust/cust1: No such file or directory
> > [root at streamer013 /]# ls /mnt/file02/cust/cust1
> > configs files outgoing reports
> >
> > Oops :( And the client log says:
> >
> > [2009-07-22 18:41:22] W [fuse-bridge.c:1651:fuse_opendir]
> > glusterfs-fuse: 64: OPENDIR (null) (fuse_loc_fill() failed)
> >
> > While none of the servers logs say anything.
> >
> > So files really exist in the servers, but the same client can see them
> > in one of the filers but not in the other, although both are running
> > exactly the same software. But there's more. It seems it only happens
> > for certain directories (I can't show you the contents due to privacity,
> > but I guess you'll figure it out):
> >
> > [root at streamer013 /]# ls /mnt/file01/cust/|wc -l
> > 95
> > [root at streamer013 /]# ls /mnt/file02/cust/|wc -l
> > 95
> > [root at streamer013 /]# for i in `ls /mnt/file01/cust/`; do
> > ls /mnt/file01/cust/$i; done|grep such
> > ls: /mnt/file01/cust/cust1: No such file or directory
> > ls: /mnt/file01/cust/cust2: No such file or directory
> > [root at streamer013 /]# for i in `ls /mnt/file02/cust/`; do
> > ls /mnt/file02/cust/$i; done|grep such
> > [root at streamer013 /]#
> >
> > And of course, our client log error twice:
> >
> > [2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir]
> > glusterfs-fuse: 2119: OPENDIR (null) (fuse_loc_fill() failed)
> > [2009-07-22 18:49:21] W [fuse-bridge.c:1651:fuse_opendir]
> > glusterfs-fuse: 2376: OPENDIR (null) (fuse_loc_fill() failed)
> >
> >
> > I hope having been clear enough this time. If you need more data just
> > let me know and I'll see what I can do.
> >
> > And thanks again for your help.
> >
> > Roger
> >
> >
> > On Wed, 2009-07-22 at 09:10 -0700, Anand Avati wrote:
> > > > I have been witnessing some strange behaviour with my GlusterFS system.
> > > > Fact is there are some files which exist and are completely accessible
> > > > in the server, while they can't be accessed from a client, while other
> > > > files do.
> > > >
> > > > To be sure, I copied the same files to another directory and I still was
> > > > unable to see them from the client. To be sure it wasn't any kind of
> > > > file permissions, selinux or whatever issue, I created a copy from a
> > > > working directory, and still wasn't seen from the client. All I get is
> > > > a:
> > > >
> > > > ls: .: No such file or directory
> > > >
> > > > And the client log says:
> > > >
> > > > [2009-07-22 14:04:18] W [fuse-bridge.c:1651:fuse_opendir]
> > > > glusterfs-fuse: 104778: OPENDIR (null) (fuse_loc_fill() failed)
> > > >
> > > > While the server log says nothing.
> > > >
> > > > Funniest thing is the same client has another GlusterFS mount to another
> > > > server, which has exactly the same contents as the first one, and this
> > > > mount does work.
> > > >
> > > > Some data:
> > > >
> > > > [root at streamer001 /]# ls /mnt/file01/cust/cust1/
> > > > ls: /mnt/file01/cust/cust1/: No such file or directory
> > > >
> > > > [root at streamer001 /]# ls /mnt/file02/cust/cust1/
> > > > configs files outgoing reports
> > > >
> > > > [root at streamer001 /]# mount
> > > > /dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
> > > > proc on /proc type proc (rw)
> > > > sysfs on /sys type sysfs (rw)
> > > > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> > > > /dev/sda1 on /boot type ext3 (rw)
> > > > tmpfs on /dev/shm type tmpfs (rw)
> > > > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> > > > sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
> > > > glusterfs#file01.priv on /mnt/file01 type fuse
> > > > (rw,max_read=131072,allow_other,default_permissions)
> > > > glusterfs#file02.priv on /mnt/file02 type fuse
> > > > (rw,max_read=131072,allow_other,default_permissions)
> > > >
> > > > [root at file01 /]# ls /file/data/cust/cust1
> > > > configs files outgoing reports
> > > >
> > > > [root at file02 /]# ls /file/data/cust/cust1
> > > > configs files outgoing reports
> > > >
> > > > Any ideas?
> > >
> > > Can you please post all your client and server logs and volfiles? Are
> > > you quite certain that this is not a result of some misconfiguration?
> > >
> > > Avati
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at nongnu.org
> > http://lists.nongnu.org/mailman/listinfo/gluster-devel
>
>
--
Clean custom Red Hat Linux rpm packages : http://freshrpms.net/
Fedora release 10 (Cambridge) - Linux kernel
2.6.27.25-170.2.72.fc10.x86_64 Load : 0.50 3.32 2.58
More information about the Gluster-devel
mailing list