[Gluster-users] Gluster-users Digest, Vol 27, Issue 1

Brian Smith brs at usf.edu
Thu Jul 1 19:29:30 UTC 2010


Each of the two bricks has just one XFS file system w/ inode64 enabled
on a 10TB LVM LV.  Each of the volumes is less than 40% full and inode
counts look reasonable.

I'm working to get a test environment going so I can reproduce this
off-production and add the trace translator.  I've sent in a bunch of
trace data and opened a bug:
http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=1040

-Brian


-- 
Brian Smith
Senior Systems Administrator
IT Research Computing, University of South Florida
4202 E. Fowler Ave. ENB204
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu



> Date: Wed, 30 Jun 2010 15:33:24 -0400
> From: Brian Smith <brs at usf.edu>
> Subject: Re: [Gluster-users] Revisit: FORTRAN Codes and File I/O
> To: Harshavardhana <harsha at gluster.com>
> Cc: Gluster General Discussion List <gluster-users at gluster.org>
> Message-ID: <1277926404.2687.60.camel at localhost.localdomain>
> Content-Type: text/plain; charset="UTF-8"
> 
> Spoke too soon.  Same problem occurs minus all performance translators.
> Debug logs on the server show
> 
> [2010-06-30 15:30:54] D [server-protocol.c:2104:server_create_cbk]
> server-tcp: create(/b/brs/Si/CHGCAR) inode (ptr=0x2aaab00e05b0,
> ino=2159011921, gen=5488651098262601749) found conflict
> (ptr=0x2aaab40cca00, ino=2159011921, gen=5488651098262601749)
> [2010-06-30 15:30:54] D [server-resolve.c:386:resolve_entry_simple]
> server-tcp: inode (pointer: 0x2aaab40cca00 ino:2159011921) found for
> path (/b/brs/Si/CHGCAR) while type is RESOLVE_NOT
> [2010-06-30 15:30:54] D [server-protocol.c:2132:server_create_cbk]
> server-tcp: 72: CREATE (null) (0) ==> -1 (File exists)
> 
> -Brian
> 
> -- 
> Brian Smith
> Senior Systems Administrator
> IT Research Computing, University of South Florida
> 4202 E. Fowler Ave. ENB204
> Office Phone: +1 813 974-1467
> Organization URL: http://rc.usf.edu
> 
> 
> On Wed, 2010-06-30 at 13:06 -0400, Brian Smith wrote:
> > I received these in my debug output during a run that failed:
> > 
> > [2010-06-30 12:34:25] D [read-ahead.c:468:ra_readv] readahead:
> > unexpected offset (8192 != 1062) resetting
> > [2010-06-30 12:34:25] D [read-ahead.c:468:ra_readv] readahead:
> > unexpected offset (8192 != 1062) resetting
> > [2010-06-30 12:34:25] D [read-ahead.c:468:ra_readv] readahead:
> > unexpected offset (8192 != 1062) resetting
> > [2010-06-30 12:34:25] D [read-ahead.c:468:ra_readv] readahead:
> > unexpected offset (8192 != 1062) resetting
> > 
> > I disabled the read-ahead translator as well as the three other
> > performance translators commented out in my vol file (I'm on GigE; the
> > docs say I can still reach link max anyway) and my processes appear to
> > be running smoothly.  I'll go ahead and submit the bug report with
> > tracing enabled as well.
> > 
> > -Brian
> > 
> > 
> Date: Wed, 30 Jun 2010 21:17:59 -0400
> From: Jeff Darcy <jdarcy at redhat.com>
> Subject: Re: [Gluster-users] Revisit: FORTRAN Codes and File I/O
> To: gluster-users at gluster.org
> Message-ID: <4C2BECC7.5010402 at redhat.com>
> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
> 
> On 06/30/2010 03:33 PM, Brian Smith wrote:
> > Spoke too soon.  Same problem occurs minus all performance translators.
> > Debug logs on the server show
> >
> > [2010-06-30 15:30:54] D [server-protocol.c:2104:server_create_cbk]
> > server-tcp: create(/b/brs/Si/CHGCAR) inode (ptr=0x2aaab00e05b0,
> > ino=2159011921, gen=5488651098262601749) found conflict
> > (ptr=0x2aaab40cca00, ino=2159011921, gen=5488651098262601749)
> > [2010-06-30 15:30:54] D [server-resolve.c:386:resolve_entry_simple]
> > server-tcp: inode (pointer: 0x2aaab40cca00 ino:2159011921) found for
> > path (/b/brs/Si/CHGCAR) while type is RESOLVE_NOT
> > [2010-06-30 15:30:54] D [server-protocol.c:2132:server_create_cbk]
> > server-tcp: 72: CREATE (null) (0) ==>  -1 (File exists)
> >    
> The first line almost looks like a create attempt for a file that 
> already exists at the server.  The second and third lines look like *yet 
> another* create attempt, failing this time before the request is even 
> passed to the next translator.  This might be a good time to drag out 
> the debug/trace translator, and sit it on top of brick1 to watch the 
> create calls.  That will help nail down the exact sequence of events as 
> the server sees them, so we don't go looking in the wrong places.  It 
> might even be useful to do the same on the client side, but perhaps not 
> yet.  Instructions are here:
> 
> http://www.gluster.com/community/documentation/index.php/Translators/debug/trace
> 
> In the mean time, to further identity which code paths are most likely 
> to be relevant, it would be helpful to know a couple more things.
> 
> (1) Is each storage/posix volume using just one local filesystem, or is 
> it possible that the underlying directory tree spans more than one?  
> This could lead to inode-number duplication, which requires extra handling.
> 
> (2) Are either of the server-side volumes close to being full?  This 
> could result in creating an extra "linkfile" on the subvolume/server 
> where we'd normally create the file, pointing to where we really created 
> it due to space considerations.
> 
> ------------------------------
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> 
> 
> End of Gluster-users Digest, Vol 27, Issue 1
> ********************************************




More information about the Gluster-users mailing list