[Gluster-devel] Re: more bugs (was Re: io-threads...)

Sun Apr 29 07:00:54 UTC 2007

Brent,
  if you are using the latest TLA, then it is expected if you have
aggregate-size > 0 and if file is a multiple of 4096 (no necessarily
even multiple). having aggregate-size = 0 and no io-threads should not
produce the mtim glitch at all.

avati

On Sat, Apr 28, 2007 at 06:18:27PM -0400, Brent A Nelson wrote:
> Adding to the list, there is still an mtime bug when using write-behind 
> even without io-threads.  It occurs on (big clue!) files with sizes evenly 
> divisable by 4096 in my AFR/unified setup:
> 
> -rw-r--r-- 1 root root 4096 2007-04-28 16:01 
> /scratch/usr/src/linux-headers-2.6.15-28/include/net/genetlink.h
> -rwxr-xr-x 1 root root 12288 2007-04-28 15:54 /scratch/usr/bin/last
> -rwxr-xr-x 1 root root 12288 2007-04-28 15:54 /scratch/usr/bin/aseqnet
> -rw-r--r-- 1 root root 528384 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/all-universe.db
> -rw-r--r-- 1 root root 12288 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/all-multiverse.db
> -rw-r--r-- 1 root root 12288 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/i386-restricted.db
> -rw-r--r-- 1 root root 135168 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/i386-multiverse.db
> -rw-r--r-- 1 root root 12288 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/all-restricted.db
> -rw-r--r-- 1 root root 135168 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/all-main.db
> -rw-r--r-- 1 root root 4198400 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/i386-universe.db
> -rw-r--r-- 1 root root 1052672 2007-04-28 15:54 
> /scratch/usr/share/command-not-found/programs.d/i386-main.db
> -rw-r--r-- 1 root root 65536 2007-04-28 15:53 
> /scratch/usr/share/samba/valid.dat
> -rw-r--r-- 1 root root 61440 2007-04-28 15:57 
> /scratch/usr/lib/python2.5/distutils/command/wininst-7.1.exe
> -rw-r--r-- 1 root root 61440 2007-04-28 15:57 
> /scratch/usr/lib/python2.5/distutils/command/wininst-6.exe
> -rw-r--r-- 1 root root 61440 2007-04-28 15:55 
> /scratch/usr/lib/python2.4/distutils/command/wininst-7.1.exe
> -rw-r--r-- 1 root root 61440 2007-04-28 15:55 
> /scratch/usr/lib/python2.4/distutils/command/wininst-6.exe
> -rw-r--r-- 1 root root 4096 2007-04-28 15:57 
> /scratch/usr/lib/gettext/msgfmt.net.exe
> -rw-r--r-- 1 root root 8192 2007-04-28 15:57 
> /scratch/usr/lib/GNU.Gettext.dll
> -rw-r--r-- 1 root root 143360 2007-04-28 15:56 
> /scratch/usr/lib/libgc.so.1.0.2
> 
> These files have wrong mtimes on both nodes in the AFR, not just one or 
> the other.  It resulted from a simple "cp -a" of my /usr directory.
> 
> Thanks,
> 
> Brent
> 
> On Fri, 27 Apr 2007, Brent A Nelson wrote:
> 
> >A couple of more bugs observed today:
> >
> >1) stat-prefetch still causes glusterfs to die on occasion.  I can 
> >reproduce this with a bunch of clients doing a du of a complex directory 
> >structure; out of 8 clients du'ing simultaneously, one or two will die 
> >before the du finishes (glusterfs dies).  This is probably the same thing 
> >I've reported before about stat-prefetch, but I was hoping io-threads 
> >might have been responsible (it wasn't).
> >
> >2) NFS reexport is somehow triggering a really rapid memory 
> >leak/consumption in glusterfsd, causing it to quickly die.  On the NFS 
> >client, I did a du of an Ubuntu Edgy mirror, which worked fine.  Then I 
> >did multiple cp -a's of a simple 30MB directory, which causes rapid memory 
> >consumption on the glusterfsd of node1 of an AFR.  It soon dies (before it 
> >does the sixth copy), along with the NFS-exported glusterfs client 
> >(running on node1, as well). This occurs in a simple mirror with 
> >storage/posix and protocol/server on the server and protocol/client, 
> >cluster/afr, performance/read-ahead, and performance/write-behind on the 
> >client.
> >
> >Thanks,
> >
> >Brent
> >
> >On Fri, 27 Apr 2007, Brent A Nelson wrote:
> >
> >>Hmm, it looks like io-threads is responsible for more than just mtime 
> >>glitches when used with write-behind.  I just found that the problems I 
> >>had with NFS re-export go away when I get rid of io-threads (plus, now 
> >>that I can enable write-behind, the NFS write performance is far better, 
> >>by at least a factor of 5)!
> >>
> >>It looks like I'll be switching off io-threads for now, and turning on 
> >>all the other performance enhancements.
> >>
> >>Thanks,
> >>
> >>Brent
> >>
> >>On Fri, 27 Apr 2007, Brent A Nelson wrote:
> >>
> >>>On Thu, 26 Apr 2007, Anand Avati wrote:
> >>>
> >>>>Brent,
> >>>>I understand what is happening. It is because I/O threads lets the
> >>>>mtime overtake the write call. I assume you have loaded io-threads on
> >>>>server side (or below write-behind on client side).
> >>>
> >>>Yes, I have io-threads loaded on the server.  This occurs when I load 
> >>>write-behind on the client.
> >>>
> >>>>I could provide you a temporary 'ugly' fix just for you if the issue is 
> >>>>critical (until the proper framework comes in 1.4)
> >>>
> >>>It would be worthwhile if the temporary fix is acceptable for the 1.3 
> >>>release (otherwise, you'll need a warning included with the release, so 
> >>>that people enabling io-threads and write-behind know what to expect), 
> >>>but don't waste your time if it's just for me.  Push on to 1.4 and the 
> >>>real fix; I'll just leave write-behind disabled for now.
> >>>
> >>>Many Thanks,
> >>>
> >>>Brent
> >>>
> >>
> >
> 

-- 
ultimate_answer_t
deep_thought (void)
{ 
  sleep (years2secs (7500000)); 
  return 42;
}