[Gluster-devel] 2.0.0rc4 (and rc5) locks up when used as root

Gordan Bobic gordan at bobich.net
Sat Mar 21 00:46:32 UTC 2009


I'm not sure what you mean. There root volume is replicate/afr, but 
there is only one node active (I haven't bothered building the 2nd node 
yet).

The init scripts don't change any permissions during boot. What I 
mentioned below is happening in rc.sysinit.

If permissions were the problem, the init process wouldn't just lock up 
- if execute permission wasn't there it would just skip executing the 
script, which is what I forced by chmodding -x the udev-stw.modules script.

Gordan

Harshavardhana wrote:
> Gordan,
> 
>     were the permissions changed being over replicate by any chance?. 
> can you write an additional script to check for the permission of files 
> or atleast have a dump to compare permissions over replicate and without 
> it.
>  
> Regards
> --
> Harshavardhana
> "Yantra Shilpi"
> Z Research Inc - http://www.zresearch.com
> 
> 
> 
> On Fri, Mar 20, 2009 at 11:46 PM, Gordan Bobic <gordan at bobich.net 
> <mailto:gordan at bobich.net>> wrote:
> 
>     As suspected, chmod -x /etc/sysconfig/modules/udev-stw.modules fixes
>     the immediate problem. It would appear things like:
> 
> 
>     for file in /etc/sysconfig/modules/*.modules ; do
>      [ -x $file ] && $file
>     done
> 
>     seem to cause it to lock up.
> 
>     Unfortunately, that sort of thing happens all over the place in the
>     boot scripts, and now it locks up a few steps later. The last
>     version this worked with was rc2 (possibly rc3, I haven't tested
>     it). It's definitely not working on rc4 and rc5.
> 
>     Gordan
> 
> 
>     Gordan Bobic wrote:
> 
>         Anand Avati wrote:
> 
>                 It looks like it locks up when used as root
>                 (afr/replicate) at the point
>                 where it initially starts up udev (not 100% sure where
>                 exactly yet, will
>                 have to put some trace code in rc.sysinit).
> 
>                 2.0.0rc2 didn't have this problem.
> 
> 
>             Can you try rc5? Though still it is still under QA, you
>             might want to
>             give it a try since some transport related code changes have
>             gone it
>             which might be the reason for your lockup.
> 
> 
>         The lock-up still occurs with rc5.
> 
>         I've done some more digging, however. It appears to die at this
>         pint in rc.sysinit, between debug 3 and debug 4:
> 
>         #################
>         echo "debug 3"
>         # Load other user-defined modules
>         for file in /etc/sysconfig/modules/*.modules ; do
>          [ -x $file ] && $file
>         done
> 
>         # Load modules (for backward compatibility with VARs)
>         if [ -f /etc/rc.modules ]; then
>                /etc/rc.modules
>         fi
>         echo "debug 4"
>         #################
> 
>         There is no rc.modules file, so contents of that can be ruled out.
> 
>         # ls -l /etc/sysconfig/modules/
>         total 8
>         -rwxr-xr-x 1 root root 100 May 25  2008 udev-stw.modules
> 
>         # cat /etc/sysconfig/modules/udev-stw.modules
>         #!/bin/sh
>         for i in nvram floppy parport lp snd-powermac;do
>                modprobe $i >/dev/null 2>&1
>         done
> 
>         I have just rebuilt my initrds separately with rc2 and rc5. rc2
>         works fine, rc5 fails. No other changes to the system between
>         the two attempts.
> 
>         Oh, and the first access failure bug is still there.
> 
>         I couldn't test for the memory leak in rc5 since I couldn't get
>         it to boot due to the lock-up mentioned above.
> 
>         I'll try disabling those modules listed above since I don't need
>         them on this setup, but I can confirm that modprobe itself works
>         fine. So it sounds like a problem/bug elsewhere. Possibly a
>         buffer-overrun somewhere that gets triggered by rc4/rc5.
> 
>         BTW, in case it's relevant, I'm using the fuse kernel module
>         from 2.6.24.7, rather than the one from the patched package,
>         because the one in the kernel appears to be later. Can anyone
>         confirm if there are any known problems with this? Is there any
>         strong reason why I should use a different kernel module (e.g.
>         one from the patched fuse 2.7.4 package)?
> 
>         Gordan
> 
> 
>         _______________________________________________
>         Gluster-devel mailing list
>         Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>         http://lists.nongnu.org/mailman/listinfo/gluster-devel
> 
> 
> 
> 
>     _______________________________________________
>     Gluster-devel mailing list
>     Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
>     http://lists.nongnu.org/mailman/listinfo/gluster-devel
> 
> 






More information about the Gluster-devel mailing list