[Gluster-users] VM Instances Hang When Mounting Root Filesystem

Mon Apr 25 17:36:35 UTC 2011

Greetings Gluster folk,

I'm running a set of Xen instances and the Xen server configuration and disk
images live on a local disk.  I'm attempting to migrate the images/instances
off local disk and onto a gluster share.

I have a 4 node physical system cluster. Each system in the cluster has 2
disks. The 2nd drive is dedicated to gluster. It's mounted up as /gluster/
and the volume looks like this:

*[root at vm-container-0-0 ~]# gluster volume info*
*
*
*Volume Name: pifs*
*Type: Distributed-Replicate*
*Status: Started*
*Number of Bricks: 2 x 2 = 4*
*Transport-type: tcp*
*Bricks:*
*Brick1: vm-container-0-0:/gluster*
*Brick2: vm-container-0-1:/gluster*
*Brick3: vm-container-0-2:/gluster*
*Brick4: vm-container-0-3:/gluster*
*
*
*[root at vm-container-0-1 ~]# df -h /pifs/*
*Filesystem            Size  Used Avail Use% Mounted on*
*glusterfs#127.0.0.1:pifs*
*                      1.8T  421G  1.3T  25% /pifs*

I'm mounting the gluster partition from /etc/fstab, like so:

*127.0.0.1:pifs          /pifs                   glusterfs
direct-io-mode=disable,_netdev 0 0*

on each of the 4 nodes. Thus, each system is a gluster server and a gluster
client. Everything seem to work fairly happily except when xen tries to
actually run and instance. It begins to boot and every instance has the same
problem, which is it simply hangs after checking the local disks. So, to be
clear, this is the output of 'xm console' for a VM instance running off the
glusterfs based /pifs partition:

*kjournald starting.  Commit interval 5 seconds*
*EXT3-fs: mounted filesystem with ordered data mode.*
*Setting up other filesystems.*
*Setting up new root fs*
*no fstab.sys, mounting internal defaults*
*Switching to new root and running init.*
*unmounting old /dev*
*unmounting old /proc*
*unmounting old /sys*
*SELinux:  Disabled at runtime.*
*type=1404 audit(1303751618.915:2): selinux=0 auid=4294967295 ses=4294967295
*
*INIT: version 2.86 booting*
*                Welcome to  CentOS release 5.5 (Final)*
*                Press 'I' to enter interactive startup.*
*Cannot access the Hardware Clock via any known method.*
*Use the --debug option to see the details of our search for an access
method.*
*Setting clock : Mon Apr 25 13:13:39 EDT 2011 [  OK  ]*
*Starting udev: [  OK  ]*
*Setting hostname localhost:  [  OK  ]*
*No devices found*
*Setting up Logical Volume Management: [  OK  ]*
*Checking filesystems*
*Checking all file systems.*
*[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda1*
*/dev/sda1: clean, 55806/262144 files, 383646/524288 blocks*
*[/sbin/fsck.ext3 (1) -- /opt] fsck.ext3 -a /dev/sda2*
*/dev/sda2: clean, 11/2359296 files, 118099/4718592 blocks*

This is obviously the output from a CentOS 5.5 VM. I've also tried to run a
very simple ttylinux 6.0 VM simply to test with and it hangs in the same
place, when trying to mount the root file system:

*NET: Registered protocol family 2*
*IP route cache hash table entries: 32768 (order: 5, 131072 bytes)*
*TCP established hash table entries: 131072 (order: 8, 1048576 bytes)*
*TCP bind hash table entries: 65536 (order: 7, 524288 bytes)*
*TCP: Hash tables configured (established 131072 bind 65536)*
*TCP reno registered*
*TCP bic registered*
*NET: Registered protocol family 1*
*NET: Registered protocol family 17*
*Bridge firewalling registered*
*Using IPI Shortcut mode*
*md: Autodetecting RAID arrays.*
*md: autorun ...*
*md: ... autorun DONE.*
*VFS: Mounted root (ext2 filesystem) readonly.*
*Freeing unused kernel memory: 380k freed*
*
*
*                        -=#     ttylinux 6.0    #=-*
*
*
*Mounting proc:                                                  done*
*Mounting sysfs:                                                 done*
*Setting console loglevel:                                       done*
*Setting system clock: hwclock: cannot access RTC: No such file or directory
*
*                                                                failed*
*Starting fsck for root filesystem.*
*e2fsck 1.39 (29-May-2006)*
*/dev/sda1: clean, 427/1280 files, 4215/5120 blocks*
*Checking root filesystem:                                       done*
*Remounting root rw:*

The instances simply hang here forever... When the instance files are
located on the local disk, they boot up with no problem.

I have mounted /pifs/ with direct-io-mode=disabled but am still seeing some
strange behavior.

Does anyone know what the issue with this could be? Thanks!

   --joey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110425/d14d7406/attachment.html>