[Gluster-users] Notes regarding PXE-booting from GlusterFS

Fri Jun 18 19:57:04 UTC 2010

Hi all:

I saw the thread "Netboot / PXE-Boot from glusterfs?" in the online
list archives and decided to subscribe to the mailing-list and share
some notes I have.  Sorry for not being able to thread my reply in.

Anyway, I have recently added experimental support for GlusterFS to
Perceus, which is a provisioning system which supports stateful (with
disk) and stateless (diskless) configuration.  More information can be
found at the project website: http://www.perceus.org.

The way Perceus stateless provisioning works is that your client
PXE-boots and retrieves the Perceus kernel/initramfs, then it
retrieves the OS image via a specific transport (could be NFS, http,
or GlusterFS in this case) and then sets it up in RAM.  The system
then kexec into the new kernel and thus booting the system in RAM.

Perceus has a hybrid approach where the OS can reside in both RAM and
NFS (or GlusterFS).  This is to conserve on the memory consumption of
the OS image.

Some notes regarding the integration with GlusterFS:

- you will need the glusterfs binary, libglusterfs, libdl, libpthread,
libc and all the loadable libraries which your client volume file uses
in the initramfs
- right now I just build all the loadable libraries from auth,
scheduler, transport, xlators source directories and copy them to the
initramfs -- this is definitely bloated and I would like to figure out
a good way to cherry pick the ones I need
- since the Perceus kernel is fairly recent (2.6.32), you just need to
enable FUSE in the kernel and the user-space tools are not necessary
- if you are planning to do any operations on the GlusterFS mount
point which require direct mmap (system operations such as yum
requires it), you will need a recent kernel + new FUSE user tools

I haven't been able to do extensive testing but I know that it builds and runs.

Traditionally, Perceus retrieves the OS image and "hybridizes" the OS
via NFS server on the master server.  If you have a lot of client
nodes, you will probably run into scalability problems and that's one
of the reasons why I started looking at GlusterFS.

Another approach one could take is just run NFS over GlusterFS, which
negates all the work in getting the libraries into initramfs and such.
 The Gluster folks are working on a native NFS server so it would be
interesting to see how the performance stacks up.

If you have any questions, please let me know -- hope this is informative.

Cheers,

Bernard