Gluster and Cloudera's Hadoop

James Gurtowski gurtowskij at gmail.com
Mon Apr 8 18:17:44 UTC 2013


It seems the gluster hadoop plugin assumes all hadoop daemons/commands are
run as root? I was having trouble getting the jobtracker to start because
every time the fs is initialized a system call "mount -t glusterfs ..." is
issued. Cloudera runs all daemons as the mapred user who is not allowed to
run mount, so this is failing. I modified GlusterFileSystem.java (see
attached diff) and set fs.glusterfs.automount to false in core-site.xml so
this wouldn't happen.
That fixed the initial issue of getting daemons to start.

My next issue is getting hadoop jobs to run. I get an error:

File /mnt/glusterfs/user/james/.staging/job_201304081221_0013/job.xml does
not exist.

I believe this to be a permissions issue, I can access this file fine from
my account, but the .staging directory is only accessible by the user who
launches the job :

drwx------ 8 james james 870 Apr  8 14:10 .staging

If I change the permissions, they are changed back (by Cloudera's hadoop)
when I launch a job:
Permissions on staging directory
glusterfs://node001:9000/user/james/.staging are incorrect: rwxrwxrwx.
Fixing permissions to correct value rwx------

Any ideas of a work around would be greatly appreciated.

