[Gluster-users] Can Hadoop run on gluster in 1 JT, N TT setup or only works for 1 JT+TT?

Mon Jan 30 17:14:59 UTC 2012

Hi,

Can you please dump the contents of conf/core-site.xml from the JT and TT ? (or attach it).

We have tested the plugin with 1 hadoop master (JT) and 8 Hadoop Task Trackers (TT), so it should work with your setup too.

Additionally it would be better if you can give us back the JobTracker and TaskTracker logs. (If they are huge in size paste the last 50 odd lines)

Thanks,
-Venky
________________________________________
From: gluster-users-bounces at gluster.org [gluster-users-bounces at gluster.org] on behalf of Fermín Galán Márquez [fermin at tid.es]
Sent: Monday, January 30, 2012 10:30 PM
To: gluster-users at gluster.org
Subject: [Gluster-users] Can Hadoop run on gluster in 1 JT, N TT setup or only works for 1 JT+TT?

Hi,

Recently I've set up a Gluster cluster to run Hadoop M/R jobs, following
the document at
http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/Gluster_Hadoop_Compatible_Storage.pdf.

As long as I check in my tests, what the gluster_hadoop.jar plugin is
doing is to automatically mount the gluster volumen at the JT node, then
the TT (in the same node) uses that mountpoint to do its work. That's ok
if JT and TT are runing in the same node (i.e. a one-node setup (*)).
However, when I test with a 2-nodes (*) setup in which the JT runs in a
node and TT in another node it doesn't work (e.g. hadoop jar gets
stalled in the "INFO mapred.JobClient:  map 0% reduce 0%" with no
progress after that), which at the end makes sense, given that the
gluster volume is not mounted in the TT node (it's only mounted in the
JT node).

This is a bit annoying to me, given I was expecting that the gluster
volume gets mounted in the TT nodes, which are the ones that actually
need to access to data in the filesystem.

Thus, is not possible to run a 1 JT, N TT Hadoop cluster with gluster?
It only works on a 1 JT+TT?

Or maybe I'm doing something wrong or maybe I'm not understanding
correctly the document at
http://download.gluster.com/pub/gluster/glusterfs/qa-releases/3.3-beta-2/Gluster_Hadoop_Compatible_Storage.pdf
(any piece of information about Hadoop running on gluster is highly
welcome, please).

I'm using Hadoop 0.20.2 and Gluster 3.3beta2. If you need to know any
other information about my setup, don't hesitate to ask for it!

Thanks in advance!

Best regards,

------
Fermín

(*) I refer to nodes in the Hadoop cluster, no matter how many nodes are
implementing the gluster cluster (latter ones are "abstracted" by the
mountpoint, as far as I understand)

Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo.
This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at
http://www.tid.es/ES/PAGINAS/disclaimer.aspx
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users