[Gluster-users] (Fixed) Re: Can Hadoop run on gluster in 1 JT, N TT setup or only works for 1 JT+TT?

Venky Shankar vshankar at redhat.com
Wed Feb 1 07:15:00 UTC 2012


> However, after reading your mail, I wonder if Hadoop plugin for 
> gluster implements some location-based job scheduling similar to the 
> one in Hadoop on HDFS. I mean, in Hadoop on HDFS the JT coordinates 
> with the NN (which knows where every file block is located withing the 
> cluster), so each map task is scheduled to the TT closest to the input 
> they have to process (ideally, collocated). In Hadoop on gluster I 
> understand that there is no NN equivalente, but is there any mean so 
> JT can know which nodes in the cluster have the actual data in their 
> respective backend filesystem so JT tries to schedule each map task to 
> a TT in one of these nodes? In negative case, how JT select the TT to 
> schedule each map task (round-robin, randomly, etc.)?
> Probably my question is very basic, but I haven't find a clear and 
> direct answer in the documentation, sorry...

The JT knows which part of the file is where by calling an API that the 
GlusterFS plug-in implements.

If you see the plug-in source, it extends the *FileSystem* *class. So, 
the JT invokes an API that we implement (*getFileBlockLocations()***), 
and we give back the required info (file, offset, length) back to JT. 
This helps it to decide which job to schedule to which TT node. This API 
queries GlusterFS for the pathinfo extended attribute 
(trusted.glusterfs.pathinfo) to get the required info.



> Thanks!
> Best regards,
> ------
> Fermín
> ------------------------------------------------------------------------
> Este mensaje se dirige exclusivamente a su destinatario. Puede 
> consultar nuestra política de envío y recepción de correo electrónico 
> en el enlace situado más abajo.
> This message is intended exclusively for its addressee. We only send 
> and receive email on the basis of the terms set out at
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120201/73a4a4df/attachment.html>

More information about the Gluster-users mailing list