[Gluster-devel] Language Bindings and Data Serialization in Gluster

Fri Feb 15 05:46:13 UTC 2013

I'm working on a hadoop/gluster integration.  First, whoever wrote the initial map/reduce gluster DFS layer did a great job.  Same with the FUSE layer--  FUSE alone solves a lot of problems.

However, some of the hadoop architectures require cross language (java) bindings with gluster (c). Other's I've talked to about cross language bindings have shown interest in gluster--> python and ruby bindings.  From the research I've done, most of a common language interface is simple and the only tricky part is data serialization.  

I'm not an expert in data serialization across programming languages so it's been a lot of research.  There's a few new and old solutions to the problem but I think the older non IDL solutions are obvious and not really that interesting (RPC, ASN.1, raw JNI or other language specific native access). 

Moving up to IDL abstractions, there's Thrift, Protobuf and SWIG.  SWIG is really good for simple multi-language bindings but requires manual data serialization.

>From my research the common solutions for multi-language serialization today is either protobuf or thrift.   Thrift is a facebook originated solution (now under apache) and protobuf is google's open solution.

I'm leaning towards thrift since it's apache.  A few of the hadoop architectures already support thrift so that would simplify things.  What's your opinion?  Is anyone familiar with, or already solved I/O serialization across programming languages in gluster? I'd love some feedback and discussion.  

Links:
SWIG: http://www.swig.org/
Protobuf: http://code.google.com/p/protobuf/
Serialization Discussion: http://stackoverflow.com/questions/4633611/what-are-the-key-differences-between-apache-thrift-google-protocol-buffers-mes
Previous Gluster Interface Pitch: https://fedoraproject.org/wiki/Summer_coding_ideas_for_2012#Implement_a_binding_translator_for_GlusterFS

Humor: http://www.quickmeme.com/meme/3t05wa/

Thanks!
bradley childs