[Gluster-devel] GlusterD 2.0 status updates

Aravinda avishwan at redhat.com
Tue Sep 8 06:18:15 UTC 2015


On 09/07/2015 10:38 PM, Vijay Bellur wrote:
> On Monday 07 September 2015 09:38 PM, Jeff Darcy wrote:
>>> Agree here. I think we can use Go as the other non-C language for
>>> components like glusterd, geo-replication etc. and retire the python
>>> implementation of gsyncd that exists in 3.x over time.
>>
>> That sounds like a pretty major effort.  Are you suggesting that it
>> (or even part of it) should be part of 4.0?
>>
>
> Not for the first cut of 4.0 but we can take this up as a goal for a 
> subsequent 4.x release. We do not have a lot of non C code. syncdaemon 
> and glusterfind are probably the only non C pieces that exist today in 
> the codebase (excluding tests, glupy). The python implementation of 
> gsyncd is not one of my favorite pieces of code in the repository. I 
> find it obscure and am slightly concerned about it from a 
> maintainability perspective. A refactor might make it easier to manage 
> over time.

Geo-replication code base is now lot better compared to previous
releases. It is unfortunate that today Geo-rep codebase is example of
how not to write Python, hopefully this will change in future.

I don't think Geo-replication's performance problems due to a
language, it was written without any of the advanced features/external
libraries just to support Python 2.4. We have not spent enough
resources/time to improve the Geo-replication, the existing
codebase(during Gluster 3.5 I guess) is hacked to make Geo-replication
distributed and consume changelog. I think modularizing the codebase
will help in improving the maintainability.

We faced issues with threading because of GIL, but we can use
Multiprocessing/Queues and other technologies to avoid using
Threading. If we are migrating to Python 3(not soon) then asyncio
library is available in standard library to write better concurrent
applications.

I think Python is still a good choice if we rewrite Geo-replication
today for the following reasons.

1. Good stable libraries support(zerorpc/zeromq, paramiko, pyxattr etc)
2. Performance was/is not really a concern since it uses external
    tools like Rsync for sync. Changelog parsing is written in
    C(libgfchangelog)
3. Can expect good Community contribution
4. Easy to learn and easy to find resources
5. There are good stuffs available in existing codebase, which can be
    reused.
6. It is very easy to integrate with libraries written in C, if
    performance is required for some portion we can write in C and use it
    in Python. For example xsync crawl.
7. For performance, we can use Cython(http://cython.org/) to generate
     performant Python extensions.

I am sure Go also will have the same advantageous, we may have to
evaluate once again before we really start rewriting Geo-rep.


>
>>> python will still be present as part of our distaf test infrastructure.
>>> We can possibly port our bash test scripts to python and retire the
>>> usage of BASH with Gluster.next releases.
>>
>> Getting rid of bash scripts sounds great, but we're talking about 420
>> scripts - many of them quite opaque and thus tedious to port to
>> another language/framework.  I suspect that some of these tests will
>> linger in their current form for quite some time to come.
>>
>
> Yes, agree with you. We can look at having new tests in python and do 
> a lazy conversion of existing test units to python.
>
> -Vijay
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


regards
Aravinda
http://aravindavk.in




More information about the Gluster-devel mailing list