[Gluster-users] Frequent glusterd restarts needed to, avoid NFS performance degradation

Wed Apr 18 12:58:35 UTC 2012

On 04/18/2012 01:48 PM, gluster-users-request at gluster.org wrote:
> Date: Tue, 17 Apr 2012 19:06:31 -0500 (CDT)
> From: Gerald Brandt<gbr at majentis.com>
> Subject: Re: [Gluster-users] Frequent glusterd restarts needed to
> 	avoid	NFS	performance degradation
> To: Dan Bretherton<d.a.bretherton at reading.ac.uk>
> Cc: gluster-users<gluster-users at gluster.org>
> Message-ID:<22749685.104.1334707572319.JavaMail.gbr at thinkpad>
> Content-Type: text/plain; charset=utf-8
>
> Hi,
>
> ----- Original Message -----
>> >  Dear All-
>> >  I find that I have to restart glusterd every few days on my servers
>> >  to
>> >  stop NFS performance from becoming unbearably slow.  When the problem
>> >  occurs, volumes can take several minutes to mount and there are long
>> >  delays responding to "ls".   Mounting from a different server, i.e.
>> >  one
>> >  not normally used for NFS export, results in normal NFS access
>> >  speeds.
>> >  This doesn't seem to have anything to do with load because it happens
>> >  whether or not there is anything running on the compute servers.
>> >    Even
>> >  when the system is mostly idle there are often a lot of glusterfsd
>> >  processes running, and on several of the servers I looked at this
>> >  evening there is a process called glusterfs using 100% of one CPU.  I
>> >  can't find anything unusual in nfs.log or
>> >  etc-glusterfs-glusterd.vol.log
>> >  on the servers affected.  Restarting glusterd seems to stop this
>> >  strange
>> >  behaviour and make NFS access run smoothly again, but this usually
>> >  only
>> >  lasts for a day or two.
>> >  
>> >  This behaviour is not necessarily related to the length of time since
>> >  glusterd was started, but has more to do with the amount of work the
>> >  GlusterFS processes on each server have to do.  I use a different
>> >  server
>> >  to export each of my 8 different volumes, and the NFS performance
>> >  degradation seems to affect the most heavily used volumes more than
>> >  the
>> >  others.  I really need to find a solution to this problem; all I can
>> >  think of doing is setting up a cron job on each server to restart
>> >  glusterd every day, but I am worried about what side effects that
>> >  might
>> >  have.  I am using GlusterFS version 3.2.5.  All suggestions would be
>> >  much appreciated.
>> >  
>> >  Regards,
>> >  Dan.
> I run GlusterFS 3.2.5 and only access is via NFS.  I'm running Citrix XenServer with about 23 VM's off of it.  I haven't seen any degradation at all.
>
> One thing I don't have is replication or anything else set up.  The server is ready to replicate, but I'm waiting for 3.3
>
> Gerald
>
Hello Gerald,
Thanks for your comments.  I should have mentioned that I do use 
replication in my cluster, but I'm not sure that the replication is 
causing the problem.  Another thing to mention about my system is that 
there is a lot of data transfer going on most of the time, including 
models and data processing applications running on the compute cluster 
and data transfers from other sites.  I wouldn't be surprised if the 
Gluster-NFS handles several terabytes of data before it starts to grind 
to a halt.  Perhaps this problem hasn't been noticed before because my 
usage isn't typical.  However, it should be fairly easy to reproduce if 
it's just a matter of transferring a large volume of data.
-Dan.