[Gluster-devel] Rebalance mem-leak

Mon Oct 5 07:35:41 UTC 2015

Hi Mohamed,
  Most of the mem-leaks are fixed now. Backport for 3.7 is available here: http://review.gluster.org/#/c/12296/

Thanks,
Susant

----- Original Message -----
From: "Susant Palai" <spalai at redhat.com>
To: "Mohamed Pakkeer" <Pakkeer.Mohideen at realimage.com>
Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Amit Chaurasia" <achauras at redhat.com>
Sent: Monday, 14 September, 2015 6:14:09 PM
Subject: Re: Rebalance mem-leak

Hi Mohamed,
   Found the issue from the local run of rebalance. Will analyze the problem and will update.

Regards,
Susant

----- Original Message -----
From: "Susant Palai" <spalai at redhat.com>
To: "Mohamed Pakkeer" <Pakkeer.Mohideen at realimage.com>
Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Amit Chaurasia" <achauras at redhat.com>
Sent: Monday, 14 September, 2015 4:50:55 PM
Subject: Re: Rebalance mem-leak

Awesome! So the next time you run fix-layout, take a state-dump when you observe mem-usage increasing. I will also try to run rebalance and check mem-leak.

Thanks,
Susant

----- Original Message -----
From: "Mohamed Pakkeer" <Pakkeer.Mohideen at realimage.com>
To: "Susant Palai" <spalai at redhat.com>
Cc: "Pranith Kumar Karampuri" <pkarampu at redhat.com>, "Amit Chaurasia" <achauras at redhat.com>
Sent: Monday, 14 September, 2015 4:04:29 PM
Subject: RE: Rebalance mem-leak

Hi Susant,

Thanks for your mail. I am not running the fix-layout now?. But I am going to extent the cluster from 72 * 8+2 distributed disperse volume to 108 * 8+2 distributed  disperse volume on coming Friday(18-09-2015). Currently we have 2 PB of Video data on the test cluster. I will send the state-dump, once we extend the cluster and run the fix-layout command.  Meanwhile, I am trying to reproduce the memory leak issue in VM cluster. when we run the fix-layout , the memory is not claiming to the memory pool after fix-layout process completed. We need to restart/unmount the drives to reclaim the occupied memory. If we have low memory size in system, the system goes to OOM. 

We faced OOM with 16GB RAM in a 36 * 4TB bricks system . When we  increase the RAM size from 16GB to 32 GB, the fix-layout was completed with consumed all 32 GB primary memory and swap 2GB memory. But after completing the fix-layout, the memory was not claimed back to the memory pool.

Thanks & regards
K.Mohamed Pakkeer

-----Original Message-----
From: Susant Palai [mailto:spalai at redhat.com] 
Sent: Monday, September 14, 2015 12:58 PM
To: Mohamed Pakkeer
Cc: Pranith Kumar Karampuri; Amit Chaurasia
Subject: Rebalance mem-leak

Hi Pakker,
    Heard about mem-leak issue in rebalance. Happy to know that you are stepping up to help out on the issue. :)

So if you are running fix-layout right now, would you be able to send here the state-dump? 

Here are the steps to take the state dump.

1. Find your state-dump destination; Run "gluster --print-statedumpdir". The state dump will be stored in this location.

2. When you see any of the rebalance process on any of the servers using high memory issue the following command.
   "kill -USR1 <pid-of-rebalance-process>".  ---> ps aux | grep rebalance should give the rebalance process pid.

The state dump should give some hint about the high mem-usage.

My IRC nick will be "spalai"

Thanks,
Susant