[Gluster-devel] wrong volume status report

Atin Mukherjee amukherj at redhat.com
Mon Sep 7 04:24:25 UTC 2015



On 09/07/2015 06:12 AM, Emmanuel Dreyfus wrote:
> I wrote a simple nagios plugin in C that calls gluster volume status to
> check taht all bricks are online (is it of any interest to someone else
> than me? What name would you expect for it? Does check_gfbricks looks
> sane?)
> 
> The thing periodically reported offline bricks and I did not understood
> why, until I realized that the peers all run the test at the same time,
> and hence may fail to lock the volume because another peer already holds
> the lock.
> 
> It seems that a failed lock acquisition is reported as offline bricks
> for the peer. The simple workaround is to not check at the same time,
> but perhaps the reported data could be improved?
GlusterD doesn't report the status of bricks if find that another
transaction on the same volume is in progress on the cluster. You could
very well prove that by running a concurrent volume status command from
different peers. I would suggest you to check the plugin and see why
nagios is not handling the negative case here.

~Atin


More information about the Gluster-devel mailing list