[Gluster-users] False notifications

Wed May 14 05:45:15 UTC 2014

On 5/13/2014 10:43 PM, Sahina Bose wrote:
>
> On 05/14/2014 07:42 AM, Milos( Kozák wrote:
>> Hi,
>> I am running a field trial of Gluster 3.5 on two servers. These two 
>> server use one 10k HDD each with XFS as a brick. On top of these 
>> bricks I have one replica 2 volume:
>>
>> [root at nodef01i ~]# gluster volume info ph-fs-0
>>
>> Volume Name: ph-fs-0
>> Type: Replicate
>> Volume ID: 5085e018-7c47-4d4f-8dcb-cd89ec240393
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: 10.11.100.1:/gfs/s3-sata-10k/brick
>> Brick2: 10.11.100.2:/gfs/s3-sata-10k/brick
>> Options Reconfigured:
>> performance.io-thread-count: 12
>> network.ping-timeout: 2
>> performance.cache-max-file-size: 0
>> performance.flush-behind: on
>>
>> Additionally I am running nagios to monitor everything where I use 
>> http://exchange.nagios.org/directory/Plugins/System-Metrics/File-System/GlusterFS-checks/details. 
>> I improved it slightly such that I monitor number of split-brain 
>> files and all this information go to the performance data, therefore 
>> I can draw pictures out of it (these pictures are in attachement).
>>
>> My problem is that I am receiving quite a lot of false warning from 
>> nagios during a day because there are some unsync files (gluster 
>> volume heal XXX info). I dont know if it is a bug or it is cause by 
>> my configuration. Either way it is quite disturbing and I am afraid 
>> that after receiving a lot false warning I could just omit an 
>> important one..
>
>
> I think the issue is because the "gluster volume heal info" also 
> reports files undergoing I/O in addition to files that need self-heal. 
> see 
> http://supercolony.gluster.org/pipermail/gluster-users/2014-May/040239.html 
> for more information on this. Pranith, please correct me if wrong.
>

That's what I've seen as well.

> On another note, we are also developing Nagios plugins that can be 
> used to monitor the various entities and services in the gluster 
> cluster. The repositories are here -
>
> gluster-nagios-addons - 
> http://review.gluster.org/#/admin/projects/gluster-nagios-addons
> nagios-server-addons - 
> http://review.gluster.org/#/admin/projects/nagios-server-addons
>
> We will be putting together a short doc on these soon, meanwhile, 
> please feel free to check it out and give us your valuable feedback.
>
>
>
>>
>> network.ping-timeout is set to 2, because I can not allow VM servers 
>> to hang for 2x42sec when other node is rebooted (we have some kind of 
>> reboot policy)..
>>
>> Thanks for help,
>> Milos
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140513/2298e7c5/attachment.html>