[Gluster-users] Extremely high load after 100% full bricks

Thu Oct 25 15:34:33 UTC 2012

Dear All-
I'm not sure this excessive server load has anything to do with the 
bricks having been full.  I noticed the full bricks while I was 
investigating the excessive load, and assumed the two were related.  
However despite there being plenty of room on all the bricks the load on 
this particular pair of servers has been consistently between 60 and 80 
all week, and this is causing serious problems for users who are getting 
repeated I/O errors.  The servers are responding so slowly that 
GlusterFS isn't working properly, and CLI commands like "gluster volume 
stop" just time out when issued on any server.  Restarting glusterd on 
all servers has no effect.

Is there any way to limit the load imposed by GlusterFS on a server?  I 
desperately need to reduce it to a level where GlusterFS can work 
properly and talk to the other servers without timing out.

-Dan.

On 10/22/2012 02:03 PM, Dan Bretherton wrote:
> Dear All-
> A replicated pair of servers in my GlusterFS 3.3.0 cluster have been 
> experiencing extremely high load for the past few days after a 
> replicated brick pair became 100% full.  The GlusterFS related load on 
> one of the servers was fluctuating at around 60, and this high load 
> would swap to the other server periodically.  When I noticed the full 
> bricks I quickly extended the volume by creating new bricks on another 
> server, and manually moved some data off the full bricks to create 
> space for write operations.  The fix-layout operation seemed to start 
> normally but the load then increased even further.  The server with 
> the high load (then up to about 80) became very slow to respond and I 
> noticed a lot of errors in the VOLNAME-rebalance.log files like the 
> following.
>
> [2012-10-22 00:35:52.070364] W 
> [socket.c:1512:__socket_proto_state_machine] 0-atmos-client-10: 
> reading from socket failed. Error (Transport endpoint is not 
> connected), peer (192.171.166.92:24052)
> [2012-10-22 00:35:52.070446] E [rpc-clnt.c:373:saved_frames_unwind] 
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xe7) [0x2b3fd905c547] 
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb2) 
> [0x2b3fd905bf42] 
> (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) 
> [0x2b3fd905bbfe]))) 0-atmos-client-10: forced unwinding frame 
> type(GlusterFS 3.1) op(INODELK(29)) called at 2012-10-22 
> 00:35:45.454529 (xid=0x285951x)
>
> There have also been occasional errors like the following, referring 
> to the pair of bricks that became 100% full.
>
> [2012-10-22 01:32:52.827044] W 
> [client3_1-fops.c:5517:client3_1_readdir] 0-atmos-client-15:  
> (00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD
> [2012-10-22 09:49:21.103066] W 
> [client3_1-fops.c:5628:client3_1_readdirp] 0-atmos-client-14:  
> (00000000-0000-0000-0000-000000000000) remote_fd is -1. EBADFD
>
> The log files from the bricks that were 100% full have a lot of these 
> errors in, from the period after I freed up some space on them.
>
> [2012-10-22 00:40:56.246075] E [server.c:176:server_submit_reply] 
> (-->/usr/lib64/libglusterfs.so.0(default_inodelk_cbk+0xa4) 
> [0x361da23e84] 
> (-->/usr/lib64/glusterfs/3.3.0/xlator/debug/io-stats.so(io_stats_inodelk_cbk+0xd8) 
> [0x2aaaabd74d48] 
> (-->/usr/lib64/glusterfs/3.3.0/xlator/protocol/server.so(server_inodelk_cbk+0x10b) 
> [0x2aaaabf9742b]))) 0-: Reply submission failed
> [2012-10-22 00:40:56.246117] I 
> [server-helpers.c:629:server_connection_destroy] 0-atmos-server: 
> destroyed connection of 
> bdan10.nerc-essc.ac.uk-13609-2012/10/21-23:04:53:323865-atmos-client-15-0
>
> All these errors have only occurred on the replicated pair of servers 
> that had suffered from 100% full bricks.  I don't know if the errors 
> are being caused by the high load (resulting in poor communication 
> with other peers for example) or if the high load is the result of 
> replication and/or distribution errors.  I have tried various things 
> to bring the load down, including un-mounting the volume and stopping 
> the fix-layout operation, but the only thing that works is stopping 
> the volume. Obviously I can't do that for long because people need to 
> use the data, but with the load as high as it is data access is very 
> slow and users are experiencing a lot of temporary I/O errors.   
> Bricks from several volumes are on those servers so everybody in the 
> department is being affected by this problem.  I thought at first that 
> the load was being caused by self-heal operations fixing errors caused 
> by write failures that occurred when the bricks were full, but it is 
> glusterfs threads that are causing the high load, not glustershd.
>
> Can anyone suggest a way to bring the load down so people can access 
> the data properly again?  Also, can I trust GlusterFS to eventually 
> self-heal the errors causing the above error messages?
>
> Regards,
> -Dan.