[Bugs] [Bug 1197185] New: Brick/glusterfsd crash randomly once a day on a replicated volume

bugzilla at redhat.com bugzilla at redhat.com
Fri Feb 27 17:33:52 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1197185

            Bug ID: 1197185
           Summary: Brick/glusterfsd crash randomly once a day on a
                    replicated volume
           Product: GlusterFS
           Version: 3.5.2
         Component: rpc
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: pauyeung at shopzilla.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com



Description of problem:
glusterfsd crash randomly once a day on a replicated volume with error:
E [rpcsvc.c:547:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to
complete successfully

Version-Release number of selected component (if applicable):
3.5.2

How reproducible:


Steps to Reproduce:
1. N/A
2.
3.

Actual results:
[2015-02-27 16:55:01.554964] I
[glusterd-handler.c:1114:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2015-02-27 17:00:02.340402] I
[glusterd-handler.c:1114:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2015-02-27 17:02:11.845002] W [socket.c:522:__socket_rwv] 0-management: readv
on /var/run/4f4216db5dffe909b3ed8430b737d9d8.socket failed (No data available)
[2015-02-27 17:02:11.845585] I
[glusterd-handler.c:3713:__glusterd_brick_rpc_notify] 0-management:
Disconnected from glusterprod006.bo.shopzilla.sea:/brick02/gfs
[2015-02-27 17:02:11.854461] W [rpcsvc.c:254:rpcsvc_program_actor]
0-rpc-service: RPC program not available (req 1298437 330)
[2015-02-27 17:02:11.854479] E [rpcsvc.c:547:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
[2015-02-27 17:02:11.865963] W [rpcsvc.c:254:rpcsvc_program_actor]
0-rpc-service: RPC program not available (req 1298437 330)
[2015-02-27 17:02:11.865987] E [rpcsvc.c:547:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
[2015-02-27 17:02:36.504901] I [glusterd-pmap.c:271:pmap_registry_remove]
0-pmap: removing brick /brick02/gfs on port 49153
[2015-02-27 17:02:36.523359] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap:
adding brick /brick02/gfs on port 49153
[2015-02-27 17:02:40.298697] E
[glusterd-utils.c:4124:glusterd_nodesvc_unlink_socket_file] 0-management:
Failed to remove /var/run/8f75ea8dc7d25bf6095380ad15310042.socket error:
Permission denied
[2015-02-27 17:02:40.300631] I
[glusterd-utils.c:4158:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV3
successfully
[2015-02-27 17:02:40.300943] I
[glusterd-utils.c:4163:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV1
successfully
[2015-02-27 17:02:40.301172] I
[glusterd-utils.c:4168:glusterd_nfs_pmap_deregister] 0-: De-registered NFSV3
successfully
[2015-02-27 17:02:40.301433] I
[glusterd-utils.c:4173:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v4
successfully
[2015-02-27 17:02:40.301791] I
[glusterd-utils.c:4178:glusterd_nfs_pmap_deregister] 0-: De-registered NLM v1
successfully
[2015-02-27 17:02:40.302149] I
[glusterd-utils.c:4183:glusterd_nfs_pmap_deregister] 0-: De-registered ACL v3
successfully
[2015-02-27 17:02:40.304581] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-02-27 17:02:40.305013] I [mem-pool.c:539:mem_pool_destroy] 0-management:
size=588 max=0 total=0
[2015-02-27 17:02:40.305030] I [mem-pool.c:539:mem_pool_destroy] 0-management:
size=124 max=0 total=0
[2015-02-27 17:02:40.305615] I [socket.c:3561:socket_init] 0-management: SSL
support is NOT enabled
[2015-02-27 17:02:40.305633] I [socket.c:3576:socket_init] 0-management: using
system polling thread
[2015-02-27 17:02:40.306513] I [socket.c:2238:socket_event_handler]
0-transport: disconnecting now
[2015-02-27 17:02:41.315085] E
[glusterd-utils.c:4124:glusterd_nodesvc_unlink_socket_file] 0-management:
Failed to remove /var/run/f422793f928763c541562cd141488c0c.socket error: No
such file or directory
[2015-02-27 17:02:41.317551] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-02-27 17:02:41.317617] I [socket.c:3561:socket_init] 0-management: SSL
support is NOT enabled
[2015-02-27 17:02:41.317626] I [socket.c:3576:socket_init] 0-management: using
system polling thread
[2015-02-27 17:02:41.318974] I [mem-pool.c:539:mem_pool_destroy] 0-management:
size=588 max=1 total=4454
[2015-02-27 17:02:41.319032] I [mem-pool.c:539:mem_pool_destroy] 0-management:
size=124 max=1 total=4454
[2015-02-27 17:02:41.319117] W [socket.c:522:__socket_rwv] 0-management: readv
on /var/run/f422793f928763c541562cd141488c0c.socket failed (No data available)
[2015-02-27 17:02:42.319567] E
[glusterd-utils.c:4124:glusterd_nodesvc_unlink_socket_file] 0-management:
Failed to remove /var/run/14294a56444cea1dc097c88aef4d8f1c.socket error: No
such file or directory
[2015-02-27 17:02:42.320171] W [socket.c:522:__socket_rwv] 0-management: readv
on /var/run/f422793f928763c541562cd141488c0c.socket failed (No data available)
[2015-02-27 17:02:42.320409] I [socket.c:2238:socket_event_handler]
0-transport: disconnecting now
[2015-02-27 17:02:42.320454] I [mem-pool.c:539:mem_pool_destroy] 0-management:
size=588 max=0 total=0
[2015-02-27 17:02:42.320471] I [mem-pool.c:539:mem_pool_destroy] 0-management:
size=124 max=0 total=0
[2015-02-27 17:02:42.362839] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-02-27 17:02:42.362904] I [socket.c:3561:socket_init] 0-management: SSL
support is NOT enabled
[2015-02-27 17:02:42.362915] I [socket.c:3576:socket_init] 0-management: using
system polling thread
[2015-02-27 17:05:03.252884] I
[glusterd-handler.c:1114:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req

Expected results:
Brick should not crash

Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list