[Bugs] [Bug 1336376] New: Sequential volume start&stop is failing with SSL enabled setup.

Mon May 16 10:07:52 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1336376

            Bug ID: 1336376
           Summary: Sequential volume start&stop  is failing  with SSL
                    enabled setup.
           Product: GlusterFS
           Version: 3.8.0
         Component: rpc
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: kaushal at redhat.com
                CC: bsrirama at redhat.com, bugs at gluster.org,
                    kaushal at redhat.com
        Depends On: 1336371
            Blocks: 1317278 (glusterfs-3.8.0)

+++ This bug was initially created as a clone of Bug #1336371 +++

+++ This bug was initially created as a clone of Bug #1336369 +++

+++ This bug was initially created as a clone of Bug #1336339 +++

Description of problem:
=======================
I done 400 times sequential volume start and stop, it's failed after some time
with CLI error message 
"""
volume start: rep: success
volume stop: rep: success
volume start: rep: success
volume stop: rep: failed: Commit failed on localhost. Please check the log file
for more details.
volume start: rep: failed: Pre Validation failed on
dhcp43-215.lab.eng.blr.redhat.com. Volume rep already started
volume stop: rep: failed: Volume rep is not in the started state
volume start: rep: failed: Pre-validation failed on localhost. Please check log
file for details
volume stop: rep: failed: Volume rep is not in the started state
volume start: rep: failed: Pre-validation failed on localhost. Please check lo
"""

glusterd logs are posted in the comment.

Version-Release number of selected component (if applicable):
=============================================================
glusterfs-3.7.9-4.

How reproducible:
=================
Always

Steps to Reproduce:
===================
1. Create a volume 16 x 2 = 32 type using two nodes.
2. Do FUSE mount and umount the volume 300 times.
3. Do volume start/stop 400 times //it fail after some start/stop operation.
4. Check the glusterd logs for error messages and warnings.

Actual results:
===============
Sequential volume start and stop is not working with SSL enabled setup. 

Expected results:
=================
Sequential volume start and stop should work with SSL enabled setup. 

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-05-16
13:58:53 IST ---

This bug is automatically being proposed for the current z-stream release of
Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change
the proposed release flag.

--- Additional comment from Byreddy on 2016-05-16 13:59:59 IST ---

[2016-05-16 06:52:45.028639] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028646] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028654] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028661] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028674] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028679] I [socket.c:459:ssl_setup_connection]
0-socket.management: peer CN = dhcp43-215.lab.eng.blr.redhat.com
[2016-05-16 06:52:45.028689] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028750] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028760] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028768] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028776] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028784] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028792] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028806] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028815] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028823] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028831] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028838] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028864] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028872] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028880] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028888] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028895] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028903] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028911] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028918] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028926] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028934] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028941] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028949] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028956] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028965] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.028973] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029132] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029152] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029168] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029183] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029191] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029198] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029206] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029213] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029221] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029228] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029236] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029244] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029251] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.029258] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.030909] W [socket.c:2719:socket_server_event_handler]
0-socket.management: accept on 10 failed (Too many open files)
[2016-05-16 06:52:45.031075] E [socket.c:2868:socket_server_event_handler]
0-socket.management: could not create pipe
[2016-05-16 06:52:45.041646] I [socket.c:459:ssl_setup_connection]
0-socket.management: peer CN = dhcp43-215.lab.eng.blr.redhat.com
[2016-05-16 06:52:45.043605] E [socket.c:2616:socket_poller]
0-socket.management: error in polling loop
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 2 times between
[2016-05-16 06:52:44.061901] and [2016-05-16 06:52:47.161342]
[2016-05-16 06:52:47.163334] E [MSGID: 106280]
[glusterd-volume-ops.c:1653:glusterd_op_stage_stop_volume] 0-management: Volume
rep is not in the started state
[2016-05-16 06:52:47.163376] E [MSGID: 106301]
[glusterd-syncop.c:1281:gd_stage_op_phase] 0-management: Staging of operation
'Volume Stop' failed on localhost : Volume rep is not in the started state
[2016-05-16 06:52:47.163551] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 2 times between
[2016-05-16 06:52:47.163551] and [2016-05-16 06:52:47.252524]
[2016-05-16 06:52:47.254692] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname] 0-management:
error in getaddrinfo: System error
 [Unknown error -11]
[2016-05-16 06:52:47.254771] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 06:52:47.254784] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 06:52:47.254793] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 06:52:47.254800] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 06:52:47.254809] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
[2016-05-16 06:52:47.255009] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 2 times between
[2016-05-16 06:52:47.255009] and [2016-05-16 06:52:50.350609]
[2016-05-16 06:52:50.352683] E [MSGID: 106280]
[glusterd-volume-ops.c:1653:glusterd_op_stage_stop_volume] 0-management: Volume
rep is not in the started state
[2016-05-16 06:52:50.352756] E [MSGID: 106301]
[glusterd-syncop.c:1281:gd_stage_op_phase] 0-management: Staging of operation
'Volume Stop' failed on localhost : Volume rep is not in the started state
[2016-05-16 06:52:50.352909] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 2 times between
[2016-05-16 06:52:50.352909] and [2016-05-16 06:52:50.448995]
[2016-05-16 06:52:50.451250] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname] 0-management:
error in getaddrinfo: System error
 [Unknown error -11]
[2016-05-16 06:52:50.451338] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 06:52:50.451353] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 06:52:50.451363] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 06:52:50.451371] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 06:52:50.451381] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
[2016-05-16 06:52:50.451505] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 2 times between
[2016-05-16 06:52:50.451505] and [2016-05-16 06:52:53.542390]
[2016-05-16 06:52:53.544425] E [MSGID: 106280]
[glusterd-volume-ops.c:1653:glusterd_op_stage_stop_volume] 0-management: Volume
rep is not in the started state
[2016-05-16 06:52:53.544467] E [MSGID: 106301]
[glusterd-syncop.c:1281:gd_stage_op_phase] 0-management: Staging of operation
'Volume Stop' failed on localhost : Volume rep is not in the star:

[2016-05-16 06:53:00.020201] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 06:53:00.020216] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 06:53:00.020225] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 06:53:00.020233] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 06:53:00.020243] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
[2016-05-16 06:53:00.020359] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
[2016-05-16 06:56:30.764560] I [MSGID: 106487]
[glusterd-handler.c:1472:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2016-05-16 07:27:02.476621] E [MSGID: 101012] [logging.c:376:gf_log_rotate]
0-logrotate: failed to open logfile [Too many open files]
[2016-05-16 07:27:02.476615] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/distribute.so: cannot open shared
object file: Too many open files
[2016-05-16 07:27:02.476753] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/replicate.so: cannot open shared
object file: Too many open files
[2016-05-16 07:27:02.476902] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/disperse.so: cannot open shared
object file: Too many open files
[2016-05-16 07:27:02.476983] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/stripe.so: cannot open shared object
file: Too many open files
[2016-05-16 07:27:02.477002] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/stripe.so: cannot open shared object
file: Too many open files
[2016-05-16 07:27:02.477010] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so: cannot open shared object
file: Too many open files
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/debug/io-stats.so: cannot open shared object
file: Too many open files" repeated 13 times between [2016-05-16
07:27:02.477010] and [2016-05-16 07:27:02.477115]
[2016-05-16 07:27:02.477124] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-cache.so: cannot open shared
object file: Too many open files
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-cache.so: cannot open shared
object file: Too many open files" repeated 4 times between [2016-05-16
07:27:02.477124] and [2016-05-16 07:27:02.477170]
[2016-05-16 07:27:02.477177] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-threads.so: cannot open shared
object file: Too many open files
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/io-threads.so: cannot open shared
object file: Too many open files" repeated 6 times between [2016-05-16
07:27:02.477177] and [2016-05-16 07:27:02.477235]
[2016-05-16 07:27:02.477242] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/write-behind.so: cannot open
shared object file: Too many open files
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/write-behind.so: cannot open
shared object file: Too many open files" repeated 7 times between [2016-05-16
07:27:02.477242] and [2016-05-16 07:27:02.477312]
[2016-05-16 07:27:02.477319] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/open-behind.so: cannot open
shared object file: Too many open files
[2016-05-16 07:27:02.477343] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/open-behind.so: cannot open
shared object file: Too many open files
[2016-05-16 07:27:02.477350] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/read-ahead.so: cannot open shared
object file: Too many open files
[2016-05-16 07:27:02.477371] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/performance/md-cache.so: cannot open shared
object file: Too many open files
[2016-05-16 07:27:02.477390] W [MSGID: 101095]
[xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/encryption/crypt.so: cannot open shared
object file: Too many open files
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/encryption/crypt.so: cannot open shared
object file: Too many open files" repeated 2 times between [2016-05-16
07:27:02.477390] and [2016-05-16 07:27:02.477412]
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/distribute.so: cannot open shared
object file: Too many open files" repeated 9 times between [2016-05-16
07:27:02.476615] and [2016-05-16 07:27:02.478514]
The message "W [MSGID: 101095] [xlator.c:148:xlator_volopt_dynload] 0-xlator:
/usr/lib64/glusterfs/3.7.9/xlator/cluster/replicate.so: cannot open shared
object file: Too many open files" repeated 2 times between [2016-05-16
07:27:02.478527] and [2016-05-16 07:27:02.478556]
[2016-05-16 08:04:49.794095] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
[2016-05-16 08:04:49.849469] I [MSGID: 106499]
[glusterd-handler.c:4330:__glusterd_handle_status_volume] 0-management:
Received status volume req for volume rep
[2016-05-16 08:04:49.851449] E [MSGID: 106301]
[glusterd-syncop.c:1281:gd_stage_op_phase] 0-management: Staging of operation
'Volume Status' failed on localhost : Volume rep is not started
[2016-05-16 08:04:55.999928] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname] 0-management:
error in getaddrinfo: System error
 [Unknown error -11]
[2016-05-16 08:04:56.000034] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 08:04:56.000059] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 08:04:56.000096] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 08:04:56.000107] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 08:04:56.000117] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
[2016-05-16 08:05:27.990060] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname] 0-management:
error in getaddrinfo: System error
 [Unknown error -11]
[2016-05-16 08:05:27.990090] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 08:05:27.990157] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 08:05:27.990177] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 08:05:27.990246] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 08:05:27.990272] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 10 times between
[2016-05-16 08:04:49.794095] and [2016-05-16 08:05:27.990523]
[2016-05-16 08:13:05.184001] W [MSGID: 101088]
[common-utils.c:3662:gf_backtrace_save] 0-management: Failed to save the
backtrace.
[2016-05-16 08:13:05.186580] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname] 0-management:
error in getaddrinfo: System error
 [Unknown error -11]
[2016-05-16 08:13:05.186639] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 08:13:05.186654] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 08:13:05.186665] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 08:13:05.186675] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 08:13:05.186690] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
[2016-05-16 08:13:06.581765] E [MSGID: 106187]
[glusterd-volume-ops.c:1486:glusterd_op_stage_start_volume] 0-management: Could
not find peer on which brick
dhcp43-214.lab.eng.blr.redhat.com:/bricks/brick0/brr0 resides
[2016-05-16 08:13:06.581767] W [MSGID: 106122]
[glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start
prevalidation failed.
[2016-05-16 08:13:06.581833] E [MSGID: 106122]
[glusterd-mgmt.c:879:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Start on local node
[2016-05-16 08:13:06.581722] E [MSGID: 106408]
[glusterd-peer-utils.c:120:glusterd_peerinfo_find_by_hostname] 0-management:
error in getaddrinfo: System error
 [Unknown error -11]
[2016-05-16 08:13:06.581761] E [MSGID: 101075]
[common-utils.c:3137:gf_is_local_addr] 0-management: error in getaddrinfo:
System error

[2016-05-16 08:13:06.581873] E [MSGID: 106122]
[glusterd-mgmt.c:1991:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre
Validation Failed
The message "W [MSGID: 101088] [common-utils.c:3662:gf_backtrace_save]
0-management: Failed to save the backtrace." repeated 5 times between
[2016-05-16 08:13:05.184001] and [2016-05-16 08:13:06.581980]
[2016-05-16 08:19:40.223288] I [MSGID: 106488]
[glusterd-handler.c:1533:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2016-05-16 08:19:40.239294] I [MSGID: 106488]
[glusterd-handler.c:1533:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
(END)

--- Additional comment from Kaushal on 2016-05-16 15:12:08 IST ---

Each encrypted connections create a pipe (which uses 2 fds), which are not
closed after disconnections. This is an fd leak, which can lead to resource
starvation and lead to other operations failing afterwards.

--- Additional comment from Vijay Bellur on 2016-05-16 15:18:56 IST ---

REVIEW: http://review.gluster.org/14356 (rpc/socket: Close pipe on
disconnection) posted (#1) for review on master by Kaushal M
(kaushal at redhat.com)

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1317278
[Bug 1317278] GlusterFS 3.8.0 tracker
https://bugzilla.redhat.com/show_bug.cgi?id=1336371
[Bug 1336371] Sequential volume start&stop  is failing  with SSL  enabled
setup.
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.