[Bugs] [Bug 1363595] New: Node remains in stopped state in pcs status with "/usr/ lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments ]" messages in logs.

bugzilla at redhat.com bugzilla at redhat.com
Wed Aug 3 07:00:08 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1363595

            Bug ID: 1363595
           Summary: Node remains in stopped state in pcs status with
                    "/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line
                    137: [: too many arguments ]" messages in logs.
           Product: GlusterFS
           Version: 3.8.1
         Component: ganesha-nfs
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: sraj at redhat.com
                CC: bugs at gluster.org, jthottan at redhat.com,
                    kkeithle at redhat.com, ndevos at redhat.com,
                    skoduri at redhat.com



Description of problem:

One of the node remains in stopped state in pcs status with
"/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]" messages in logs.

Version-Release number of selected component (if applicable):

[root at dhcp41-253 ~]# rpm -qa|grep glusterfs
glusterfs-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-cli-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-ganesha-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-libs-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-client-xlators-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-fuse-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-server-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-geo-replication-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
glusterfs-api-3.8.1-0.4.git56fcf39.el7rhgs.x86_64

[root at dhcp41-253 ~]# rpm -qa|grep ganesha
glusterfs-ganesha-3.8.1-0.4.git56fcf39.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.0-0.14dev26.el7.centos.x86_64
nfs-ganesha-2.4.0-0.14dev26.el7.centos.x86_64

How reproducible:

Observed twice

Steps to Reproduce:
1. Try creating nfs-ganesha cluster on 4 nodes.
2. Observe that sometimes, after gluster nfs-ganesha enable, one of the nodes
remains in stopped state in pcs status and below messages are seen in
/var/log/messages:

Aug  3 12:22:10 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7257:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]
Aug  3 12:22:25 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7271:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]
Aug  3 12:22:40 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7285:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]
Aug  3 12:22:55 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7326:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]
Aug  3 12:23:10 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7340:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]
Aug  3 12:23:25 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7354:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]
Aug  3 12:23:40 dhcp41-253 lrmd[645]:  notice:
nfs-mon_monitor_10000:7368:stderr [
/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]


pcs status output:

4 nodes and 16 resources configured

Online: [ dhcp41-206.lab.eng.blr.redhat.com dhcp41-253.lab.eng.blr.redhat.com
dhcp43-133.lab.eng.blr.redhat.com dhcp43-181.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp41-206.lab.eng.blr.redhat.com
dhcp41-253.lab.eng.blr.redhat.com dhcp43-133.lab.eng.blr.redhat.com
dhcp43-181.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp41-206.lab.eng.blr.redhat.com
dhcp41-253.lab.eng.blr.redhat.com dhcp43-133.lab.eng.blr.redhat.com
dhcp43-181.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp41-206.lab.eng.blr.redhat.com
dhcp43-133.lab.eng.blr.redhat.com dhcp43-181.lab.eng.blr.redhat.com ]
     Stopped: [ dhcp41-253.lab.eng.blr.redhat.com ]
 dhcp43-133.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):       
Started dhcp43-133.lab.eng.blr.redhat.com
 dhcp41-206.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):       
Started dhcp41-206.lab.eng.blr.redhat.com
 dhcp41-253.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):       
Started dhcp41-206.lab.eng.blr.redhat.com
 dhcp43-181.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr):       
Started dhcp43-181.lab.eng.blr.redhat.com

Failed Actions:
* nfs-grace_monitor_0 on dhcp41-253.lab.eng.blr.redhat.com 'unknown error' (1):
call=17, status=complete, exitreason='none',
    last-rc-change='Tue Aug  2 17:37:52 2016', queued=0ms, exec=55ms


PCSD Status:
  dhcp43-133.lab.eng.blr.redhat.com: Online
  dhcp41-206.lab.eng.blr.redhat.com: Online
  dhcp41-253.lab.eng.blr.redhat.com: Online
  dhcp43-181.lab.eng.blr.redhat.com: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


Actual results:

One of the node remains in stopped state in pcs status with
"/usr/lib/ocf/resource.d/heartbeat/ganesha_mon: line 137: [: too many arguments
]" messages in logs.

Expected results:

There should not be any errors in logs and all the nodes should be up

Additional info:

sosreports and logs will be attached.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list