[Bugs] [Bug 1672205] New: [GSS] 'gluster get-state' command fails if volume brick doesn't exist.
bugzilla at redhat.com
bugzilla at redhat.com
Mon Feb 4 09:32:44 UTC 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1672205
Bug ID: 1672205
Summary: [GSS] 'gluster get-state' command fails if volume
brick doesn't exist.
Product: GlusterFS
Version: mainline
Status: NEW
Component: glusterd
Keywords: Improvement
Severity: medium
Priority: medium
Assignee: bugs at gluster.org
Reporter: srakonde at redhat.com
Depends On: 1669970
Target Milestone: ---
Group: private
Classification: Community
Description of problem:
'gluster get-state' command fails when any brick of a volume is not present or
deleted. Instead the command output should report the brick failure.
When any brick of a volume is not available or being removed 'gluster
get-state' command fails with the following error:
'Failed to get daemon state. Check glusterd log file for more details'
The requirement is 'gluster get-state' command should not fail and generate
gluster brick's state in the output.
For example:
cat /var/run/gluster/glusterd_state_XYZ
...
Volume3.name: v02
Volume3.id: c194e70d-6738-4ba3-9502-ec5603aab679
Volume3.type: Distributed-Replicate
...
## HERE #
Volume3.Brick1.port: N/A or 0 or empty?
Volume3.Brick1.rdma_port: 0
Volume3.Brick1.port_registered: N/A or 0 or empty?
Volume3.Brick1.status: Failed
Volume3.Brick1.spacefree: N/A or 0 or empty?
Volume3.Brick1.spacetotal: N/A or 0 or empty?
...
This situation can happen in production when a local storage on node is
'broken' or while using heketi with gluster. Volumes are present but bricks are
missing.
How reproducible:
Always
Version-Release number of selected component (if applicable): RHGS 3.X
Steps to Reproduce:
1. Delete a brick
2. Run command 'gluster get-state'
Actual results:
Command fails with the below message
'Failed to get daemon state. Check glusterd log file for more details'
Expected results:
'gluster get-state'Command should not fail. It should report the faulty brick's
state in the output so one can simply identify what is the problem with the
volumne.
'gluster get-state' command should return a message regarding that 'faulty
brick'.
--- Additional comment from Atin Mukherjee on 2019-01-28 15:10:36 IST ---
Root cause:
from glusterd_get_state ()
<snip>
ret = sys_statvfs(brickinfo->path, &brickstat);
if (ret) {
gf_msg(this->name, GF_LOG_ERROR, errno, GD_MSG_FILE_OP_FAILED,
"statfs error: %s ", strerror(errno));
goto out;
}
memfree = brickstat.f_bfree * brickstat.f_bsize;
memtotal = brickstat.f_blocks * brickstat.f_bsize;
fprintf(fp, "Volume%d.Brick%d.spacefree: %" PRIu64 "Bytes\n",
count_bkp, count, memfree);
fprintf(fp, "Volume%d.Brick%d.spacetotal: %" PRIu64 "Bytes\n",
count_bkp, count, memtotal);
</snip>
a statfs call is made on the brick path for every bricks of the volumes to
calculate the total vs free space. In this case we shouldn't error out on a
statfs failure and should report spacefree and spacetotal as unavailable or 0
bytes.
--- Additional comment from Atin Mukherjee on 2019-02-04 07:59:34 IST ---
We need to have a test coverage to ensure that get-state command should
generate an output successfully even if underlying brick(s) of volume(s) in the
cluster go bad.
--- Additional comment from sankarshan on 2019-02-04 14:48:30 IST ---
(In reply to Atin Mukherjee from comment #4)
> We need to have a test coverage to ensure that get-state command should
> generate an output successfully even if underlying brick(s) of volume(s) in
> the cluster go bad.
The test coverage flag needs to be set
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the Bugs
mailing list