[Bugs] [Bug 1335721] New: glusterd can't startup while volumes configuration file corrupt
bugzilla at redhat.com
bugzilla at redhat.com
Fri May 13 05:45:29 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1335721
Bug ID: 1335721
Summary: glusterd can't startup while volumes configuration
file corrupt
Product: GlusterFS
Version: 3.6.9
Component: glusterd
Severity: urgent
Assignee: bugs at gluster.org
Reporter: george.lian at nokia.com
Group: nokia
Description of problem:
"glusterd" can't start up due to corruption of volumes configuration files
which might come from SW, HW or unclean reboots.
Version-Release number of selected component (if applicable):
How reproducible:
the corruptions of configuration file is not easy to reproduce, but we could do
it by manually
Steps to Reproduce:
1. setup 2 host for glusterd for replicated, which named hostA, hostB
2. stop any glusterfs process in hostA
3. rm -rf $workdir/volume_example/info
4. start the glusterd process
Actual results:
the process glusterd start failed due to error log:
"Unable to restore volume:volume_example"
Expected results:
hope glusterd can't startup and wait hostB startup normal, then get
configration data from hostB
Additional info:
when do some test for the below code changes , it seems work.
int32_t
glusterd_restore ()
{
int32_t ret = -1;
xlator_t *this = NULL;
this = THIS;
ret = glusterd_restore_op_version (this);
if (ret) {
gf_log (this->name, GF_LOG_ERROR,
"Failed to restore op_version");
goto out;
}
ret = glusterd_store_retrieve_volumes (this, NULL);
if (ret)
goto out;
ret = glusterd_store_retrieve_peers (this);
if (ret)
goto out;
/* While retrieving snapshots, if the snapshot status
is not GD_SNAP_STATUS_IN_USE, then the snapshot is
cleaned up. To do that, the snap volume has to be
stopped by stopping snapshot volume's bricks. And for
that the snapshot bricks should be resolved. But without
retrieving the peers, resolving bricks will fail. So
do retrieving of snapshots after retrieving peers.
*/
ret = glusterd_store_retrieve_snaps (this);
/*
if (ret)
goto out;
*/
ret = glusterd_resolve_all_bricks (this);
/*
if (ret)
goto out;
*/
ret = glusterd_snap_cleanup (this);
if (ret) {
gf_log (this->name, GF_LOG_ERROR, "Failed to perform "
"a cleanup of the snapshots");
goto out;
}
ret = glusterd_recreate_all_snap_brick_mounts (this);
if (ret) {
gf_log (this->name, GF_LOG_ERROR, "Failed to recreate "
"all snap brick mounts");
goto out;
}
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the Bugs
mailing list