[Bugs] [Bug 1335721] glusterd can't startup while volumes configuration file corrupt
bugzilla at redhat.com
bugzilla at redhat.com
Fri May 13 05:54:11 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1335721
Atin Mukherjee <amukherj at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |CLOSED
CC| |amukherj at redhat.com
Resolution|--- |NOTABUG
Last Closed| |2016-05-13 01:54:11
--- Comment #1 from Atin Mukherjee <amukherj at redhat.com> ---
(In reply to George from comment #0)
> Description of problem:
>
> "glusterd" can't start up due to corruption of volumes configuration files
> which might come from SW, HW or unclean reboots.
>
>
> Version-Release number of selected component (if applicable):
>
>
> How reproducible:
> the corruptions of configuration file is not easy to reproduce, but we could
> do it by manually
>
> Steps to Reproduce:
> 1. setup 2 host for glusterd for replicated, which named hostA, hostB
> 2. stop any glusterfs process in hostA
> 3. rm -rf $workdir/volume_example/info
Manual alteration of the configuration files are strictly disallowed. You'd
need to come up with a proven case where the files can be corrupted. One
example could be disk full scenario, but at the same time its recommended that
the disk size of the partition where /var/lib/glusterd resides should be
carefully monitored.
Given this reason, I don't think its a bug and closing it. Please feel free to
reopen if you can come up with some other scenarios.
> 4. start the glusterd process
>
>
> Actual results:
> the process glusterd start failed due to error log:
> "Unable to restore volume:volume_example"
>
> Expected results:
> hope glusterd can't startup and wait hostB startup normal, then get
> configration data from hostB
>
> Additional info:
> when do some test for the below code changes , it seems work.
>
>
>
> int32_t
> glusterd_restore ()
> {
> int32_t ret = -1;
> xlator_t *this = NULL;
>
> this = THIS;
>
> ret = glusterd_restore_op_version (this);
> if (ret) {
> gf_log (this->name, GF_LOG_ERROR,
> "Failed to restore op_version");
> goto out;
> }
>
> ret = glusterd_store_retrieve_volumes (this, NULL);
> if (ret)
> goto out;
>
> ret = glusterd_store_retrieve_peers (this);
> if (ret)
> goto out;
>
> /* While retrieving snapshots, if the snapshot status
> is not GD_SNAP_STATUS_IN_USE, then the snapshot is
> cleaned up. To do that, the snap volume has to be
> stopped by stopping snapshot volume's bricks. And for
> that the snapshot bricks should be resolved. But without
> retrieving the peers, resolving bricks will fail. So
> do retrieving of snapshots after retrieving peers.
> */
> ret = glusterd_store_retrieve_snaps (this);
> /*
> if (ret)
> goto out;
> */
>
> ret = glusterd_resolve_all_bricks (this);
> /*
> if (ret)
> goto out;
> */
>
> ret = glusterd_snap_cleanup (this);
> if (ret) {
> gf_log (this->name, GF_LOG_ERROR, "Failed to perform "
> "a cleanup of the snapshots");
> goto out;
> }
>
> ret = glusterd_recreate_all_snap_brick_mounts (this);
> if (ret) {
> gf_log (this->name, GF_LOG_ERROR, "Failed to recreate "
> "all snap brick mounts");
> goto out;
> }
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the Bugs
mailing list