[Gluster-users] Reinstall OS while keeping bricks intact

Thu Jul 30 13:49:41 UTC 2015

Hi Prasun,

This email alias is for upstream glusterfs users not for Red Hat Gluster(downstream glusterfs).
So will not be able to help on Red Hat Gluster Storage much in the public forum. You will need to open a ticket with Red Hat support.

But as a help here is what you need to do :
1) Try to stabilize your current 3.0 installation first.
2) If you want to have one of your node newly installed, I will say you to re-install RHS the same which was installed earlier
   and then add this node back into the cluster. I will recommend you to use same Hostname and IP as earlier.
3) You can follow the chapter 8.6.2( Replacing a host machine with the same Hostname" from the document:
https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Replacing_Hosts.html#Replacing_a_Host_Machine_with_the_Same_Hostname
4) Please go through the document before using it. If possible you can try on your test cluster first in order to get confidence and minimize mistake.
5) As you are using distribute-replicate topology you can get your data back from replica brick by executing self heal.
6) If you want to upgrade to 3.1, you can upgrade now.

Thanks,
Bipin Kunal

----- Original Message -----
From: Prasun Gera <prasun.gera at gmail.com>
To: gluster-users at gluster.org
Sent: Thu, 30 Jul 2015 05:17:46 -0400 (EDT)
Subject: Re: [Gluster-users] Reinstall OS while keeping bricks intact

Also, if there is a cleaner way of doing this by removing and adding the
node again through gluster commands that would be preferable.

On Thu, Jul 30, 2015 at 1:58 AM, Prasun Gera <prasun.gera at gmail.com> wrote:

> Hi,
> One of my nodes in an RHS 3.0 3x2 dist+replicated pool is down and not
> likely to recover. The machine doesn't have IPMI and I have limited access.
> Standard steps to recover it didn't work, and at this point the easiest
> option seems to get help in reinstalling the OS. I believe that the brick
> and other config files are intact. From RHS documentation on upgrading from
> an ISO, this is what I got:
>
> 1. Backup (/var/lib/glusterd, /etc/swift, /etc/samba, /etc/ctdb,
> /etc/glusterfs. /var/lib/samba, /var/lib/ctdb) . Backup entire /etc for
> selective restoration.
>
> 2. Stop the volume and all services everywhere. Install the OS on the
> affected node without touching the brick. Stop glusterd on this node too.
>
> 3. Backup /var/lib/glusterd from the newly installed OS.
>
> 4. Copy back /var/lib/glusterd and /etc/glusterfs from step 1. to the
> newly installed OS.
>
> 5. Copy back the latest hooks scripts (from step 3) to
> /var/lib/glusterd/hooks. This is probably not required since the steps were
> written for an upgrade whereas my version is the same. Right ?
>
> 6. glusterd --xlator-option *.upgrade=yes -N. Is this needed in my case ?
> It's not an upgrade.
>
> 7. Restart services and volume.
>
> Do these steps sound all right ? Should I also restore /etc/nagios ? Or
> would nagios have to be reconfigured for the entire cluster ?
>
>
> The reason for this failure was a botched kernel upgrade and a combination
> of some other factors which i'm not sure yet. And I wasn't able to generate
> working initramfs using dracut in recovery. Interestingly, I noticed the
> following line in the new RHS 3.1 documentation. "If dracut packages are
> previously installed, then exclude the dracut packages while updating to
> Red Hat Gluster Storage 3.1 during offline ISO update using the following
> command:
> # yum update -x dracut -x dracut-kernel" . Is there some sort of a known
> issue ?
>