[Gluster-users] Replace corrupted brick

Wed Sep 23 02:23:46 UTC 2015

Update: I was able to use the TestDisk program from cgsecurity.org to find and rewrite the partition info for the LVM partition. I was then able to mount the disk and restart the gluster volume to bring the brick back online. To make sure everything was OK, I then rebooted the node with the problem. I also rebooted all the client nodes so they have a nice clean start for the morning.
Regards,
Eva
--
Eva Broadaway Freer
Senior Development Engineer
RF, Communications, and Intelligent Systems Group
Electrical and Electronics Systems Research Division
Oak Ridge National Laboratory
freereb at ornl.gov                (865) 574-6894

From: Eva Freer <freereb at ornl.gov<mailto:freereb at ornl.gov>>
Date: Tuesday, September 22, 2015 5:18 PM
To: "gluster-users at gluster.org<mailto:gluster-users at gluster.org>" <gluster-users at gluster.org<mailto:gluster-users at gluster.org>>
Cc: Eva Freer <freereb at ornl.gov<mailto:freereb at ornl.gov>>, Toby Flynn <flynnth at ornl.gov<mailto:flynnth at ornl.gov>>
Subject: Replace corrupted brick

Our configuration is a distributed, replicated volume with 7 pairs of bricks on 2 servers. We are in the process of adding additional storage for another brick pair. I placed the new disks in one of the servers late last week and used the LSI storcli command to make a RAID 6 volume of the new disks. We are running RedHat 6.6 and Gluster 3.7.1 on both servers. Yesterday, I ran 'parted /dev/sdj' to create a partition on the new volume. Unfortunately, /dev/sdj was not the new volume (which is /dev/sdh). I realized the error right away, but the system was operating OK and it was late at night so I decided to wait until today to try to fix this. This morning, I ran 'parted rescue 0 36.0TB'. This runs, but does not find a partition to restore. I am using LVM, and the partition is /dev/mapper/vg_data5-lv_data5 with an xfs filesystem on it. The system continued to operate, but I expected that there would be problems on re-boot. I re-booted and indeed, the system can't find the volume at /dev/mapper/vg_data5-lv_data5. Is it possible to recover this volume in place, or do I need to just drop it from the gluster volume, recreate the lvm partition, and then copy the files from its partner brick on the other server? If I need to copy the files, what is the best procedure for doing it?

TIA,
Eva Freer
Oak Ridge National Laboratory
freereb at ornl.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150923/03695c9e/attachment.html>