[Gluster-users] Replace a failed disk in Distributed Disperse Volume

fanghuang.data at yahoo.com fanghuang.data at yahoo.com
Mon Jul 13 16:08:38 UTC 2015


Hi Pakkeer,

Sorry for the text formatted by the yahoo mail server in the previous message. I am trying to use the plain format.

-------------


I am also interested in this question. For 3.6 version, we don't need mount the new drive to the same mount point. The "volume replace-brick commit force" operation is supported on that version even the  replaced brick is offline. But the pro active healing is not supported and during the healing triggered by the client the xattr on the new brick still cannot be healed properly. Some xattr accessing errors are reported in the log files if I remember correctly. So even the new brick is replaced in the volume file, the data does not be healed. I once wrote a experimental script to fix all xattr of the files and directories on the new brick according to the information of other bricks. It seems to work.

For 3.7 version, the "volume replace-brick commit force" is disabled if the replaced brick is offline, both for the disperse volume and for the AFR volume. So the volume file cannot be generated with the new brick. 

So the question could be extended to "how to replace a failed brick with a new brick mounted on different mount point?".

Since you mount the new brick to the same mount point with the old one and the volume is started, the pro active healing should work. Could you check the log files and find some errors reported?

 
Best Regards, 
Fang Huang



On Monday, 13 July 2015, 23:53, "fanghuang.data at yahoo.com" <fanghuang.data at yahoo.com> wrote:


>
>
>Hi Pakkeer,
>
>
>I am also interested in this question. For 3.6 version, we don't need mount the new drive to the same mount point. The "volume replace-brick commit force" operation is supported on that version even the  replaced brick is offline. But the pro active healing is not supported and during the healing triggered by the client the xattr on the new brick still cannot be healed properly. Some xattr accessing errors are reported in the log files if I remember correctly. So even the new brick is replaced in the volume file, the data does not be healed. I once wrote a experimental script to fix all xattr of the files and directories on the new brick according to the information of other bricks. It seems to work.
>
>
>For 3.7 version, the "volume replace-brick commit force" is disabled if the replaced brick is offline, both for the disperse volume and for the AFR volume. So the volume file cannot be generated with the new brick. 
>
>
>So the question could be extended to "how to replace a failed brick with a new brick mounted on different mount point?".
>
>
>Since you mount the new brick to the same mount point with the old one and the volume is started, the pro active healing should work. Could you check the log files and find some errors reported?
> 
>Best Regards, 
>Fang Huang
>
>
>
>
>On Monday, 13 July 2015, 22:33, Mohamed Pakkeer <mdfakkeer at gmail.com> wrote:
> 
>
>>
>>
>>Hi Gluster Experts,
>>
>>
>>How to replace the failed disk on the same mount point in distributed disperse volume? I am following the below steps to replace the disk.
>>
>>
>>1. Kill the PID of the failed disk
>>2. unmount the drive
>>3. insert , format and mount the new drive on the same failed drive mount point
>>4. start the volume by force.
>>
>>
>>Is this right approach to replace the failed drive?. After replacing the failed brick using above steps, pro active self healing is not working for creating unavailable data chunks from the available disks to new disk. I am using glusterfs 3.7.2 version.
>>
>>
>>I am not able to find any proper documentation to replace the disk on same mount point. Any help would be greatly appreciated.
>>
>>
>>Thanks...
>>Backer
>>
>>
>>
>>_______________________________________________
>>Gluster-users mailing list
>>Gluster-users at gluster.org
>>http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
>


More information about the Gluster-users mailing list