[Gluster-devel] [glusterfs-3.6.0beta3-0.11.gitd01b00a] gluster volume status is running even though the Disk is detached

Mon Oct 27 12:23:10 UTC 2014

On Mon, Oct 27, 2014 at 05:19:13PM +0530, Kiran Patil wrote:
> Hi,
> 
> I created replicated vol with two bricks on the same node and copied some
> data to it.
> 
> Now removed the disk which has hosted one of the brick of the volume.
> 
> Storage.health-check-interval is set to 30 seconds.
> 
> I could see the disk is unavailable using zpool command of zfs on linux but
> the gluster volume status still displays the brick process running which
> should have been shutdown by this time.
> 
> Is this a bug in 3.6 since it is mentioned as feature "
> https://github.com/gluster/glusterfs/blob/release-3.6/doc/features/brick-failure-detection.md"
>  or am I doing any mistakes here?

The initial detection of brick failures did not work for all
filesystems. It may not work for ZFS too. A fix has been posted, but it
has not been merged into the master branch yet. When the change has been
merged, it can get backported to 3.6 and 3.5.

You may want to test with the patch applied, and add your "+1 Verified"
to the change in case it makes it functional for you:
- http://review.gluster.org/8213

Cheers,
Niels

> 
> [root at fractal-c92e gluster-3.6]# gluster volume status
> Status of volume: repvol
> Gluster process Port Online Pid
> ------------------------------------------------------------------------------
> Brick 192.168.1.246:/zp1/brick1 49154 Y 17671
> Brick 192.168.1.246:/zp2/brick2 49155 Y 17682
> NFS Server on localhost 2049 Y 17696
> Self-heal Daemon on localhost N/A Y 17701
> 
> Task Status of Volume repvol
> ------------------------------------------------------------------------------
> There are no active volume tasks
> 
> 
> [root at fractal-c92e gluster-3.6]# gluster volume info
> 
> Volume Name: repvol
> Type: Replicate
> Volume ID: d4f992b1-1393-43b8-9fda-2e2b6e3b5039
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.1.246:/zp1/brick1
> Brick2: 192.168.1.246:/zp2/brick2
> Options Reconfigured:
> storage.health-check-interval: 30
> 
> [root at fractal-c92e gluster-3.6]# zpool status zp2
>   pool: zp2
>  state: UNAVAIL
> status: One or more devices are faulted in response to IO failures.
> action: Make sure the affected devices are connected, then run 'zpool
> clear'.
>    see: http://zfsonlinux.org/msg/ZFS-8000-HC
>   scan: none requested
> config:
> 
> NAME        STATE     READ WRITE CKSUM
> zp2         UNAVAIL      0     0     0  insufficient replicas
>   sdb       UNAVAIL      0     0     0
> 
> errors: 2 data errors, use '-v' for a list
> 
> 
> Thanks,
> Kiran.

> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel