[Gluster-users] Issue with Pro active self healing for Erasure coding
Xavier Hernandez
xhernandez at datalab.es
Mon Jun 15 07:56:05 UTC 2015
On 06/15/2015 09:25 AM, Mohamed Pakkeer wrote:
> Hi Xavier,
>
> When can we expect the 3.7.2 release for fixing the I/O error which we
> discussed on this mail thread?.
As per the latest meeting held last wednesday [1] it will be released
this week.
Xavi
[1]
http://meetbot.fedoraproject.org/gluster-meeting/2015-06-10/gluster-meeting.2015-06-10-12.01.html
>
> Thanks
> Backer
>
> On Wed, May 27, 2015 at 8:02 PM, Xavier Hernandez <xhernandez at datalab.es
> <mailto:xhernandez at datalab.es>> wrote:
>
> Hi again,
>
> in today's gluster meeting [1] it has been decided that 3.7.1 will
> be released urgently to solve a bug in glusterd. All fixes planned
> for 3.7.1 will be moved to 3.7.2 which will be released soon after.
>
> Xavi
>
> [1]
> http://meetbot.fedoraproject.org/gluster-meeting/2015-05-27/gluster-meeting.2015-05-27-12.01.html
>
>
> On 05/27/2015 12:01 PM, Xavier Hernandez wrote:
>
> On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote:
>
> Hi Xavier,
>
> Thanks for your reply. When can we expect the 3.7.1 release?
>
>
> AFAIK a beta of 3.7.1 will be released very soon.
>
>
> cheers
> Backer
>
> On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez
> <xhernandez at datalab.es <mailto:xhernandez at datalab.es>
> <mailto:xhernandez at datalab.es
> <mailto:xhernandez at datalab.es>>> wrote:
>
> Hi,
>
> some Input/Output error issues have been identified and
> fixed. These
> fixes will be available on 3.7.1.
>
> Xavi
>
>
> On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote:
>
> Hi Glusterfs Experts,
>
> We are testing glusterfs 3.7.0 tarball on our 10
> Node glusterfs
> cluster.
> Each node has 36 dirves and please find the volume
> info below
>
> Volume Name: vaulttest5
> Type: Distributed-Disperse
> Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9
> Status: Started
> Number of Bricks: 36 x (8 + 2) = 360
> Transport-type: tcp
> Bricks:
> Brick1: 10.1.2.1:/media/disk1
> Brick2: 10.1.2.2:/media/disk1
> Brick3: 10.1.2.3:/media/disk1
> Brick4: 10.1.2.4:/media/disk1
> Brick5: 10.1.2.5:/media/disk1
> Brick6: 10.1.2.6:/media/disk1
> Brick7: 10.1.2.7:/media/disk1
> Brick8: 10.1.2.8:/media/disk1
> Brick9: 10.1.2.9:/media/disk1
> Brick10: 10.1.2.10:/media/disk1
> Brick11: 10.1.2.1:/media/disk2
> Brick12: 10.1.2.2:/media/disk2
> Brick13: 10.1.2.3:/media/disk2
> Brick14: 10.1.2.4:/media/disk2
> Brick15: 10.1.2.5:/media/disk2
> Brick16: 10.1.2.6:/media/disk2
> Brick17: 10.1.2.7:/media/disk2
> Brick18: 10.1.2.8:/media/disk2
> Brick19: 10.1.2.9:/media/disk2
> Brick20: 10.1.2.10:/media/disk2
> ...
> ....
> Brick351: 10.1.2.1:/media/disk36
> Brick352: 10.1.2.2:/media/disk36
> Brick353: 10.1.2.3:/media/disk36
> Brick354: 10.1.2.4:/media/disk36
> Brick355: 10.1.2.5:/media/disk36
> Brick356: 10.1.2.6:/media/disk36
> Brick357: 10.1.2.7:/media/disk36
> Brick358: 10.1.2.8:/media/disk36
> Brick359: 10.1.2.9:/media/disk36
> Brick360: 10.1.2.10:/media/disk36
> Options Reconfigured:
> performance.readdir-ahead: on
>
> We did some performance testing and simulated the
> proactive self
> healing
> for Erasure coding. Disperse volume has been
> created across
> nodes.
>
> _*Description of problem*_
>
> I disconnected the *network of two nodes* and tried
> to write
> some video
> files and *glusterfs* *wrote the video files on
> balance 8 nodes
> perfectly*. I tried to download the uploaded file
> and it was
> downloaded
> perfectly. Then i enabled the network of two nodes,
> the pro
> active self
> healing mechanism worked perfectly and wrote the
> unavailable
> junk of
> data to the recently enabled node from the other 8
> nodes. But
> when i
> tried to download the same file node, it showed
> Input/Output
> error. I
> couldn't download the file. I think there is an
> issue in pro
> active self
> healing.
>
> Also we tried the simulation with one node network
> failure. We
> faced
> same I/O error issue while downloading the file
>
>
> _Error while downloading file _
> _
> _
>
> root at master02:/home/admin# rsync -r --progress
> /mnt/gluster/file13_AN
> ./1/file13_AN-2
>
> sending incremental file list
>
> file13_AN
>
> 3,342,355,597 100% 4.87MB/s 0:10:54 (xfr#1,
> to-chk=0/1)
>
> rsync: read errors mapping "/mnt/gluster/file13_AN":
> Input/output error (5)
>
> WARNING: file13_AN failed verification -- update
> discarded (will
> try again).
>
> root at master02:/home/admin# cp /mnt/gluster/file13_AN
> ./1/file13_AN-3
>
> cp: error reading ‘/mnt/gluster/file13_AN’:
> Input/output error
>
> cp: failed to extend ‘./1/file13_AN-3’:
> Input/output error_
> _
>
>
> We can't conclude the issue with glusterfs 3.7.0 or
> our glusterfs
> configuration.
>
> Any help would be greatly appreciated
>
> --
> Cheers
> Backer
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>>
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
More information about the Gluster-users
mailing list