[Gluster-users] Issue with Pro active self healing for Erasure coding

Mon Jun 15 07:56:05 UTC 2015

On 06/15/2015 09:25 AM, Mohamed Pakkeer wrote:
> Hi Xavier,
>
> When can we expect the 3.7.2 release for fixing the I/O error which we
> discussed on this mail thread?.

As per the latest meeting held last wednesday [1] it will be released 
this week.

Xavi

[1] 
http://meetbot.fedoraproject.org/gluster-meeting/2015-06-10/gluster-meeting.2015-06-10-12.01.html

>
> Thanks
> Backer
>
> On Wed, May 27, 2015 at 8:02 PM, Xavier Hernandez <xhernandez at datalab.es
> <mailto:xhernandez at datalab.es>> wrote:
>
>     Hi again,
>
>     in today's gluster meeting [1] it has been decided that 3.7.1 will
>     be released urgently to solve a bug in glusterd. All fixes planned
>     for 3.7.1 will be moved to 3.7.2 which will be released soon after.
>
>     Xavi
>
>     [1]
>     http://meetbot.fedoraproject.org/gluster-meeting/2015-05-27/gluster-meeting.2015-05-27-12.01.html
>
>
>     On 05/27/2015 12:01 PM, Xavier Hernandez wrote:
>
>         On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote:
>
>             Hi Xavier,
>
>             Thanks for your reply. When can we expect the 3.7.1 release?
>
>
>         AFAIK a beta of 3.7.1 will be released very soon.
>
>
>             cheers
>             Backer
>
>             On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez
>             <xhernandez at datalab.es <mailto:xhernandez at datalab.es>
>             <mailto:xhernandez at datalab.es
>             <mailto:xhernandez at datalab.es>>> wrote:
>
>                  Hi,
>
>                  some Input/Output error issues have been identified and
>             fixed. These
>                  fixes will be available on 3.7.1.
>
>                  Xavi
>
>
>                  On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote:
>
>                      Hi Glusterfs Experts,
>
>                      We are testing glusterfs 3.7.0 tarball on our 10
>             Node glusterfs
>                      cluster.
>                      Each node has 36 dirves and please find the volume
>             info below
>
>                      Volume Name: vaulttest5
>                      Type: Distributed-Disperse
>                      Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9
>                      Status: Started
>                      Number of Bricks: 36 x (8 + 2) = 360
>                      Transport-type: tcp
>                      Bricks:
>                      Brick1: 10.1.2.1:/media/disk1
>                      Brick2: 10.1.2.2:/media/disk1
>                      Brick3: 10.1.2.3:/media/disk1
>                      Brick4: 10.1.2.4:/media/disk1
>                      Brick5: 10.1.2.5:/media/disk1
>                      Brick6: 10.1.2.6:/media/disk1
>                      Brick7: 10.1.2.7:/media/disk1
>                      Brick8: 10.1.2.8:/media/disk1
>                      Brick9: 10.1.2.9:/media/disk1
>                      Brick10: 10.1.2.10:/media/disk1
>                      Brick11: 10.1.2.1:/media/disk2
>                      Brick12: 10.1.2.2:/media/disk2
>                      Brick13: 10.1.2.3:/media/disk2
>                      Brick14: 10.1.2.4:/media/disk2
>                      Brick15: 10.1.2.5:/media/disk2
>                      Brick16: 10.1.2.6:/media/disk2
>                      Brick17: 10.1.2.7:/media/disk2
>                      Brick18: 10.1.2.8:/media/disk2
>                      Brick19: 10.1.2.9:/media/disk2
>                      Brick20: 10.1.2.10:/media/disk2
>                      ...
>                      ....
>                      Brick351: 10.1.2.1:/media/disk36
>                      Brick352: 10.1.2.2:/media/disk36
>                      Brick353: 10.1.2.3:/media/disk36
>                      Brick354: 10.1.2.4:/media/disk36
>                      Brick355: 10.1.2.5:/media/disk36
>                      Brick356: 10.1.2.6:/media/disk36
>                      Brick357: 10.1.2.7:/media/disk36
>                      Brick358: 10.1.2.8:/media/disk36
>                      Brick359: 10.1.2.9:/media/disk36
>                      Brick360: 10.1.2.10:/media/disk36
>                      Options Reconfigured:
>                      performance.readdir-ahead: on
>
>                      We did some performance testing and simulated the
>             proactive self
>                      healing
>                      for Erasure coding. Disperse volume has been
>             created across
>             nodes.
>
>                      _*Description of problem*_
>
>                      I disconnected the *network of two nodes* and tried
>             to write
>                      some video
>                      files and *glusterfs* *wrote the video files on
>             balance 8 nodes
>                      perfectly*. I tried to download the uploaded file
>             and it was
>                      downloaded
>                      perfectly. Then i enabled the network of two nodes,
>             the pro
>                      active self
>                      healing mechanism worked perfectly and wrote the
>             unavailable
>             junk of
>                      data to the recently enabled node from the other 8
>             nodes. But
>             when i
>                      tried to download the same file node, it showed
>             Input/Output
>                      error. I
>                      couldn't download the file. I think there is an
>             issue in pro
>                      active self
>                      healing.
>
>                      Also we tried the simulation with one node network
>             failure. We
>             faced
>                      same I/O error issue while downloading the file
>
>
>                      _Error while downloading file _
>                      _
>                      _
>
>                      root at master02:/home/admin# rsync -r --progress
>                      /mnt/gluster/file13_AN
>                      ./1/file13_AN-2
>
>                      sending incremental file list
>
>                      file13_AN
>
>                          3,342,355,597 100% 4.87MB/s    0:10:54 (xfr#1,
>             to-chk=0/1)
>
>                      rsync: read errors mapping "/mnt/gluster/file13_AN":
>                      Input/output error (5)
>
>                      WARNING: file13_AN failed verification -- update
>             discarded (will
>                      try again).
>
>                         root at master02:/home/admin# cp /mnt/gluster/file13_AN
>                      ./1/file13_AN-3
>
>                      cp: error reading ‘/mnt/gluster/file13_AN’:
>             Input/output error
>
>                      cp: failed to extend ‘./1/file13_AN-3’:
>             Input/output error_
>                      _
>
>
>                      We can't conclude the issue with glusterfs 3.7.0 or
>             our glusterfs
>                      configuration.
>
>                      Any help would be greatly appreciated
>
>                      --
>                      Cheers
>                      Backer
>
>
>
>                      _______________________________________________
>                      Gluster-users mailing list
>             Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>             <mailto:Gluster-users at gluster.org
>             <mailto:Gluster-users at gluster.org>>
>             http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
>
>         _______________________________________________
>         Gluster-users mailing list
>         Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>         http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>