[Gluster-users] Issue with Pro active self healing for Erasure coding
Mohamed Pakkeer
mdfakkeer at gmail.com
Mon Jun 15 07:25:47 UTC 2015
Hi Xavier,
When can we expect the 3.7.2 release for fixing the I/O error which we
discussed on this mail thread?.
Thanks
Backer
On Wed, May 27, 2015 at 8:02 PM, Xavier Hernandez <xhernandez at datalab.es>
wrote:
> Hi again,
>
> in today's gluster meeting [1] it has been decided that 3.7.1 will be
> released urgently to solve a bug in glusterd. All fixes planned for 3.7.1
> will be moved to 3.7.2 which will be released soon after.
>
> Xavi
>
> [1]
> http://meetbot.fedoraproject.org/gluster-meeting/2015-05-27/gluster-meeting.2015-05-27-12.01.html
>
>
> On 05/27/2015 12:01 PM, Xavier Hernandez wrote:
>
>> On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote:
>>
>>> Hi Xavier,
>>>
>>> Thanks for your reply. When can we expect the 3.7.1 release?
>>>
>>
>> AFAIK a beta of 3.7.1 will be released very soon.
>>
>>
>>> cheers
>>> Backer
>>>
>>> On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez <xhernandez at datalab.es
>>> <mailto:xhernandez at datalab.es>> wrote:
>>>
>>> Hi,
>>>
>>> some Input/Output error issues have been identified and fixed. These
>>> fixes will be available on 3.7.1.
>>>
>>> Xavi
>>>
>>>
>>> On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote:
>>>
>>> Hi Glusterfs Experts,
>>>
>>> We are testing glusterfs 3.7.0 tarball on our 10 Node glusterfs
>>> cluster.
>>> Each node has 36 dirves and please find the volume info below
>>>
>>> Volume Name: vaulttest5
>>> Type: Distributed-Disperse
>>> Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9
>>> Status: Started
>>> Number of Bricks: 36 x (8 + 2) = 360
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 10.1.2.1:/media/disk1
>>> Brick2: 10.1.2.2:/media/disk1
>>> Brick3: 10.1.2.3:/media/disk1
>>> Brick4: 10.1.2.4:/media/disk1
>>> Brick5: 10.1.2.5:/media/disk1
>>> Brick6: 10.1.2.6:/media/disk1
>>> Brick7: 10.1.2.7:/media/disk1
>>> Brick8: 10.1.2.8:/media/disk1
>>> Brick9: 10.1.2.9:/media/disk1
>>> Brick10: 10.1.2.10:/media/disk1
>>> Brick11: 10.1.2.1:/media/disk2
>>> Brick12: 10.1.2.2:/media/disk2
>>> Brick13: 10.1.2.3:/media/disk2
>>> Brick14: 10.1.2.4:/media/disk2
>>> Brick15: 10.1.2.5:/media/disk2
>>> Brick16: 10.1.2.6:/media/disk2
>>> Brick17: 10.1.2.7:/media/disk2
>>> Brick18: 10.1.2.8:/media/disk2
>>> Brick19: 10.1.2.9:/media/disk2
>>> Brick20: 10.1.2.10:/media/disk2
>>> ...
>>> ....
>>> Brick351: 10.1.2.1:/media/disk36
>>> Brick352: 10.1.2.2:/media/disk36
>>> Brick353: 10.1.2.3:/media/disk36
>>> Brick354: 10.1.2.4:/media/disk36
>>> Brick355: 10.1.2.5:/media/disk36
>>> Brick356: 10.1.2.6:/media/disk36
>>> Brick357: 10.1.2.7:/media/disk36
>>> Brick358: 10.1.2.8:/media/disk36
>>> Brick359: 10.1.2.9:/media/disk36
>>> Brick360: 10.1.2.10:/media/disk36
>>> Options Reconfigured:
>>> performance.readdir-ahead: on
>>>
>>> We did some performance testing and simulated the proactive self
>>> healing
>>> for Erasure coding. Disperse volume has been created across
>>> nodes.
>>>
>>> _*Description of problem*_
>>>
>>> I disconnected the *network of two nodes* and tried to write
>>> some video
>>> files and *glusterfs* *wrote the video files on balance 8 nodes
>>> perfectly*. I tried to download the uploaded file and it was
>>> downloaded
>>> perfectly. Then i enabled the network of two nodes, the pro
>>> active self
>>> healing mechanism worked perfectly and wrote the unavailable
>>> junk of
>>> data to the recently enabled node from the other 8 nodes. But
>>> when i
>>> tried to download the same file node, it showed Input/Output
>>> error. I
>>> couldn't download the file. I think there is an issue in pro
>>> active self
>>> healing.
>>>
>>> Also we tried the simulation with one node network failure. We
>>> faced
>>> same I/O error issue while downloading the file
>>>
>>>
>>> _Error while downloading file _
>>> _
>>> _
>>>
>>> root at master02:/home/admin# rsync -r --progress
>>> /mnt/gluster/file13_AN
>>> ./1/file13_AN-2
>>>
>>> sending incremental file list
>>>
>>> file13_AN
>>>
>>> 3,342,355,597 100% 4.87MB/s 0:10:54 (xfr#1, to-chk=0/1)
>>>
>>> rsync: read errors mapping "/mnt/gluster/file13_AN":
>>> Input/output error (5)
>>>
>>> WARNING: file13_AN failed verification -- update discarded (will
>>> try again).
>>>
>>> root at master02:/home/admin# cp /mnt/gluster/file13_AN
>>> ./1/file13_AN-3
>>>
>>> cp: error reading ‘/mnt/gluster/file13_AN’: Input/output error
>>>
>>> cp: failed to extend ‘./1/file13_AN-3’: Input/output error_
>>> _
>>>
>>>
>>> We can't conclude the issue with glusterfs 3.7.0 or our glusterfs
>>> configuration.
>>>
>>> Any help would be greatly appreciated
>>>
>>> --
>>> Cheers
>>> Backer
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150615/6f1f6063/attachment.html>
More information about the Gluster-users
mailing list