[Gluster-users] [URGENT] Add-bricks to a volume corrupted the files

Fri Oct 21 02:39:07 UTC 2016

And now I have it all setup for logging etc I can't reproduce the error :(

Though I did manage to score a "volume rebalance: teststore1: failed:
Another transaction is in progress for teststore1. Please try again
after sometime" problem. No gluster commands would work after that. I
had to restart the glusterfsd service.

On 20 October 2016 at 21:13, Krutika Dhananjay <kdhananj at redhat.com> wrote:
> Thanks a lot, Lindsay! Appreciate the help.
>
> It would be awesome if you could tell us whether you
> see the issue with FUSE as well, while we get around
> to setting up the environment and running the test ourselves.
>
> -Krutika
>
> On Thu, Oct 20, 2016 at 2:57 AM, Lindsay Mathieson
> <lindsay.mathieson at gmail.com> wrote:
>>
>> On 20/10/2016 7:01 AM, Kevin Lemonnier wrote:
>>>
>>> Yes, you need to add a full replica set at once.
>>> I don't remember, but according to my history, looks like I've used this
>>> :
>>>
>>> gluster volume add-brick VMs host1:/brick host2:/brick host3:/brick force
>>>
>>> (I have the same without force just before that, so I assume force is
>>> needed)
>>
>>
>> Ok, I did a:
>>
>> gluster volume add-brick datastore1
>> vna.proxmox.softlog:/tank/vmdata/datastore1-2
>> vnb.proxmox.softlog:/tank/vmdata/datastore1-2
>> vng.proxmox.softlog:/tank/vmdata/datastore1-2
>>
>> I had added a 2nd windows VM as well.
>>
>> Looked like it was going ok for a while, then blew up. The first windows
>> vm which was running diskmark died and won't boot. qemu-img check shows the
>> image hopelessly corrupted. 2nd VM has also crashed and is unbootable,
>> though qemuimg shows the qcow2 file as ok.
>>
>>
>> I have a sneaking suspicion its related to active IO. VM1 was doing heavy
>> io compared to vm2, perhaps thats while is image was corrupted worse.
>>
>>
>> rebalance status looks odd to me:
>>
>> root at vna:~# gluster volume rebalance datastore1 status
>>                                     Node Rebalanced-files          size
>> scanned      failures skipped               status  run time in h:m:s
>>                                --------- -----------   -----------
>> -----------   ----------- -----------         ------------
>> --------------
>>                                localhost 0        0Bytes             0
>> 0 0            completed        0:0:1
>>                      vnb.proxmox.softlog 0        0Bytes             0
>> 0 0            completed        0:0:1
>>                      vng.proxmox.softlog 328        19.2GB          1440
>> 0 0          in progress        0:11:55
>>
>>
>> Don't know why vng is taking so much longer, the nodes are identical. But
>> maybe this normal?
>>
>>
>> When I get time, I'll try again with:
>>
>> - all vm's shutdown (no IO)
>>
>> - All VM's running off the gluster fuse mount (no gfapi).
>>
>>
>> cheers,
>>
>>
>> --
>> Lindsay Mathieson
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>

-- 
Lindsay