[Gluster-users] 3.7.16 with sharding corrupts VMDK files when adding and removing bricks

Mon Nov 14 18:43:42 UTC 2016

Features and stability are not mutually exclusive. 

Sometimes instability is cured by adding a feature. 

Fixing a bug is not something that's solved better by having more developers work on it.

Sometimes fixing one bug exposed a problem elsewhere. 

Using free open source community projects with your own hardware and system design weights the responsibility to test more heavily on yourself. If that's not a risk you can afford, you might consider contracting with a 3rd party which has "certified" installation parameters. IMHO.

On November 14, 2016 8:29:00 AM PST, Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> wrote:
>1016-11-14 17:01 GMT+01:00 Vijay Bellur <vbellur at redhat.com>:
>> Accessing sharded data after disabling sharding is something that we
>> did not visualize as a valid use case at any point in time. Also, you
>> could access the contents by enabling sharding again. Given these
>> factors I think this particular problem has not been prioritized by
>> us.
>
>That's not true.
>If you have VMs running on a sharded volume and you disable sharding,
>with the VM still running, everything crash and could lead to data
>loss, as VM
>will be unable to find their filesystem and so on, qemu currupts the
>image and so on.....
>
>If I write to a file that was shareded, (in example a log file), now
>when you disable the shard,
>the application would write the existing file (the one that was the
>first shard).
>If you reenable sharding, you lost some data
>
>Example:
>
>128MB file. shard set to 64MB. You have 2 chunks: shard1+shard2
>
>Now you are writing to the file:
>
>AAAA
>BBBB
>CCCC
>DDDD
>
>AAAA+BBBB are placed on shard1, CCCC+DDDD are placed on shard2
>
>If you disable the shard and write some extra data, EEEE, then EEEE
>would be placed after BBBB in shard1 (growing more than 64MB)
>and not on shard3
>
>If you re-enable shard, EEEE is lost, as gluster would expect it as
>shard3. and I think gluster will read only the first 64MB from shard1.
>If gluster read the whole file, you'll get something like this:
>
>AAAA
>BBBB
>EEEE
>CCCC
>DDDD
>
>in a text file this is bad, in a VM image, this mean data
>loss/corruption almost impossible to fix.
>
>
>> As with many other projects, we are in a stage today where the number
>> of users and testers far outweigh the number of developers
>> contributing code. With this state it becomes hard to prioritize
>> problems from a long todo list for developers.  If valuable community
>> members like you feel strongly about a bug or feature that need
>> attention of developers, please call such issues out on the mailing
>> list. We will be more than happy to help.
>
>That's why i've asked for less feature and more stability.
>If you have to prioritize, please choose all bugs that could lead to
>data corruption or similiar.
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161114/4b61931b/attachment.html>