[Gluster-users] Reconstructing files from shards
WK
wkmail at bneit.com
Mon Apr 23 19:57:23 UTC 2018
On 4/23/2018 11:46 AM, Jamie Lawrence wrote:
>
> Thanks for that, WK.
>
> Do you know if those images were sparse files? My understanding is that this will not work with files with holes.
We typically use qcow2 images (compat 1.1) with metadata preallocated
(so yes, sparse) So we may be vulnerable to that to whole problem. Our
Gluster setups are always simple replication either rep3 or rep2+arb.
Since you brought it up, I recall being somewhat aware of the 'sparse'
issue at the time.
Given the way qcow2 images expand and/or the fact that the images are
typically fully grown after being in production for a while may have
given us a false positive by passing the md5 check. We literally yanked
a live gluster brick node full of mature VM images and used that to do
an offline reconstruction of the shard images. All passed the md5sum
and/or booted cleanly. We did have one or two that the md5sum was off
but the file was otherwise fine and passed an fsck.
Upon reflection that that may been the sparse file issue trying to get
our attention.
In the end, we have always been able to mount the volume (perhaps by
killing quorum if you are down to a single brick node) and copy the VM
files cleanly, so the 'sparse may be a problem' issue was soon forgotten
as mounting is easier and cleaner than shard reconstruction.
I'm glad you brought the issue forward again because "the conventional
wisdom" here is that shard reconstruction worked fine if we ever really
needed it <grin>.
I suppose reallocation=full or reallocation=falloc would be a logical
replacement to cover that issue, but we would have to think through the
cost/benefit of fully allocated images just because there is a very
slight chance we would have to reconstruct the shards instead of simply
copying off the image from a mount.
If I can ever find some spare time, I would like to redo the test with
brand new qcow2 files that are preallocated with lots of room to grow.
-wk
>
> Quoting from : http://lists.gluster.org/pipermail/gluster-devel/2017-March/052212.html
>
> - - snip
>
> 1. A non-existent/missing shard anywhere between offset $SHARD_BLOCK_SIZE
> through ceiling ($FILE_SIZE/$SHARD_BLOCK_SIZE)
> indicates a hole. When you reconstruct data from a sharded file of this
> nature, you need to take care to retain this property.
>
> 2. The above is also true for partially filled shards between offset
> $SHARD_BLOCK_SIZE through ceiling ($FILE_SIZE/$SHARD_BLOCK_SIZE).
> What do I mean by partially filled shards? Shards whose sizes are not equal
> to $SHARD_BLOCK_SIZE.
>
> In the above, $FILE_SIZE can be gotten from the
> 'trusted.glusterfs.shard.file-size' extended attribute on the base file
> (the 0th block).
>
> - - snip
>
> So it sounds like (although I am not sure, which is why I was writing in the first place) one would need to use `dd` or similar to read out ( ${trusted.glusterfs.shard.file-size} - ($SHARD_BLOCK_SIZE * count) ) bytes from the partial shard.
>
> Although I also just realized the above quote fails to explain, if a file has a hole less than $SHARD_BLOCK_SIZE in size, how we know which shard(s) are holey, so I'm back to thinking reconstruction is undocumented and unsupported except for reading the files off on a client, blowing away the volume and reconstructing. Which is a problem.
>
> -j
>
>
>> -wk
>>
>>
>> On 4/20/2018 12:44 PM, Jamie Lawrence wrote:
>>> Hello,
>>>
>>> So I have a volume on a gluster install (3.12.5) on which sharding was enabled at some point recently. (Don't know how it happened, it may have been an accidental run of an old script.) So it has been happily sharding behind our backs and it shouldn't have.
>>>
>>> I'd like to turn sharding off and reverse the files back to normal. Some of these are sparse files, so I need to account for holes. There are more than enough that I need to write a tool to do it.
>>>
>>> I saw notes ca. 3.7 saying the only way to do it was to read-off on the client-side, blow away the volume and start over. This would be extremely disruptive for us, and language I've seen reading tickets and old messages to this list make me think that isn't needed anymore, but confirmation of that would be good.
>>>
>>> The only discussion I can find are these videos[1]:
>>> http://opensource-storage.blogspot.com/2016/07/de-mystifying-gluster-shards.html
>>> , and some hints[2] that are old enough that I don't trust them without confirmation that nothing's changed. The video things don't acknowledge the existence of file holes. Also, the hint in [2] mentions using trusted.glusterfs.shard.file-size to get the size of a partly filled hole; that value looks like base64, but when I attempt to decode it, base64 complains about invalid input.
>>>
>>> In short, I can't find sufficient information to reconstruct these. Has anyone written a current, step-by-step guide on reconstructing sharded files? Or has someone has written a tool so I don't have to?
>>>
>>> Thanks,
>>>
>>> -j
>>>
>>>
>>> [1] Why one would choose to annoy the crap out of their fellow gluster users by using video to convey about 80 bytes of ASCII-encoded information, I have no idea.
>>> [2]
>>> http://lists.gluster.org/pipermail/gluster-devel/2017-March/052212.html
>>>
More information about the Gluster-users
mailing list