<div dir="ltr"><div dir="ltr">Hi Dmitry,</div><div><br></div>my comments below...<div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Sep 29, 2020 at 11:19 AM Dmitry Antipov <<a href="mailto:dmantipov@yandex.ru">dmantipov@yandex.ru</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">For the testing purposes, I've set up a localhost-only setup with 6x16M<br>
ramdisks (formatted as ext4) mounted (with '-o user_xattr') at<br>
/tmp/ram/{0,1,2,3,4,5} and SHARD_MIN_BLOCK_SIZE lowered to 4K. Finally<br>
the volume is:<br>
<br>
Volume Name: test<br>
Type: Distributed-Replicate<br>
Volume ID: 241d6679-7cd7-48b4-bdc5-8bc1c9940ac3<br>
Status: Started<br>
Snapshot Count: 0<br>
Number of Bricks: 2 x 3 = 6<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: [local-ip]:/tmp/ram/0<br>
Brick2: [local-ip]:/tmp/ram/1<br>
Brick3: [local-ip]:/tmp/ram/2<br>
Brick4: [local-ip]:/tmp/ram/3<br>
Brick5: [local-ip]:/tmp/ram/4<br>
Brick6: [local-ip]:/tmp/ram/5<br>
Options Reconfigured:<br>
features.shard-block-size: 64KB<br>
features.shard: on<br>
storage.fips-mode-rchecksum: on<br>
transport.address-family: inet<br>
nfs.disable: on<br>
performance.client-io-threads: off<br>
<br>
Then I mount it under /mnt/test:<br>
<br>
# mount -t glusterfs [local-ip]:/test /mnt/test<br>
<br>
and create 4M file on it:<br>
<br>
# dd if=/dev/random of=/mnt/test/file0 bs=1M count=4<br>
<br>
This creates 189 shards of 64K each, in /tmp/ram/?/.shard:<br>
<br>
/tmp/ram/0/.shard: 24<br>
/tmp/ram/1/.shard: 24<br>
/tmp/ram/2/.shard: 24<br>
/tmp/ram/3/.shard: 39<br>
/tmp/ram/4/.shard: 39<br>
/tmp/ram/5/.shard: 39<br>
<br>
To simulate data loss I just remove 2 arbitrary .shard directories,<br>
for example:<br>
<br>
# rm -rfv /tmp/ram/0/.shard /tmp/ram/5/.shard<br>
<br>
Finally, I do full heal:<br>
<br>
# gluster volume heal test full<br>
<br>
and successfully got all shards under /tmp/ram/{0,5}.shard back.<br>
<br>
But the things seems going weird for the following volume:<br>
<br>
Volume Name: test<br>
Type: Distributed-Disperse<br>
Volume ID: aa621c7e-1693-427a-9fd5-d7b38c27035e<br>
Status: Started<br>
Snapshot Count: 0<br>
Number of Bricks: 2 x (2 + 1) = 6<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: [local-ip]:/tmp/ram/0<br>
Brick2: [local-ip]:/tmp/ram/1<br>
Brick3: [local-ip]:/tmp/ram/2<br>
Brick4: [local-ip]:/tmp/ram/3<br>
Brick5: [local-ip]:/tmp/ram/4<br>
Brick6: [local-ip]:/tmp/ram/5<br>
Options Reconfigured:<br>
features.shard: on<br>
features.shard-block-size: 64KB<br>
storage.fips-mode-rchecksum: on<br>
transport.address-family: inet<br>
nfs.disable: on<br>
<br>
After creating 4M file as before, I've got the same 189 shards<br>
but 32K each.</blockquote><div><br></div><div>This is normal. A dispersed volume writes encoded fragments of each block in each brick. In this case it's a 2+1 configuration, so each block is divided into 2 fragments. A third fragment is generated for redundancy and stored on the third brick.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> After deleting /tmp/ram/{0,5}/.shard and full heal,<br>
I was able to get all shards back. But, after deleting<br>
/tmp/ram/{3,4}/.shard and full heal, I've ended up with the following:<br></blockquote><div><br></div><div>This is not right. A disperse 2+1 configuration only supports a single failure. Wiping 2 fragments from the same file makes the file unrecoverable. Disperse works using the Reed-Solomon erasure code, which requires at least 2 healthy fragments to recover the data (in a 2+1 configuration).</div><div><br></div><div>If you want to be able to recover from 2 disk failures, you need to create a 4+2 configuration.</div><div><br></div><div>To make it more clear: a 2+1 configuration is like a traditional RAID5 with 3 disks. If you lose 2 disks, data is lost. A 4+2 is similar to a RAID6.</div><div><br></div><div>Regards,</div><div><br></div><div>Xavi</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
/tmp/ram/0/.shard:<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.10<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.11<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.12<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.13<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.14<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.15<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.16<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.17<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.2<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.22<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.23<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.27<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.28<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.3<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.31<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.34<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.35<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.37<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.39<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.4<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.40<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.44<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.45<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.46<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.47<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.53<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.54<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.55<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.57<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.58<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.6<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.63<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.7<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.9<br>
<br>
/tmp/ram/1/.shard:<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.10<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.11<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.12<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.13<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.14<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.15<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.16<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.17<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.2<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.22<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.23<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.27<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.28<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.3<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.31<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.34<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.35<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.37<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.39<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.4<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.40<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.44<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.45<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.46<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.47<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.53<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.54<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.55<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.57<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.58<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.6<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.63<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.7<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.9<br>
<br>
/tmp/ram/2/.shard:<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.10<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.11<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.12<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.13<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.14<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.15<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.16<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.17<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.2<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.22<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.23<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.27<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.28<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.3<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.31<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.34<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.35<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.37<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.39<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.4<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.40<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.44<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.45<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.46<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.47<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.53<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.54<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.55<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.57<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.58<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.6<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.63<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.7<br>
-rw-r--r-- 2 root root 32768 Sep 29 12:01 951d7c52-7230-420b-b8bb-da887fffd41e.9<br>
<br>
So, /tmp/ram/{3,4}/.shard was not recovered. Even worse, /tmp/ram/5/.shard<br>
has disappeared completely. And of course this breaks all I/O on /mnt/test/file0,<br>
for example:<br>
<br>
# dd if=/dev/random of=/mnt/test/file0 bs=1M count=4<br>
dd: error writing '/mnt/test/file0': No such file or directory<br>
dd: closing output file '/mnt/test/file0': No such file or directory<br>
<br>
Any ideas on what's going on here? </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Dmitry<br>
_______________________________________________<br>
<br>
Community Meeting Calendar:<br>
<br>
Schedule -<br>
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>
Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>
<br>
<br>
<br>
<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-devel</a><br>
<br>
</blockquote></div></div></div>