<div dir="ltr"><br><div>Hi David,</div><div><br></div><div>Do you have any bricks down? Can you please share the output of the following commands and also the logs of the server and the client nodes?</div><div><br></div><div>1) gluster volume info</div><div>2) gluster volume status</div><div>3) gluster volume bitrot &lt;volume name&gt; scrub status</div><div><br></div><div>Few more questions</div><div><br></div><div>1) How many copies of the file were corrupted? (All? Or Just one?)</div><div><br></div><div>2 things I am trying to understand</div><div><br></div><div>A) IIUC, if only one copy is corrupted, then the replication module from the gluster client should serve the data from the </div><div>    remaining good copy</div><div>B) If all the copies were corrupted (or say more than quorum copies were corrupted which means 2 in case of 3 way replication)</div><div>    then there will be an error to the application. But the error to be reported should &#39;Input/Output Error&#39;. Not &#39;Transport endpoint not connected&#39;</div><div>   &#39;Transport endpoint not connected&#39; error usually comes when a brick where the operation is being directed to is not connected to the client.</div><div><br></div><div><br></div><div><br></div><div>Regards,</div><div>Raghavendra</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Feb 4, 2019 at 6:02 AM David Spisla &lt;<a href="mailto:spisla80@gmail.com" target="_blank">spisla80@gmail.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Hello Amar,</div><div>sounds good. Until now this patch is only merged into master. I think it should be part of the next v5.x patch release!</div><div><br></div><div>Regards</div><div>David<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Am Mo., 4. Feb. 2019 um 09:58 Uhr schrieb Amar Tumballi Suryanarayan &lt;<a href="mailto:atumball@redhat.com" target="_blank">atumball@redhat.com</a>&gt;:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">Hi David, <div><br></div><div>I guess <a href="https://review.gluster.org/#/c/glusterfs/+/21996/" target="_blank">https://review.gluster.org/#/c/glusterfs/+/21996/</a> helps to fix the issue. I will leave it to Raghavendra Bhat to reconfirm.</div><div><br></div><div>Regards,</div><div>Amar</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 1, 2019 at 8:45 PM David Spisla &lt;<a href="mailto:spisla80@gmail.com" target="_blank">spisla80@gmail.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Hello Gluster Community,</div><div>I have got a 4 Node Cluster with a Replica 4 Volume, so each node has a brick with a copy of a file. Now I tried out the bitrot functionality and corrupt the copy on the brick of node1. After this I scrub ondemand and the file is marked correctly as corrupted. <br></div><div><br></div><div>No I try to read that file from FUSE on node1 (with corrupt copy):</div><div>$ cat file1.txt <br>cat: file1.txt: Transport endpoint is not connected</div>FUSE log says:</div><div dir="ltr"><br></div><div dir="ltr"><b>[2019-02-01 15:02:19.191984] E [MSGID: 114031] [client-rpc-fops_v2.c:281:client4_0_open_cbk] 0-archive1-client-0: remote operation failed. Path: /data/file1.txt (b432c1d6-ece2-42f2-8749-b11e058c4be3) [Input/output error]</b><br>[2019-02-01 15:02:19.192269] W [dict.c:761:dict_ref] (--&gt;/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fc642471329] --&gt;/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fc642682af5] --&gt;/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fc64a78d218] ) 0-dict: dict is NULL [Invalid argument]<br>[2019-02-01 15:02:19.192714] E [MSGID: 108009] [afr-open.c:220:afr_openfd_fix_open_cbk] 0-archive1-replicate-0: Failed to open /data/file1.txt on subvolume archive1-client-0 [Input/output error]<br><b>[2019-02-01 15:02:19.193009] W [fuse-bridge.c:2371:fuse_readv_cbk] 0-glusterfs-fuse: 147733: READ =&gt; -1 gfid=b432c1d6-ece2-42f2-8749-b11e058c4be3 fd=0x7fc60408bbb8 (Transport endpoint is not connected)</b><br>[2019-02-01 15:02:19.193653] W [MSGID: 114028] [client-lk.c:347:delete_granted_locks_owner] 0-archive1-client-0: fdctx not valid [Invalid argument]<br><br></div><div dir="ltr"><div>And from FUSE on node2 (with heal copy):</div><div>$ cat file1.txt <br>file1<br></div><div><br></div><div>It seems to be that node1 wants to get the file from its own brick, but the copy there is broken. Node2 gets the file from its own brick with a heal copy, so reading the file succeed.</div><div>But I am wondering myself because sometimes reading the file from node1 with the broken copy succeed</div><div><br></div><div>What is the expected behaviour here? Is it possibly to read files with a corrupted copy from any client access?</div><div><br></div><div>Regards</div><div>David Spisla<br></div><div><br></div><div><br></div></div></div></div></div></div>

_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail-m_1325824890776359779gmail-m_-3012308099005136219gmail-m_1634570276701474471gmail-m_807676680040787958gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Amar Tumballi (amarts)<br></div></div></div></div></div>

</blockquote></div></div>

</blockquote></div>