<div dir="ltr"><div><div>OK, so the log just hints to the following:<br><br>[2017-07-05 15:04:07.178204] E [MSGID: 106123] [glusterd-mgmt.c:1532:glusterd_mgmt_v3_commit] 0-management: Commit failed for operation Reset Brick on local node <br>[2017-07-05 15:04:07.178214] E [MSGID: 106123] [glusterd-replace-brick.c:649:glusterd_mgmt_v3_initiate_replace_brick_cmd_phases] 0-management: Commit Op Failed<br><br></div>While going through the code, glusterd_op_reset_brick () failed resulting into these logs. Now I don&#39;t see any error logs generated from glusterd_op_reset_brick () which makes me thing that have we failed from a place where we log the failure in debug mode. Would you be able to restart glusterd service with debug log mode and reran this test and share the log?<br><br></div><div><div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 5, 2017 at 9:12 PM, Gianluca Cecchi <span dir="ltr">&lt;<a href="mailto:gianluca.cecchi@gmail.com" target="_blank">gianluca.cecchi@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="gmail-">On Wed, Jul 5, 2017 at 5:22 PM, Atin Mukherjee <span dir="ltr">&lt;<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">And what does glusterd log indicate for these failures?<br></div></blockquote><div><br></div><div><br></div></span><div>See here in gzip format</div><div><br></div><div><a href="https://drive.google.com/file/d/0BwoPbcrMv8mvYmlRLUgyV0pFN0k/view?usp=sharing" target="_blank">https://drive.google.com/file/<wbr>d/<wbr>0BwoPbcrMv8mvYmlRLUgyV0pFN0k/<wbr>view?usp=sharing</a> </div><div><br></div><div>It seems that on each host the peer files have been updated with a new entry &quot;hostname2&quot;:</div><div><br></div><div><div>[root@ovirt01 ~]# cat /var/lib/glusterd/peers/*</div><div>uuid=b89311fe-257f-4e44-8e15-<wbr>9bff6245d689</div><div>state=3</div><div>hostname1=ovirt02.localdomain.<wbr>local</div><div>hostname2=10.10.2.103</div><div>uuid=ec81a04c-a19c-4d31-9d82-<wbr>7543cefe79f3</div><div>state=3</div><div>hostname1=ovirt03.localdomain.<wbr>local</div><div>hostname2=10.10.2.104</div><div>[root@ovirt01 ~]# </div></div><div><br></div><div><div>[root@ovirt02 ~]# cat /var/lib/glusterd/peers/*</div><div>uuid=e9717281-a356-42aa-a579-<wbr>a4647a29a0bc</div><div>state=3</div><div>hostname1=ovirt01.localdomain.<wbr>local</div><div>hostname2=10.10.2.102</div><div>uuid=ec81a04c-a19c-4d31-9d82-<wbr>7543cefe79f3</div><div>state=3</div><div>hostname1=ovirt03.localdomain.<wbr>local</div><div>hostname2=10.10.2.104</div><div>[root@ovirt02 ~]# </div></div><div><br></div><div><div>[root@ovirt03 ~]# cat /var/lib/glusterd/peers/*</div><div>uuid=b89311fe-257f-4e44-8e15-<wbr>9bff6245d689</div><div>state=3</div><div>hostname1=ovirt02.localdomain.<wbr>local</div><div>hostname2=10.10.2.103</div><div>uuid=e9717281-a356-42aa-a579-<wbr>a4647a29a0bc</div><div>state=3</div><div>hostname1=ovirt01.localdomain.<wbr>local</div><div>hostname2=10.10.2.102</div><div>[root@ovirt03 ~]# </div></div><div><br></div><div><br></div><div>But not the gluster info on the second and third node that have lost the ovirt01/gl01 host brick information...</div><div><br></div><div>Eg on ovirt02</div><div><br></div><div><br></div><div><div>[root@ovirt02 peers]# gluster volume info export</div><span class="gmail-"><div> </div><div>Volume Name: export</div><div>Type: Replicate</div><div>Volume ID: b00e5839-becb-47e7-844f-<wbr>6ce6ce1b7153</div><div>Status: Started</div><div>Snapshot Count: 0</div><div>Number of Bricks: 0 x (2 + 1) = 2</div><div>Transport-type: tcp</div><div>Bricks:</div></span><div>Brick1: ovirt02.localdomain.local:/<wbr>gluster/brick3/export</div><div><div class="gmail-h5"><div>Brick2: ovirt03.localdomain.local:/<wbr>gluster/brick3/export</div><div>Options Reconfigured:</div><div>transport.address-family: inet</div><div>performance.readdir-ahead: on</div><div>performance.quick-read: off</div><div>performance.read-ahead: off</div><div>performance.io-cache: off</div><div>performance.stat-prefetch: off</div><div>cluster.eager-lock: enable</div><div>network.remote-dio: off</div><div>cluster.quorum-type: auto</div><div>cluster.server-quorum-type: server</div><div>storage.owner-uid: 36</div><div>storage.owner-gid: 36</div><div>features.shard: on</div><div>features.shard-block-size: 512MB</div><div>performance.low-prio-threads: 32</div><div>cluster.data-self-heal-<wbr>algorithm: full</div><div>cluster.locking-scheme: granular</div><div>cluster.shd-wait-qlength: 10000</div><div>cluster.shd-max-threads: 6</div><div>network.ping-timeout: 30</div><div>user.cifs: off</div><div>nfs.disable: on</div><div>performance.strict-o-direct: on</div></div></div><div>[root@ovirt02 peers]# </div></div><div><br></div><div>And on ovirt03</div><div><br></div><div><div>[root@ovirt03 ~]# gluster volume info export</div><span class="gmail-"><div> </div><div>Volume Name: export</div><div>Type: Replicate</div><div>Volume ID: b00e5839-becb-47e7-844f-<wbr>6ce6ce1b7153</div><div>Status: Started</div><div>Snapshot Count: 0</div><div>Number of Bricks: 0 x (2 + 1) = 2</div><div>Transport-type: tcp</div><div>Bricks:</div></span><div>Brick1: ovirt02.localdomain.local:/<wbr>gluster/brick3/export</div><div><div class="gmail-h5"><div>Brick2: ovirt03.localdomain.local:/<wbr>gluster/brick3/export</div><div>Options Reconfigured:</div><div>transport.address-family: inet</div><div>performance.readdir-ahead: on</div><div>performance.quick-read: off</div><div>performance.read-ahead: off</div><div>performance.io-cache: off</div><div>performance.stat-prefetch: off</div><div>cluster.eager-lock: enable</div><div>network.remote-dio: off</div><div>cluster.quorum-type: auto</div><div>cluster.server-quorum-type: server</div><div>storage.owner-uid: 36</div><div>storage.owner-gid: 36</div><div>features.shard: on</div><div>features.shard-block-size: 512MB</div><div>performance.low-prio-threads: 32</div><div>cluster.data-self-heal-<wbr>algorithm: full</div><div>cluster.locking-scheme: granular</div><div>cluster.shd-wait-qlength: 10000</div><div>cluster.shd-max-threads: 6</div><div>network.ping-timeout: 30</div><div>user.cifs: off</div><div>nfs.disable: on</div><div>performance.strict-o-direct: on</div></div></div><div>[root@ovirt03 ~]# </div></div><div><br></div><div>While on ovirt01 it seems isolated...</div><div><br></div><div><div>[root@ovirt01 ~]# gluster volume info export</div><div><div class="gmail-h5"><div> </div><div>Volume Name: export</div><div>Type: Replicate</div><div>Volume ID: b00e5839-becb-47e7-844f-<wbr>6ce6ce1b7153</div><div>Status: Started</div><div>Snapshot Count: 0</div><div>Number of Bricks: 0 x (2 + 1) = 1</div><div>Transport-type: tcp</div><div>Bricks:</div><div>Brick1: gl01.localdomain.local:/<wbr>gluster/brick3/export</div><div>Options Reconfigured:</div><div>transport.address-family: inet</div><div>performance.readdir-ahead: on</div><div>performance.quick-read: off</div><div>performance.read-ahead: off</div><div>performance.io-cache: off</div><div>performance.stat-prefetch: off</div><div>cluster.eager-lock: enable</div><div>network.remote-dio: off</div><div>cluster.quorum-type: auto</div><div>cluster.server-quorum-type: server</div><div>storage.owner-uid: 36</div><div>storage.owner-gid: 36</div><div>features.shard: on</div><div>features.shard-block-size: 512MB</div><div>performance.low-prio-threads: 32</div><div>cluster.data-self-heal-<wbr>algorithm: full</div><div>cluster.locking-scheme: granular</div><div>cluster.shd-wait-qlength: 10000</div><div>cluster.shd-max-threads: 6</div><div>network.ping-timeout: 30</div><div>user.cifs: off</div><div>nfs.disable: on</div><div>performance.strict-o-direct: on</div></div></div><div>[root@ovirt01 ~]# </div></div><div><br></div></div></div></div>
</blockquote></div><br></div></div></div></div></div></div>