<div dir="ltr">I'll leave it to others to help debug slow heal...<div><br></div><div>As for 'heal info' taking a long time, you can use `gluster vol heal gv1 info summmary` to just get the counts. That will probably get you the stats you are really interested in (whether heal is progressing).</div><div><br></div><div>-John</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Oct 23, 2018 at 5:31 AM hsafe <<a href="mailto:hsafe@devopt.net">hsafe@devopt.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello all,<br>
<br>
Can somebody please respond to this? as of now if I run "gluster volume <br>
heal gv1 info"<br>
<br>
there is infinite number of lines of gfid which never ends...usually and <br>
in stable scenario this ended with some numbers and status but currently <br>
it never finishes...is it a bad sign ? is it a loop? are there any <br>
actions required to do beside gluster?<br>
<br>
Appreciate any help...<br>
<br>
On 10/21/18 8:05 AM, hsafe wrote:<br>
> Hello all gluster community,<br>
><br>
> I am in a scenario unmatched for the past year of using glusterfs in a <br>
> 2 replica set on glusterfs 3.10.12 servers where they are the storage <br>
> back of my application which saves small images into them.<br>
><br>
> Now the problem I face and unique for the time is that whenever we <br>
> were asynced or one server went down; bringing the other one will <br>
> start the self heal and eventually we could see the clustered volume <br>
> in sync, but now if I run the volume heal info the list of the gfid <br>
> does not even finish after couple of hours. if I look at the heal log <br>
> I can see that the process is ongoing but it a very small scale and <br>
> speed!<br>
><br>
> My question is how can I expect it finished and how can I speed it up <br>
> there?<br>
><br>
> Here is a bit of info:<br>
><br>
> Status of volume: gv1<br>
> Gluster process TCP Port RDMA Port <br>
> Online Pid<br>
> ------------------------------------------------------------------------------ <br>
><br>
> Brick IMG-01:/images/storage/brick1 49152 0 Y 4176<br>
> Brick IMG-02:/images/storage/brick1 49152 0 Y 4095<br>
> Self-heal Daemon on localhost N/A N/A Y 4067<br>
> Self-heal Daemon on IMG-01 N/A N/A Y 4146<br>
><br>
> Task Status of Volume gv1<br>
> ------------------------------------------------------------------------------ <br>
><br>
> There are no active volume tasks<br>
><br>
> Status of volume: gv2<br>
> Gluster process TCP Port RDMA Port <br>
> Online Pid<br>
> ------------------------------------------------------------------------------ <br>
><br>
> Brick IMG-01:/data/brick2 49153 0 Y 4185<br>
> Brick IMG-02:/data/brick2 49153 0 Y 4104<br>
> NFS Server on localhost N/A N/A N N/A<br>
> Self-heal Daemon on localhost N/A N/A Y 4067<br>
> NFS Server on IMG-01 N/A N/A N N/A<br>
> Self-heal Daemon on IMG-01 N/A N/A Y 4146<br>
><br>
> Task Status of Volume gv2<br>
> ------------------------------------------------------------------------------ <br>
><br>
> There are no active volume tasks<br>
><br>
><br>
><br>
> gluster> peer status<br>
> Number of Peers: 1<br>
><br>
> Hostname: IMG-01<br>
> Uuid: 5faf60fc-7f5c-4c6e-aa3f-802482391c1b<br>
> State: Peer in Cluster (Connected)<br>
><br>
> Hostname: IMG-01<br>
> Uuid: 5faf60fc-7f5c-4c6e-aa3f-802482391c1b<br>
> State: Peer in Cluster (Connected)<br>
> gluster> exit<br>
> root@NAS02:/var/log/glusterfs# gluster volume gv1 info<br>
> unrecognized word: gv1 (position 1)<br>
> root@NAS02:/var/log/glusterfs# gluster volume info<br>
><br>
> Volume Name: gv1<br>
> Type: Replicate<br>
> Volume ID: f1c955a1-7a92-4b1b-acb5-8b72b41aaace<br>
> Status: Started<br>
> Snapshot Count: 0<br>
> Number of Bricks: 1 x 2 = 2<br>
> Transport-type: tcp<br>
> Bricks:<br>
> Brick1: IMG-01:/images/storage/brick1<br>
> Brick2: IMG-02:/images/storage/brick1<br>
> Options Reconfigured:<br>
> server.event-threads: 4<br>
> performance.cache-invalidation: on<br>
> performance.stat-prefetch: on<br>
> features.cache-invalidation-timeout: 600<br>
> features.cache-invalidation: on<br>
> cluster.lookup-optimize: on<br>
> cluster.shd-max-threads: 4<br>
> cluster.readdir-optimize: on<br>
> performance.md-cache-timeout: 30<br>
> cluster.background-self-heal-count: 32<br>
> server.statedump-path: /tmp<br>
> performance.readdir-ahead: on<br>
> nfs.disable: true<br>
> network.inode-lru-limit: 50000<br>
> features.bitrot: off<br>
> features.scrub: Inactive<br>
> performance.cache-max-file-size: 16MB<br>
> client.event-threads: 8<br>
> cluster.eager-lock: on<br>
> cluster.self-heal-daemon: enable<br>
><br>
><br>
> Please do help me out...Thanks<br>
><br>
><br>
><br>
-- <br>
Hamid Safe<br>
<a href="http://www.devopt.net" rel="noreferrer" target="_blank">www.devopt.net</a><br>
+989361491768<br>
<br>
_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div>