<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Oct 2, 2016 at 5:49 AM, Lindsay Mathieson <span dir="ltr"><<a href="mailto:lindsay.mathieson@gmail.com" target="_blank">lindsay.mathieson@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 2/10/2016 12:48 AM, Lindsay Mathieson wrote:<br>
</span><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Only the heal count does not change, it just does not seem to start. It can take hours before it shifts, but once it does, its quite rapid. Node 1 has restarted and the heal count has been static at 511 shards for 45 minutes now. Nodes 1 & 2 have low CPU load, node 3 has glusterfsd pegged at 800% CPU. <br>
</blockquote>
<br></span>
Ok, had a try at systematically reproducing it this morning and was actually unable to do so - quite weird. Testing was the same as last night - move all the VM's off a server and reboot it, wait for the healing to finish. This time I tried it with various different settings.<br>
<br>
<br>
Test 1<br>
------<br>
cluster.granular-entry-heal no<br>
cluster.locking-scheme full<br>
Shards / Min: 350 / 8<br>
<br>
<br>
Test 2<br>
------<br>
cluster.granular-entry-heal yes<br>
cluster.locking-scheme granular<br>
Shards / Min: 391 / 10<br>
<br>
Test 3<br>
------<br>
cluster.granular-entry-heal yes<br>
cluster.locking-scheme granular<br>
heal command issued<br>
Shards / Min: 358 / 11<br>
<br>
Test 3<br>
------<br>
cluster.granular-entry-heal yes<br>
cluster.locking-scheme granular<br>
heal full command issued<br>
Shards / Min: 358 / 27<br>
<br>
<br>
Best results were with cluster.granular-entry-heal=ye<wbr>s, cluster.locking-scheme=granula<wbr>r but they were all quite good.<br>
<br>
<br>
Don't know why it was so much worse last night - i/o load, cpu and memory were the same. However one thin that is different which I can't easily reproduce was that the cluster had been running for several weeks, but last night I rebooted all nodes. Could gluster be developing an issue after running for some time?</blockquote><div><br></div><div>From the algorithm point of view, the only thing that matters is load that it needs to heal. Doesn't depend on age. So whether the load to heal is 100GB in very less time or in few months, the time to heal should be same.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>
<br>
<br>
-- <br>
Lindsay Mathieson<br>
<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman<wbr>/listinfo/gluster-users</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div>
</div></div>