<div dir="ltr"><div><div>Could you please attacj the brick logs and glustershd logs?<br></div>Also share the volume configuration please (`gluster volume info`).<br><br></div>-Krutika<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Aug 15, 2016 at 12:19 PM, Lindsay Mathieson <span dir="ltr">&lt;<a href="mailto:lindsay.mathieson@gmail.com" target="_blank">lindsay.mathieson@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Moved to a new subject as its now an issue on our cluster.<br>

<br>

As an experiment I killed glusterfsd on one node. System kept<br>

chugging allong fine with no hiccups. I ran a few disk intensive VM&#39;s<br>

on that node and others, no real slow down. Monitoring it with &quot;heal<br>

statistics heal-count&#39;<br>

<br>

heal-count got up to approx 2500 shards and restarted glusterfsd by<br>

restarting the gluster-service (glusterd).<br>

<br>

heal-count stopped rising, but what is concerning is that it doesn&#39;t<br>

seem to be going back down. 45min later at its stable at 2439 files<br>

needing healed and glusterfsd is thrashing the CPU&#39;s on that node<br>

(1000%!)<br>

<br>

The glfsheal log has no entries at all.<br>

<br>

Previously (3.7.x) when I&#39;ve done this test, heals kicked in very rapidly.<br>

<br>

<br>

At three hours later, still no progress in heal at all. VM&#39;s on other<br>

nodes getting occasional read timeouts.<br>

<br>

heal-count = 2550, and not changing.<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Lindsay<br>

______________________________<wbr>_________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>

</font></span></blockquote></div><br></div>