<div dir="ltr"><br><div class="gmail_extra"><div><div class="gmail_signature"><div dir="ltr"><div><span><font color="#888888"><font><font size="1"><span style="color:rgb(51,51,51)"><span style="color:rgb(153,153,153)"><a value="+17086132426"><font color="#888888"><font size="1"><br></font></font></a></span></span></font></font></font></span></div></div></div></div>

<br><div class="gmail_quote">On Thu, Mar 10, 2016 at 4:52 PM, Lindsay Mathieson <span dir="ltr">&lt;<a href="mailto:lindsay.mathieson@gmail.com" target="_blank">lindsay.mathieson@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 11/03/2016 2:24 AM, David Gossage wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  It is file based not block based healing so it saw multi-GB files that it had to recopy over.  It had to halt all write to those files while that occurred or it would be a never ending cycle of re-copying the large images.  So the fact most VM&#39;s went haywire isnt that odd.  It does look based on timing in alerts the 2 bricks that were up kept serving images until 3rd brick came back.  It did heal all images just fine.<br>

<br>

</blockquote>

<br>

What version are you running?  3.7.x has sharding (breaks large files into chunks) to allow much finer grained healing, it speeds up heals a *lot*. However it can&#39;t be applied retroactively, you have to enable sharding then copy the VM over :(<br>

<br>

<a href="http://blog.gluster.org/2015/12/introducing-shard-translator/" rel="noreferrer" target="_blank">http://blog.gluster.org/2015/12/introducing-shard-translator/</a></blockquote><div><br></div><div>Yes, I was planning on testing that out soon at office I am on 3.7.  Attaching an nfs mount and moving disks off and on until all were sharded.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

<br>

In regards to rolling reboots, it can be done with replicated storage and gluster will transparently  hand over client read/writes, but for each VM image, only one copy at a time can be healing over wise access will be blocked as you saw.<br>

<br>

So recommended procedure:<br>

- Enable sharding<br>

- copy VM&#39;s over<br>

- when rebooting wait for heals to complete before rebooting the next node<br></blockquote><div><br></div><div>Odd thing is I did only reboot the one node so I was expecting one version to be healed, the one I had rebooted, and the other 2 to handle writes still during the heal process.  However that was not what happened.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

nb: Thoroughly recommend 3 way replication as you have done, it saves a lot of headaches with quorums and split brain.<span class="HOEnZb"><font color="#888888"><br>

<br>

-- <br>

Lindsay Mathieson<br>

<br>

</font></span></blockquote></div><br></div></div>