<div dir="ltr">Well it&#39;s not magic, there is an algorithm that is documented and it is trivial script the recreation of the file from the shards if gluster was truly unavailable:<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><br>#!/bin/bash<br>#<br># quick and dirty reconstruct file from shards<br># takes brick path and file name as arguments<br># Copyright May 20th 2016 A. Neil<br>#<br>brick=$1<br>filen=$2<br>file=`find $brick -name $filen`<br>inode=`ls -i $file | cut -d&#39; &#39; -f1`<br>pushd $brick/.glusterfs<br>gfid=`find . -inum $inode | cut -d&#39;/&#39; -f4`<br>popd<br>nshard=`ls -1  $brick/.shard/${gfid}.* | wc -l`<br>cp $file ./${filen}.restored<br>for i in `seq 1 $nshard`; do cat $brick/.shard/${gfid}.$i &gt;&gt; ./${filen}.restored; done</blockquote><div><div><br></div><div> Admittedly this is not as easy as pulling the image for from the brick file system, but then the advantages are pretty big.<div><br></div><div>The point is that each shard is small and healing of them is fast.  The majority of the time when you need to heal a vm it&#39;s is only a few blocks that have changed and without sharding you might have to heal 10 , 20 or 100GB.  In my experience if you have 30 or 40 VMs it can take hours to heal.  With the limited testing I have done I have found  that yes some VMs will experience IO timeouts, freeze, and then need to be restarted.  However, at least you don&#39;t need to wait hours before you can do that.  </div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 20 May 2016 at 15:20, Gandalf Corvotempesta <span dir="ltr">&lt;<a href="mailto:gandalf.corvotempesta@gmail.com" target="_blank">gandalf.corvotempesta@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><p dir="ltr">Il 20 mag 2016 20:14, &quot;Alastair Neil&quot; &lt;<a href="mailto:ajneil.tech@gmail.com" target="_blank">ajneil.tech@gmail.com</a>&gt; ha scritto:<br>

&gt;<br>

&gt; I think you are confused about what sharding does.   In a sharded replica 3 volume all the shards exist on all the replicas so there is no distribution.  Might you be getting confused with erasure coding?  The upshot of sharding is that if you have a failure, instead of healing multiple gigabyte vm files for example, you only heal the shards that have changed. This generally shortens the heal time dramatically.<br><br></p>

</span><p dir="ltr">I know what sharding is.<br>

it split each file in multiple, smaller,  chunks</p>

<p dir="ltr">But if all is gonna bad, how can i reconstruct a file from each shard without gluster? It would be a pain.<br>

Let&#39;s assume tens of terabytes of shards to be manually reconstructed ...</p>

<p dir="ltr">Anyway how is possible to keep VM up and running when healing is happening on a shard? That part of disk image is not accessible and thus the VM could have some issue on a filesystem.</p>

</blockquote></div><br></div>