<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 24/04/2016 11:12 AM, Lindsay
Mathieson wrote:<br>
</div>
<blockquote cite="mid:571C1D6A.50304@gmail.com" type="cite">esterday
I stopped the volume and ran a md5sum on all the shards to compare
the 3 replicas. All 15 VM images were identical except for one
(vm-307). It has 2048 shards of which 8 differed.
<br>
<br>
volume heal info lists <b class="moz-txt-star"><span
class="moz-txt-tag">*</span>no<span class="moz-txt-tag">*</span></b>
files needing healed.
<br>
<br>
Two things concern me:
<br>
<br>
1. How did this happen? trust in gluster either keeping replica's
sync'd or knowing when they are not is crucial.
<br>
<br>
2. How do I force a heal of an individual file? I can find no
documentation as to this process or even if it is possible.
<br>
<br>
I do have one possible solution - delete the vm image and restore
from backup. Not ideal.
<br>
<br>
<br>
Notes:
<br>
- I did have a hard disk failure on a brick while testing. ZFS
recovered it with no errors.
<br>
<br>
- My testing was reasonably severe - server reboots and killing of
the gluster processes. All things that will happen in a cluster
life time. I was pleased with how well gluster handled them.
<br>
</blockquote>
<br>
<br>
Duplicating from a separate msg how I resolved the immediate issue:<br>
<br>
I used diff3 to compare the checksums of the shards and it revealed
that seven of the shards were the same on two bricks (vna & vng)
and one of the shards was the same on two other bricks (vna &
vnb). Fortunately none were different on all 3 bricks :)<br>
<br>
Using the checksum as a quorum I deleted all the singleton shards (7
on vnb, 1 on vng), touched the file owner and issule a "heal full".
All 8 shards were restored with matching checksums for the other two
bricks. A rechack of the entire set of shards for the vm showed all
3 copies as identical and the VM itself is functioning normally.<br>
<br>
Its one way to manually heal up shard mismatches which gluster
hasn't detected, if somewhat tedious. Its a method which lends
itself to automation though.<br>
<br>
<br>
<pre class="moz-signature" cols="72">--
Lindsay Mathieson</pre>
</body>
</html>