<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 27/03/2016 12:33 AM, Lindsay
Mathieson wrote:<br>
</div>
<blockquote cite="mid:56F69DA3.5080409@gmail.com" type="cite">On
26/03/2016 11:58 PM, Pranith Kumar Karampuri wrote:
<br>
<blockquote type="cite" style="color: #000000;">
<blockquote type="cite" style="color: #000000;">Is that the same
issue I posted earlier re "gluster volume heal info" appearing
to block I/O?
<br>
<br>
</blockquote>
I don't think it is heal info that is blocking I/O. I think it
is client triggering heal and block the fop until heal completes
that results in this pattern. This data-heal disabling should
get you out of this problem. </blockquote>
<br>
<br>
I tried it earlier and it didn't seem to help.
<br>
<br>
Does anything need to be restarted after cluster.data-self-heal is
set off?
<br>
</blockquote>
<br>
<br>
Tried again this morning. 100% replicate the behaviour I noted in<br>
<br>
<blockquote type="cite">After testing the heal process by killing
glusterfsd on a node I noticed the following.
<br>
<br>
- I/O continued at normal speed while glusterfsd was down.
<br>
<br>
- After restarting glusterfsd, I/O still continued as normal
<br>
<br>
- performing a "gluster volume heal datastore2 info" whould show
some info then hang.
<br>
<br>
- I/O on the cluster would cease. e.g in a VM where I was running
a command line build of a large project, the build just stopped.
The VM itself was mostly responsive but anything that involved
accessing the disk hung.
<br>
<br>
- if I killed the "gluster volume heal datastore2 info" command
then I/O in the VM's resumed at a normal pace.
<br>
<br>
- if I then reissued the "gluster volume heal datastore2 info"
command I/O would continue for a short while (seconds - minutes)
before hanging again.
<br>
<br>
- killing the heal info command would resume I/O again.
<br>
</blockquote>
<br>
<br>
iowait and cpu are under 4% on all three nodes.<br>
<br>
Even after I shutdown all vm's on datastore2 "gluster volume heal
datastore2 info" hung indefinitely with no output. <br>
<br>
I had to stop/start the datastore2 before the info would work, it
rteurned very quickly with:<br>
<tt><br>
</tt>
<blockquote><tt>Brick vnb.proxmox.softlog:/tank/vmdata/datastore2</tt><br>
<tt>Number of entries: 0</tt><br>
<br>
<tt>Brick vng.proxmox.softlog:/tank/vmdata/datastore2</tt><br>
<tt>/.shard - Possibly undergoing heal</tt><br>
<br>
<tt>Number of entries: 1</tt><br>
<br>
<tt>Brick vna.proxmox.softlog:/tank/vmdata/datastore2</tt><br>
<tt>/.shard - Possibly undergoing heal</tt><br>
<br>
<tt>Number of entries: 1</tt></blockquote>
<br>
Unfortunately its stayed that way for 10 minutes now.<br>
<br>
<br>
I'd like to recheck this behaviour under 3.7.7 - can I just revert
to that (debian packages) without recreating the datastore?<br>
<br>
thanks,<br>
<br>
<br>
<br>
<pre class="moz-signature" cols="72">--
Lindsay Mathieson</pre>
</body>
</html>