<div dir="ltr"><div>Could you share the client logs and information about the approx time/day when you saw this issue?<br><br></div>-Krutika<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Apr 16, 2016 at 12:57 AM, Kevin Lemonnier <span dir="ltr">&lt;<a href="mailto:lemonnierk@ulrar.net" target="_blank">lemonnierk@ulrar.net</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>

<br>

We have a small glusterFS 3.7.6 cluster with 3 nodes running with proxmox VM&#39;s on it. I did set up the different recommended option like the virt group, but<br>

by hand since it&#39;s on debian. The shards are 256MB, if that matters.<br>

<br>

This morning the second node crashed, and as it came back up started a heal, but that basically froze all the VM&#39;s running on that volume. Since we really really<br>

can&#39;t have 40 minutes down time in the middle of the day, I just removed the node from the network and that stopped the heal, allowing the VM&#39;s to access<br>

their disks again. The plan was to re-connecte the node in a couple of hours to let it heal at night.<br>

But a VM crashed now, and it can&#39;t boot up again : seems to freez trying to access the disks.<br>

<br>

Looking at the heal info for the volume, it has gone way up since this morning, it looks like the VM&#39;s aren&#39;t writing to both nodes, just the one they are on.<br>

It seems pretty bad, we have 2 nodes on 3 up, I would expect the volume to work just fine since it has quorum. What am I missing ?<br>

<br>

It is still too early to start the heal, is there a way to start the VM anyway right now ? I mean, it was running a moment ago so the data is there, it just needs<br>

to let the VM access it.<br>

<br>

<br>

<br>

Volume Name: vm-storage<br>

Type: Replicate<br>

Volume ID: a5b19324-f032-4136-aaac-5e9a4c88aaef<br>

Status: Started<br>

Number of Bricks: 1 x 3 = 3<br>

Transport-type: tcp<br>

Bricks:<br>

Brick1: first_node:/mnt/vg1-storage<br>

Brick2: second_node:/mnt/vg1-storage<br>

Brick3: third_node:/mnt/vg1-storage<br>

Options Reconfigured:<br>

cluster.quorum-type: auto<br>

cluster.server-quorum-type: server<br>

network.remote-dio: enable<br>

cluster.eager-lock: enable<br>

performance.readdir-ahead: on<br>

performance.quick-read: off<br>

performance.read-ahead: off<br>

performance.io-cache: off<br>

performance.stat-prefetch: off<br>

features.shard: on<br>

features.shard-block-size: 256MB<br>

cluster.server-quorum-ratio: 51%<br>

<br>

<br>

Thanks for your help<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Kevin Lemonnier<br>

PGP Fingerprint : 89A5 2283 04A0 E6E9 0111<br>

</font></span><br>_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br></blockquote></div><br></div>