<div dir="ltr">Thank you for responding, Heiko.  In the process of seeing the differences between our two scripts.  First thing I noticed was that the notes states &quot;<span style="color:rgb(0,0,0);white-space:pre-wrap">need to be defined in the /etc/hosts&quot;.  Would using the IP address directly be a problem?</span></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 21, 2016 at 2:10 PM, Heiko L. <span dir="ltr">&lt;<a href="mailto:heikol@fh-lausitz.de" target="_blank">heikol@fh-lausitz.de</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Am Di, 21.06.2016, 19:22 schrieb Danny Lee:<br>

&gt; Hello,<br>

&gt;<br>

&gt;<br>

&gt; We are currently figuring out how to add GlusterFS to our system to make<br>

&gt; our systems highly available using scripts.  We are using Gluster 3.7.11.<br>

&gt;<br>

&gt; Problem:<br>

&gt; Trying to migrate to GlusterFS from a non-clustered system to a 3-node<br>

&gt; glusterfs replicated cluster using scripts.  Tried various things to make this work, but it sometimes causes us to be in an<br>

&gt; indesirable state where if you call &quot;gluster volume heal &lt;volname&gt; full&quot;, we would get the error message, &quot;Launching heal<br>

&gt; operation to perform full self heal on volume &lt;volname&gt; has been unsuccessful on bricks that are down. Please check if<br>

&gt; all brick processes are running.&quot;  All the brick processes are running based on running the command, &quot;gluster volume status<br>

&gt; volname&quot;<br>

&gt;<br>

&gt; Things we have tried:<br>

&gt; Order of preference<br>

&gt; 1. Create Volume with 3 Filesystems with the same data<br>

&gt; 2. Create Volume with 2 Empty filesysytems and one with the data<br>

&gt; 3. Create Volume with only one filesystem with data and then using<br>

&gt; &quot;add-brick&quot; command to add the other two empty filesystems<br>

&gt; 4. Create Volume with one empty filesystem, mounting it, and then copying<br>

&gt; the data over to that one.  And then finally, using &quot;add-brick&quot; command to add the other two empty filesystems<br>

</span>- should be working<br>

- read each file on /mnt/gvol, to trigger replication [2]<br>

<span class=""><br>

&gt; 5. Create Volume<br>

&gt; with 3 empty filesystems, mounting it, and then copying the data over<br>

</span>- my favorite<br>

<span class=""><br>

&gt;<br>

&gt; Other things to note:<br>

&gt; A few minutes after the volume is created and started successfully, our<br>

&gt; application server starts up against it, so reads and writes may happen pretty quickly after the volume has started.  But there<br>

&gt; is only about 50MB of data.<br>

&gt;<br>

&gt; Steps to reproduce (all in a script):<br>

&gt; # This is run by the primary node with the IP Adress, &lt;server-ip-1&gt;, that<br>

&gt; has data systemctl restart glusterd gluster peer probe &lt;server-ip-2&gt; gluster peer probe &lt;server-ip-3&gt; Wait for &quot;gluster peer<br>

&gt; status&quot; to all be in &quot;Peer in Cluster&quot; state gluster volume create &lt;volname&gt; replica 3 transport tcp ${BRICKS[0]} ${BRICKS[1]}<br>

&gt; ${BRICKS[2]} force<br>

&gt; gluster volume set &lt;volname&gt; nfs.disable true gluster volume start &lt;volname&gt; mkdir -p $MOUNT_POINT mount -t glusterfs<br>

&gt; &lt;server-ip-1&gt;:/volname $MOUNT_POINT<br>

&gt; find $MOUNT_POINT | xargs stat<br>

<br>

</span>I have written a script for 2 nodes. [1]<br>

but should be at least 3 nodes.<br>

<br>

<br>

I hope it helps you<br>

regards Heiko<br>

<span class=""><br>

&gt;<br>

&gt; Note that, when we added sleeps around the gluster commands, there was a<br>

&gt; higher probability of success, but not 100%.<br>

&gt;<br>

&gt; # Once volume is started, all the the clients/servers will mount the<br>

&gt; gluster filesystem by polling &quot;mountpoint -q $MOUNT_POINT&quot;: mkdir -p $MOUNT_POINT mount -t glusterfs &lt;server-ip-1&gt;:/volname<br>

&gt; $MOUNT_POINT<br>

&gt;<br>

&gt;<br>

&gt; Logs:<br>

</span>&gt; *etc-glusterfs-glusterd.vol.log* in *server-ip-1*<br>

<span class="">&gt;<br>

&gt;<br>

&gt; [2016-06-21 14:10:38.285234] I [MSGID: 106533]<br>

&gt; [glusterd-volume-ops.c:857:__glusterd_handle_cli_heal_volume] 0-management:<br>

&gt; Received heal vol req for volume volname<br>

&gt; [2016-06-21 14:10:38.296801] E [MSGID: 106153]<br>

&gt; [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Commit failed on<br>

&gt; &lt;server-ip-2&gt;. Please check log file for details.<br>

&gt;<br>

&gt;<br>

&gt;<br>

</span>&gt; *usr-local-volname-data-mirrored-data.log* in *server-ip-1*<br>

<span class="">&gt;<br>

&gt;<br>

&gt; [2016-06-21 14:14:39.233366] E [MSGID: 114058]<br>

&gt; [client-handshake.c:1524:client_query_portmap_cbk] 0-volname-client-0:<br>

&gt; failed to get the port number for remote subvolume. Please run &#39;gluster volume status&#39; on server to see if brick process is<br>

&gt; running. *I think this is caused by the self heal daemon*<br>

&gt;<br>

&gt;<br>

</span>&gt; *cmd_history.log* in *server-ip-1*<br>

<span class="">&gt;<br>

&gt;<br>

&gt; [2016-06-21 14:10:38.298800]  : volume heal volname full : FAILED : Commit<br>

</span>&gt; failed on &lt;server-ip-2&gt;. Please check log file for details. _______________________________________________<br>

&gt; Gluster-users mailing list<br>

&gt; <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

&gt; <a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

<br>

[1] <a href="http://www2.fh-lausitz.de/launic/comp/net/glusterfs/130620.glusterfs.create_brick_vol.howto.txt" rel="noreferrer" target="_blank">http://www2.fh-lausitz.de/launic/comp/net/glusterfs/130620.glusterfs.create_brick_vol.howto.txt</a><br>

  - old, limit 2 nodes<br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

--<br>

<br>

<br>

</font></span></blockquote></div><br></div>