<div dir="ltr">Thank you for responding, Heiko. In the process of seeing the differences between our two scripts. First thing I noticed was that the notes states "<span style="color:rgb(0,0,0);white-space:pre-wrap">need to be defined in the /etc/hosts". Would using the IP address directly be a problem?</span></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 21, 2016 at 2:10 PM, Heiko L. <span dir="ltr"><<a href="mailto:heikol@fh-lausitz.de" target="_blank">heikol@fh-lausitz.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Am Di, 21.06.2016, 19:22 schrieb Danny Lee:<br>
> Hello,<br>
><br>
><br>
> We are currently figuring out how to add GlusterFS to our system to make<br>
> our systems highly available using scripts. We are using Gluster 3.7.11.<br>
><br>
> Problem:<br>
> Trying to migrate to GlusterFS from a non-clustered system to a 3-node<br>
> glusterfs replicated cluster using scripts. Tried various things to make this work, but it sometimes causes us to be in an<br>
> indesirable state where if you call "gluster volume heal <volname> full", we would get the error message, "Launching heal<br>
> operation to perform full self heal on volume <volname> has been unsuccessful on bricks that are down. Please check if<br>
> all brick processes are running." All the brick processes are running based on running the command, "gluster volume status<br>
> volname"<br>
><br>
> Things we have tried:<br>
> Order of preference<br>
> 1. Create Volume with 3 Filesystems with the same data<br>
> 2. Create Volume with 2 Empty filesysytems and one with the data<br>
> 3. Create Volume with only one filesystem with data and then using<br>
> "add-brick" command to add the other two empty filesystems<br>
> 4. Create Volume with one empty filesystem, mounting it, and then copying<br>
> the data over to that one. And then finally, using "add-brick" command to add the other two empty filesystems<br>
</span>- should be working<br>
- read each file on /mnt/gvol, to trigger replication [2]<br>
<span class=""><br>
> 5. Create Volume<br>
> with 3 empty filesystems, mounting it, and then copying the data over<br>
</span>- my favorite<br>
<span class=""><br>
><br>
> Other things to note:<br>
> A few minutes after the volume is created and started successfully, our<br>
> application server starts up against it, so reads and writes may happen pretty quickly after the volume has started. But there<br>
> is only about 50MB of data.<br>
><br>
> Steps to reproduce (all in a script):<br>
> # This is run by the primary node with the IP Adress, <server-ip-1>, that<br>
> has data systemctl restart glusterd gluster peer probe <server-ip-2> gluster peer probe <server-ip-3> Wait for "gluster peer<br>
> status" to all be in "Peer in Cluster" state gluster volume create <volname> replica 3 transport tcp ${BRICKS[0]} ${BRICKS[1]}<br>
> ${BRICKS[2]} force<br>
> gluster volume set <volname> nfs.disable true gluster volume start <volname> mkdir -p $MOUNT_POINT mount -t glusterfs<br>
> <server-ip-1>:/volname $MOUNT_POINT<br>
> find $MOUNT_POINT | xargs stat<br>
<br>
</span>I have written a script for 2 nodes. [1]<br>
but should be at least 3 nodes.<br>
<br>
<br>
I hope it helps you<br>
regards Heiko<br>
<span class=""><br>
><br>
> Note that, when we added sleeps around the gluster commands, there was a<br>
> higher probability of success, but not 100%.<br>
><br>
> # Once volume is started, all the the clients/servers will mount the<br>
> gluster filesystem by polling "mountpoint -q $MOUNT_POINT": mkdir -p $MOUNT_POINT mount -t glusterfs <server-ip-1>:/volname<br>
> $MOUNT_POINT<br>
><br>
><br>
> Logs:<br>
</span>> *etc-glusterfs-glusterd.vol.log* in *server-ip-1*<br>
<span class="">><br>
><br>
> [2016-06-21 14:10:38.285234] I [MSGID: 106533]<br>
> [glusterd-volume-ops.c:857:__glusterd_handle_cli_heal_volume] 0-management:<br>
> Received heal vol req for volume volname<br>
> [2016-06-21 14:10:38.296801] E [MSGID: 106153]<br>
> [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Commit failed on<br>
> <server-ip-2>. Please check log file for details.<br>
><br>
><br>
><br>
</span>> *usr-local-volname-data-mirrored-data.log* in *server-ip-1*<br>
<span class="">><br>
><br>
> [2016-06-21 14:14:39.233366] E [MSGID: 114058]<br>
> [client-handshake.c:1524:client_query_portmap_cbk] 0-volname-client-0:<br>
> failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is<br>
> running. *I think this is caused by the self heal daemon*<br>
><br>
><br>
</span>> *cmd_history.log* in *server-ip-1*<br>
<span class="">><br>
><br>
> [2016-06-21 14:10:38.298800] : volume heal volname full : FAILED : Commit<br>
</span>> failed on <server-ip-2>. Please check log file for details. _______________________________________________<br>
> Gluster-users mailing list<br>
> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
> <a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
<br>
[1] <a href="http://www2.fh-lausitz.de/launic/comp/net/glusterfs/130620.glusterfs.create_brick_vol.howto.txt" rel="noreferrer" target="_blank">http://www2.fh-lausitz.de/launic/comp/net/glusterfs/130620.glusterfs.create_brick_vol.howto.txt</a><br>
- old, limit 2 nodes<br>
<span class="HOEnZb"><font color="#888888"><br>
<br>
--<br>
<br>
<br>
</font></span></blockquote></div><br></div>