<div dir="ltr"><div class="gmail_default" style="font-family:tahoma,sans-serif">Any update here? Can I hope to see a fix incorporated into the release of 3.6.3 ?<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 31, 2015 at 10:53 AM, Pranith Kumar Karampuri <span dir="ltr">&lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
  <div text="#000000" bgcolor="#FFFFFF"><span class="">

    <br>

    <div>On 03/31/2015 10:47 PM, Rumen Telbizov

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-family:tahoma,sans-serif">Pranith

          and Atin,<br>

          <br>

        </div>

        <div class="gmail_default" style="font-family:tahoma,sans-serif">Thank

          you for looking into this and confirming it&#39;s a bug. Please

          log the bug yourself since I am not familiar with the

          project&#39;s bug-tracking system.<br>

          <br>

        </div>

        <div class="gmail_default" style="font-family:tahoma,sans-serif">Assessing

          its severity and the fact that this effectively stops the

          cluster from functioning properly after boot, what do you

          think would be the timeline for fixing this issue? What

          version do you expect to see this fixed in?<br>

          <br>

        </div>

        <div class="gmail_default" style="font-family:tahoma,sans-serif">In

          the meantime, is there another workaround that you might

          suggest besides running a secondary mount later after the boot

          is over?<br>

        </div>

      </div>

    </blockquote></span>

    Adding glusterd maintainers to the thread: +kaushal, +krishnan<br>

    I will let them answer your questions.<span class="HOEnZb"><font color="#888888"><br>

    <br>

    Pranith</font></span><div><div class="h5"><br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-family:tahoma,sans-serif"><br>

        </div>

        <div class="gmail_default" style="font-family:tahoma,sans-serif">Thank

          you again for your help,<br>

        </div>

        <div class="gmail_default" style="font-family:tahoma,sans-serif">Rumen

          Telbizov<br>

        </div>

        <div class="gmail_default" style="font-family:tahoma,sans-serif"><br>

          <br>

        </div>

      </div>

      <div class="gmail_extra"><br>

        <div class="gmail_quote">On Tue, Mar 31, 2015 at 2:53 AM,

          Pranith Kumar Karampuri <span dir="ltr">&lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span> wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><br>

              On 03/31/2015 01:55 PM, Atin Mukherjee wrote:<br>

              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                <br>

                On 03/31/2015 01:03 PM, Pranith Kumar Karampuri wrote:<br>

                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                  On 03/31/2015 12:53 PM, Atin Mukherjee wrote:<br>

                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                    On 03/31/2015 12:27 PM, Pranith Kumar Karampuri

                    wrote:<br>

                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                      Atin,<br>

                               Could it be because bricks are started

                      with PROC_START_NO_WAIT?<br>

                    </blockquote>

                    That&#39;s the correct analysis Pranith. Mount was

                    attempted before the<br>

                    bricks were started. If we can have a time lag in

                    some seconds between<br>

                    mount and volume start the problem will go away.<br>

                  </blockquote>

                  Atin,<br>

                          I think one way to solve this issue is to

                  start the bricks with<br>

                  NO_WAIT so that we can handle pmap-signin but wait for

                  the pmap-signins<br>

                  to complete before responding to cli/completing

                  &#39;init&#39;?<br>

                </blockquote>

                Logically it should solve the problem. We need to think

                around it more<br>

                from the existing design perspective.<br>

              </blockquote>

            </span>

            Rumen,<br>

                 Feel free to log a bug. This should be fixed in later

            release. We can raise the bug and work it as well if you

            prefer it this way.<span><font color="#888888"><br>

                <br>

                Pranith</font></span>

            <div>

              <div><br>

                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                  <br>

                  ~Atin<br>

                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                    Pranith<br>

                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                      <br>

                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                        Pranith<br>

                        On 03/31/2015 04:41 AM, Rumen Telbizov wrote:<br>

                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

                          Hello everyone,<br>

                          <br>

                          I have a problem that I am trying to resolve

                          and not sure which way to<br>

                          go so here I am asking for your advise.<br>

                          <br>

                          What it comes down to is that upon initial

                          boot of all my GlusterFS<br>

                          machines the shared volume doesn&#39;t get

                          mounted. Nevertheless the<br>

                          volume successfully created and started and

                          further attempts to mount<br>

                          it manually succeed. I suspect what&#39;s

                          happening is that gluster<br>

                          processes/bricks/etc haven&#39;t fully started at

                          the time the /etc/fstab<br>

                          entry is read and the initial mount attempt is

                          being made. Again, by<br>

                          the time I log in and run a mount -a -- the

                          volume mounts without any<br>

                          issues.<br>

                          <br>

                          _Details from the logs:_<br>

                          <br>

                          [2015-03-30 22:29:04.381918] I [MSGID: 100030]<br>

                          [glusterfsd.c:2018:main]

                          0-/usr/sbin/glusterfs: Started running<br>

                          /usr/sbin/glusterfs version 3.6.2 (args:

                          /usr/sbin/glusterfs<br>

                          --log-file=/var/log/glusterfs/glusterfs.log

                          --attribute-timeout=0<br>

                          --entry-timeout=0 --volfile-server=localhost<br>

                          --volfile-server=10.12.130.21

                          --volfile-server=10.12.130.22<br>

                          --volfile-server=10.12.130.23

                          --volfile-id=/myvolume /opt/shared)<br>

                          [2015-03-30 22:29:04.394913] E

                          [socket.c:2267:socket_connect_finish]<br>

                          0-glusterfs: connection to <a href="http://127.0.0.1:24007" target="_blank">127.0.0.1:24007</a> &lt;<a href="http://127.0.0.1:24007" target="_blank">http://127.0.0.1:24007</a>&gt;<br>

                          failed (Connection refused)<br>

                          [2015-03-30 22:29:04.394950] E<br>

                          [glusterfsd-mgmt.c:1811:mgmt_rpc_notify]

                          0-glusterfsd-mgmt: failed to<br>

                          connect with remote-host: localhost (Transport

                          endpoint is not<br>

                          connected)<br>

                          [2015-03-30 22:29:04.394964] I<br>

                          [glusterfsd-mgmt.c:1838:mgmt_rpc_notify]

                          0-glusterfsd-mgmt: connecting<br>

                          to next volfile server 10.12.130.21<br>

                          [2015-03-30 22:29:08.390687] E<br>

                          [glusterfsd-mgmt.c:1811:mgmt_rpc_notify]

                          0-glusterfsd-mgmt: failed to<br>

                          connect with remote-host: 10.12.130.21

                          (Transport endpoint is not<br>

                          connected)<br>

                          [2015-03-30 22:29:08.390720] I<br>

                          [glusterfsd-mgmt.c:1838:mgmt_rpc_notify]

                          0-glusterfsd-mgmt: connecting<br>

                          to next volfile server 10.12.130.22<br>

                          [2015-03-30 22:29:11.392015] E<br>

                          [glusterfsd-mgmt.c:1811:mgmt_rpc_notify]

                          0-glusterfsd-mgmt: failed to<br>

                          connect with remote-host: 10.12.130.22

                          (Transport endpoint is not<br>

                          connected)<br>

                          [2015-03-30 22:29:11.392050] I<br>

                          [glusterfsd-mgmt.c:1838:mgmt_rpc_notify]

                          0-glusterfsd-mgmt: connecting<br>

                          to next volfile server 10.12.130.23<br>

                          [2015-03-30 22:29:14.406429] I

                          [dht-shared.c:337:dht_init_regex]<br>

                          0-brain-dht: using regex rsync-hash-regex =

                          ^\.(.+)\.[^.]+$<br>

                          [2015-03-30 22:29:14.408964] I<br>

                          [rpc-clnt.c:969:rpc_clnt_connection_init]

                          0-host-client-2: setting<br>

                          frame-timeout to 60<br>

                          [2015-03-30 22:29:14.409183] I<br>

                          [rpc-clnt.c:969:rpc_clnt_connection_init]

                          0-host-client-1: setting<br>

                          frame-timeout to 60<br>

                          [2015-03-30 22:29:14.409388] I<br>

                          [rpc-clnt.c:969:rpc_clnt_connection_init]

                          0-host-client-0: setting<br>

                          frame-timeout to 60<br>

                          [2015-03-30 22:29:14.409430] I

                          [client.c:2280:notify] 0-host-client-0:<br>

                          parent translators are ready, attempting

                          connect on transport<br>

                          [2015-03-30 22:29:14.409658] I

                          [client.c:2280:notify] 0-host-client-1:<br>

                          parent translators are ready, attempting

                          connect on transport<br>

                          [2015-03-30 22:29:14.409844] I

                          [client.c:2280:notify] 0-host-client-2:<br>

                          parent translators are ready, attempting

                          connect on transport<br>

                          Final graph:<br>

                          <br>

                          ....<br>

                          <br>

                          [2015-03-30 22:29:14.411045] I

                          [client.c:2215:client_rpc_notify]<br>

                          0-host-client-2: disconnected from

                          host-client-2. Client process will<br>

                          keep trying to connect to glusterd until

                          brick&#39;s port is available<br>

                          *[2015-03-30 22:29:14.411063] E [MSGID:

                          108006]<br>

                          [afr-common.c:3591:afr_notify]

                          0-myvolume-replicate-0: All subvolumes<br>

                          are down. Going offline until atleast one of

                          them comes back up.<br>

                          *[2015-03-30 22:29:14.414871] I

                          [fuse-bridge.c:5080:fuse_graph_setup]<br>

                          0-fuse: switched to graph 0<br>

                          [2015-03-30 22:29:14.415003] I

                          [fuse-bridge.c:4009:fuse_init]<br>

                          0-glusterfs-fuse: FUSE inited with protocol

                          versions: glusterfs 7.22<br>

                          kernel 7.17<br>

                          [2015-03-30 22:29:14.415101] I

                          [afr-common.c:3722:afr_local_init]<br>

                          0-myvolume-replicate-0: no subvolumes up<br>

                          [2015-03-30 22:29:14.415215] I

                          [afr-common.c:3722:afr_local_init]<br>

                          0-myvolume-replicate-0: no subvolumes up<br>

                          [2015-03-30 22:29:14.415236] W

                          [fuse-bridge.c:779:fuse_attr_cbk]<br>

                          0-glusterfs-fuse: 2: LOOKUP() / =&gt; -1

                          (Transport endpoint is not<br>

                          connected)<br>

                          [2015-03-30 22:29:14.419007] I

                          [fuse-bridge.c:4921:fuse_thread_proc]<br>

                          0-fuse: unmounting /opt/shared<br>

                          *[2015-03-30 22:29:14.420176] W

                          [glusterfsd.c:1194:cleanup_and_exit]<br>

                          (--&gt; 0-: received signum (15), shutting

                          down*<br>

                          [2015-03-30 22:29:14.420192] I

                          [fuse-bridge.c:5599:fini] 0-fuse:<br>

                          Unmounting &#39;/opt/shared&#39;.<br>

                          <br>

                          <br>

                          _Relevant /etc/fstab entries are:_<br>

                          <br>

                          /dev/xvdb /opt/local xfs

                          defaults,noatime,nodiratime 0 0<br>

                          <br>

                          localhost:/myvolume /opt/shared glusterfs<br>

                          defaults,_netdev,attribute-timeout=0,entry-timeout=0,log-file=/var/log/glusterfs/glusterfs.log,backup-volfile-servers=10.12.130.21:10.12.130.22:10.12.130.23<br>

                          <br>

                          0 0<br>

                          <br>

                          <br>

                          _Volume configuration is:_<br>

                          <br>

                          Volume Name: myvolume<br>

                          Type: Replicate<br>

                          Volume ID: xxxx<br>

                          Status: Started<br>

                          Number of Bricks: 1 x 3 = 3<br>

                          Transport-type: tcp<br>

                          Bricks:<br>

                          Brick1: host1:/opt/local/brick<br>

                          Brick2: host2:/opt/local/brick<br>

                          Brick3: host3:/opt/local/brick<br>

                          Options Reconfigured:<br>

                          storage.health-check-interval: 5<br>

                          network.ping-timeout: 5<br>

                          nfs.disable: on<br>

                          auth.allow: 10.12.130.21,10.12.130.22,10.12.130.23<br>

                          cluster.quorum-type: auto<br>

                          network.frame-timeout: 60<br>

                          <br>

                          <br>

                          I run Debian 7 and the following GlusterFS

                          version 3.6.2-2.<br>

                          <br>

                          While I could together some rc.local type of

                          script which retries to<br>

                          mount the volume for a while until it succeeds

                          or times out I was<br>

                          wondering if there&#39;s a better way to solve

                          this problem?<br>

                          <br>

                          Thank you for your help.<br>

                          <br>

                          Regards,<br>

                          -- <br>

                          Rumen Telbizov<br>

                          Unix Systems Administrator &lt;<a href="http://telbizov.com" target="_blank">http://telbizov.com</a>&gt;<br>

                          <br>

                          <br>

                          _______________________________________________<br>

                          Gluster-users mailing list<br>

                          <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

                          <a href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

                        </blockquote>

                      </blockquote>

                    </blockquote>

                    <br>

                    <br>

                  </blockquote>

                </blockquote>

                <br>

              </div>

            </div>

          </blockquote>

        </div>

        <br>

        <br clear="all">

        <br>

        -- <br>

        <div>

          <div dir="ltr">

            <div><span style="font-family:tahoma,sans-serif">Rumen

                Telbizov</span>

              <div><span style="font-family:tahoma,sans-serif"><a href="http://telbizov.com" target="_blank">Unix Systems Administrator</a></span></div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

  </div></div></div>


</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div><span style="font-family:tahoma,sans-serif">Rumen Telbizov</span><div><span style="font-family:tahoma,sans-serif"><a href="http://telbizov.com" target="_blank">Unix Systems Administrator</a></span></div></div></div></div>

</div>