<div dir="ltr"><div class="gmail_default" style="font-family:tahoma,sans-serif">Any update here? Can I hope to see a fix incorporated into the release of 3.6.3 ?<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 31, 2015 at 10:53 AM, Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><span class="">
<br>
<div>On 03/31/2015 10:47 PM, Rumen Telbizov
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:tahoma,sans-serif">Pranith
and Atin,<br>
<br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Thank
you for looking into this and confirming it's a bug. Please
log the bug yourself since I am not familiar with the
project's bug-tracking system.<br>
<br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Assessing
its severity and the fact that this effectively stops the
cluster from functioning properly after boot, what do you
think would be the timeline for fixing this issue? What
version do you expect to see this fixed in?<br>
<br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">In
the meantime, is there another workaround that you might
suggest besides running a secondary mount later after the boot
is over?<br>
</div>
</div>
</blockquote></span>
Adding glusterd maintainers to the thread: +kaushal, +krishnan<br>
I will let them answer your questions.<span class="HOEnZb"><font color="#888888"><br>
<br>
Pranith</font></span><div><div class="h5"><br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Thank
you again for your help,<br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Rumen
Telbizov<br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
<br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Mar 31, 2015 at 2:53 AM,
Pranith Kumar Karampuri <span dir="ltr"><<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><br>
On 03/31/2015 01:55 PM, Atin Mukherjee wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
On 03/31/2015 01:03 PM, Pranith Kumar Karampuri wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 03/31/2015 12:53 PM, Atin Mukherjee wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 03/31/2015 12:27 PM, Pranith Kumar Karampuri
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Atin,<br>
Could it be because bricks are started
with PROC_START_NO_WAIT?<br>
</blockquote>
That's the correct analysis Pranith. Mount was
attempted before the<br>
bricks were started. If we can have a time lag in
some seconds between<br>
mount and volume start the problem will go away.<br>
</blockquote>
Atin,<br>
I think one way to solve this issue is to
start the bricks with<br>
NO_WAIT so that we can handle pmap-signin but wait for
the pmap-signins<br>
to complete before responding to cli/completing
'init'?<br>
</blockquote>
Logically it should solve the problem. We need to think
around it more<br>
from the existing design perspective.<br>
</blockquote>
</span>
Rumen,<br>
Feel free to log a bug. This should be fixed in later
release. We can raise the bug and work it as well if you
prefer it this way.<span><font color="#888888"><br>
<br>
Pranith</font></span>
<div>
<div><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
~Atin<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Pranith<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Pranith<br>
On 03/31/2015 04:41 AM, Rumen Telbizov wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hello everyone,<br>
<br>
I have a problem that I am trying to resolve
and not sure which way to<br>
go so here I am asking for your advise.<br>
<br>
What it comes down to is that upon initial
boot of all my GlusterFS<br>
machines the shared volume doesn't get
mounted. Nevertheless the<br>
volume successfully created and started and
further attempts to mount<br>
it manually succeed. I suspect what's
happening is that gluster<br>
processes/bricks/etc haven't fully started at
the time the /etc/fstab<br>
entry is read and the initial mount attempt is
being made. Again, by<br>
the time I log in and run a mount -a -- the
volume mounts without any<br>
issues.<br>
<br>
_Details from the logs:_<br>
<br>
[2015-03-30 22:29:04.381918] I [MSGID: 100030]<br>
[glusterfsd.c:2018:main]
0-/usr/sbin/glusterfs: Started running<br>
/usr/sbin/glusterfs version 3.6.2 (args:
/usr/sbin/glusterfs<br>
--log-file=/var/log/glusterfs/glusterfs.log
--attribute-timeout=0<br>
--entry-timeout=0 --volfile-server=localhost<br>
--volfile-server=10.12.130.21
--volfile-server=10.12.130.22<br>
--volfile-server=10.12.130.23
--volfile-id=/myvolume /opt/shared)<br>
[2015-03-30 22:29:04.394913] E
[socket.c:2267:socket_connect_finish]<br>
0-glusterfs: connection to <a href="http://127.0.0.1:24007" target="_blank">127.0.0.1:24007</a> <<a href="http://127.0.0.1:24007" target="_blank">http://127.0.0.1:24007</a>><br>
failed (Connection refused)<br>
[2015-03-30 22:29:04.394950] E<br>
[glusterfsd-mgmt.c:1811:mgmt_rpc_notify]
0-glusterfsd-mgmt: failed to<br>
connect with remote-host: localhost (Transport
endpoint is not<br>
connected)<br>
[2015-03-30 22:29:04.394964] I<br>
[glusterfsd-mgmt.c:1838:mgmt_rpc_notify]
0-glusterfsd-mgmt: connecting<br>
to next volfile server 10.12.130.21<br>
[2015-03-30 22:29:08.390687] E<br>
[glusterfsd-mgmt.c:1811:mgmt_rpc_notify]
0-glusterfsd-mgmt: failed to<br>
connect with remote-host: 10.12.130.21
(Transport endpoint is not<br>
connected)<br>
[2015-03-30 22:29:08.390720] I<br>
[glusterfsd-mgmt.c:1838:mgmt_rpc_notify]
0-glusterfsd-mgmt: connecting<br>
to next volfile server 10.12.130.22<br>
[2015-03-30 22:29:11.392015] E<br>
[glusterfsd-mgmt.c:1811:mgmt_rpc_notify]
0-glusterfsd-mgmt: failed to<br>
connect with remote-host: 10.12.130.22
(Transport endpoint is not<br>
connected)<br>
[2015-03-30 22:29:11.392050] I<br>
[glusterfsd-mgmt.c:1838:mgmt_rpc_notify]
0-glusterfsd-mgmt: connecting<br>
to next volfile server 10.12.130.23<br>
[2015-03-30 22:29:14.406429] I
[dht-shared.c:337:dht_init_regex]<br>
0-brain-dht: using regex rsync-hash-regex =
^\.(.+)\.[^.]+$<br>
[2015-03-30 22:29:14.408964] I<br>
[rpc-clnt.c:969:rpc_clnt_connection_init]
0-host-client-2: setting<br>
frame-timeout to 60<br>
[2015-03-30 22:29:14.409183] I<br>
[rpc-clnt.c:969:rpc_clnt_connection_init]
0-host-client-1: setting<br>
frame-timeout to 60<br>
[2015-03-30 22:29:14.409388] I<br>
[rpc-clnt.c:969:rpc_clnt_connection_init]
0-host-client-0: setting<br>
frame-timeout to 60<br>
[2015-03-30 22:29:14.409430] I
[client.c:2280:notify] 0-host-client-0:<br>
parent translators are ready, attempting
connect on transport<br>
[2015-03-30 22:29:14.409658] I
[client.c:2280:notify] 0-host-client-1:<br>
parent translators are ready, attempting
connect on transport<br>
[2015-03-30 22:29:14.409844] I
[client.c:2280:notify] 0-host-client-2:<br>
parent translators are ready, attempting
connect on transport<br>
Final graph:<br>
<br>
....<br>
<br>
[2015-03-30 22:29:14.411045] I
[client.c:2215:client_rpc_notify]<br>
0-host-client-2: disconnected from
host-client-2. Client process will<br>
keep trying to connect to glusterd until
brick's port is available<br>
*[2015-03-30 22:29:14.411063] E [MSGID:
108006]<br>
[afr-common.c:3591:afr_notify]
0-myvolume-replicate-0: All subvolumes<br>
are down. Going offline until atleast one of
them comes back up.<br>
*[2015-03-30 22:29:14.414871] I
[fuse-bridge.c:5080:fuse_graph_setup]<br>
0-fuse: switched to graph 0<br>
[2015-03-30 22:29:14.415003] I
[fuse-bridge.c:4009:fuse_init]<br>
0-glusterfs-fuse: FUSE inited with protocol
versions: glusterfs 7.22<br>
kernel 7.17<br>
[2015-03-30 22:29:14.415101] I
[afr-common.c:3722:afr_local_init]<br>
0-myvolume-replicate-0: no subvolumes up<br>
[2015-03-30 22:29:14.415215] I
[afr-common.c:3722:afr_local_init]<br>
0-myvolume-replicate-0: no subvolumes up<br>
[2015-03-30 22:29:14.415236] W
[fuse-bridge.c:779:fuse_attr_cbk]<br>
0-glusterfs-fuse: 2: LOOKUP() / => -1
(Transport endpoint is not<br>
connected)<br>
[2015-03-30 22:29:14.419007] I
[fuse-bridge.c:4921:fuse_thread_proc]<br>
0-fuse: unmounting /opt/shared<br>
*[2015-03-30 22:29:14.420176] W
[glusterfsd.c:1194:cleanup_and_exit]<br>
(--> 0-: received signum (15), shutting
down*<br>
[2015-03-30 22:29:14.420192] I
[fuse-bridge.c:5599:fini] 0-fuse:<br>
Unmounting '/opt/shared'.<br>
<br>
<br>
_Relevant /etc/fstab entries are:_<br>
<br>
/dev/xvdb /opt/local xfs
defaults,noatime,nodiratime 0 0<br>
<br>
localhost:/myvolume /opt/shared glusterfs<br>
defaults,_netdev,attribute-timeout=0,entry-timeout=0,log-file=/var/log/glusterfs/glusterfs.log,backup-volfile-servers=10.12.130.21:10.12.130.22:10.12.130.23<br>
<br>
0 0<br>
<br>
<br>
_Volume configuration is:_<br>
<br>
Volume Name: myvolume<br>
Type: Replicate<br>
Volume ID: xxxx<br>
Status: Started<br>
Number of Bricks: 1 x 3 = 3<br>
Transport-type: tcp<br>
Bricks:<br>
Brick1: host1:/opt/local/brick<br>
Brick2: host2:/opt/local/brick<br>
Brick3: host3:/opt/local/brick<br>
Options Reconfigured:<br>
storage.health-check-interval: 5<br>
network.ping-timeout: 5<br>
nfs.disable: on<br>
auth.allow: 10.12.130.21,10.12.130.22,10.12.130.23<br>
cluster.quorum-type: auto<br>
network.frame-timeout: 60<br>
<br>
<br>
I run Debian 7 and the following GlusterFS
version 3.6.2-2.<br>
<br>
While I could together some rc.local type of
script which retries to<br>
mount the volume for a while until it succeeds
or times out I was<br>
wondering if there's a better way to solve
this problem?<br>
<br>
Thank you for your help.<br>
<br>
Regards,<br>
-- <br>
Rumen Telbizov<br>
Unix Systems Administrator <<a href="http://telbizov.com" target="_blank">http://telbizov.com</a>><br>
<br>
<br>
_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
</blockquote>
</blockquote>
</blockquote>
<br>
<br>
</blockquote>
</blockquote>
<br>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
<div>
<div dir="ltr">
<div><span style="font-family:tahoma,sans-serif">Rumen
Telbizov</span>
<div><span style="font-family:tahoma,sans-serif"><a href="http://telbizov.com" target="_blank">Unix Systems Administrator</a></span></div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div><span style="font-family:tahoma,sans-serif">Rumen Telbizov</span><div><span style="font-family:tahoma,sans-serif"><a href="http://telbizov.com" target="_blank">Unix Systems Administrator</a></span></div></div></div></div>
</div>