<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Hi Paf1,<br>

      <br>

          Looks like when you reboot the nodes, glusterd does not start

      up in one node and due to this the node gets disconnected from

      other node(that is what i see from logs). After reboot, once your

      systems are up and running , can you check if glusterd is running

      on all the nodes? Can you please let me know which build of

      gluster are you using ?<br>

      <br>

          For more info please read,

      <a class="moz-txt-link-freetext" href="http://www.gluster.org/pipermail/gluster-users.old/2015-June/022377.html">http://www.gluster.org/pipermail/gluster-users.old/2015-June/022377.html</a><br>

      <br>

      Thanks<br>

      kasturi<br>

      <br>

      On 11/27/2015 10:52 AM, Sahina Bose wrote:<br>

    </div>

    <blockquote cite="mid:5657E879.7030604@redhat.com" type="cite">

      <meta content="text/html; charset=windows-1252"

        http-equiv="Content-Type">

      [+ gluster-users]<br>

      <br>

      <div class="moz-cite-prefix">On 11/26/2015 08:37 PM, <a

          moz-do-not-send="true" class="moz-txt-link-abbreviated"

          href="mailto:paf1@email.cz">paf1@email.cz</a> wrote:<br>

      </div>

      <blockquote cite="mid:56572042.1070503@email.cz" type="cite">

        <meta http-equiv="content-type" content="text/html;

          charset=windows-1252">

        Hello, <br>

        can anybody  help me with this timeouts ??<br>

        Volumes are not active yes ( bricks down )<br>

        <br>

        desc. of gluster bellow ...<br>

        <br>

        <b>/var/log/glusterfs/</b><b>etc-glusterfs-glusterd.vol.log</b><br>

        [2015-11-26 14:44:47.174221] I [MSGID: 106004]

        [glusterd-handler.c:5065:__glusterd_peer_rpc_notify]

        0-management: Peer &lt;1hp1-SAN&gt;

        (&lt;87fc7db8-aba8-41f2-a1cd-b77e83b17436&gt;), in state

        &lt;Peer in Cluster&gt;, has disconnected from glusterd.<br>

        [2015-11-26 14:44:47.174354] W

        [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]

        (--&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)


        [0x7fb7039d44dc]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)


        [0x7fb7039de542]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a)


        [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P1 not held<br>

        [2015-11-26 14:44:47.174444] W

        [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]

        (--&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)


        [0x7fb7039d44dc]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)


        [0x7fb7039de542]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a)


        [0x7fb703a79b4a] ) 0-management: Lock for vol 1HP12-P3 not held<br>

        [2015-11-26 14:44:47.174521] W

        [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]

        (--&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)


        [0x7fb7039d44dc]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)


        [0x7fb7039de542]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a)


        [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P1 not held<br>

        [2015-11-26 14:44:47.174662] W

        [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]

        (--&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c)


        [0x7fb7039d44dc]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162)


        [0x7fb7039de542]

        --&gt;/usr/lib64/glusterfs/3.7.6/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x58a)


        [0x7fb703a79b4a] ) 0-management: Lock for vol 2HP12-P3 not held<br>

        [2015-11-26 14:44:47.174532] W [MSGID: 106118]

        [glusterd-handler.c:5087:__glusterd_peer_rpc_notify]

        0-management: Lock not released for 2HP12-P1<br>

        [2015-11-26 14:44:47.174675] W [MSGID: 106118]

        [glusterd-handler.c:5087:__glusterd_peer_rpc_notify]

        0-management: Lock not released for 2HP12-P3<br>

        [2015-11-26 14:44:49.423334] I [MSGID: 106488]

        [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume]

        0-glusterd: Received get vol req<br>

        The message "I [MSGID: 106488]

        [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume]

        0-glusterd: Received get vol req" repeated 4 times between

        [2015-11-26 14:44:49.423334] and [2015-11-26 14:44:49.429781]<br>

        [2015-11-26 14:44:51.148711] I [MSGID: 106163]

        [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack]

        0-management: using the op-version 30702<br>

        [2015-11-26 14:44:52.177266] W [socket.c:869:__socket_keepalive]

        0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 12,

        Invalid argument<br>

        [2015-11-26 14:44:52.177291] E [socket.c:2965:socket_connect]

        0-management: Failed to set keep-alive: Invalid argument<br>

        [2015-11-26 14:44:53.180426] W [socket.c:869:__socket_keepalive]

        0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 17,

        Invalid argument<br>

        [2015-11-26 14:44:53.180447] E [socket.c:2965:socket_connect]

        0-management: Failed to set keep-alive: Invalid argument<br>

        [2015-11-26 14:44:52.395468] I [MSGID: 106163]

        [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack]

        0-management: using the op-version 30702<br>

        [2015-11-26 14:44:54.851958] I [MSGID: 106488]

        [glusterd-handler.c:1472:__glusterd_handle_cli_get_volume]

        0-glusterd: Received get vol req<br>

        [2015-11-26 14:44:57.183969] W [socket.c:869:__socket_keepalive]

        0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 19,

        Invalid argument<br>

        [2015-11-26 14:44:57.183990] E [socket.c:2965:socket_connect]

        0-management: Failed to set keep-alive: Invalid argument<br>

        <br>

        After volumes creation all works fine ( volumes up ) , but then,

        after several reboots ( yum updates) volumes failed due timeouts

        .<br>

        <br>

        Gluster description:<br>

        <br>

        4 nodes with 4 volumes replica 2 <br>

        oVirt 3.6 - the last<br>

        gluster 3.7.6 - the last <br>

        vdsm 4.17.999 - from git repo<br>

        oVirt - mgmt.nodes 172.16.0.0<br>

        oVirt - bricks 16.0.0.0 ( "SAN" - defined as "gluster" net)<br>

        Network works fine, no lost packets<br>

        <br>

        # gluster volume status <br>

        Staging failed on 2hp1-SAN. Please check log file for details.<br>

        Staging failed on 1hp2-SAN. Please check log file for details.<br>

        Staging failed on 2hp2-SAN. Please check log file for details.<br>

        <br>

        # gluster volume info<br>

        <br>

        Volume Name: 1HP12-P1<br>

        Type: Replicate<br>

        Volume ID: 6991e82c-9745-4203-9b0a-df202060f455<br>

        Status: Started<br>

        Number of Bricks: 1 x 2 = 2<br>

        Transport-type: tcp<br>

        Bricks:<br>

        Brick1: 1hp1-SAN:/STORAGE/p1/G<br>

        Brick2: 1hp2-SAN:/STORAGE/p1/G<br>

        Options Reconfigured:<br>

        performance.readdir-ahead: on<br>

        <br>

        Volume Name: 1HP12-P3<br>

        Type: Replicate<br>

        Volume ID: 8bbdf0cb-f9b9-4733-8388-90487aa70b30<br>

        Status: Started<br>

        Number of Bricks: 1 x 2 = 2<br>

        Transport-type: tcp<br>

        Bricks:<br>

        Brick1: 1hp1-SAN:/STORAGE/p3/G<br>

        Brick2: 1hp2-SAN:/STORAGE/p3/G<br>

        Options Reconfigured:<br>

        performance.readdir-ahead: on<br>

        <br>

        Volume Name: 2HP12-P1<br>

        Type: Replicate<br>

        Volume ID: e2cd5559-f789-4636-b06a-683e43e0d6bb<br>

        Status: Started<br>

        Number of Bricks: 1 x 2 = 2<br>

        Transport-type: tcp<br>

        Bricks:<br>

        Brick1: 2hp1-SAN:/STORAGE/p1/G<br>

        Brick2: 2hp2-SAN:/STORAGE/p1/G<br>

        Options Reconfigured:<br>

        performance.readdir-ahead: on<br>

        <br>

        Volume Name: 2HP12-P3<br>

        Type: Replicate<br>

        Volume ID: b5300c68-10b3-4ebe-9f29-805d3a641702<br>

        Status: Started<br>

        Number of Bricks: 1 x 2 = 2<br>

        Transport-type: tcp<br>

        Bricks:<br>

        Brick1: 2hp1-SAN:/STORAGE/p3/G<br>

        Brick2: 2hp2-SAN:/STORAGE/p3/G<br>

        Options Reconfigured:<br>

        performance.readdir-ahead: on<br>

        <br>

        regs. for any hints<br>

        Paf1<br>

        <br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

Users mailing list

<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a>

</pre>

      </blockquote>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Users@ovirt.org">Users@ovirt.org</a>

<a class="moz-txt-link-freetext" href="http://lists.ovirt.org/mailman/listinfo/users">http://lists.ovirt.org/mailman/listinfo/users</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>