<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    <br>
    <div class="moz-cite-prefix">On 10/09/2015 05:20 PM, Adrian
      Gruntkowski wrote:<br>
    </div>
    <blockquote
cite="mid:CAE_wqnPUtXdHzs5s3JSJarqBHeyvMyJbwXa4TDAWFNQ6oH=TLQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">Hello everyone,
        <div><br>
        </div>
        <div>
          <div>I'm trying to setup a quorum on my cluster and hit an
            issue where taking down one node blocks writing </div>
          <div>on the affected volume. The thing is, I have 3 servers
            where 2 volumes are setup in a cross-over manner, </div>
          <div>like this: </div>
          <div><br>
          </div>
          <div>[Server1: vol1]&lt;---&gt;[Server2: vol1
            vol2]&lt;---&gt;[Server3: vol2]. </div>
          <div><br>
          </div>
          <div>The trusted pool contains 3 servers so AFAIK taking down,
            for example, "Server3" shouldn't take down "vol2", </div>
          <div>but it does with "quorum not met" message in the logs</div>
          <div><br>
          </div>
          <div>
            <div>[2015-10-09 11:12:55.386736] C
              [rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired]
              0-system_mail1-client-1: server <a moz-do-not-send="true"
                href="http://172.16.11.112:49152">172.16.11.112:49152</a>
              has not responded in the last 42 seconds, disconnecting.</div>
            <div>[2015-10-09 11:12:55.387213] E
              [rpc-clnt.c:362:saved_frames_unwind] (--&gt;
              /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f950b98340a]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f950b74e4df]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f950b74e5fe]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f950b74fdcc]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f950b750578]
              ))))) 0-system_mail1-client-1: forced unwinding frame
              type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-10-09
              11:12:00.087425 (xid=0x517)</div>
            <div>[2015-10-09 11:12:55.387238] W [MSGID: 114031]
              [client-rpc-fops.c:2971:client3_3_lookup_cbk]
              0-system_mail1-client-1: remote operation failed. Path: /
              (00000000-0000-0000-0000-000000000001) [Transport endpoint
              is not connected]</div>
            <div>[2015-10-09 11:12:55.387429] E
              [rpc-clnt.c:362:saved_frames_unwind] (--&gt;
              /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f950b98340a]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f950b74e4df]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f950b74e5fe]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f950b74fdcc]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f950b750578]
              ))))) 0-system_mail1-client-1: forced unwinding frame
              type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-10-09
              11:12:07.374032 (xid=0x518)</div>
            <div>[2015-10-09 11:12:55.387591] E
              [rpc-clnt.c:362:saved_frames_unwind] (--&gt;
              /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f950b98340a]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f950b74e4df]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f950b74e5fe]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f950b74fdcc]
              (--&gt;
              /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f950b750578]
              ))))) 0-system_mail1-client-1: forced unwinding frame
              type(GF-DUMP) op(NULL(2)) called at 2015-10-09
              11:12:13.381487 (xid=0x519)</div>
            <div>[2015-10-09 11:12:55.387614] W
              [rpc-clnt-ping.c:204:rpc_clnt_ping_cbk]
              0-system_mail1-client-1: socket disconnected</div>
            <div>[2015-10-09 11:12:55.387624] I [MSGID: 114018]
              [client.c:2042:client_rpc_notify] 0-system_mail1-client-1:
              disconnected from system_mail1-client-1. Client process
              will keep trying to connect to glusterd until brick's port
              is available</div>
            <div>[2015-10-09 11:12:55.387635] W [MSGID: 108001]
              [afr-common.c:4043:afr_notify] 0-system_mail1-replicate-0:
              Client-quorum is not met</div>
            <div>[2015-10-09 11:12:55.387959] I
              [socket.c:3362:socket_submit_request]
              0-system_mail1-client-1: not connected (priv-&gt;connected
              = 0)</div>
          </div>
          <div>
            <div>[2015-10-09 11:12:55.387972] W
              [rpc-clnt.c:1571:rpc_clnt_submit] 0-system_mail1-client-1:
              failed to submit rpc-request (XID: 0x51a Program:
              GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport
              (system_mail1-client-1)</div>
            <div>[2015-10-09 11:12:55.387982] W [MSGID: 114031]
              [client-rpc-fops.c:2971:client3_3_lookup_cbk]
              0-system_mail1-client-1: remote operation failed. Path:
              /images (a63d0ff2-cb42-4cee-9df7-477459539788) [Transport
              endpoint is not connected]</div>
            <div>[2015-10-09 11:12:55.388653] W [MSGID: 114031]
              [client-rpc-fops.c:2971:client3_3_lookup_cbk]
              0-system_mail1-client-1: remote operation failed. Path:
              (null) (00000000-0000-0000-0000-000000000000) [Transport
              endpoint is not connected]</div>
            <div>[2015-10-09 11:13:03.245909] I [MSGID: 108031]
              [afr-common.c:1745:afr_local_discovery_cbk]
              0-system_mail1-replicate-0: selecting local read_child
              system_mail1-client-0</div>
            <div>[2015-10-09 11:13:04.734547] W
              [fuse-bridge.c:1937:fuse_create_cbk] 0-glusterfs-fuse:
              1253: /x =&gt; -1 (Read-only file system)</div>
            <div>[2015-10-09 11:13:10.419069] E
              [socket.c:2332:socket_connect_finish]
              0-system_mail1-client-1: connection to <a
                moz-do-not-send="true" href="http://172.16.11.112:24007">172.16.11.112:24007</a>
              failed (Connection timed out)</div>
            <div>[2015-10-09 11:12:55.387447] W [MSGID: 114031]
              [client-rpc-fops.c:2971:client3_3_lookup_cbk]
              0-system_mail1-client-1: remote operation failed. Path: /
              (00000000-0000-0000-0000-000000000001) [Transport endpoint
              is not connected]</div>
          </div>
          <div><br>
          </div>
          <div>(Another weird thing is that glusterfs version reported
            in logs is 3.7.3, when the Debian packages installed </div>
          <div>on my system are for 3.7.3 - don't know if it's meant to
            be that way).</div>
          <div><br>
          </div>
          <div>Below is the output of "gluster volume info" on one of
            the servers (there are 4 volumes in my actual setup):</div>
          <div><br>
          </div>
          <div>
            <div>Volume Name: data_mail1</div>
            <div>Type: Replicate</div>
            <div>Volume ID: c2833dbe-aaa5-49d0-91d3-5abb44efb48c</div>
            <div>Status: Started</div>
            <div>Number of Bricks: 1 x 2 = 2</div>
            <div>Transport-type: tcp</div>
            <div>Bricks:</div>
            <div>Brick1: cluster-rep:/GFS/data/mail1</div>
            <div>Brick2: mail-rep:/GFS/data/mail1</div>
            <div>Options Reconfigured:</div>
            <div>cluster.quorum-count: 2</div>
            <div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
            <div>performance.readdir-ahead: on</div>
            <div>cluster.server-quorum-type: server</div>
            <div>cluster.quorum-type: fixed</div>
            <div>cluster.server-quorum-ratio: 51%</div>
            <div> </div>
            <div>Volume Name: data_www1</div>
            <div>Type: Replicate</div>
            <div>Volume ID: 385a7052-3ab5-42c2-93bc-6e10c4e7c0f1</div>
            <div>Status: Started</div>
            <div>Number of Bricks: 1 x 2 = 2</div>
            <div>Transport-type: tcp</div>
            <div>Bricks:</div>
            <div>Brick1: cluster-rep:/GFS/data/www1</div>
            <div>Brick2: web-rep:/GFS/data/www1</div>
            <div>Options Reconfigured:</div>
            <div>cluster.quorum-count: 2</div>
            <div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
            <div>performance.readdir-ahead: on</div>
            <div>cluster.server-quorum-type: server</div>
            <div>cluster.quorum-type: fixed</div>
            <div>cluster.server-quorum-ratio: 51%</div>
            <div> </div>
            <div>Volume Name: system_mail1</div>
            <div>Type: Replicate</div>
            <div>Volume ID: 82dc0617-d855-4bf0-b5e5-c4147ca15779</div>
            <div>Status: Started</div>
            <div>Number of Bricks: 1 x 2 = 2</div>
            <div>Transport-type: tcp</div>
            <div>Bricks:</div>
            <div>Brick1: cluster-rep:/GFS/system/mail1</div>
            <div>Brick2: mail-rep:/GFS/system/mail1</div>
            <div>Options Reconfigured:</div>
            <div>cluster.quorum-count: 2</div>
            <div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
            <div>performance.readdir-ahead: on</div>
            <div>cluster.server-quorum-type: server</div>
            <div>cluster.quorum-type: fixed</div>
            <div>cluster.server-quorum-ratio: 51%</div>
            <div> </div>
            <div>Volume Name: system_www1</div>
            <div>Type: Replicate</div>
            <div>Volume ID: 83868eb5-7b32-4e80-882c-e83361b267b9</div>
            <div>Status: Started</div>
            <div>Number of Bricks: 1 x 2 = 2</div>
            <div>Transport-type: tcp</div>
            <div>Bricks:</div>
            <div>Brick1: cluster-rep:/GFS/system/www1</div>
            <div>Brick2: web-rep:/GFS/system/www1</div>
            <div>Options Reconfigured:</div>
            <div>cluster.quorum-count: 2</div>
            <div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
            <div>performance.readdir-ahead: on</div>
            <div>cluster.server-quorum-type: server</div>
            <div>cluster.quorum-type: fixed</div>
            <div>cluster.server-quorum-ratio: 51%</div>
          </div>
          <div><br>
          </div>
          <div>I have also tried switching between "cluster.qourum-type"
            = fixed|auto, with the same result.</div>
          <div><br>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Client-quorum (cluster.quorum-type and cluster.quorum-count) options
    apply to AFR (the replicate translator).<br>
    You have set it to 'fixed' with a count of '2'. This means that both
    bricks of the volume(s) need to be online to meet quorum. If only
    one brick is up, the volume becomes read-only.  For replica 2
    volumes, it doesn't make much sense to enable client-quorum without
    losing high availability (i.e volume still being writable when only
    one brick is up).<br>
    <br>
    <br>
    <blockquote
cite="mid:CAE_wqnPUtXdHzs5s3JSJarqBHeyvMyJbwXa4TDAWFNQ6oH=TLQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div>What am I missing? Is there a way to add a "fake brick"
            to meet the quorum requirements without </div>
          <div>holding 3rd replica of the data?</div>
        </div>
      </div>
    </blockquote>
    You can try using arbiter-volumes [1] that are a good compromise
    between replica-2 and normal replica-3 volumes.<br>
    Also  server-quorum (cluster.server-quorum-type and
    cluster.server-quorum-ratio) doesn't help much in avoiding data
    split-brains, so IMO, you might as well disable it. <br>
    -Ravi<br>
    <br>
    [1]
<a class="moz-txt-link-freetext" href="https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-arbiter-volumes.md">https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-arbiter-volumes.md</a><br>
    <br>
    <blockquote
cite="mid:CAE_wqnPUtXdHzs5s3JSJarqBHeyvMyJbwXa4TDAWFNQ6oH=TLQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>--</div>
        <div>Regards,</div>
        <div>Adrian</div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
    </blockquote>
    <br>
  </body>
</html>