<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 10/09/2015 05:20 PM, Adrian
Gruntkowski wrote:<br>
</div>
<blockquote
cite="mid:CAE_wqnPUtXdHzs5s3JSJarqBHeyvMyJbwXa4TDAWFNQ6oH=TLQ@mail.gmail.com"
type="cite">
<div dir="ltr">Hello everyone,
<div><br>
</div>
<div>
<div>I'm trying to setup a quorum on my cluster and hit an
issue where taking down one node blocks writing </div>
<div>on the affected volume. The thing is, I have 3 servers
where 2 volumes are setup in a cross-over manner, </div>
<div>like this: </div>
<div><br>
</div>
<div>[Server1: vol1]<--->[Server2: vol1
vol2]<--->[Server3: vol2]. </div>
<div><br>
</div>
<div>The trusted pool contains 3 servers so AFAIK taking down,
for example, "Server3" shouldn't take down "vol2", </div>
<div>but it does with "quorum not met" message in the logs</div>
<div><br>
</div>
<div>
<div>[2015-10-09 11:12:55.386736] C
[rpc-clnt-ping.c:161:rpc_clnt_ping_timer_expired]
0-system_mail1-client-1: server <a moz-do-not-send="true"
href="http://172.16.11.112:49152">172.16.11.112:49152</a>
has not responded in the last 42 seconds, disconnecting.</div>
<div>[2015-10-09 11:12:55.387213] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f950b98340a]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f950b74e4df]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f950b74e5fe]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f950b74fdcc]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f950b750578]
))))) 0-system_mail1-client-1: forced unwinding frame
type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-10-09
11:12:00.087425 (xid=0x517)</div>
<div>[2015-10-09 11:12:55.387238] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk]
0-system_mail1-client-1: remote operation failed. Path: /
(00000000-0000-0000-0000-000000000001) [Transport endpoint
is not connected]</div>
<div>[2015-10-09 11:12:55.387429] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f950b98340a]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f950b74e4df]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f950b74e5fe]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f950b74fdcc]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f950b750578]
))))) 0-system_mail1-client-1: forced unwinding frame
type(GlusterFS 3.3) op(LOOKUP(27)) called at 2015-10-09
11:12:07.374032 (xid=0x518)</div>
<div>[2015-10-09 11:12:55.387591] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x19a)[0x7f950b98340a]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f950b74e4df]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f950b74e5fe]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x9c)[0x7f950b74fdcc]
(-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x48)[0x7f950b750578]
))))) 0-system_mail1-client-1: forced unwinding frame
type(GF-DUMP) op(NULL(2)) called at 2015-10-09
11:12:13.381487 (xid=0x519)</div>
<div>[2015-10-09 11:12:55.387614] W
[rpc-clnt-ping.c:204:rpc_clnt_ping_cbk]
0-system_mail1-client-1: socket disconnected</div>
<div>[2015-10-09 11:12:55.387624] I [MSGID: 114018]
[client.c:2042:client_rpc_notify] 0-system_mail1-client-1:
disconnected from system_mail1-client-1. Client process
will keep trying to connect to glusterd until brick's port
is available</div>
<div>[2015-10-09 11:12:55.387635] W [MSGID: 108001]
[afr-common.c:4043:afr_notify] 0-system_mail1-replicate-0:
Client-quorum is not met</div>
<div>[2015-10-09 11:12:55.387959] I
[socket.c:3362:socket_submit_request]
0-system_mail1-client-1: not connected (priv->connected
= 0)</div>
</div>
<div>
<div>[2015-10-09 11:12:55.387972] W
[rpc-clnt.c:1571:rpc_clnt_submit] 0-system_mail1-client-1:
failed to submit rpc-request (XID: 0x51a Program:
GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport
(system_mail1-client-1)</div>
<div>[2015-10-09 11:12:55.387982] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk]
0-system_mail1-client-1: remote operation failed. Path:
/images (a63d0ff2-cb42-4cee-9df7-477459539788) [Transport
endpoint is not connected]</div>
<div>[2015-10-09 11:12:55.388653] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk]
0-system_mail1-client-1: remote operation failed. Path:
(null) (00000000-0000-0000-0000-000000000000) [Transport
endpoint is not connected]</div>
<div>[2015-10-09 11:13:03.245909] I [MSGID: 108031]
[afr-common.c:1745:afr_local_discovery_cbk]
0-system_mail1-replicate-0: selecting local read_child
system_mail1-client-0</div>
<div>[2015-10-09 11:13:04.734547] W
[fuse-bridge.c:1937:fuse_create_cbk] 0-glusterfs-fuse:
1253: /x => -1 (Read-only file system)</div>
<div>[2015-10-09 11:13:10.419069] E
[socket.c:2332:socket_connect_finish]
0-system_mail1-client-1: connection to <a
moz-do-not-send="true" href="http://172.16.11.112:24007">172.16.11.112:24007</a>
failed (Connection timed out)</div>
<div>[2015-10-09 11:12:55.387447] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk]
0-system_mail1-client-1: remote operation failed. Path: /
(00000000-0000-0000-0000-000000000001) [Transport endpoint
is not connected]</div>
</div>
<div><br>
</div>
<div>(Another weird thing is that glusterfs version reported
in logs is 3.7.3, when the Debian packages installed </div>
<div>on my system are for 3.7.3 - don't know if it's meant to
be that way).</div>
<div><br>
</div>
<div>Below is the output of "gluster volume info" on one of
the servers (there are 4 volumes in my actual setup):</div>
<div><br>
</div>
<div>
<div>Volume Name: data_mail1</div>
<div>Type: Replicate</div>
<div>Volume ID: c2833dbe-aaa5-49d0-91d3-5abb44efb48c</div>
<div>Status: Started</div>
<div>Number of Bricks: 1 x 2 = 2</div>
<div>Transport-type: tcp</div>
<div>Bricks:</div>
<div>Brick1: cluster-rep:/GFS/data/mail1</div>
<div>Brick2: mail-rep:/GFS/data/mail1</div>
<div>Options Reconfigured:</div>
<div>cluster.quorum-count: 2</div>
<div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
<div>performance.readdir-ahead: on</div>
<div>cluster.server-quorum-type: server</div>
<div>cluster.quorum-type: fixed</div>
<div>cluster.server-quorum-ratio: 51%</div>
<div> </div>
<div>Volume Name: data_www1</div>
<div>Type: Replicate</div>
<div>Volume ID: 385a7052-3ab5-42c2-93bc-6e10c4e7c0f1</div>
<div>Status: Started</div>
<div>Number of Bricks: 1 x 2 = 2</div>
<div>Transport-type: tcp</div>
<div>Bricks:</div>
<div>Brick1: cluster-rep:/GFS/data/www1</div>
<div>Brick2: web-rep:/GFS/data/www1</div>
<div>Options Reconfigured:</div>
<div>cluster.quorum-count: 2</div>
<div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
<div>performance.readdir-ahead: on</div>
<div>cluster.server-quorum-type: server</div>
<div>cluster.quorum-type: fixed</div>
<div>cluster.server-quorum-ratio: 51%</div>
<div> </div>
<div>Volume Name: system_mail1</div>
<div>Type: Replicate</div>
<div>Volume ID: 82dc0617-d855-4bf0-b5e5-c4147ca15779</div>
<div>Status: Started</div>
<div>Number of Bricks: 1 x 2 = 2</div>
<div>Transport-type: tcp</div>
<div>Bricks:</div>
<div>Brick1: cluster-rep:/GFS/system/mail1</div>
<div>Brick2: mail-rep:/GFS/system/mail1</div>
<div>Options Reconfigured:</div>
<div>cluster.quorum-count: 2</div>
<div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
<div>performance.readdir-ahead: on</div>
<div>cluster.server-quorum-type: server</div>
<div>cluster.quorum-type: fixed</div>
<div>cluster.server-quorum-ratio: 51%</div>
<div> </div>
<div>Volume Name: system_www1</div>
<div>Type: Replicate</div>
<div>Volume ID: 83868eb5-7b32-4e80-882c-e83361b267b9</div>
<div>Status: Started</div>
<div>Number of Bricks: 1 x 2 = 2</div>
<div>Transport-type: tcp</div>
<div>Bricks:</div>
<div>Brick1: cluster-rep:/GFS/system/www1</div>
<div>Brick2: web-rep:/GFS/system/www1</div>
<div>Options Reconfigured:</div>
<div>cluster.quorum-count: 2</div>
<div>auth.allow: 127.0.0.1,172.16.11.*,172.16.12.*</div>
<div>performance.readdir-ahead: on</div>
<div>cluster.server-quorum-type: server</div>
<div>cluster.quorum-type: fixed</div>
<div>cluster.server-quorum-ratio: 51%</div>
</div>
<div><br>
</div>
<div>I have also tried switching between "cluster.qourum-type"
= fixed|auto, with the same result.</div>
<div><br>
</div>
</div>
</div>
</blockquote>
<br>
Client-quorum (cluster.quorum-type and cluster.quorum-count) options
apply to AFR (the replicate translator).<br>
You have set it to 'fixed' with a count of '2'. This means that both
bricks of the volume(s) need to be online to meet quorum. If only
one brick is up, the volume becomes read-only. For replica 2
volumes, it doesn't make much sense to enable client-quorum without
losing high availability (i.e volume still being writable when only
one brick is up).<br>
<br>
<br>
<blockquote
cite="mid:CAE_wqnPUtXdHzs5s3JSJarqBHeyvMyJbwXa4TDAWFNQ6oH=TLQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>What am I missing? Is there a way to add a "fake brick"
to meet the quorum requirements without </div>
<div>holding 3rd replica of the data?</div>
</div>
</div>
</blockquote>
You can try using arbiter-volumes [1] that are a good compromise
between replica-2 and normal replica-3 volumes.<br>
Also server-quorum (cluster.server-quorum-type and
cluster.server-quorum-ratio) doesn't help much in avoiding data
split-brains, so IMO, you might as well disable it. <br>
-Ravi<br>
<br>
[1]
<a class="moz-txt-link-freetext" href="https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-arbiter-volumes.md">https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-arbiter-volumes.md</a><br>
<br>
<blockquote
cite="mid:CAE_wqnPUtXdHzs5s3JSJarqBHeyvMyJbwXa4TDAWFNQ6oH=TLQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>--</div>
<div>Regards,</div>
<div>Adrian</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</body>
</html>