<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <br>
    <div class="moz-cite-prefix">On 10/28/2015 04:27 PM, Adrian
      Gruntkowski wrote:<br>
    </div>
    <blockquote
cite="mid:CAE_wqnMnjf4zw3o+eOgOo7PO_eRDxM6n29dONpgc30-_1fgCew@mail.gmail.com"
      type="cite">
      <div dir="ltr">Hello Pranith,
        <div><br>
        </div>
        <div>Thank you for prompt reaction. I didn't get back to this
          until now, because I had other problems to deal with.</div>
        <div><br>
        </div>
        <div>Are there chances that it will get released this or next
          month? If not, I will probably have to resort to compiling on
          my own.</div>
      </div>
    </blockquote>
    I am planning to get this in for 3.7.6 which is to be released by
    end of this month. I guess in 4-5 days :-). I will update you<br>
    <br>
    Pranith<br>
    <blockquote
cite="mid:CAE_wqnMnjf4zw3o+eOgOo7PO_eRDxM6n29dONpgc30-_1fgCew@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>Regards,</div>
        <div>Adrian</div>
        <div><br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">2015-10-26 12:37 GMT+01:00 Pranith
          Kumar Karampuri <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000">
              <div>
                <div class="h5"> <br>
                  <br>
                  <div>On 10/23/2015 10:10 AM, Ravishankar N wrote:<br>
                  </div>
                  <blockquote type="cite"> <br>
                    <br>
                    <div>On 10/21/2015 05:55 PM, Adrian Gruntkowski
                      wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">Hello,<br>
                        <br>
                        I'm trying to track down a problem with my setup
                        (version 3.7.3 on Debian stable).<br>
                        <br>
                        I have a couple of volumes setup in 3-node
                        configuration with 1 brick as an arbiter for
                        each. <br>
                        <br>
                        There are 4 volumes set up in cross-over across
                        3 physical servers, like this:<br>
                        <br>
                        <br>
                        <br>
                                   
                         -------------------------------------&gt;[
                        GigabitEthernet switch
                        ]&lt;--------------------------<br>
                                     |                                  
                                     ^                                  
                             |<br>
                                     |                                  
                                     |                                  
                             |<br>
                                     V                                  
                                     V                                  
                             V<br>
                        /-------------------------- \                  
                        /-------------------------- \            
                        /-------------------------- \<br>
                        | web-rep                   |                  
                        | cluster-rep               |             |
                        mail-rep                  |<br>
                        |                           |                  
                        |                           |             |    
                                              |<br>
                        | vols:                     |                  
                        | vols:                     |             |
                        vols:                     |<br>
                        | system_www1               |                  
                        | system_www1               |             |
                        system_www1(arbiter)      |<br>
                        | data_www1                 |                  
                        | data_www1                 |             |
                        data_www1(arbiter)        |<br>
                        | system_mail1(arbiter)     |                  
                        | system_mail1              |             |
                        system_mail1              |<br>
                        | data_mail1(arbiter)       |                  
                        | data_mail1                |             |
                        data_mail1                |<br>
                        \---------------------------/                  
                        \---------------------------/            
                        \---------------------------/<br>
                        <br>
                        <br>
                        Now, after a fresh boot-up, everything seems to
                        be running fine.<br>
                        Then I start copying big files (KVM disk images)
                        from local disk to gluster mounts.<br>
                        In the beginning it seems to be running fine
                        (although iowait seems go so high that it clogs
                        up io operations<br>
                        at some moments, but that's an issue for later).
                        After some time the transfer freezes, then<br>
                        after some (long) time, it advances in a short
                        burst to freeze again. Another interesting thing
                        is that<br>
                        I see constant flow of the network traffic on
                        interfaces dedicated to gluster, even when
                        there's a "freeze".<br>
                        <br>
                        I have done "gluster volume statedump" at that
                        time of transfer (file is copied from local disk
                        on cluster-rep<br>
                        onto local mount of "system_www1" volume). I've
                        observer a following section in the dump for
                        cluster-rep node:<br>
                        <br>
                        [xlator.features.locks.system_www1-locks.inode]<br>
                        path=/images/101/vm-101-disk-1.qcow2<br>
                        mandatory=0<br>
                        inodelk-count=12<br>
lock-dump.domain.domain=system_www1-replicate-0:self-heal<br>
                        inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
                        start=0, len=0, pid = 18446744073709551610,
                        owner=c811600cd67f0000, client=0x7fbe100df280,
                        connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0,


                        granted at 2015-10-21 11:36:22<br>
                        lock-dump.domain.domain=system_www1-replicate-0<br>
                        inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0,
                        start=2195849216, len=131072, pid =
                        18446744073709551610, owner=c811600cd67f0000,
                        client=0x7fbe100df280,
                        connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0,


                        granted at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[1](ACTIVE)=type=WRITE, whence=0,
                        start=9223372036854775805, len=1, pid =
                        18446744073709551610, owner=c811600cd67f0000,
                        client=0x7fbe100df280,
                        connection-id=cluster-vm-3603-2015/10/21-10:35:54:596929-system_www1-client-0-0-0,


                        granted at 2015-10-21 11:36:22<br>
                      </div>
                    </blockquote>
                    <br>
                    From the statedump, It looks like self-heal daemon
                    had taken locks to heal the file due to which the
                    locks attempted by the client (mount) are in blocked
                    state.<br>
                    In Arbiter volumes the client (mount) takes full
                    locks (start=0, len=0) for every write() as opposed
                    to normal replica volumes which take range locks
                    (i.e. appropriate start,len values) for that
                    write(). This is done to avoid network split-brains.<br>
                    So in normal replica volumes, clients can still
                    write to a file while heal is going on, as long as
                    the offsets don't overlap. This is not the case with
                    arbiter volumes.<br>
                    You can look at the client or glustershd logs to see
                    if there are messages that indicate healing of a
                    file, something along the lines of "Completed data
                    selfheal on xxx"<br>
                  </blockquote>
                </div>
              </div>
              hi Adrian,<br>
                    Thanks for taking the time to send this mail. I
              raised this as bug @<a moz-do-not-send="true"
                href="https://bugzilla.redhat.com/show_bug.cgi?id=1275247"
                target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1275247</a>,
              fix is posted for review @ <a moz-do-not-send="true"
                href="http://review.gluster.com/#/c/12426/"
                target="_blank">http://review.gluster.com/#/c/12426/</a><span
                class="HOEnZb"><font color="#888888"><br>
                  <br>
                  Pranith</font></span>
              <div>
                <div class="h5"><br>
                  <blockquote type="cite"> <br>
                    <blockquote type="cite">
                      <div dir="ltr">inodelk.inodelk[2](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=c4fd2d78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[3](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=dc752e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[4](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=34832e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[5](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=d44d2e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[6](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=306f2e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[7](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=8c902e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[8](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=782c2e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[9](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=1c0b2e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        inodelk.inodelk[10](BLOCKED)=type=WRITE,
                        whence=0, start=0, len=0, pid = 0,
                        owner=24332e78487f0000, client=0x7fbe100e1380,
                        connection-id=cluster-vm-3846-2015/10/21-10:36:03:123909-system_www1-client-0-0-0,


                        blocked at 2015-10-21 11:37:45<br>
                        <br>
                        There seem to be multiple locks in BLOCKED state
                        - which doesn't look normal to me. The other 2
                        nodes have<br>
                        only 2 ACTIVE locks at the same time.<br>
                        <br>
                        Below is "gluster volume info" output.<br>
                        <br>
                        # gluster volume info<br>
                         <br>
                        Volume Name: data_mail1<br>
                        Type: Replicate<br>
                        Volume ID: fc3259a1-ddcf-46e9-ae77-299aaad93b7c<br>
                        Status: Started<br>
                        Number of Bricks: 1 x 3 = 3<br>
                        Transport-type: tcp<br>
                        Bricks:<br>
                        Brick1: cluster-rep:/GFS/data/mail1<br>
                        Brick2: mail-rep:/GFS/data/mail1<br>
                        Brick3: web-rep:/GFS/data/mail1<br>
                        Options Reconfigured:<br>
                        performance.readdir-ahead: on<br>
                        cluster.quorum-count: 2<br>
                        cluster.quorum-type: fixed<br>
                        cluster.server-quorum-ratio: 51%<br>
                         <br>
                        Volume Name: data_www1<br>
                        Type: Replicate<br>
                        Volume ID: 0c37a337-dbe5-4e75-8010-94e068c02026<br>
                        Status: Started<br>
                        Number of Bricks: 1 x 3 = 3<br>
                        Transport-type: tcp<br>
                        Bricks:<br>
                        Brick1: cluster-rep:/GFS/data/www1<br>
                        Brick2: web-rep:/GFS/data/www1<br>
                        Brick3: mail-rep:/GFS/data/www1<br>
                        Options Reconfigured:<br>
                        performance.readdir-ahead: on<br>
                        cluster.quorum-type: fixed<br>
                        cluster.quorum-count: 2<br>
                        cluster.server-quorum-ratio: 51%<br>
                         <br>
                        Volume Name: system_mail1<br>
                        Type: Replicate<br>
                        Volume ID: 0568d985-9fa7-40a7-bead-298310622cb5<br>
                        Status: Started<br>
                        Number of Bricks: 1 x 3 = 3<br>
                        Transport-type: tcp<br>
                        Bricks:<br>
                        Brick1: cluster-rep:/GFS/system/mail1<br>
                        Brick2: mail-rep:/GFS/system/mail1<br>
                        Brick3: web-rep:/GFS/system/mail1<br>
                        Options Reconfigured:<br>
                        performance.readdir-ahead: on<br>
                        cluster.quorum-type: none<br>
                        cluster.quorum-count: 2<br>
                        cluster.server-quorum-ratio: 51%<br>
                         <br>
                        Volume Name: system_www1<br>
                        Type: Replicate<br>
                        Volume ID: 147636a2-5c15-4d9a-93c8-44d51252b124<br>
                        Status: Started<br>
                        Number of Bricks: 1 x 3 = 3<br>
                        Transport-type: tcp<br>
                        Bricks:<br>
                        Brick1: cluster-rep:/GFS/system/www1<br>
                        Brick2: web-rep:/GFS/system/www1<br>
                        Brick3: mail-rep:/GFS/system/www1<br>
                        Options Reconfigured:<br>
                        performance.readdir-ahead: on<br>
                        cluster.quorum-type: none<br>
                        cluster.quorum-count: 2<br>
                        cluster.server-quorum-ratio: 51%<br>
                        <br>
                        The issue does not occur when I get rid of 3rd
                        arbiter brick.<br>
                      </div>
                    </blockquote>
                    <br>
                    What do you mean by 'getting rid of'? Killing the
                    3rd brick process of the volume?<br>
                    <br>
                    Regards,<br>
                    Ravi<br>
                    <blockquote type="cite">
                      <div dir="ltr"><br>
                        If there's any additional information that is
                        missing and I could provide, please let me know.<br>
                        <br>
                        Greetings,<br>
                        Adrian</div>
                      <br>
                      <fieldset></fieldset>
                      <br>
                      <pre>_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
                    </blockquote>
                    <br>
                    <br>
                    <fieldset></fieldset>
                    <br>
                    <pre>_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>