<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 03/02/2016 04:03 PM, Atin Mukherjee

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAGkR8FNgxX9rL5cL7FvHw1VRyr7J8YZSUq6ULpn-rVjeg=RpYw@mail.gmail.com"

      type="cite">

      <p dir="ltr">-Atin<br>

        Sent from one plus one<br>

        On 02-Mar-2016 3:41 pm, "Avra Sengupta" &lt;<a

          moz-do-not-send="true" href="mailto:asengupt@redhat.com"><a class="moz-txt-link-abbreviated" href="mailto:asengupt@redhat.com">asengupt@redhat.com</a></a>&gt;

        wrote:<br>

        &gt;<br>

        &gt; On 03/02/2016 02:55 PM, Venky Shankar wrote:<br>

        &gt;&gt;<br>

        &gt;&gt; On Wed, Mar 02, 2016 at 02:29:26PM +0530, Avra Sengupta

        wrote:<br>

        &gt;&gt;&gt;<br>

        &gt;&gt;&gt; On 03/02/2016 02:02 PM, Venky Shankar wrote:<br>

        &gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt; On Wed, Mar 02, 2016 at 01:40:08PM +0530, Avra

        Sengupta wrote:<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; Hi,<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; All fops in NSR, follow a specific workflow

        as described in this UML(<a moz-do-not-send="true"

href="https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing">https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing</a>).<br>

        &gt;&gt;&gt;&gt;&gt; However all locking fops will follow a

        slightly different workflow as<br>

        &gt;&gt;&gt;&gt;&gt; described below. This is a first proposed

        draft for handling locks, and we<br>

        &gt;&gt;&gt;&gt;&gt; would like to hear your concerns and

        queries regarding the same.<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; 1. On receiving the lock, the leader will

        Journal the lock himself, and then<br>

        &gt;&gt;&gt;&gt;&gt; try to actually acquire the lock. At this

        point in time, if it fails to<br>

        &gt;&gt;&gt;&gt;&gt; acquire the lock, then it will invalidate

        the journal entry, and return a<br>

        &gt;&gt;&gt;&gt;&gt; -ve ack back to the client. However, if it

        is successful in acquiring the<br>

        &gt;&gt;&gt;&gt;&gt; lock, it will mark the journal entry as

        complete, and forward the fop to the<br>

        &gt;&gt;&gt;&gt;&gt; followers.<br>

        &gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt; So, does a contending non-blocking lock

        operation check only on the leader<br>

        &gt;&gt;&gt;&gt; since the followers might have not yet ack'd an

        earlier lock operation?<br>

        &gt;&gt;&gt;<br>

        &gt;&gt;&gt; A non-blocking lock follows the same work flow, and

        thereby checks on the<br>

        &gt;&gt;&gt; leader first. In this case, it would be blocked on

        the leader, till the<br>

        &gt;&gt;&gt; leader releases the lock. Then it will follow the

        same workflow.<br>

        &gt;&gt;<br>

        &gt;&gt; A non-blocking lock should ideally return EAGAIN if the

        region is already locked.<br>

        &gt;&gt; Checking just on the leader (posix/locks on the leader

        server stack) and returning<br>

        &gt;&gt; EAGAIN is kind of incomplete as the earlier lock

        request might not have been granted<br>

        &gt;&gt; (due to failure on followers).<br>

        &gt;&gt;<br>

        &gt;&gt; or does it even matter if we return EAGAIN during the

        transient state?<br>

        &gt;&gt;<br>

        &gt;&gt; We could block the lock on the leader until an earlier

        lock request is satisfied<br>

        &gt;&gt; (in which case return EAGAIN) or in case of failure try

        to satisfy the lock request.<br>

        &gt;<br>

        &gt; That is what I said, it will be blocked on the leader till

        the leader releases the already held lock.<br>

        &gt;<br>

        &gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; 2. The followers on receiving the fop, will

        journal it, and then try to<br>

        &gt;&gt;&gt;&gt;&gt; actually acquire the lock. If it fails to

        acquire the lock, then it will<br>

        &gt;&gt;&gt;&gt;&gt; invalidate the journal entry, and return a

        -ve ack back to the leader. If it<br>

        &gt;&gt;&gt;&gt;&gt; is successful in acquiring the lock, it

        will mark the journal entry as<br>

        &gt;&gt;&gt;&gt;&gt; complete,and send a +ve ack to the leader.<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; 3. The leader on receiving all acks, will

        perform a quorum check. If quorum<br>

        &gt;&gt;&gt;&gt;&gt; meets, it will send a +ve ack to the

        client. If the quorum fails, it will<br>

        &gt;&gt;&gt;&gt;&gt; send a rollback to the followers.<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; 4. The followers on receiving the rollback,

        will journal it first, and then<br>

        &gt;&gt;&gt;&gt;&gt; release the acquired lock. It will update

        the rollback entry in the journal<br>

        &gt;&gt;&gt;&gt;&gt; as complete and send an ack to the leader.<br>

        &gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt; What happens if the rollback fails for whatever

        reason?<br>

        &gt;&gt;&gt;<br>

        &gt;&gt;&gt; The leader receives a -ve rollback ack, but there's

        little it can do about<br>

        &gt;&gt;&gt; it. Depending on the failure, it will be resolved

        during reconciliation<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; 5. The leader on receiving the rollback

        acks, will journal it's own<br>

        &gt;&gt;&gt;&gt;&gt; rollback, and then release the acquired

        lock. It will update the rollback<br>

        &gt;&gt;&gt;&gt;&gt; entry in the journal, and send a -ve ack to

        the client.<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; Few things to be noted in the above

        workflow are:<br>

        &gt;&gt;&gt;&gt;&gt; 1. It will be a synchronous operation,

        across the replica volume.<br>

        &gt;<br>

        &gt; Atin, I am not sure how AFR handles it.<br>

        If AFR/EC handle them asynchronously do you see any performance

        bottleneck with NSR for this case?<br>

      </p>

    </blockquote>

    Well it's not synchronous to the point that the follwers would

    perform it one after the other. AFR/EC clients would also have to

    wait for acks from a quorum of servers till they can ack the client.

    The same is true with the NSR leader, who will have to wait till it

    gets ack from a quorum of followers.<br>

    <blockquote

cite="mid:CAGkR8FNgxX9rL5cL7FvHw1VRyr7J8YZSUq6ULpn-rVjeg=RpYw@mail.gmail.com"

      type="cite">

      <p dir="ltr">

        &gt;<br>

        &gt;&gt;&gt;&gt;&gt; 2. Reconciliation will take care of nodes

        who have missed out the locks.<br>

        &gt;&gt;&gt;&gt;&gt; 3. On a client disconnect, there will be a

        lock-timeout on whose expiration<br>

        &gt;&gt;&gt;&gt;&gt; all locks held by that particular client

        will be released.<br>

        &gt;&gt;&gt;&gt;&gt;<br>

        &gt;&gt;&gt;&gt;&gt; Regards,<br>

        &gt;&gt;&gt;&gt;&gt; Avra<br>

        &gt;&gt;&gt;&gt;&gt;

        _______________________________________________<br>

        &gt;&gt;&gt;&gt;&gt; Gluster-devel mailing list<br>

        &gt;&gt;&gt;&gt;&gt; <a moz-do-not-send="true"

          href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a><br>

        &gt;&gt;&gt;&gt;&gt; <a moz-do-not-send="true"

          href="http://www.gluster.org/mailman/listinfo/gluster-devel">http://www.gluster.org/mailman/listinfo/gluster-devel</a><br>

        &gt;<br>

        &gt;<br>

        &gt; _______________________________________________<br>

        &gt; Gluster-devel mailing list<br>

        &gt; <a moz-do-not-send="true"

          href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a><br>

        &gt; <a moz-do-not-send="true"

          href="http://www.gluster.org/mailman/listinfo/gluster-devel">http://www.gluster.org/mailman/listinfo/gluster-devel</a><br>

      </p>

    </blockquote>

    <br>

  </body>

</html>