<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 03/02/2016 04:03 PM, Atin Mukherjee
wrote:<br>
</div>
<blockquote
cite="mid:CAGkR8FNgxX9rL5cL7FvHw1VRyr7J8YZSUq6ULpn-rVjeg=RpYw@mail.gmail.com"
type="cite">
<p dir="ltr">-Atin<br>
Sent from one plus one<br>
On 02-Mar-2016 3:41 pm, "Avra Sengupta" <<a
moz-do-not-send="true" href="mailto:asengupt@redhat.com"><a class="moz-txt-link-abbreviated" href="mailto:asengupt@redhat.com">asengupt@redhat.com</a></a>>
wrote:<br>
><br>
> On 03/02/2016 02:55 PM, Venky Shankar wrote:<br>
>><br>
>> On Wed, Mar 02, 2016 at 02:29:26PM +0530, Avra Sengupta
wrote:<br>
>>><br>
>>> On 03/02/2016 02:02 PM, Venky Shankar wrote:<br>
>>>><br>
>>>> On Wed, Mar 02, 2016 at 01:40:08PM +0530, Avra
Sengupta wrote:<br>
>>>>><br>
>>>>> Hi,<br>
>>>>><br>
>>>>> All fops in NSR, follow a specific workflow
as described in this UML(<a moz-do-not-send="true"
href="https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing">https://docs.google.com/presentation/d/1lxwox72n6ovfOwzmdlNCZBJ5vQcCaONvZva0aLWKUqk/edit?usp=sharing</a>).<br>
>>>>> However all locking fops will follow a
slightly different workflow as<br>
>>>>> described below. This is a first proposed
draft for handling locks, and we<br>
>>>>> would like to hear your concerns and
queries regarding the same.<br>
>>>>><br>
>>>>> 1. On receiving the lock, the leader will
Journal the lock himself, and then<br>
>>>>> try to actually acquire the lock. At this
point in time, if it fails to<br>
>>>>> acquire the lock, then it will invalidate
the journal entry, and return a<br>
>>>>> -ve ack back to the client. However, if it
is successful in acquiring the<br>
>>>>> lock, it will mark the journal entry as
complete, and forward the fop to the<br>
>>>>> followers.<br>
>>>><br>
>>>> So, does a contending non-blocking lock
operation check only on the leader<br>
>>>> since the followers might have not yet ack'd an
earlier lock operation?<br>
>>><br>
>>> A non-blocking lock follows the same work flow, and
thereby checks on the<br>
>>> leader first. In this case, it would be blocked on
the leader, till the<br>
>>> leader releases the lock. Then it will follow the
same workflow.<br>
>><br>
>> A non-blocking lock should ideally return EAGAIN if the
region is already locked.<br>
>> Checking just on the leader (posix/locks on the leader
server stack) and returning<br>
>> EAGAIN is kind of incomplete as the earlier lock
request might not have been granted<br>
>> (due to failure on followers).<br>
>><br>
>> or does it even matter if we return EAGAIN during the
transient state?<br>
>><br>
>> We could block the lock on the leader until an earlier
lock request is satisfied<br>
>> (in which case return EAGAIN) or in case of failure try
to satisfy the lock request.<br>
><br>
> That is what I said, it will be blocked on the leader till
the leader releases the already held lock.<br>
><br>
>><br>
>>>>> 2. The followers on receiving the fop, will
journal it, and then try to<br>
>>>>> actually acquire the lock. If it fails to
acquire the lock, then it will<br>
>>>>> invalidate the journal entry, and return a
-ve ack back to the leader. If it<br>
>>>>> is successful in acquiring the lock, it
will mark the journal entry as<br>
>>>>> complete,and send a +ve ack to the leader.<br>
>>>>><br>
>>>>> 3. The leader on receiving all acks, will
perform a quorum check. If quorum<br>
>>>>> meets, it will send a +ve ack to the
client. If the quorum fails, it will<br>
>>>>> send a rollback to the followers.<br>
>>>>><br>
>>>>> 4. The followers on receiving the rollback,
will journal it first, and then<br>
>>>>> release the acquired lock. It will update
the rollback entry in the journal<br>
>>>>> as complete and send an ack to the leader.<br>
>>>><br>
>>>> What happens if the rollback fails for whatever
reason?<br>
>>><br>
>>> The leader receives a -ve rollback ack, but there's
little it can do about<br>
>>> it. Depending on the failure, it will be resolved
during reconciliation<br>
>>>>><br>
>>>>> 5. The leader on receiving the rollback
acks, will journal it's own<br>
>>>>> rollback, and then release the acquired
lock. It will update the rollback<br>
>>>>> entry in the journal, and send a -ve ack to
the client.<br>
>>>>><br>
>>>>> Few things to be noted in the above
workflow are:<br>
>>>>> 1. It will be a synchronous operation,
across the replica volume.<br>
><br>
> Atin, I am not sure how AFR handles it.<br>
If AFR/EC handle them asynchronously do you see any performance
bottleneck with NSR for this case?<br>
</p>
</blockquote>
Well it's not synchronous to the point that the follwers would
perform it one after the other. AFR/EC clients would also have to
wait for acks from a quorum of servers till they can ack the client.
The same is true with the NSR leader, who will have to wait till it
gets ack from a quorum of followers.<br>
<blockquote
cite="mid:CAGkR8FNgxX9rL5cL7FvHw1VRyr7J8YZSUq6ULpn-rVjeg=RpYw@mail.gmail.com"
type="cite">
<p dir="ltr">
><br>
>>>>> 2. Reconciliation will take care of nodes
who have missed out the locks.<br>
>>>>> 3. On a client disconnect, there will be a
lock-timeout on whose expiration<br>
>>>>> all locks held by that particular client
will be released.<br>
>>>>><br>
>>>>> Regards,<br>
>>>>> Avra<br>
>>>>>
_______________________________________________<br>
>>>>> Gluster-devel mailing list<br>
>>>>> <a moz-do-not-send="true"
href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a><br>
>>>>> <a moz-do-not-send="true"
href="http://www.gluster.org/mailman/listinfo/gluster-devel">http://www.gluster.org/mailman/listinfo/gluster-devel</a><br>
><br>
><br>
> _______________________________________________<br>
> Gluster-devel mailing list<br>
> <a moz-do-not-send="true"
href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a><br>
> <a moz-do-not-send="true"
href="http://www.gluster.org/mailman/listinfo/gluster-devel">http://www.gluster.org/mailman/listinfo/gluster-devel</a><br>
</p>
</blockquote>
<br>
</body>
</html>