<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
TL;DR: Need to come up with a fix for AFR data self-heal from
clients (mounts).<br>
<br>
<i>data-self-heal.t</i> creates a 1x2 volume, sets afr changelog
xattrs directly on the files in the backend bricks, then runs full
heal to heal the files.<br>
<br>
The test fails intermittently when run in a loop because data
self-heal attempts non-blocking locks before healing and the two
heal threads (one per brick) might try to acquire the lock at the
same time and both might fail. In afr-v1, only one thread gets
spawned if both bricks are in the same node. In afr-v2, we cannot do
this because unlike in v1, there is no conservative merge in
afr_opendir_cbk() in v2. We are not sure that adding conservative
merge in v2 is a good idea because it involves (multiple ) readdirs
on both bricks and computing checksum on the entries to see if there
is a mismatch, which can be a costly operation when done from
clients. Making the locks blocking could cause one heal thread to
block instead of trying to heal other files if the other thread
holds the lock. One approach is to do what ec does by using a
virtual xattr and handling it in the getxattr FOP to trigger data
heals from clients. More thought needs to be given to this.<br>
<br>
Regards,<br>
Ravi<br>
<br>
<br>
</body>
</html>