<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 6, 2016 at 12:24 AM, Shyam <span dir="ltr">&lt;<a href="mailto:srangana@redhat.com" target="_blank">srangana@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 07/01/2016 01:45 AM, B.K.Raghuram wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I have not gone through this implementation nor the new iscsi<br>

implementation being worked on for 3.9 but I thought I&#39;d share the<br>

design behind a distributed iscsi implementation that we&#39;d worked on<br>

some time back based on the istgt code with a libgfapi hook.<br>

<br>

The implementation used the idea of using one file to represent one<br>

block (of a chosen size) thus allowing us to use gluster as the backend<br>

to store these files while presenting a single block device of possibly<br>

infinite size. We used a fixed file naming convention based on the block<br>

number which allows the system to determine which file(s) needs to be<br>

operated on for the requested byte offset. This gave us the advantage of<br>

automatically accessing all of gluster&#39;s file based functionality<br>

underneath to provide a fully distributed iscsi implementation.<br>

<br>

Would this be similar to the new iscsi implementation thats being worked<br>

on for 3.9?<br>

</blockquote>

<br></span>

&lt;will let others correct me here, but...&gt;<br>

<br>

Ultimately the idea would be to use sharding, as a part of the gluster volume graph, to distribute the blocks (or rather shard the blocks), rather than having the disk image on one distribute subvolume and hence scale disk sizes to the size of the cluster. Further, sharding should work well here, as this is a single client access case (or are we past that hurdle already?).<br></blockquote><div><br></div><div>Not yet, we need common transaction frame in place to reduce the latency for synchronization.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

What this achieves is similar to the iSCSI implementation that you talk about, but gluster doing the block splitting and hence distribution, rather than the iSCSI implementation (istgt) doing the same.<br>

<br>

&lt; I did a cursory check on the blog post, but did not find a shard reference, so maybe others could pitch in here, if they know about the direction&gt;<br></blockquote><div><br></div><div>There are two directions which will eventually converge.<br></div><div>1) Granular data self-heal implementation so that taking snapshot becomes as simple as reflink.<br></div><div>2) Bring in snapshots of file with shards - this is a bit involved compared to the solution above.<br><br></div><div>Once 2) is also complete we will have both 1) + 2) combined so that data-self-heal will heal the exact blocks inside each shard.<br><br></div><div>If the users are not worried about snapshots 2) is the best option.<br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Further, in your original proposal, how do you maintain device properties, such as size of the device and used/free blocks? I ask about used and free, as that is an overhead to compute, if each block is maintained as a separate file by itself, or difficult to achieve consistency of the size and block update (as they are separate operations). Just curious.<br>

</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div>

</div></div>