<html><body><div style="font-family: garamond,new york,times,serif; font-size: 12pt; color: #000000"><div><br></div><div><br></div><hr id="zwchr"><blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"><b>From: </b>"Shyam" <srangana@redhat.com><br><b>To: </b>"Krutika Dhananjay" <kdhananj@redhat.com><br><b>Cc: </b>"Aravinda" <avishwan@redhat.com>, "Gluster Devel" <gluster-devel@gluster.org><br><b>Sent: </b>Wednesday, September 2, 2015 11:13:55 PM<br><b>Subject: </b>Re: [Gluster-devel] Gluster Sharding and Geo-replication<br><div><br></div>On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:<br>><br>><br>> ------------------------------------------------------------------------<br>><br>> *From: *"Shyam" <srangana@redhat.com><br>> *To: *"Aravinda" <avishwan@redhat.com>, "Gluster Devel"<br>> <gluster-devel@gluster.org><br>> *Sent: *Wednesday, September 2, 2015 8:09:55 PM<br>> *Subject: *Re: [Gluster-devel] Gluster Sharding and Geo-replication<br>><br>> On 09/02/2015 03:12 AM, Aravinda wrote:<br>> > Geo-replication and Sharding Team today discussed about the approach<br>> > to make Sharding aware Geo-replication. Details are as below<br>> ><br>> > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur<br>> ><br>> > - Both Master and Slave Volumes should be Sharded Volumes with same<br>> > configurations.<br>><br>> If I am not mistaken, geo-rep supports replicating to a non-gluster<br>> local FS at the slave end. Is this correct? If so, would this<br>> limitation<br>> not make that problematic?<br>><br>> When you state *same configuration*, I assume you mean the sharding<br>> configuration, not the volume graph, right?<br>><br>> That is correct. The only requirement is for the slave to have shard<br>> translator (for, someone needs to present aggregated view of the file to<br>> the READers on the slave).<br>> Also the shard-block-size needs to be kept same between master and<br>> slave. Rest of the configuration (like the number of subvols of DHT/AFR)<br>> can vary across master and slave.<br><div><br></div>Do we need to have the sharded block size the same? As I assume the file <br>carries an xattr that contains the size it is sharded with <br>(trusted.glusterfs.shard.block-size), so if this is synced across, it <br>would do. If this is true, what it would mean is that "a sharded volume <br>needs a shard supported slave to ge-rep to".<br></blockquote><div>Yep. Even I feel it should probably not be necessary to enforce same-shard-size-everywhere as long as shard translator on the slave takes care not to further "shard" the individual shards gsyncD would write to, on the slave volume.<br></div><div>This is especially true if different files/images/vdisks on the master volume are associated with different block sizes.<br></div><div>This logic has to be built into the shard translator based on parameters (client-pid, parent directory of the file being written to).<br></div><div>What this means is that shard-block-size attribute on the slave would essentially be a don't-care parameter. I need to give all this some more thought though.<br></div><div><br></div><div>-Krutika<br></div><div><br></div><blockquote style="border-left:2px solid #1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;">><br>> -Krutika<br>><br>><br>><br>> > - In Changelog record changes related to Sharded files also. Just<br>> like<br>> > any regular files.<br>> > - Sharding should allow Geo-rep to list/read/write Sharding internal<br>> > Xattrs if Client PID is gsyncd(-1)<br>> > - Sharding should allow read/write of Sharded files(that is in<br>> .shards<br>> > directory) if Client PID is GSYNCD<br>> > - Sharding should return actual file instead of returning the<br>> > aggregated content when the Main file is requested(Client PID<br>> > GSYNCD)<br>> ><br>> > For example, a file f1 is created with GFID G1.<br>> ><br>> > When the file grows it gets sharded into chunks(say 5 chunks).<br>> ><br>> > f1 G1<br>> > .shards/G1.1 G2<br>> > .shards/G1.2 G3<br>> > .shards/G1.3 G4<br>> > .shards/G1.4 G5<br>> ><br>> > In Changelog, this is recorded as 5 different files as below<br>> ><br>> > CREATE G1 f1<br>> > DATA G1<br>> > META G1<br>> > CREATE G2 PGS/G1.1<br>> > DATA G2<br>> > META G1<br>> > CREATE G3 PGS/G1.2<br>> > DATA G3<br>> > META G1<br>> > CREATE G4 PGS/G1.3<br>> > DATA G4<br>> > META G1<br>> > CREATE G5 PGS/G1.4<br>> > DATA G5<br>> > META G1<br>> ><br>> > Where PGS is GFID of .shards directory.<br>> ><br>> > Geo-rep will create these files independently in Slave Volume and<br>> > syncs Xattrs of G1. Data can be read only when all the chunks are<br>> > synced to Slave Volume. Data can be read partially if main/first file<br>> > and some of the chunks synced to Slave.<br>> ><br>> > Please add if I missed anything. C & S Welcome.<br>> ><br>> > regards<br>> > Aravinda<br>> ><br>> > On 08/11/2015 04:36 PM, Aravinda wrote:<br>> >> Hi,<br>> >><br>> >> We are thinking different approaches to add support in<br>> Geo-replication<br>> >> for Sharded Gluster Volumes[1]<br>> >><br>> >> *Approach 1: Geo-rep: Sync Full file*<br>> >> - In Changelog only record main file details in the same brick<br>> >> where it is created<br>> >> - Record as DATA in Changelog whenever any addition/changes<br>> to the<br>> >> sharded file<br>> >> - Geo-rep rsync will do checksum as a full file from mount and<br>> >> syncs as new file<br>> >> - Slave side sharding is managed by Slave Volume<br>> >> *Approach 2: Geo-rep: Sync sharded file separately*<br>> >> - Geo-rep rsync will do checksum for sharded files only<br>> >> - Geo-rep syncs each sharded files independently as new files<br>> >> - [UNKNOWN] Sync internal xattrs(file size and block count)<br>> in the<br>> >> main sharded file to Slave Volume to maintain the same state as<br>> in Master.<br>> >> - Sharding translator to allow file creation under .shards<br>> dir for<br>> >> gsyncd. that is Parent GFID is .shards directory<br>> >> - If sharded files are modified during Geo-rep run may end up<br>> stale<br>> >> data in Slave.<br>> >> - Files on Slave Volume may not be readable unless all sharded<br>> >> files sync to Slave(Each bricks in Master independently sync<br>> files to<br>> >> slave)<br>> >><br>> >> First approach looks more clean, but we have to analize the Rsync<br>> >> checksum performance on big files(Sharded in backend, accessed<br>> as one<br>> >> big file from rsync)<br>> >><br>> >> Let us know your thoughts. Thanks<br>> >><br>> >> Ref:<br>> >> [1]<br>> >><br>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator<br>> >> --<br>> >> regards<br>> >> Aravinda<br>> >><br>> >><br>> >> _______________________________________________<br>> >> Gluster-devel mailing list<br>> >> Gluster-devel@gluster.org<br>> >> http://www.gluster.org/mailman/listinfo/gluster-devel<br>> ><br>> ><br>> ><br>> > _______________________________________________<br>> > Gluster-devel mailing list<br>> > Gluster-devel@gluster.org<br>> > http://www.gluster.org/mailman/listinfo/gluster-devel<br>> ><br>> _______________________________________________<br>> Gluster-devel mailing list<br>> Gluster-devel@gluster.org<br>> http://www.gluster.org/mailman/listinfo/gluster-devel<br>><br>><br></blockquote><div><br></div></div></body></html>