[Gluster-devel] State of the 4.0 World

Jeff Darcy jdarcy at redhat.com
Tue May 3 22:00:38 UTC 2016


> Great summary Jeff - thanks!

You're welcome.  Also, it seems like I somehow managed to leave out the
piece that I'm personally most involved in - Journal Based Replication
(formerly New Style Replication).  I might as well correct that now.

What we (Avra and I) have so far is a working I/O path, which consists
of the following.

 * A client translator which handles finding the current leader, routing
   to it, retrying if a new leader is elected, etc.

 * A server translator which handles the sequencing of requests from
   leader to followers, to the journal, and to the main store.  This
   relies on other components to deal with leader (re)election, so
   what's currently just a stub to make testing possible.

 * A full data-logging translator which manages the journal files, both
   in the I/O path and in response to special requests related to
   reconciliation (the JBR equivalent of AFR or EC self-heal).

This path has even been tuned/optimized somewhat, so performance isn't
totally lousy.  A lot more work still needs to be done here, of course.
In particular, there's a whole subsystem still to be implemented for
doing *reads* efficiently when the most recent data is still only in the
journal.  Still, it's nice to have tests run in time comparable to what
we're already used to with EC or AFR.

What's missing?  Mostly the real guts of reconciliation.  This was
always going to be the most complicated part, so it's really no surprise
that it's the last to become usable, but here's some basic status.

 * Code to trigger reconciliation on cold start or leadership change is
   *not* done yet.  Ditto for code to generate the volfiles that the
   reconciliation process will use.  I'm currently doing these parts by
   hand, and the real thing probably won't happen until we get into
   integrating with etcd/glusterd.

 * Code to query the replicas about what terms exist, what states
   requests are in, etc. *does* exist.  So does code to replay an
   individual request on a particular replica.

 * Code to use the above facilities and make intelligent decisions about
   what to replay where does *not* exist.  This is the real heart of
   reconciliation, and I'm just now gearing up to start on it.

There are sure to be lots of other smaller bits too, but that's most of
it.


More information about the Gluster-devel mailing list