[Gluster-devel] State of the 4.0 World

Jeff Darcy jdarcy at redhat.com
Tue May 3 15:50:30 UTC 2016


One of my recurring action items at community meetings is to report to
the list on how 4.0 is going.  So, here we go.

The executive summary is that 4.0 is on life support.  Many features
were proposed - some quite ambitious.  Many of those *never* had anyone
available to work on them.  Of those that did, many have either been
pulled forward into 3.8 (which is great) or lost what resources they had
(which is bad).  Downstream priorities have been the biggest cause of
those resource losses, though other factors such as attrition have also
played a part.  Net result is that, with the singular exception of
GlusterD 2.0, progress on 4.0 has all but stopped.  I'll provide more
details below.  Meanwhile, I'd like to issue a bit of a call to action
here, in two parts.

 * Many of the 4.0 sub-projects are still unstaffed.  Some of them are
   in areas of code where our combined expertise is thin.  For example,
   "glusterfsd" is where we need to make many brick- and
   daemon-management changes for 4.0, but it has no specific maintainer
   other than the project architects so nobody touches it.  Over the
   past year it has been touched by fewer than two patches per month,
   mostly side effects of patches which were primarily focused elsewhere
   (less than 400 lines changed).  It can be challenging to dive into
   such a "fallow" area, but it can also be an opportunity to make a big
   difference, show off one's skill, and not have to worry much about
   conflicts with other developers' changes.  Taking on projects like
   these is how people get from contributing to leading (FWIW it's how I
   did), so I encourage people to make the leap.

 * I've been told that some people have asked how 4.0 is going to affect
   existing components for which they are responsible.  Please note that
   only two components are being replaced - GlusterD and DHT.  The DHT2
   changes are going to affect storage/posix a lot, so that *might* be
   considered a third replacement.  JBR (formerly NSR) is *not* going to
   replace AFR or EC any time soon.  In fact, I'm making significant
   efforts to create common infrastructure that will also support
   running AFR/EC on the server side, with many potential benefits to
   them and their developers.  However, just about every other component
   is going to be affected to some degree, if only to use the 4.0
   CLI/volgen plugin interfaces instead of being hard-coded into their
   current equivalents.  4.0 tests are also expected to be based on
   Distaf rather than TAP (the .t infrastructure) so there's a lot of
   catch-up to be done there.  In other cases there are deeper issues to
   be resolved, and many of those discussions - e.g. regarding quota or
   georep - have already been ongoing.  There will eventually be a
   Gluster 4.0, even if it happens after I'm retired and looks nothing
   like what I describe below.  If you're responsible for any part of
   GlusterFS, you're also responsible for understanding how 4.0 will
   affect that part.

With all that said, I'm going to give item-by-item details of where we
stand.  I'll use

http://www.gluster.org/community/documentation/index.php/Planning40

as a starting point, even though (as you'll see) in some ways it's out
of date.

* GlusterD 2 is still making good progress, under Atin's and Kaushal's
   leadership.  There are designs for most of the important pieces, and
   a significant amount of code which we should be able to demo soon.

 * DHT2 had been making good progress for a while, but has been stalled
   recently as its lead developer (Shyam) has been unavailable.
   Hopefully we'll get him back soon, and progress will accelerate
   again.

 * Sharding got pulled forward because of its importance for other
   efforts, so it's no longer a 4.0 feature.

 * Client-side caching has been dropped for now, though it could still
   return with a new design based on the lease infrastructure.

 * Data classification (beyond just tiering) has been dropped.

 * Multiple-network support and network QoS are very much still part of
   the 4.0 plan as far as I'm concerned, but there's still nobody
   available to work on them.

 * "Better brick management" is also still an un-resourced part of the
   4.0 plan.  A lot of the higher-level logic will go into Heketi, but
   exporting multiple bricks through a single daemon (and port) can't
   be.  

 * Compression/dedup have been dropped.

 * Composite operations are already being implemented, either as part of
   3.8 or as part of the Samba/Ganesha efforts depending on how you look
   at it, so that's not a 4.0 feature any more.

 * Stat/xattr caching (on the server) is a bit of a question mark.  On
   the one hand, it should be pretty simple to implement.  On the other
   hand, nobody has made even a minimal effort to do so.  Recent events
   have also raised the issue of needing to do this for correctness
   (especially around maintaining ctime across replicas) as well as
   performance.  This would be a *great* opportunity for a currently
   junior/novice Gluster contributor to make their mark.

 * Code generation already exists, and is actively being used to
   implement other 4.0 features.  My only other comment here is that 
   people should start using it instead of continuing to use macros in
   many cases.  Every macro we add is another little nugget of technical
   debt, causing all sorts of headaches for anyone who has to edit or
   debug the code later.  Please do your part to stamp out macro abuse.

 * Management plugins are part of the GlusterD 2 plan.

 * Performance monitoring etc. (last item on list) has been dropped, for
   lack of a well defined scope or requirements.


More information about the Gluster-devel mailing list