Trying Out MooseFS

Gluster

2012-11-07

In addition to watching the election coverage last night, I spent some time giving MooseFS another try. It’s a project I had high hopes for once, I even considered it as an alternative to GlusterFS as a basis for what was then CloudFS, but I was put off by several things – single metadata server, not modular, inactive/hostile community. When I tested it a couple of years ago it couldn’t even survive a simple test (write ten files concurrently and then read them all back concurrently) without hanging, so I pretty much forgot about it. When somebody recently said it was “better than GlusterFS” I decided it was time to put that to the test. Here are some results. First, let’s look at performance.
GlusterFS vs. MooseFS performance
OK, so it looks way faster than GlusterFS. That’s a bit misleading, though. For one thing, this is with replication turned on. GlusterFS replication is client to both servers; MooseFS is client to one server then that server to others (so the first server can use both halves of a full-duplex link). That means GlusterFS should scale better as the client:server ratio increases, but also that single-client performance will be half that for MooseFS. That’s what the “limit” lines in the chart above show, but those lines show something even more interesting. The MooseFS numbers are higher than is actually possible on the single GigE that client had. These numbers are supposed to include fsync time, and the GlusterFS numbers reflect that, but the MooseFS numbers keep going as data continues to be buffered in memory. That’s neither correct nor sustainable.

The way that MooseFS ignores fsync reminds me of another thing I noticed a while ago: it ignores O_SYNC too. I verified this by looking at the code and seeing where O_SYNC got stripped out, and now my tests show the same effect. Whereas the GlusterFS IOPS numbers with O_SYNC take the expected huge hit relative to those above, the MooseFS numbers actually improve somewhat. POSIX compliant, eh? Nope, not even trying. As a storage guy who cares about user data, I find that totally unacceptable and sufficient to disqualify MooseFS for serious use. The question is: how hard is it to fix? Honoring O_SYNC isn’t just a matter of passing it to the server, which would be easy. It’s also a matter of fixing the fsync behavior to make sure the O_SYNC request actually gets to the server, and – a little more arguably – to all servers that are supposed to hold the data. Those parts might be more difficult.

In any case, let’s take a more detailed look at how MooseFS compares to GlusterFS.

GOOD: performance. Despite everything I say above, it looks like MooseFS really does perform better than GlusterFS, at least for some important use cases. I’ll give them credit for that.
GOOD: per-object replication levels (“goals”). This is a feature I personally plan to add to GlusterFS some day, but it’s some day in the far future. 🙁
GOOD: snapshots. This one’s actually in the GlusterFS road map, but it’s not there today.
MIXED: self-heal and rebalance are more automatic, but also more opaque. I couldn’t find even the sketchiest documentation of this on their website. Apparently the only way to know what they’re doing, and if it’s the right thing, is to read the code. Besides being poor form for an open-source project (if dumping code from a private repository into a public one twice a year even qualifies), that also means it’s unlikely that the user will ever have significant control over when or how these things are done.
BAD: clunky configuration. Just a bunch of files in /etc/mfs with very little documentation. This isn’t just cosmetic; if you want to build a really big system, this approach just won’t scale. I don’t think GlusterFS’s approach to managing configuration data scales as well as it should either, but it’s still way better than this.
BAD: no online reconfiguration/upgrade. AFAICT, chunk servers can be added or removed on the fly, but anything else requires disruptive restarts. GlusterFS can already do most forms of reconfiguration online, and soon upgrades will be possible that way too.
BAD: no other protocols. No built-in NFS (requires re-export so the performance advantage would shift to GlusterFS), no UFO, no Hadoop integration, no qemu integration.
BAD: no geo-replication.
BAD: not readily extensible, as with GlusterFS translators.

Basically, if performance matters to you more than data integrity (e.g. the data exists elsewhere already), and/or if you really really need snapshots right now, then I don’t see MooseFS as an invalid choice. Go ahead, knock yourself out. You can even post your success stories here if you want. On the other hand, if you have any other needs whatsoever, I’d warn you to be careful. You might get yourself in deep trouble, and I’ve never seen anyone say the developers or other community were any help at all.

BLOG

06 Dec 2020
Looking back at 2020 – with g...

2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

Read more
27 Apr 2020
Update from the team

It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

Read more
03 Feb 2020
Building a longer term focus for Gl...

The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

Read more

Trying Out MooseFS

BLOG

Looking back at 2020 – with g...

Update from the team

Building a longer term focus for Gl...