Should I use Stripe on GlusterFS?

Gluster

2012-04-17

Frequently I have new users come into #gluster with their first ever GlusterFS volume being a stripe volume. Why? Because they’re sure that’s the right way to get better performance.

That ain’t necessarily so. The stripe translator was designed to allow a file to exceed the size of a single brick. That was its designed purpose, not for parallel reads and writes.

The Expectation

In a RAID0 stack, you use striping to allow each drive to operate in parallel. This will give both read and write performance increases, especially if you tune stripe sizes to your use. Your bottleneck is still (most likely) going to be the drive speed. Every piece along the way is faster. Operations are typically coming from a small handful of applications, and seeks are likely kept to a minimum. So why would striping across multiple computers be any different?

Reality

Multiple Clients

Networked filesystems typically are associated with a myriad of clients, each with their own task. This can cause a wild array of file requests that the server is going to try to satisfy as quickly as possible requiring reading and writing data all over the disk. If you only have a few files that are typically accessed, this may not be a problem, but more generally it seems to be.

With a distributed volume, file requests will usually end up being spread evenly among your servers, allowing fewer disk seeks.

Load Balance

With a striped volume, your load distribution is going to affect several of those servers, but not necessarily equally. If your typical file falls within the stripe*bricks size, then it should be pretty equal, but since most files are going to fall outside of that, the first server in each stripe set will actually have a higher load. This is due to the fact that offset 0 of any file is always on the first subvolume.

With a distributed volume, the files are already spread among the servers causing file I/O to be spread fairly evenly among them as well, thus probably providing the benefit you might expect with stripe.

Network

Obviously, if your network speed is slower than your disk speed, it doesn’t really matter how many disks you have working on a file, you’re still stuck with your network speed.

Data Integrity

Striped files are going to be lost. When a hard drive fails, striped files are gone. The more disks you add to a stripe, the higher the likelihood of failure. If you decide that you are going to use stripe, have a backup plan.

Distribute, when used alone, is still susceptible to data loss due to disk failure, but only to the files that are actually on that disk. As files are stored whole, disaster recovery is also still possible.

Summary

Distribute

Distribute is the default volume configuration of choice. It stores whole files and distributes those files among your bricks. When using many clients that access many disparate files, this will provide the greatest load distribution. Overloaded clusters can be expanded by adding more servers. Each of the following translators will combine with Distribute when a multiple of the number of designed bricks is added (4, 6, or 2^n bricks in a stripe 2 volume, 6,9,12… bricks in a replica 3 volume, 12 bricks in a replica 2 stripe 3 volume, etc.).

Stripe

When using files that exceed the size of your bricks, or when using a small number of large files that have i/o operations done to them in random seek locations (such as huge isam files), stripe may be a good fit. Files stored on a stripe volume should be throw-away as there’s no data integrity and a single brick failure will lose all the data.

Stripe + Replicate

New with 3.3, stripe + replicate will offer improved read performance over stripe alone, as well as system redundancy and data integrity security. This should still only be used with over-brick-sized files, or large files with random i/o.

Replicate ( + Distribute )

The most common, replicate offers read load sharing, data integrity, and redundancy. Replicate + Distribute is most likely the volume configuration you should actually be using.

If you have a configuration where a striped volume actually test performs better for your actual use case, please write a whitepaper about it and let me know. I’d be happy to reference it.

Tags
glusterfs

BLOG

06 Dec 2020
Looking back at 2020 – with g...

2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

Read more
27 Apr 2020
Update from the team

It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

Read more
03 Feb 2020
Building a longer term focus for Gl...

The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

Read more