The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

The Importance of Staying Sequential

Gluster
2012-11-02

One of the most important aspects of disk performance is the difference between seek latency and rotational latency. To put it simply, the time it takes to seek between tracks is at least an order of magnitude greater than the time it takes for the disk to spin – and that’s already more orders of magnitude greater than just about anything else that happens inside a computer. Thus, even though a disk provides random access, not all random accesses are equal; depending on where the disk head is after one operation, the time for the next can vary a great deal. Memory doesn’t have this behavior. Solid-state disks don’t have this behavior. Even networks don’t have this behavior; if it takes you X milliseconds to send one packet, it will probably take close to X milliseconds to send the next. Disks are special this way, and likely to become more rather than less special. Because of all this, operating systems and even some applications have evolved to generate sequential I/O (which doesn’t require disk seeks) instead of random (which does). What a lot of people tend to miss, even though it’s pretty obvious in retrospect, is that sequential I/O tends to become pathological I/O when there’s a journal on the same disk. That clicking you hear on a busy disk might well be the head seeking from the journal at one end of the disk to the actual data at the other – over and over and over and over again. How bad is it? Here’s a very simple graph.
sequential vs. random I/O
That’s from a couple of my test machines. The disks are pretty old (147GB is almost laughable nowadays) and it’s my usual worst-case test – synchronous, small, random writes. Wait, random? Weren’t we supposed to be comparing random to sequential? Well, yes. This is random I/O from the user perspective, but only for the red “single disk” line is it random when it hits disk. The filesystem actually does a pretty good job of de-randomizing the placement of those data blocks to avoid seeks, and the block I/O scheduler helps too. The journal I/O is inherently sequential, but what’s left is still the data/journal thrashing I mentioned. Thus, the green “separate journal” line shows what happens when we move the journal to a separate disk, so now we have two sequential streams on two separate devices instead of contending for one. Notice how the green line is more than twice as high as the red line. This isn’t just a matter of having twice as much hardware. It’s even more a matter of using that hardware more efficiently.

That brings us to SSDs. I don’t happen to have any SSDs handy, but it should be pretty clear how they might fit into this picture. Maybe you can afford to go all-SSD for your data and maybe you can’t. If you can, then you don’t need to worry about any of this stuff so why did you even read this far? If you can’t, then there’s still something you can do. Because they don’t have the “seek chasm” that spinning disks do, a single SSD can handle the logs for many spinning disks. The fact that SSDs are smaller shouldn’t matter either, because journals are smaller too. Thus, a bunch of disks plus a single SSD might give you better overall performance than more/faster disks without an SSD. Implications regarding the types and costs of block storage in public clouds are left as an exercise for the reader.

BLOG

  • 06 Dec 2020
    Looking back at 2020 – with g...

    2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

    Read more
  • 27 Apr 2020
    Update from the team

    It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

    Read more
  • 03 Feb 2020
    Building a longer term focus for Gl...

    The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

    Read more