Efficiency Still Matters

Gluster

2012-10-12

One of the most common knocks against GlusterFS is that it eats too many CPU cycles. For example:

Really not liking GlusterFS now. Performance is quite poor and CPU usage way too high for what it does.

I’ve talked about performance issues and expectations many times. With regard to CPU usage, I can’t resist saying that most people who say “too high for what it does” don’t actually understand what it does, but Frank still has a good point. I’ll be the first to admit that the GlusterFS code is pretty inefficient. Over on my other blog I obliquely mention one example, which I discuss as a correctness issue bit which also affects performance. Every time you write a variable you actually don’t need, you don’t just use up a few CPU cycles. You might also generate more cache-coherency traffic or push something else out of cache/TLB. The effects one time are minor, but for code that executes millions of times a second these things add up. If an idiom that drives more memory access is repeated across thousands of functions, then it has a measurable effect on overall cache efficiency. You owe it to yourself, your colleagues, and your users to avoid sins like the following (all pet peeves in the GlusterFS code).

Overuse of macros and inlines that hide their true cost from profilers etc. I particularly despise macro abuse, because macros are also not type-safe and are annoying to debug through.
Too much locking and unlocking, often hidden (see above) and/or forced by bad calling conventions. Lock cycles are expensive.
Too many memory-allocator trips per request for stack frames, translator-local structures, dictionaries, etc. Also, custom allocators that don’t implement thread affinity.
Linear searches in dictionaries and fd/inode context lists.
Repeating arguments through every stage of a deep translator chain, instead of using a request-block idiom that keeps arguments in one place.
Calling functions/macros on every iteration of a loop instead of just once (compilers can catch these some of the time but not always).
Functions that are only called from one place, but defined in a separate module to maximize linkage difficulty and cache effects.

As I said, these things add up. For some of these I can point to one specific place where an improvement could be made, but most of these issues don’t live in a particular piece of code. They’re aspects/idioms that appear everywhere, in every piece of code, not causing a distinct pause but just slowing down the “clock rate” at which every line executes. In addition to all of the other things I’m doing with specific code, one of my long-term goals on the project is to start changing habits away from these inefficient calling conventions and coding style to something a bit “tighter” as befits system code. Let’s see how that goes.

BLOG

06 Dec 2020
Looking back at 2020 – with g...

2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

Read more
27 Apr 2020
Update from the team

It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

Read more
03 Feb 2020
Building a longer term focus for Gl...

The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

Read more

Efficiency Still Matters

BLOG

Looking back at 2020 – with g...

Update from the team

Building a longer term focus for Gl...