In a very unscientific test, I was curious about how much of an effect GlusterFS’ self-heal check has on lstat. I wrote probably the first C program I’ve written in 20 years to find out.
To my local disk, which is not the same type or speed as my bricks (although it shouldn’t matter as this should all be handled in cache anyway), to a raw image from within a KVM instance, and to a file on a fuse mounted gluster volume; I looped lstat calls for 60 seconds. This was the result:
Iterations |
Calculated Latency |
Store |
| 90330916 | 0.66 microseconds | Local |
| 56497255 | 1.06 microseconds | Raw VM Image |
| 32860989 | 1.83 microseconds | GlusterFS |
Again, this is probably the worst test I could do, it’s not at all scientific, has way too many differences in the tests, is performed on a replica 3 volume with a replica down, is run on 3.1.7 (for which afr should perform the same as 3.2.6) and is just overall a waste of blog space, imho, but who knows. Someone else might at least get inspired to do a real test.
As you can see, it’s pretty significant. An almost 64% latency hit for this dumb test over local which, really, should be expected considering we’re adding network latency on top of everything, but the 41% drop from VM Image to GlusterFS mount probably a smidgeon more accurately represent the latency hit for the self-heal checks.
Here’s the C source:
#include <sys/types.h>
#include <sys/stat.h>
#include <time.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
int
main (int argc, char *argv[]) {
struct stat sb;
time_t seconds;
uint64_t count;
if (argc != 2) {
fprintf(stderr, "Usage: $s <pathname>\n", argv[0]);
exit(EXIT_FAILURE);
}
if (lstat(argv[1], &sb) == -1) {
perror("stat");
exit(EXIT_FAILURE);
}
seconds = time(NULL);
count = 0;
while ( seconds + 60 > time(NULL) ) {
lstat(argv[1], &sb);
count++;
}
fprintf(stdout, "Performed %llu lstat() calls in 60 seconds.\n", count);
}
My first time playing with heroku was very cool, but mystifying – it wasn’t clear how or why it was that I needed to run “git init”, and why I was “pushing” code to heroku. As a java developer, I’m used to setting up a tomcat server, dropping a …
Vagrant can build, and destroy, your entire dev setup in a matter of minutes. Its a powerful tool for achieving a cleanroom enginerring deployment setup.Vagrant allows you to setup a personalized VM on any machine in a matter of minute…
More often than I would like, someone with twenty or more web servers servicing tens of thousands of page hits per hour comes into #gluster asking how to get the highest performance out of their storage system. They’ve only just now come to the realiza…
Your code is only as good as its worst library. The lamest thing in the world is getting a “NoSuchMethodException” because you deploy an executable which puts the wrong version of the right libraries on the classpath… Or alternatively, becaus…
In case anyone’s interested, I have some puppet modules I’ve created.
Almost every other puppet module out there is ubuntu-centric and there are very few geared toward RHEL/CentOS/Fedora. Mine are, though I’ve tried to add structure to allow other dist…
Since GlusterFS is fuse based, it can be mounted as a standard user without too much difficulty.
On a server:
gluster volume set $VOLUME allow-insecure on
On the client as root:
echo user_allow_other >> /etc/fuse.conf
To mount the volume, you…
Frequently I have new users come into #gluster with their first ever GlusterFS volume being a stripe volume. Why? Because they’re sure that’s the right way to get better performance.
That ain’t necessarily so. The stripe translator was designed to allo…
I’ve been using other people’s maven repos for years. Emailing jars around, pushing them into drop boxes, checking out source code just to build binaries, etc. etc. etc…. And this was all AFTER maven existed.Why ?Because I never realized HOW EA…
Every once in a while , hadoop goes totally haywire when I play with it in psuedodistributed mode.Problems include :1) Data not being replicated to nodes (i.e. you do a namenode format, and the data nodes are now out of sync). 2) No connection ava…
My recent work on High Speed Replication is not the only thing I’ve done to improve GlusterFS performance recently. In addition to that 2x improvement in synchronous/replicated write performance, here are some of the other changes in the pipeline. Patch 3005 is a more reliable way to make sure we use a local copy of […]
In my last post, I promised to talk a bit about some emergent properties of the current replication approach (AFR), and some directions for the future. The biggest issue is latency. If you will recall, there are five basic steps to doing a write (or other modifying operation): lock, increment changelog, write, decrement changelog, unlock. […]
When your editing files on the fly, you need a tool like VIM. I use VIM almost exclusively for clojure and python.However, to really be efficient, you can’t rely on the arrow keys -> you need to know the shortcuts.So heres my favorites :Inside…
Okay so… in the last post, i tried to build a non-trivial map/r from scratch, and ran it on my machine. I ran into some issues involving the “glue” that held my map/reduce jobs together. For example, configuring the classes declared…
Today I’m writing a new map/r job, from scratch, trying to minimally copy code from other jobs. The principles behind hadoop are simple : you separately map records into key->value[] arrays, and then you convert those keys to integers, and dis…
Replication is the most necessarily complex part of GlusterFS – even more than distribution, which would probably be the most common guess. It’s also one of the things that sets GlusterFS apart from most of its obvious competitors. Many of them simply require that you implement RAID (to protect against disk failures) and heartbeat/failover (to […]
My Fedora 17 Beefy Miracle alpha1 ARM system does not any contents in /var/log/messages. This is very impractical for troubleshooting. The command systemd-journalctl –no-tail shows that rsyslog.service fails to start correctly. Bummer!Starting the dae…
A lot of people seem to be curious about how GlusterFS works, not just in the sense of effects but in terms of internal algorithms etc. as well. Here’s an example from this morning. The documentation at this level really is kind of sparse, so I might as well start filling some of the gaps. […]
This is a simple one. I just renewed my love for regular expressions… Sometimes I forget about them and I get lost in a world of if statements.A Quick refresher : Getting the “FOO” out of FOO_BAR The regex for this is “.*(?=_)”But what if…
I’m sitting here at the hotel where FAST’12 was just held, because my flight home isn’t until this evening and I didn’t schedule anything to do. Somewhere along the way I caught a cold, so going out and “seeing the sights” doesn’t appeal to me very much. I might as well write down my thoughts […]