Last month, I received two new servers to replace two of our three (replica 3) GlusterFS servers. My first inclination was to just down the server, move the hard drives into the new server, re-install the OS (moving from 32 bit to 64 bit), and voila, d…
One of the most common knocks against GlusterFS is that it eats too many CPU cycles. For example: Really not liking GlusterFS now. Performance is quite poor and CPU usage way too high for what it does. I’ve talked about performance issues and expectations many times. With regard to CPU usage, I can’t resist saying […]
A little while back, I tested out the Unified File and Object feature in Gluster 3.3, which taps OpenStack’s Swift component to handle the object half of the file and object combo. It took me kind of a long time to get it all running, so I was pleased to find this blog post promising a […]
This has come up several times in the last week. “I have 2n servers with 2 or 4 bricks each and I want to add 1 more server. How do I ensure the new server isn’t a replica of itself?”
This isn’t a simple thing to do. When you add bricks, replicas are a…
This concept is thrown around a lot. People frequently say that “GlusterFS is slow with small files”, or “how can I increase small file performance” without really understanding what they mean by “small files” or even “slow”.
“Small files” is sort of a…
Version 3.3 introduced a new structure to the bricks, the .glusterfs directory. So what is it?
The GFID
As you’re probably aware, GlusterFS stores metadata info in extended attributes. One of these bits of metadata is the “trusted.gfid”. This is, for a…
This post is about parsimonious planning, back-of-the-envelope engineering, and extreme productivity. By back of the envelope, we don’t mean flow charting. Rather, we mean real world consideration of the quantitative aspects of all of the major…
One of the cooler new features in oVirt 3.1 is the platform’s support for creating and managing Gluster volumes. oVirt’s web admin console now includes a graphical tool for configuring these volumes, and vdsm, the service for responsible for controlling oVirt’s virtualization nodes, has a new sibling, vdsm-gluster, for handling the back end work. Gluster and […]
GlusterFS spreads load using a distribute hash translation (DHT) of filenames to it’s subvolumes. Those subvolumes are usually replicated to provide fault tolerance as well as some load handling. The advanced file replication translator (AFR) departs f…
Back at the end of 2003 I was toying with Gnutella’s distributed peer-to-peer network and wanted to host a peer lookup server myself. I tried several cache servers but most of them didn’t support the newer protocol and would frequently fail. I wrote my…
I’ve been working on a puppet module for gluster. Both this, my puppet-gfs2 module, and other puppet clustering modules all share a common problem: How does one make sure that only certain operations happen on one node at a time? … Continue reading →
On Sunday, March 18th, Fan Yong commited a patch against ext4 to “return 32/64-bit dir name hash according to usage type”. Prior to that, ext2/3/4 would return a 32-bit hash value from telldir()/seekdir() as NFSv2 wasn’t designed to accomidate anything…
I’m trying to use openstack for my 2 vm hosts. I think that this will puppetize better than how I’m doing it now. Primarily, I think OpenStack will offer more flexibility when I need to schedule hardware maintenance as it will handle which compute node…
Over the weekend – and, obviously, a little bit today – I’ve been working on one of those projects that has been baking in the back of my mind for a long time. I always have a bunch of these queued up. Sometimes I use them as warmups or breaks when I’m feeling a bit […]
A GlusterFS user from IRC asked me about my puppet management of KVM in RHEL/CentOS and how it works. I started to write this post two weeks ago and had to stop because although it works great, I figured that wasn’t the answer he was looking for. I loo…
One of the key points I tried to make in my Red Hat Summit talk about GlusterFS last month is that GlusterFS quite deliberately does not trade away data safety or consistency for performance. That’s a painful choice, because everyone always wants to be the speed king and they’ll be sharply critical of anyone they […]
Scalable code is not enough… you need to know your big data plaform works on the inside.This post is about how to switch paradigms. It assumes that you’ve seen one or two of the thousands of big-data sales pitch videos. Maybe you’ve even …
The thoughtful bodepd has been kind enough to help me get my puppet-gluster module off the ground and publicized a bit too. My first few commits have been all clean up to get my initial hacking up to snuff with … Continue reading →
Apparently, someone in Hadoop-land is getting worried about alternatives to HDFS, and has decided to address that fear via social media instead of code. Two days ago we had Daniel Abadi casting aspersions on Hadoop adapters. Today we have Charles Zedlewski explaining why Cloudera uses HDFS. He mentions a recent GigaOm article listing eight alternatives, […]
It seems that the maintainer of the wireshark package in Fedora has updated to version 1.8.1 in the current Fedora Rawhide, which will become Fedora 18. The schedule tells us that Fedora 18 is planned to be released on 2012-11-06 (the latest schedule m…