Over the weekend – and, obviously, a little bit today – I’ve been working on one of those projects that has been baking in the back of my mind for a long time. I always have a bunch of these queued up. Sometimes I use them as warmups or breaks when I’m feeling a bit […]
DRAFT! (2012-08-09) Purpose: This document is intended to give you hands-on experience with Gluster by guiding you through the steps of setting it up for the first time.”step by step. If you are looking to get right into things, you can take a look at our quick start guide. After you deploy Gluster by following …Read more
A GlusterFS user from IRC asked me about my puppet management of KVM in RHEL/CentOS and how it works. I started to write this post two weeks ago and had to stop because although it works great, I figured that wasn’t the answer he was looking for. I loo…
One of the key points I tried to make in my Red Hat Summit talk about GlusterFS last month is that GlusterFS quite deliberately does not trade away data safety or consistency for performance. That’s a painful choice, because everyone always wants to be the speed king and they’ll be sharply critical of anyone they […]
Scalable code is not enough… you need to know your big data plaform works on the inside.This post is about how to switch paradigms. It assumes that you’ve seen one or two of the thousands of big-data sales pitch videos. Maybe you’ve even …
The thoughtful bodepd has been kind enough to help me get my puppet-gluster module off the ground and publicized a bit too. My first few commits have been all clean up to get my initial hacking up to snuff with … Continue reading →![]()
Apparently, someone in Hadoop-land is getting worried about alternatives to HDFS, and has decided to address that fear via social media instead of code. Two days ago we had Daniel Abadi casting aspersions on Hadoop adapters. Today we have Charles Zedlewski explaining why Cloudera uses HDFS. He mentions a recent GigaOm article listing eight alternatives, […]
It seems that the maintainer of the wireshark package in Fedora has updated to version 1.8.1 in the current Fedora Rawhide, which will become Fedora 18. The schedule tells us that Fedora 18 is planned to be released on 2012-11-06 (the latest schedule m…
It seems that the maintainer of the wireshark package in Fedora has updated to version 1.8.1 in the current Fedora Rawhide, which will become Fedora 18. The schedule tells us that Fedora 18 is planned to be released on 2012-11-06 (the latest schedule m…
I am an avid cobbler+puppet user. This allows me to rely on my cobbler server and puppet manifests to describe how servers/workstations are setup. I only backup my configs and data, and I regenerate failed machines PRN. I’ll be publishing … Continue reading →![]()
(This was originally posted on our Q&A site at community.gluster.org) Problem: VERY slow performance when using ‘bedtools’ and other apps that write zillions of small output chunks. If this was a self-writ app or an infrequently used one, I wouldn’t bother writing this up, but ‘bedtools’ is a fairly popular genomics app and since many …Read more
Many thanks to johnmark in #gluster for syndicating my “gluster” tagged blog posts on http://www.gluster.org/blog/ I aim to keep these posts technical and informative, aimed mostly at other sysadmins and gluster users. Please don’t be shy to comment on my … Continue reading →![]()
(This is a guest post from Red Hat engineering manager, Vidya Sakar, originally at The Fifth Elephant blog) In this digital universe where data is growing at a fast pace, the infrastructure to store, manage and retrieve data is of paramount importance. Just about everyone in this universe is generating data at a pace never …Read more
Daniel Abadi described his blog entry about Hadoop connectors as a “Stonebraker-style rant” and then delivered on the threat. Like everything Stonebraker has written in the last five years, it’s based on a fundamentally flawed premise, which is that HDFS stores unstructured data. This assumption is not clearly stated, but it’s pretty clear from context, […]
When I first read about reduce side joins in hadoop, I spent some time walking through a bunch of examples from this whitepaper by Jairam Chandar on Hadoop join-algorithms.In the beggining, everything seemed simple enough – because I was focusing on jo…
Using RAW Devices In VirtualBox VMs
Usually, VirtualBox creates its virtual machines in disk images
(.vdi, .vmdk, etc.). This tutorial explains how you can use RAW devices
from the host (partitions, LVM volumes, etc.) and create a VirtualBox VM
in …
I’ve been having some strange networking issues with gluster. “Eco__” from #gluster suggested I try an up to date Intel nic driver. Here are the steps I followed to make that happen. No news yet on if that solved the … Continue reading →![]()
For the last ~two or so years I’ve played and tested gluster on and off and hanging out in the awesome #gluster channel on Freenode. In case you haven’t heard, gluster was acquired by RedHat back in October 2011. This post … Continue reading →![]()
With the addition of automated self-heal in GlusterFS 3.3, a new hidden directory structure was added to each brick: “.glusterfs”. This complicates split-brain resolution as you now not only have to remove the “bad” file from the brick, but it’s counte…
The new hi1.4xlarge instances in EC2 are pretty exciting, not only because they’re equipped with SSDs but because they’re also equipped with 10GbE and placement groups allow you to create server clusters that are closely colocated with full bandwidth among them. I was about ready to do another round of GlusterFS testing to see the […]