Replacing a GlusterFS Server: Best Practice

Gluster

2012-10-17

Last month, I received two new servers to replace two of our three (replica 3) GlusterFS servers. My first inclination was to just down the server, move the hard drives into the new server, re-install the OS (moving from 32 bit to 64 bit), and voila, done deal. Probably would have been okay if I hadn’t used a kickstart file that formatted all the drives. Oops. Since the drives were now blank, I decided to just put it in place, using the same gfid and let it self-heal everything back over.

This idea sucked. I have 15 volumes, and 4 bricks per server. Self-healing 60 bricks brought the remaining 32 bit server to it’s knees (and I filed multiple bugs against 3.3.0 including that the load for self heal doesn’t balance between sane servers). After a day (luckily I don’t have that much data) of having everyone in the company mad at me, the heal was completed and I was a bit wiser.

Today I installed the other new server. I installed CentOS 6.3, created the LVs (I use lvm to partition up the disks to make resizing volumes easier should the need arise and to allow me to do snapshots before I make any major changes), and one new hard drive (My drives aren’t that old. No need to replace them all).

I then added the new server to the trusted pool and used replace-brick to migrate one brick at a time to the new server. I also changed my placement of bricks to fit our newer best-practices.

oldserver=ewcs4
newserver=ewcs10
oldbrickpath=/var/spool/glusterfs/a_home
newbrickpath=/data/glusterfs/home/a
gluster peer probe $newserver
gluster volume replace-brick ${volname} ${oldserver}:${oldbrickpath} ${newserver}:${newbrickpath} start

I monitored the migration.

watch gluster volume replace-brick ${volname} ${oldserver}:${oldbrickpath} ${newserver}:${newbrickpath} status

Then committed the change after all the files were finished moving.

gluster volume replace-brick ${volname} ${oldserver}:${oldbrickpath} ${newserver}:${newbrickpath} commit

Repeat as necessary.

As for performance, it met my performance requirements: nobody calling me or emailing me to say that anything’s not working or is too slow. My VM’s continued without interruption, as did mysql – both hosted on their own volumes. As long as nobody noticed, I’m happy.

Tags
glusterfs

BLOG

06 Dec 2020
Looking back at 2020 – with g...

2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

Read more
27 Apr 2020
Update from the team

It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

Read more
03 Feb 2020
Building a longer term focus for Gl...

The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

Read more

Replacing a GlusterFS Server: Best Practice

BLOG

Looking back at 2020 – with g...

Update from the team

Building a longer term focus for Gl...