Setting up AFR on two servers with pre-existing data
From GlusterDocumentation
FIXME: NOTE: May 5, 2008 - This tutorial may not work, looking into it for suggestions
Contents |
Introduction
NOTE: Before beginning, BE SURE TO HAVE A BACKUP OF ALL YOUR DATA BEFOREHAND! Remember to always keep a backup handy if you are working on live/important data. Things that CAN go wrong have a way of going wrong!
In this tutorial we will walk through setting up an AFR cluster using two servers and a single client. This tutorial shows how to migrate pre-existing data on one server to be mirrored via AFR to a new empty directory on the other server. This differs from setting up an AFR config using a brand new empty directory on both servers and copying any existing data to the new glusterfs volume.
The Scenario
Here we assume we are working with three machines. Two are servers and one is the client. Server1 and server2 both have raids mounted under /mnt/raid. Each of these has a glusterfs exported volume directory of web which the client will mount as /mnt/web. We will be using client side AFR.
In this case, server1 may have been using NFS to export the /mnt/raid/web directory to multiple web servers, so it might contain 20 gigs of data. This being a new setup, you would have just created an empty directory on server2 (mkdir /mnt/raid/web).
What we want to do next is migrate this to a glusterfs setup and have the data that exists on server1 get replicated over to the empty /mnt/raid/web directory on server2.
Server1
- IP address : 192.168.0.1
- Export volume directory : /mnt/raid/web
- Pre-existing data : YES
Server2
- IP address : 192.168.0.2
- Export volume directory : /mnt/raid/web
- Pre-existing data : NO
Client
- IP address : 192.168.0.3
- Server volume mount point : /mnt/web
Preparing the directories
What we need to do now is set the trusted.glusterfs.version extended file attribute for /mnt/raid/web on server1 to a HIGHER value than server2. The highest value is the greater number, so 3 is a higher value than 1. This tells GlusterFS that we want to trust the data on server1 over the empty directory on server2.
On server1, set the trusted.glusterfs.version to a value of 2 (higher):
$ find /mnt/raid/web -execdir setfattr -n trusted.glusterfs.version -v 2 {} \;
On server2, remove the extended attributes if they happen to exist:
$ setfattr -x trusted.glusterfs.version /mnt/raid/web $ setfattr -x trusted.glusterfs.createtime /mnt/raid/web
NOTE: PAY ATTENTION to which server you are on. If you are working on live/important data this can quickly turn into a complete loss of data if you reverse which servers you run these commands on. If you were to set the trusted.glusterfs.version to 2 on server2 instead of server1, and then removed the extended attributes on server1 instead of server2, when you mount the volume on the client machine glusterfs will think the EMPTY DIRECTORY is the most trusted version, and will happily delete your PRE-EXISTING data on server1.
BE SURE TO HAVE A BACKUP! This is just good practice!
Configuring the volume spec files
The following volume spec files are very basic and only provided to illustrate getting AFR to work.
Server volume spec file
Both servers will use the same volume spec file, exporting the /mnt/raid/web directory. This example does not use/include any performance translators. Please refer to GlusterFS Volume Specification Examples for more ideas on how to configure spec files.
Create or edit the server volume specification file (/etc/glusterfs/glusterfs-server.vol) to match the example below.
volume brick
type storage/posix
option directory /mnt/raid/web
end-volume
volume server
type protocol/server
option transport-type tcp/server
subvolumes brick
option auth.ip.brick.allow 192.168.0.* # Allow access to brick
end-volume
Client volume spec file
Create or edit the client volume specification file (/etc/glusterfs/glusterfs-client.vol) to match the example below.
volume brick1
type protocol/client
option transport-type tcp/client # for TCP/IP transport
option remote-host 192.168.0.1 # IP address of server1
option remote-subvolume brick # name of the remote volume on server1
end-volume
volume brick2
type protocol/client
option transport-type tcp/client # for TCP/IP transport
option remote-host 192.168.0.2 # IP address of server2
option remote-subvolume brick # name of the remote volume on server2
end-volume
volume afr
type cluster/afr
subvolumes brick1 brick2
end-volume
Start the GlusterFS servers
Start the GlusterFS server (glusterfsd) on both servers. Then tail the log to make sure we had no errors.
NOTE: The location of your log files may be different if you specified a --prefix during installation.
$ glusterfsd -f /etc/glusterfs/glusterfs-server.vol $ tail /var/log/glusterfsd.log
Mount the server volume
Now on the client machine mount the server volume. You will then need to use the find command to initiate AFR replication to server2.
This may take awhile depending on how much pre-existing data you had as the client will need to read all the files/directories and copy them to server2 via AFR.
$ glusterfs -f /etc/glusterfs/glusterfs-client.vol /mnt/web
$ cd /mnt/web
$ find /mnt/web -type f -exec head -n 1 {} \; >/dev/null
$ ls -l
You should see all your pre-existing data that was already on your server1. Now lets check the servers.
Verify AFR replication
Now go to server2 and take a look at your /mnt/raid/web directory, you should see all your data on server2 now!
$ cd /mnt/raid/web $ ls -al
Conclusion
You should now have a successfully operational AFR setup using two servers mirroring data using AFR.
Things to keep in mind / gotchas
Refer to AFR (Automatic File Replication) - Things to keep in mind and gotchas
Credits
- Author : Brandon Lamb
- Email : brandonlamb@gmail.com
- GlusterFS rules!

