Setting up AFR on two servers with pre-existing data

From GlusterDocumentation

Jump to: navigation, search

FIXME: NOTE: May 5, 2008 - This tutorial may not work, looking into it for suggestions

Contents

Introduction

NOTE: Before beginning, BE SURE TO HAVE A BACKUP OF ALL YOUR DATA BEFOREHAND! Remember to always keep a backup handy if you are working on live/important data. Things that CAN go wrong have a way of going wrong!

In this tutorial we will walk through setting up an AFR cluster using two servers and a single client. This tutorial shows how to migrate pre-existing data on one server to be mirrored via AFR to a new empty directory on the other server. This differs from setting up an AFR config using a brand new empty directory on both servers and copying any existing data to the new glusterfs volume.

The Scenario

Here we assume we are working with three machines. Two are servers and one is the client. Server1 and server2 both have raids mounted under /mnt/raid. Each of these has a glusterfs exported volume directory of web which the client will mount as /mnt/web. We will be using client side AFR.

In this case, server1 may have been using NFS to export the /mnt/raid/web directory to multiple web servers, so it might contain 20 gigs of data. This being a new setup, you would have just created an empty directory on server2 (mkdir /mnt/raid/web).

What we want to do next is migrate this to a glusterfs setup and have the data that exists on server1 get replicated over to the empty /mnt/raid/web directory on server2.

Server1

  • IP address : 192.168.0.1
  • Export volume directory : /mnt/raid/web
  • Pre-existing data : YES

Server2

  • IP address : 192.168.0.2
  • Export volume directory : /mnt/raid/web
  • Pre-existing data : NO

Client

  • IP address : 192.168.0.3
  • Server volume mount point : /mnt/web

Preparing the directories

What we need to do now is set the trusted.glusterfs.version extended file attribute for /mnt/raid/web on server1 to a HIGHER value than server2. The highest value is the greater number, so 3 is a higher value than 1. This tells GlusterFS that we want to trust the data on server1 over the empty directory on server2.

On server1, set the trusted.glusterfs.version to a value of 2 (higher):

$ find /mnt/raid/web -execdir setfattr -n trusted.glusterfs.version -v 2 {} \;

On server2, remove the extended attributes if they happen to exist:

$ setfattr -x trusted.glusterfs.version /mnt/raid/web
$ setfattr -x trusted.glusterfs.createtime /mnt/raid/web

NOTE: PAY ATTENTION to which server you are on. If you are working on live/important data this can quickly turn into a complete loss of data if you reverse which servers you run these commands on. If you were to set the trusted.glusterfs.version to 2 on server2 instead of server1, and then removed the extended attributes on server1 instead of server2, when you mount the volume on the client machine glusterfs will think the EMPTY DIRECTORY is the most trusted version, and will happily delete your PRE-EXISTING data on server1.

BE SURE TO HAVE A BACKUP! This is just good practice!

Configuring the volume spec files

The following volume spec files are very basic and only provided to illustrate getting AFR to work.

Server volume spec file

Both servers will use the same volume spec file, exporting the /mnt/raid/web directory. This example does not use/include any performance translators. Please refer to GlusterFS Volume Specification Examples for more ideas on how to configure spec files.

Create or edit the server volume specification file (/etc/glusterfs/glusterfs-server.vol) to match the example below.

volume brick
    type storage/posix
    option directory /mnt/raid/web
end-volume

volume server
    type protocol/server
    option transport-type tcp/server
    subvolumes brick
    option auth.ip.brick.allow 192.168.0.* # Allow access to brick
end-volume

Client volume spec file

Create or edit the client volume specification file (/etc/glusterfs/glusterfs-client.vol) to match the example below.

volume brick1
    type protocol/client
    option transport-type tcp/client # for TCP/IP transport
    option remote-host 192.168.0.1   # IP address of server1
    option remote-subvolume brick    # name of the remote volume on server1
end-volume

volume brick2
    type protocol/client
    option transport-type tcp/client # for TCP/IP transport
    option remote-host 192.168.0.2   # IP address of server2
    option remote-subvolume brick    # name of the remote volume on server2
end-volume

volume afr
   type cluster/afr
   subvolumes brick1 brick2
end-volume

Start the GlusterFS servers

Start the GlusterFS server (glusterfsd) on both servers. Then tail the log to make sure we had no errors.

NOTE: The location of your log files may be different if you specified a --prefix during installation.

$ glusterfsd -f /etc/glusterfs/glusterfs-server.vol
$ tail /var/log/glusterfsd.log

Mount the server volume

Now on the client machine mount the server volume. You will then need to use the find command to initiate AFR replication to server2.

This may take awhile depending on how much pre-existing data you had as the client will need to read all the files/directories and copy them to server2 via AFR.

$ glusterfs -f /etc/glusterfs/glusterfs-client.vol /mnt/web
$ cd /mnt/web
$ find /mnt/web -type f -exec head -n 1 {} \; >/dev/null
$ ls -l

You should see all your pre-existing data that was already on your server1. Now lets check the servers.

Verify AFR replication

Now go to server2 and take a look at your /mnt/raid/web directory, you should see all your data on server2 now!

$ cd /mnt/raid/web
$ ls -al

Conclusion

You should now have a successfully operational AFR setup using two servers mirroring data using AFR.

Things to keep in mind / gotchas

Refer to AFR (Automatic File Replication) - Things to keep in mind and gotchas

Credits

  • Author : Brandon Lamb
  • Email  : brandonlamb@gmail.com
  • GlusterFS rules!
Personal tools