AFR (Automatic File Replication) - Things to keep in mind and gotchas

From GlusterDocumentation

Jump to: navigation, search

Applies to server side

  • Using AFR on the server side means the servers talk directly to each other.
  • The clients connect only to 1 server. You would need to implement some kind of load balancing or something either with round robin DNS, LVS or manually specify a server per client.
  • You could configure all of your clients to connect to a single server, or manually spread them out between available servers.
  • When you create a new file or do a write on a client machine, the client sends the write the server it is configured to connect to, that server will then send the write to the other.
  • It is recommended to use client side AFR versus server side as it provides high availability in the event that a server goes down. Since the clients know about all of the servers they will still function and write to the server(s) that are up still.
  • If you have client1 connected to server1 and client2 connected to server2, and then server2 goes down, so does client2. The cluster also becomes unavailable.

Applies to client side

  • Using AFR on the client side means the servers do not know about each other. Each of the client machines handles the file replication when a file is opened. You should make sure that the clients are all using the same client volume spec file.
  • When you create a new file or do a write on a client machine, the client must send the write to both servers. The servers do not talk to each other.
  • It is recommended to use client side AFR versus server side as it provides high availability in the event that a server goes down. Since the clients know about all of the servers they will still function and write to the server(s) that are up still.
  • If you have a server go down, the client machines will still work if they can talk to the other server(s). In a server side setup the cluster becomes unavailable.

Applies to both

  • option replicate *:3 - no longer in AFR. The number of copies is determined by the number of servers. If you are doing AFR with 2 servers, you will have 2 copies. If you add a third server, you will have 3 copies of your data. You have to be a little more tricky in configuring your volume spec file to change this behaviour.
  • Make sure that the underlying server filesystems have equal available disk space. If one runs out files will still be written to the other server(s) but you may end up with less copies of your data then you think you have.
  • When doing a "df -h" on a client, the AVAILABLE disk space will display the maximum disk space of the first AFR sub volume defined in the spec file. So if you have two servers with 50 gigs and 100 gigs of free disk space, and the server with 100 gigs is listed first, then you will see 100 gigs available even though one server only has 50 gigs free.
Personal tools