[Gluster-users] Server Side AFR and RoundRobin DNS

Benjamin Long Benjamin.Long at longbros.com
Wed Oct 7 14:03:39 UTC 2009


On Monday 05 October 2009 06:06:07 pm Benjamin Long wrote:
> Greetings All,
> 
> 	I've been fighting with a very strange problem after following the
> instructions at:
> http://www.gluster.com/community/documentation/index.php/High-
> availability_storage_using_server-side_AFR
> 
> I'm using Gluster 2.0.7 on Debian Lenny servers, and Kubuntu Karmic guest.
> Here are the symptoms:
> Start Server1
> Start Server2
> Mount on client at /export
>   * Server1 says:
> [server-protocol.c:7065:mop_setvolume] server: accepted client from
> 10.10.3.33:1023
>   * Server2 says:
> [server-protocol.c:7065:mop_setvolume] server: accepted client from
> 10.10.3.33:1021
> 
> cd to /export
> run 'ls'
>   * Server 1 says:
> [2009-10-05 17:54:54] E [server-protocol.c:5020:server_releasedir] server:
>  fd - 0: unresolved fd
>   * Server 2 says nothing.
> 
> run 'touch testfile'
>   errors with:
> touch: closing `testfile': Invalid argument
>   * Client log says:
> W [fuse-bridge.c:882:fuse_err_cbk] glusterfs-fuse: 87: FLUSH() ERR => -1
> (Invalid argument)
> W [fuse-bridge.c:882:fuse_err_cbk] glusterfs-fuse: 89: FLUSH() ERR => -1
> (Invalid argument)
>   * Server 1 says:
> E [server-protocol.c:4119:server_flush] afr: invalid argument: state->fd
> E [server-protocol.c:4119:server_flush] afr: invalid argument: state->fd
>   * Server 2 says nothing
> 
> 
> Now, the strange thing is, I can restart EITHER server and this problem
>  goes away. Note that is a restart, not a stop. The server comes back up
>  and replicates as I expect it to. I do not have this issue with client
>  side replication, but it's much slower.
> 
> I've posted my complete config and more log details at:
> http://pastebin.ca/1595892
> 

Since I haven't gotten any replies yet, I'll add my own hypothesis to the 
discussion in the hopes it will help someone help me.

I think that when the client first connects, it's connecting to BOTH servers at 
the same time and trying to do the same file operation on both. Once one of 
them resets, and the client only has one active connection, the problem goes 
away.

Any ideas?

-- 
Benjamin Long



More information about the Gluster-users mailing list