[Gluster-users] Data consistency with Gluster 3.5

Bryan Whitehead driver at megahappy.net
Tue Mar 13 16:47:39 UTC 2012


Are all the clocks in sync on the servers?

You probably should configure memcache to be the banner cache (quick search
for "OpenX banner cache" shows that is an option). You can't have 4 clients
all opening/writing to the same file all at the same time.

On Mon, Mar 12, 2012 at 6:55 AM, Sean Fulton <sean at gcnpublishing.com> wrote:

> I have set up a replicated, four-node gluster config for a web farm. The
> idea is that each web node is its own server, and will have its own copy of
> the entire web root locally. It then serves the cluster to itself.  We're
> running it over dual GigE NICs bonded.
>
> The problem I am having is when we switch live traffic to nodes in the
> cluster, they almost immediately get out of sync. The issue seems to be
> with cache files that are read/written a lot. Here is an excerpt pointing
> to issues with our OpenX banner cache:
>
> [2012-02-25 18:53:04.198326] E [afr-self-heal-common.c:2074:**afr_self_heal_completion_cbk]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> failed on /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php
> [2012-02-25 18:53:04.199191] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php: gfid differs on subvolume 0
> (53fa373a-3830-4c5e-aa22-**6ed35c947d97, c12e0cdd-9b6c-4988-b793-**
> 819db0472780)
> [2012-02-25 18:53:04.199210] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php: gfid differs on subvolume 0
> (53fa373a-3830-4c5e-aa22-**6ed35c947d97, c12e0cdd-9b6c-4988-b793-**
> 819db0472780)
> [2012-02-25 18:53:04.199219] W [afr-common.c:882:afr_detect_**self_heal_by_iatt]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php: gfid different on subvolume
> [2012-02-25 18:53:04.199236] I [afr-common.c:1038:afr_launch_**self_heal]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> triggered. path: /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php
> [2012-02-25 18:53:04.200752] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php: gfid differs on subvolume 0
> (53fa373a-3830-4c5e-aa22-**6ed35c947d97, c12e0cdd-9b6c-4988-b793-**
> 819db0472780)
> [2012-02-25 18:53:04.200971] I [afr-self-heal-common.c:963:**afr_sh_missing_entries_done]
> 0-web-pub-replicate-0: split brain found, aborting selfheal of
> /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php
> [2012-02-25 18:53:04.200986] E [afr-self-heal-common.c:2074:**afr_self_heal_completion_cbk]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> failed on /cust/site1/www/openx/var/**cache/deliverycache_**
> f8e7a8862cb80b4933c58acdf65aae**f5.php
> [2012-02-25 18:53:04.202159] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 1
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.202178] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 1
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.202188] W [afr-common.c:882:afr_detect_**self_heal_by_iatt]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid different on subvolume
> [2012-02-25 18:53:04.202204] I [afr-common.c:1038:afr_launch_**self_heal]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> triggered. path: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.203463] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.203678] I [afr-self-heal-common.c:963:**afr_sh_missing_entries_done]
> 0-web-pub-replicate-0: split brain found, aborting selfheal of
> /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.203693] E [afr-self-heal-common.c:2074:**afr_self_heal_completion_cbk]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> failed on /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.204759] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.204781] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.204800] W [afr-common.c:882:afr_detect_**self_heal_by_iatt]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid different on subvolume
> [2012-02-25 18:53:04.204818] I [afr-common.c:1038:afr_launch_**self_heal]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> triggered. path: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.206150] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.206384] I [afr-self-heal-common.c:963:**afr_sh_missing_entries_done]
> 0-web-pub-replicate-0: split brain found, aborting selfheal of
> /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.206400] E [afr-self-heal-common.c:2074:**afr_self_heal_completion_cbk]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> failed on /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.207725] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.207746] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
> [2012-02-25 18:53:04.207756] W [afr-common.c:882:afr_detect_**self_heal_by_iatt]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid different on subvolume
> [2012-02-25 18:53:04.207772] I [afr-common.c:1038:afr_launch_**self_heal]
> 0-web-pub-replicate-0: background  meta-data data missing-entry self-heal
> triggered. path: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php
> [2012-02-25 18:53:04.209217] W [afr-common.c:1121:afr_**conflicting_iattrs]
> 0-web-pub-replicate-0: /cust/site1/www/openx/var/**cache/deliverycache_**
> f901ff39b456df599289c590ed89b1**9d.php: gfid differs on subvolume 0
> (375e1754-0420-4e26-9176-**bb2128c6596b, 3e9eca35-3351-450e-b8ab-**
> c62785968953)
>
> Nodes and network are fine. I have tried mounting the volumes using both
> the Gluster native client and with the Gluster NFS client but get the same
> results. It's killing performance.
>
> Here is the config:
>
>  1: volume web-pub-client-0
>  2:     type protocol/client
>  3:     option remote-host web-web1
>  4:     option remote-subvolume /glusterfs/pub
>  5:     option transport-type tcp
>  6: end-volume
>  7:
>  8: volume web-pub-client-1
>  9:     type protocol/client
>  10:     option remote-host web-web2
>  11:     option remote-subvolume /glusterfs/pub
>  12:     option transport-type tcp
>  13: end-volume
>  14:
>  15: volume web-pub-client-2
>  16:     type protocol/client
>  17:     option remote-host web-web3
>  18:     option remote-subvolume /glusterfs/pub
>  19:     option transport-type tcp
>  20: end-volume
>  21:
>  22: volume web-pub-client-3
>  23:     type protocol/client
>  24:     option remote-host web-web4
>  25:     option remote-subvolume /glusterfs/pub
>  26:     option transport-type tcp
>  27: end-volume
>  28:
>  29: volume web-pub-replicate-0
>  30:     type cluster/replicate
>  31:     subvolumes web-pub-client-0 web-pub-client-1 web-pub-client-2
> web-pub-client-3
>  32: end-volume
>  33:
>  34: volume web-pub-write-behind
>  35:     type performance/write-behind
>  36:     subvolumes web-pub-replicate-0
>  37: end-volume
>  38:
>  39: volume web-pub-read-ahead
>  40:     type performance/read-ahead
>  41:     subvolumes web-pub-write-behind
>  42: end-volume
>  43:
>  44: volume web-pub-io-cache
>  45:     type performance/io-cache
>  46:     option cache-size 256MB
>  47:     subvolumes web-pub-read-ahead
>  48: end-volume
>  49:
>  50: volume web-pub-quick-read
>  51:     type performance/quick-read
>  52:     option cache-size 256MB
>  53:     subvolumes web-pub-io-cache
>  54: end-volume
>  55:
>  56: volume web-pub
>  57:     type debug/io-stats
>  58:     option latency-measurement off
>  59:     option count-fop-hits off
>  60:     subvolumes web-pub-quick-read
>  61: end-volume
>  62:
>  63: volume nfs-server
>  64:     type nfs/server
>  65:     option nfs.dynamic-volumes on
>  66:     option rpc-auth.addr.web-pub.allow *
>  67:     option nfs3.web-pub.volume-id ac556d2e-e8a9-4857-bd17-**
> cab603820fcb
>  68:     subvolumes web-pub
>  69: end-volume
>
>
> Any ideas or help would be greatly appreciated.
>
> sean
>
> --
> Sean Fulton
> GCN Publishing, Inc.
> Internet Design, Development and Consulting For Today's Media Companies
> http://www.gcnpublishing.com
> (203) 665-6211, x203
>
>
> ______________________________**_________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/**mailman/listinfo/gluster-users<http://gluster.org/cgi-bin/mailman/listinfo/gluster-users>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120313/00d8e53b/attachment.html>


More information about the Gluster-users mailing list