<div dir="ltr">Thanks for you input, Anirban.<div><br></div><div>I ran the commands on both servers, with the following results:</div><div><br></div><div><div><br></div><div>root@web3:/var/www/site-images# time getfattr -m . -d -e hex templates/assets/prod/temporary/13/user_1339200.png</div><div><br></div><div>real<span class="" style="white-space:pre">        </span>0m34.524s</div><div>user<span class="" style="white-space:pre">        </span>0m0.004s</div><div>sys<span class="" style="white-space:pre">        </span>0m0.000s</div></div><div><br></div><div><br></div><div><div>root@web4:/var/www/site-images# time getfattr -m . -d -e hex templates/assets/prod/temporary/13/user_1339200.png</div><div>getfattr: templates/assets/prod/temporary/13/user_1339200.png: Input/output error</div><div><br></div><div>real<span class="" style="white-space:pre">        </span>0m11.315s</div><div>user<span class="" style="white-space:pre">        </span>0m0.001s</div><div>sys<span class="" style="white-space:pre">        </span>0m0.003s</div><div>root@web4:/var/www/site-images# ls templates/assets/prod/temporary/13/user_1339200.png</div><div>ls: cannot access templates/assets/prod/temporary/13/user_1339200.png: Input/output error</div></div><div><br></div><div><br></div><div>Not sure if it elucidate the issue..</div><div><br></div><div><br></div><div>Also, I saw at /var/log/gluster.log a zillion entries like these:</div><div><br></div><div><div>[2015-01-26 17:35:39.973268] W [client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1: remote operation failed: Transport endpoint is not connected. Path: /templates/apache/template/prod/facebook/9616964 (00000000-0000-0000-0000-000000000000)</div><div>[2015-01-26 17:35:39.973435] W [client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1: remote operation failed: Transport endpoint is not connected. Path: /templates/apache/template/prod/facebook/9594915 (00000000-0000-0000-0000-000000000000)</div><div>[2015-01-26 17:35:39.973571] W [client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1: remote operation failed: Transport endpoint is not connected. Path: /templates/apache/template/prod/facebook/9681971 (00000000-0000-0000-0000-000000000000)</div><div>[2015-01-26 17:35:39.973686] W [client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1: remote operation failed: Transport endpoint is not connected. Path: /templates/apache/template/prod/facebook/19615 (00000000-0000-0000-0000-000000000000)</div><div>[2015-01-26 17:35:39.973802] W [client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1: remote operation failed: Transport endpoint is not connected. Path: /templates/apache/template/prod/facebook/130392 (00000000-0000-0000-0000-000000000000)</div></div><div><br></div><div><br></div><div>I have talked with some guys at #gluster that pointed it could be network issues. I&#39;m still looking into it, but since the issue also happens locally (within the same server), would that still be a valid point?</div><div><br></div><div><br></div><div>Also, less often, I see entries like these:</div><div><br></div><div><div>[2015-01-26 17:41:25.956418] E [afr-self-heal-common.c:1615:afr_sh_common_lookup_cbk] 0-site-images-replicate-0: Conflicting entries for /webhost/sites/clipart/assets/apache/images/graphics/215126/image1.png</div><div>[2015-01-26 17:41:26.588753] E [afr-self-heal-common.c:1615:afr_sh_common_lookup_cbk] 0-site-images-replicate-0: Conflicting entries for /webhost/sites/clipart/assets/apache/images/graphics/215126/image1.png</div></div><div><br></div><div><br></div><div>Are those a definitive indication of a split-brain? Or just something usual until self-heal takes care of recently updated files?</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 26, 2015 at 2:25 PM, A Ghoshal <span dir="ltr">&lt;<a href="mailto:a.ghoshal@tcs.com" target="_blank">a.ghoshal@tcs.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> I am plagued with something of this sort, too!<br>
<br>
What I mostly see when I explore these things is that<br>
<br>
A) it&#39;s a split-brain.<br>
B) the split-brain is because the gfid&#39;s on the two replicas are at odds.<br>
<br>
You could check that out by<br>
1. On each server, first &#39;cd&#39; to where your brick is mounted.<br>
2. getfattr -m . -d -e hex templates/assets/prod/temporary/13/user_1339200.png<br>
<br>
You will see a trusted.gfid kind of extended attribute. If it&#39;s not the same on both servers, there&#39;s a problem.<br>
<br>
Thanks,<br>
Anirban<br><br></blockquote></div><br clear="all"><div><br></div><div>Regards,</div>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><font color="#444444"><b>Tiago Santos</b></font><div><div><font color="#ff0000">MustHaveMenus.com</font></div></div></div></div></div></div>
</div></div>