<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>I caught one of the nodes transitioning into faulty mode, log output is below.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><div> In master nodes, look for log messages. Let us know if you feel any issue in log messages. (/var/log/glusterfs/geo-replication/)</div></span></div></div></div></blockquote><div>When one of the nodes drops into &quot;faulty&quot;, which happens periodically, this is the type of output that appears in the log:</div><div><br></div><div>[root@gfs-a-1 ~]# tail  /usr/local/var/log/glusterfs/geo-replication/shares/ssh%3A%2F%2Froot%4010.XX.XXX.X%3Agluster%3A%2F%2F127.0.0.1%3Abkpshares.log</div><div>[2015-05-05 09:22:58.140913] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/065c09f9-4502-4a2c-81fa-5e8fcaf22712 [errcode: 23]</div><div>[2015-05-05 09:22:58.152951] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/28a237a4-4346-48c5-bd1c-713273f591c7 [errcode: 23]</div><div>[2015-05-05 09:22:58.327603] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/5755db3e-e9d8-42d2-b415-890842b086ae [errcode: 23]</div><div>[2015-05-05 09:22:58.336714] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/0b7fc219-1e31-4e66-865f-5ae1c26d5e54 [errcode: 23]</div><div>[2015-05-05 09:22:58.360308] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/955cd0e4-dd06-4db6-9391-34dbf72c9b06 [errcode: 23]</div><div>[2015-05-05 09:22:58.367522] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/1d455725-c3e1-4111-92e5-335610d3f513 [errcode: 23]</div><div>[2015-05-05 09:22:58.368226] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/7ce881ae-3491-4e21-b38b-0a27fb620c74 [errcode: 23]</div><div>[2015-05-05 09:22:58.368959] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/056732c1-1537-4925-a30c-b905c110a5b2 [errcode: 23]</div><div>[2015-05-05 09:22:58.369635] W [master(/mnt/a-1-shares-brick-2/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/8c58d6c5-9975-43c6-8f4c-2a92337f7350 [errcode: 23]</div><div>[2015-05-05 09:22:58.369790] W [master(/mnt/a-1-shares-brick-2/brick):877:process] _GMaster: incomplete sync, retrying changelogs: XSYNC-CHANGELOG.1430830891 </div><div><br></div><div>When the node is in &quot;active&quot; mode, I get a lot of log output that resembles this:</div><div><div>[2015-05-05 09:23:54.735502] W [master(/mnt/a-1-shares-brick-3/brick):877:process] _GMaster: incomplete sync, retrying changelogs: XSYNC-CHANGELOG.1430832227</div><div>[2015-05-05 09:23:55.449265] W [master(/mnt/a-1-shares-brick-3/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/0665be16-04e9-4cbe-a2c9-a633caa8c79d [errcode: 23]</div><div>[2015-05-05 09:23:55.449491] W [master(/mnt/a-1-shares-brick-3/brick):877:process] _GMaster: incomplete sync, retrying changelogs: XSYNC-CHANGELOG.1430832227</div><div>[2015-05-05 09:23:56.277033] W [master(/mnt/a-1-shares-brick-3/brick):250:regjob] &lt;top&gt;: Rsync: .gfid/0665be16-04e9-4cbe-a2c9-a633caa8c79d [errcode: 23]</div><div>[2015-05-05 09:23:56.277259] W [master(/mnt/a-1-shares-brick-3/brick):860:process] _GMaster: changelogs XSYNC-CHANGELOG.1430832227 could not be processed - moving on...</div><div>[2015-05-05 09:23:56.294038] W [master(/mnt/a-1-shares-brick-3/brick):862:process] _GMaster: SKIPPED GFID =</div><div>[2015-05-05 09:23:56.381592] I [master(/mnt/a-1-shares-brick-3/brick):1130:crawl] _GMaster: finished hybrid crawl syncing</div><div>[2015-05-05 09:24:24.404884] I [master(/mnt/a-1-shares-brick-4/brick):445:crawlwrap] _GMaster: 1 crawls, 1 turns</div><div>[2015-05-05 09:24:24.437452] I [master(/mnt/a-1-shares-brick-4/brick):1124:crawl] _GMaster: starting hybrid crawl...</div><div>[2015-05-05 09:24:24.588865] I [master(/mnt/a-1-shares-brick-1/brick):1133:crawl] _GMaster: processing xsync changelog /usr/local/var/run/gluster/shares/ssh%3A%2F%2Froot%4010.XX.XXX.X%3Agluster%3A%2F%2F127.0.0.1%3Abkpshares/9d9a72f468c582609e97e8929e58b9ff/xsync/XSYNC-CHANGELOG.1430832135</div></div><div><br></div><div>This begs a couple of questions for me:</div><div><ol><li>Are these errcode:23 issues files that have been deleted/renamed since the changelog was created?</li><li>Is it correct/expected for the node to drop into faulty and then recover itself to active periodically?</li></ol></div><div>Thank you again for your assistance!</div><div>Dave</div></div></div></div>