<div dir="ltr"><div>Re:bug. I have a feeling some of these issues may have been caused by the gluster network adapters on the nodes having different MTU values during the initial peer, and also when attempting a rebalance/fix-layout.<br></div><div><br></div>Thanks, after some stupidity on my part, gluster03 is up and running again.<div><br></div><div>I have several replica 3 volumes on gluster01/02/03 for ovirt. Unfortunately I took gluster01 and gluster03 offline while trying to fix this issue, but somehow not all of my VM&#39;s in ovirt crashed. The output of &#39;gluster volume heal vm-storage info&#39; was:</div><div><div><br></div><div># gluster volume heal vm-storage info</div><div>Brick 10.0.231.50:/mnt/lv-vm-storage/vm-storage</div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/573c218f-1cc9-450d-a141-23d1d968d32c/033bd358-3d24-4bd3-963a-f789e54c131b - Possibly undergoing heal</div><div><br></div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/55bd9b2c-83c5-4d0c-9b2a-f9d49badc9cb/94e9cfcb-0e98-4b7f-8d99-181e635c2d12 </div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/aa98e418-dcea-431f-b94f-d323fbdf732f/c5c3417c-ca0b-426b-8589-39145ed4a866 - Possibly undergoing heal</div><div><br></div><div>/__DIRECT_IO_TEST__ </div><div>Number of entries: 4</div><div><br></div><div>Brick 10.0.231.51:/mnt/lv-vm-storage/vm-storage</div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/aa98e418-dcea-431f-b94f-d323fbdf732f/c5c3417c-ca0b-426b-8589-39145ed4a866 - Possibly undergoing heal</div><div><br></div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/573c218f-1cc9-450d-a141-23d1d968d32c/033bd358-3d24-4bd3-963a-f789e54c131b - Possibly undergoing heal</div><div><br></div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/55bd9b2c-83c5-4d0c-9b2a-f9d49badc9cb/94e9cfcb-0e98-4b7f-8d99-181e635c2d12 </div><div>/__DIRECT_IO_TEST__ </div><div>Number of entries: 4</div><div><br></div><div>Brick 10.0.231.52:/mnt/lv-vm-storage/vm-storage</div><div>&lt;gfid:41d37ab1-eda6-4e72-9edc-359bc5823431&gt; - Possibly undergoing heal</div><div><br></div><div>/a5a83df1-47e2-4927-9add-079199ca7ef8/images/aa98e418-dcea-431f-b94f-d323fbdf732f/c5c3417c-ca0b-426b-8589-39145ed4a866 - Possibly undergoing heal</div><div><br></div><div>&lt;gfid:56909e28-97b6-487a-a6c3-eabf09d19a1e&gt; </div><div>&lt;gfid:70b2c75a-561d-4aa0-b5f7-23b90e684caf&gt; </div><div>Number of entries: 4</div><div><br></div><div>I then tried to run heal which says it was unsuccessful:</div><div><br></div><div><div># gluster volume heal vm-storage full</div><div>Launching heal operation to perform full self heal on volume vm-storage has been unsuccessful</div></div><div><br></div><div><div># gluster volume heal vm-storage</div><div>Commit failed on 10.0.231.54. Please check log file for details.</div><div>Commit failed on 10.0.231.55. Please check log file for details.</div><div>Commit failed on 10.0.231.53. Please check log file for details.</div></div><div><br></div><div>glfheal-vm-storage.log:</div><div><div>[2016-02-29 21:03:47.754355] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1</div><div>[2016-02-29 21:13:32.583649] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1</div></div><div><br></div><div><br></div><div>Now if I run heal info again, I see no split-brains:</div></div><div><div># gluster volume heal vm-storage info</div><div>Brick 10.0.231.50:/mnt/lv-vm-storage/vm-storage</div><div>Number of entries: 0</div><div><br></div><div>Brick 10.0.231.51:/mnt/lv-vm-storage/vm-storage</div><div>Number of entries: 0</div><div><br></div><div>Brick 10.0.231.52:/mnt/lv-vm-storage/vm-storage</div><div>Number of entries: 0</div></div><div><br></div><div><div># gluster volume heal vm-storage info split-brain</div><div>Brick 10.0.231.50:/mnt/lv-vm-storage/vm-storage</div><div>Number of entries in split-brain: 0</div><div><br></div><div>Brick 10.0.231.51:/mnt/lv-vm-storage/vm-storage</div><div>Number of entries in split-brain: 0</div><div><br></div><div>Brick 10.0.231.52:/mnt/lv-vm-storage/vm-storage</div><div>Number of entries in split-brain: 0</div></div><div><br></div><div>But I really don&#39;t trust this, as some of the VM&#39;s continued to run, even when only 1 of 3 vm-storage bricks were available. Although there may not have been any IO.</div><div><br></div><div>So how do I fix:</div><div><div><div># gluster volume heal vm-storage full</div><div>Launching heal operation to perform full self heal on volume vm-storage has been unsuccessful</div></div></div><div><div><br></div><div>Is this the normal output, if the the nodes below don&#39;t have a brick for this volume?</div><div># gluster volume heal vm-storage</div><div>Commit failed on 10.0.231.54. Please check log file for details.</div><div>Commit failed on 10.0.231.55. Please check log file for details.</div><div>Commit failed on 10.0.231.53. Please check log file for details.</div></div><div><br></div><div>And how else can I verify integrity of the replica 3 volumes? Stop all the VM&#39;s and run md5sum on the brick files?</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Feb 29, 2016 at 12:12 PM, Rafi Kavungal Chundattu Parambil <span dir="ltr">&lt;<a href="mailto:rkavunga@redhat.com" target="_blank">rkavunga@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">You have to figure out the difference in volinfo in the peers . and rectify it. Or simply you can reduce version in vol info by one in node3 and restating the glusterd will solve the problem.</p>
<p dir="ltr">But I would be more interested to figure out why glusterd crashed.</p>
<p dir="ltr">1) Can you paste back trace of the core generated.<br>
2) Can you paste the op-version for all the nodes.<br>
3) Can you mention steps you did that lead to crash? Seems like you added a brick .<br>
4) If possible can you recollect the order in which you added the peers and the version. Also the upgrade sequence.<br></p>
<p dir="ltr">May be you can race a bug in bugzilla with the information.</p>
<p dir="ltr">Regards<br>
Rafi KC</p><div class="HOEnZb"><div class="h5">
<div>On 1 Mar 2016 12:58 am, Steve Dainard &lt;<a href="mailto:sdainard@spd1.com" target="_blank">sdainard@spd1.com</a>&gt; wrote:<br type="attribution"><blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I changed quota-version=1 on the two new nodes, and was able to join the cluster. I also rebooted the two new nodes and everything came up correctly.<div><br></div><div>Then I triggered a rebalance fix-layout and one of the original cluster members (node gluster03) glusterd crashed. I restarted glusterd and was connected but after a few minutes I&#39;m left with:</div><div><br></div><div><div># gluster peer status</div><div>Number of Peers: 5</div><div><br></div><div>Hostname: 10.0.231.51</div><div>Uuid: b01de59a-4428-486b-af49-cb486ab44a07</div><div>State: Peer in Cluster (Connected)</div><div><br></div><div>Hostname: 10.0.231.52</div><div>Uuid: 75143760-52a3-4583-82bb-a9920b283dac</div><div><b>State: Peer Rejected (Connected)</b></div><div><br></div><div>Hostname: 10.0.231.53</div><div>Uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411</div><div>State: Peer in Cluster (Connected)</div><div><br></div><div>Hostname: 10.0.231.54</div><div>Uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c</div><div>State: Peer in Cluster (Connected)</div><div><br></div><div>Hostname: 10.0.231.55</div><div>Uuid: 9c155c8e-2cd1-4cfc-83af-47129b582fd3</div><div>State: Peer in Cluster (Connected)</div></div><div><br></div><div>I see in the logs (attached) there is now a cksum error:</div><div><br></div><div><div>[2016-02-29 19:16:42.082256] E [MSGID: 106010] [glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of Cksums storage differ. local cksum = 50348222, remote cksum = 50348735 on peer 10.0.231.55</div><div>[2016-02-29 19:16:42.082298] I [MSGID: 106493] [glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.0.231.55 (0), ret: 0</div><div>[2016-02-29 19:16:42.092535] I [MSGID: 106493] [glusterd-rpc-ops.c:480:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411, host: 10.0.231.53, port: 0</div><div>[2016-02-29 19:16:42.096036] I [MSGID: 106143] [glusterd-pmap.c:229:pmap_registry_bind] 0-pmap: adding brick /mnt/lv-export-domain-storage/export-domain-storage on port 49153</div><div>[2016-02-29 19:16:42.097296] I [MSGID: 106143] [glusterd-pmap.c:229:pmap_registry_bind] 0-pmap: adding brick /mnt/lv-vm-storage/vm-storage on port 49155</div><div>[2016-02-29 19:16:42.100727] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30700</div><div>[2016-02-29 19:16:42.108495] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411</div><div>[2016-02-29 19:16:42.109295] E [MSGID: 106010] [glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of Cksums storage differ. local cksum = 50348222, remote cksum = 50348735 on peer 10.0.231.53</div><div>[2016-02-29 19:16:42.109338] I [MSGID: 106493] [glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.0.231.53 (0), ret: 0</div><div>[2016-02-29 19:16:42.119521] I [MSGID: 106143] [glusterd-pmap.c:229:pmap_registry_bind] 0-pmap: adding brick /mnt/lv-env-modules/env-modules on port 49157</div><div>[2016-02-29 19:16:42.122856] I [MSGID: 106143] [glusterd-pmap.c:229:pmap_registry_bind] 0-pmap: adding brick /mnt/raid6-storage/storage on port 49156</div><div>[2016-02-29 19:16:42.508104] I [MSGID: 106493] [glusterd-rpc-ops.c:480:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: b01de59a-4428-486b-af49-cb486ab44a07, host: 10.0.231.51, port: 0</div><div>[2016-02-29 19:16:42.519403] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30700</div><div>[2016-02-29 19:16:42.524353] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: b01de59a-4428-486b-af49-cb486ab44a07</div><div>[2016-02-29 19:16:42.524999] E [MSGID: 106010] [glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of Cksums storage differ. local cksum = 50348222, remote cksum = 50348735 on peer 10.0.231.51</div><div>[2016-02-29 19:16:42.525038] I [MSGID: 106493] [glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.0.231.51 (0), ret: 0</div><div>[2016-02-29 19:16:42.592523] I [MSGID: 106493] [glusterd-rpc-ops.c:480:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c, host: 10.0.231.54, port: 0</div><div>[2016-02-29 19:16:42.599518] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30700</div><div>[2016-02-29 19:16:42.604821] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c</div><div>[2016-02-29 19:16:42.605458] E [MSGID: 106010] [glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of Cksums storage differ. local cksum = 50348222, remote cksum = 50348735 on peer 10.0.231.54</div><div>[2016-02-29 19:16:42.605492] I [MSGID: 106493] [glusterd-handler.c:3780:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.0.231.54 (0), ret: 0</div><div>[2016-02-29 19:16:42.621943] I [MSGID: 106163] [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30700</div><div>[2016-02-29 19:16:42.628443] I [MSGID: 106490] [glusterd-handler.c:2539:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: a965e782-39e2-41cc-a0d1-b32ecccdcd2f</div><div>[2016-02-29 19:16:42.629079] E [MSGID: 106010] [glusterd-utils.c:2717:glusterd_compare_friend_volume] 0-management: Version of Cksums storage differ. local cksum = 50348222, remote cksum = 50348735 on peer 10.0.231.50</div></div><div><br></div><div>On gluster01/02/04/05</div><div>/var/lib/glusterd/vols/storage/cksum info=998305000<br></div><div><br></div><div>On gluster03</div><div><div>/var/lib/glusterd/vols/storage/cksum info=998305001</div></div><div><br></div><div>How do I recover from this? Can I just stop glusterd on gluster03 and change the cksum value?</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Feb 25, 2016 at 12:49 PM, Mohammed Rafi K C <span dir="ltr">&lt;<a href="mailto:rkavunga@redhat.com" target="_blank">rkavunga@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000"><div><div>
    <br>
    <br>
    <div>On 02/26/2016 01:53 AM, Mohammed Rafi K
      C wrote:<br>
    </div>
    <blockquote type="cite">
      
      <br>
      <br>
      <div>On 02/26/2016 01:32 AM, Steve Dainard
        wrote:<br>
      </div>
      <blockquote type="cite">
        <div dir="ltr">
          <div>I haven&#39;t done anything more than peer thus far, so I&#39;m a
            bit confused as to how the volume info fits in, can you
            expand on this a bit?<br>
          </div>
          <div><br>
          </div>
          <div>Failed commits? Is this split brain on the replica
            volumes? I don&#39;t get any return from &#39;gluster volume heal
            &lt;volname&gt; info&#39; on all the replica volumes, but if I
            try a gluster volume heal &lt;volname&gt; full I get:
            &#39;Launching heal operation to perform full self heal on
            volume &lt;volname&gt; has been unsuccessful&#39;.</div>
        </div>
      </blockquote>
      <br>
      forget about this. it is not for metadata selfheal .<br>
      <br>
      <blockquote type="cite">
        <div dir="ltr">
          <div><br>
          </div>
          <div>I have 5 volumes total.</div>
          <div><br>
          </div>
          <div>&#39;Replica 3&#39; volumes running on gluster01/02/03:</div>
          <div>vm-storage</div>
          <div>iso-storage</div>
          <div>export-domain-storage</div>
          <div>env-modules</div>
          <div><br>
          </div>
          <div>And one distributed only volume &#39;storage&#39; info shown
            below:<br>
          </div>
          <div>
            <div><br>
            </div>
            <div><b>From existing host gluster01/02:</b></div>
            <div>
              <div>type=0</div>
              <div>count=4</div>
              <div>status=1</div>
              <div>sub_count=0</div>
              <div>stripe_count=1</div>
              <div>replica_count=1</div>
              <div>disperse_count=0</div>
              <div>redundancy_count=0</div>
              <div>version=25</div>
              <div>transport-type=0</div>
              <div>volume-id=26d355cb-c486-481f-ac16-e25390e73775</div>
              <div>username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c</div>
              <div>password=</div>
              <div>op-version=3</div>
              <div>client-op-version=3</div>
              <div>quota-version=1</div>
              <div>parent_volname=N/A</div>
              <div>restored_from_snap=00000000-0000-0000-0000-000000000000</div>
              <div>snap-max-hard-limit=256</div>
              <div>features.quota-deem-statfs=on</div>
              <div>features.inode-quota=on</div>
              <div>diagnostics.brick-log-level=WARNING</div>
              <div>features.quota=on</div>
              <div>performance.readdir-ahead=on</div>
              <div>performance.cache-size=1GB</div>
              <div>performance.stat-prefetch=on</div>
              <div>brick-0=10.0.231.50:-mnt-raid6-storage-storage</div>
              <div>brick-1=10.0.231.51:-mnt-raid6-storage-storage</div>
              <div>brick-2=10.0.231.52:-mnt-raid6-storage-storage</div>
              <div>brick-3=10.0.231.53:-mnt-raid6-storage-storage</div>
            </div>
            <div><br>
            </div>
            <div>
              <div><b>From existing host gluster03/04:</b><br>
              </div>
              <div>
                <div>type=0</div>
                <div>count=4</div>
                <div>status=1</div>
                <div>sub_count=0</div>
                <div>stripe_count=1</div>
                <div>replica_count=1</div>
                <div>disperse_count=0</div>
                <div>redundancy_count=0</div>
                <div>version=25</div>
                <div>transport-type=0</div>
                <div>volume-id=26d355cb-c486-481f-ac16-e25390e73775</div>
                <div>username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c</div>
                <div>password=</div>
                <div>op-version=3</div>
                <div>client-op-version=3</div>
                <div>quota-version=1</div>
                <div>parent_volname=N/A</div>
                <div>restored_from_snap=00000000-0000-0000-0000-000000000000</div>
                <div>snap-max-hard-limit=256</div>
                <div>features.quota-deem-statfs=on</div>
                <div>features.inode-quota=on</div>
                <div>performance.stat-prefetch=on</div>
                <div>performance.cache-size=1GB</div>
                <div>performance.readdir-ahead=on</div>
                <div>features.quota=on</div>
                <div>diagnostics.brick-log-level=WARNING</div>
                <div>brick-0=10.0.231.50:-mnt-raid6-storage-storage</div>
                <div>brick-1=10.0.231.51:-mnt-raid6-storage-storage</div>
                <div>brick-2=10.0.231.52:-mnt-raid6-storage-storage</div>
                <div>brick-3=10.0.231.53:-mnt-raid6-storage-storage</div>
              </div>
              <div><br>
              </div>
              <div>So far between gluster01/02 and gluster03/04 the
                configs are the same, although the ordering is different
                for some of the features.</div>
              <div><br>
              </div>
              <div>On gluster05/06 the ordering is different again, and
                the quota-version=0 instead of 1.</div>
            </div>
          </div>
        </div>
      </blockquote>
      <br>
      This is why the peer shows as rejected. Can you check the
      op-version of all the glusterd including the one which is in
      reject state. you can find out the op-version here in 
      /var/lib/glusterd/<a href="http://glusterd.info" target="_blank">glusterd.info</a> <br>
    </blockquote>
    <br></div></div>
    If all the op-version are same and 3.7.6, then to work-around the
    issue, you can manually make it quota-version=1, and restarting the
    glusterd will solve the problem, But I would strongly recommend you
    to figure out the RCA. May be you can file a bug for this.<span><font color="#888888"><br>
    <br>
    Rafi</font></span><div><div><br>
    <br>
    <blockquote type="cite"> <br>
      Rafi KC<br>
      <br>
      <blockquote type="cite">
        <div dir="ltr">
          <div>
            <div>
              <div><br>
              </div>
              <div><b>From new hosts gluster05/gluster06:</b></div>
              <div>type=0</div>
              <div>count=4</div>
              <div>status=1</div>
              <div>sub_count=0</div>
              <div>stripe_count=1</div>
              <div>replica_count=1</div>
              <div>disperse_count=0</div>
              <div>redundancy_count=0</div>
              <div>version=25</div>
              <div>transport-type=0</div>
              <div>volume-id=26d355cb-c486-481f-ac16-e25390e73775</div>
              <div>username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c</div>
              <div>password=</div>
              <div>op-version=3</div>
              <div>client-op-version=3</div>
              <div>quota-version=0</div>
              <div>parent_volname=N/A</div>
              <div>restored_from_snap=00000000-0000-0000-0000-000000000000</div>
              <div>snap-max-hard-limit=256</div>
              <div>performance.stat-prefetch=on</div>
              <div>performance.cache-size=1GB</div>
              <div>performance.readdir-ahead=on</div>
              <div>features.quota=on</div>
              <div>diagnostics.brick-log-level=WARNING</div>
              <div>features.inode-quota=on</div>
              <div>features.quota-deem-statfs=on</div>
              <div>brick-0=10.0.231.50:-mnt-raid6-storage-storage</div>
              <div>brick-1=10.0.231.51:-mnt-raid6-storage-storage</div>
              <div>brick-2=10.0.231.52:-mnt-raid6-storage-storage</div>
              <div>brick-3=10.0.231.53:-mnt-raid6-storage-storage</div>
            </div>
            <div><br>
            </div>
          </div>
          <div>Also, I forgot to mention that when I initially peer&#39;d
            the two new hosts, glusterd crashed on gluster03 and had to
            be restarted (log attached) but has been fine since.</div>
          <div><br>
          </div>
          <div>Thanks,</div>
          <div>Steve</div>
        </div>
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Thu, Feb 25, 2016 at 11:27 AM,
            Mohammed Rafi K C <span dir="ltr">&lt;<a href="mailto:rkavunga@redhat.com" target="_blank">rkavunga@redhat.com</a>&gt;</span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"><span> <br>
                  <br>
                  <div>On 02/25/2016 11:45 PM, Steve Dainard wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr">Hello,<br>
                      <br>
                      I upgraded from 3.6.6 to 3.7.6 a couple weeks ago.
                      I just peered 2 new nodes to a 4 node cluster and
                      gluster peer status is:<br>
                      <br>
                      # gluster peer status <b>&lt;-- from node
                        gluster01</b><br>
                      Number of Peers: 5<br>
                      <br>
                      Hostname: 10.0.231.51<br>
                      Uuid: b01de59a-4428-486b-af49-cb486ab44a07<br>
                      State: Peer in Cluster (Connected)<br>
                      <br>
                      Hostname: 10.0.231.52<br>
                      Uuid: 75143760-52a3-4583-82bb-a9920b283dac<br>
                      State: Peer in Cluster (Connected)<br>
                      <br>
                      Hostname: 10.0.231.53<br>
                      Uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411<br>
                      State: Peer in Cluster (Connected)<br>
                      <br>
                      Hostname: 10.0.231.54 <b>&lt;-- new node
                        gluster05</b><br>
                      Uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c<br>
                      <b>State: Peer Rejected (Connected)</b><br>
                      <br>
                      Hostname: 10.0.231.55 <b>&lt;-- new node gluster06</b><br>
                      Uuid: 9c155c8e-2cd1-4cfc-83af-47129b582fd3<br>
                      <b>State: Peer Rejected (Connected)</b><br>
                    </div>
                  </blockquote>
                  <br>
                </span> Looks like your configuration files are
                mismatching, ie the checksum calculation differs on this
                two node than the others,<br>
                <br>
                Did you had any failed commit ?<br>
                <br>
                Compare your /var/lib/glusterd/&lt;volname&gt;/info of
                the failed node against good one, mostly you could see
                some difference.<br>
                <br>
                can you paste the /var/lib/glusterd/&lt;volname&gt;/info
                ?<br>
                <br>
                Regards<br>
                Rafi KC<br>
                <br>
                <br>
                <blockquote type="cite"><span>
                    <div dir="ltr">
                      <div><b><br>
                        </b></div>
                      <div>I followed the write-up here: <a href="http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected" target="_blank">http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected</a>
                        and the two new nodes peer&#39;d properly but after
                        a reboot of the two new nodes I&#39;m seeing the
                        same Peer Rejected (Connected) State.</div>
                      <div><br>
                      </div>
                      <div>I&#39;ve attached logs from an existing node, and
                        the two new nodes.</div>
                      <div><br>
                      </div>
                      <div>Thanks for any suggestions,</div>
                      <div>Steve</div>
                      <div><br>
                      </div>
                      <div>
                        <div><br>
                        </div>
                      </div>
                    </div>
                    <br>
                    <fieldset></fieldset>
                    <br>
                  </span>
                  <pre>_______________________________________________
Gluster-users mailing list
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
                </blockquote>
                <br>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div>
</blockquote></div></div></div></blockquote></div><br></div>