<div dir="ltr"><div>Milos,<br><br></div>I just managed to take a look into a similar issue and my analysis is at [1]. I remember you mentioning about some incorrect /etc/hosts entries which lead to this same problem in earlier case, do you mind to recheck the same?<br><br>[1]  <a href="http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html">http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html</a> </div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Dec 14, 2016 at 2:57 AM, Miloš Čučulović - MDPI <span dir="ltr">&lt;<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi All,<br>
<br>
Moving forward with my issue, sorry for the late reply!<br>
<br>
I had some issues with the storage2 server (original volume), then decided to use 3.9.0, si I have the latest version.<br>
<br>
For that, I synced manually all the files to the storage server. I installed there gluster 3.9.0, started it, created new volume called storage and all seems to work ok.<br>
<br>
Now, I need to create my replicated volume (add new brick on storage2 server). Almost all the files are there. So, I was adding on storage server:<br>
<br>
* sudo gluter peer probe storage2<br>
* sudo gluster volume add-brick storage replica 2 storage2:/data/data-cluster force<br>
<br>
But there I am receiving &quot;volume add-brick: failed: Host storage2 is not in &#39;Peer in Cluster&#39; state&quot;<br>
<br>
Any idea?<span class="im HOEnZb"><br>
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
Skype: milos.cuculovic.mdpi<br>
<br></span><div class="HOEnZb"><div class="h5">
On 08.12.2016 17:52, Ravishankar N wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I was able to fix the sync by rsync-ing all the directories, then the<br>
hale started. The next problem :), as soon as there are files on the<br>
new brick, the gluster mount will render also this one for mounts, and<br>
the new brick is not ready yet, as the sync is not yet done, so it<br>
results on missing files on client side. I temporary removed the new<br>
brick, now I am running a manual rsync and will add the brick again,<br>
hope this could work.<br>
<br>
What mechanism is managing this issue, I guess there is something per<br>
built to make a replica brick available only once the data is<br>
completely synced.<br>
</blockquote>
This mechanism was introduced in  3.7.9 or 3.7.10<br>
(<a href="http://review.gluster.org/#/c/13806/" rel="noreferrer" target="_blank">http://review.gluster.org/#/c<wbr>/13806/</a>). Before that version, you<br>
manually needed to set some xattrs on the bricks so that healing could<br>
happen in parallel while the client still would server reads from the<br>
original brick.  I can&#39;t find the link to the doc which describes these<br>
steps for setting xattrs.:-(<br>
<br>
Calling it a day,<br>
Ravi<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
Skype: milos.cuculovic.mdpi<br>
<br>
On 08.12.2016 16:17, Ravishankar N wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI<br>
&lt;<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;&gt; wrote:<br>
<br>
    Ah, damn! I found the issue. On the storage server, the storage2<br>
    IP address was wrong, I inversed two digits in the /etc/hosts<br>
    file, sorry for that :(<br>
<br>
    I was able to add the brick now, I started the heal, but still no<br>
    data transfer visible.<br>
<br>
</blockquote>
1. Are the files getting created on the new brick though?<br>
2. Can you provide the output of `getfattr -d -m . -e hex<br>
/data/data-cluster` on both bricks?<br>
3. Is it possible to attach gdb to the self-heal daemon on the original<br>
(old) brick and get a backtrace?<br>
    `gdb -p &lt;pid of self-heal daemon on the orignal brick&gt;`<br>
     thread apply all bt  --&gt;share this output<br>
    quit gdb.<br>
<br>
<br>
-Ravi<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
@Ravi/Pranith - can you help here?<br>
<br>
<br>
<br>
    By doing gluster volume status, I have<br>
<br>
    Status of volume: storage<br>
    Gluster process                       TCP Port  RDMA Port<br>
Online  Pid<br>
------------------------------<wbr>------------------------------<wbr>------------------<br>
<br>
    Brick storage2:/data/data-cluster     49152     0 Y<br>
     23101<br>
    Brick storage:/data/data-cluster      49152     0 Y<br>
     30773<br>
    Self-heal Daemon on localhost         N/A       N/A Y<br>
     30050<br>
    Self-heal Daemon on storage           N/A       N/A Y<br>
     30792<br>
<br>
<br>
    Any idea?<br>
<br>
    On storage I have:<br>
    Number of Peers: 1<br>
<br>
    Hostname: 195.65.194.217<br>
    Uuid: 7c988af2-9f76-4843-8e6f-d94866<wbr>d57bb0<br>
    State: Peer in Cluster (Connected)<br>
<br>
<br>
    - Kindest regards,<br>
<br>
    Milos Cuculovic<br>
    IT Manager<br>
<br>
    ---<br>
    MDPI AG<br>
    Postfach, CH-4020 Basel, Switzerland<br>
    Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
    Tel. +41 61 683 77 35<br>
    Fax +41 61 302 89 18<br>
    Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;<br>
    Skype: milos.cuculovic.mdpi<br>
<br>
    On 08.12.2016 13:55, Atin Mukherjee wrote:<br>
<br>
        Can you resend the attachment as zip? I am unable to extract<br>
the<br>
        content? We shouldn&#39;t have 0 info file. What does gluster peer<br>
        status<br>
        output say?<br>
<br>
        On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI<br>
        &lt;<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;<br>
        &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;&gt;&gt; wrote:<br>
<br>
            I hope you received my last email Atin, thank you!<br>
<br>
            - Kindest regards,<br>
<br>
            Milos Cuculovic<br>
            IT Manager<br>
<br>
            ---<br>
            MDPI AG<br>
            Postfach, CH-4020 Basel, Switzerland<br>
            Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
            Tel. +41 61 683 77 35<br>
            Fax +41 61 302 89 18<br>
            Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;<br>
        &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;&gt;<br>
            Skype: milos.cuculovic.mdpi<br>
<br>
            On 08.12.2016 10:28, Atin Mukherjee wrote:<br>
<br>
<br>
                ---------- Forwarded message ----------<br>
                From: *Atin Mukherjee* &lt;<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
        &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>&gt;<br>
                &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
        &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>&gt;&gt; &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
        &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>&gt;<br>
                &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
        &lt;mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>&gt;&gt;&gt;<wbr>&gt;<br>
                Date: Thu, Dec 8, 2016 at 11:56 AM<br>
                Subject: Re: [Gluster-users] Replica brick not working<br>
                To: Ravishankar N &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt;<br>
                &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt;&gt;<br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a> &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt;<br>
                &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt;&gt;&gt;&gt;<br>
                Cc: Miloš Čučulović - MDPI &lt;<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
        &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;<br>
                &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;&gt;<br>
                &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;<br>
        &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> &lt;mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>&gt;&gt;&gt;&gt;<wbr>,<br>
                Pranith Kumar Karampuri<br>
                &lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;<br>
        &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;&gt;<br>
                &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
        &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt; &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
        &lt;mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;&gt;&gt;<wbr>&gt;,<br>
                gluster-users<br>
                &lt;<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a><br>
        &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>&gt;<br>
        &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
        &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>&gt;&gt;<br>
                &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
        &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>&gt;<br>
                &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
        &lt;mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>&gt;&gt;&gt;&gt;<br>
<br>
<br>
<br>
<br>
                On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N<br>
                &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt; &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt;&gt;<br>
                &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt; &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
        &lt;mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>&gt;&gt;&gt;&gt;<br>
<br>
                wrote:<br>
<br>
                    On 12/08/2016 10:43 AM, Atin Mukherjee wrote:<br>
<br>
                        &gt;From the log snippet:<br>
<br>
                        [2016-12-07 09:15:35.677645] I [MSGID: 106482]<br>
<br>
        [glusterd-brick-ops.c:442:__gl<wbr>usterd_handle_add_brick]<br>
                        0-management: Received add brick req<br>
                        [2016-12-07 09:15:35.677708] I [MSGID: 106062]<br>
<br>
        [glusterd-brick-ops.c:494:__gl<wbr>usterd_handle_add_brick]<br>
                        0-management: replica-count is 2<br>
                        [2016-12-07 09:15:35.677735] E [MSGID: 106291]<br>
<br>
        [glusterd-brick-ops.c:614:__gl<wbr>usterd_handle_add_brick]<br>
                0-management:<br>
<br>
                        The last log entry indicates that we hit the<br>
        code path in<br>
                        gd_addbr_validate_replica_coun<wbr>t ()<br>
<br>
                                        if (replica_count ==<br>
                volinfo-&gt;replica_count) {<br>
                                                if (!(total_bricks %<br>
                        volinfo-&gt;dist_leaf_count)) {<br>
                                                        ret = 1;<br>
                                                        goto out;<br>
                        }<br>
                                        }<br>
<br>
<br>
                    It seems unlikely that this snippet was hit<br>
        because we print<br>
                the E<br>
                    [MSGID: 106291] in the above message only if<br>
ret==-1.<br>
                    gd_addbr_validate_replica_coun<wbr>t() returns -1 and<br>
        yet not<br>
                populates<br>
                    err_str only when in volinfo-&gt;type doesn&#39;t match<br>
        any of the<br>
                known<br>
                    volume types, so volinfo-&gt;type is corrupted<br>
perhaps?<br>
<br>
<br>
                You are right, I missed that ret is set to 1 here in<br>
        the above<br>
                snippet.<br>
<br>
                @Milos - Can you please provide us the volume info<br>
        file from<br>
                /var/lib/glusterd/vols/&lt;volnam<wbr>e&gt;/ from all the three<br>
        nodes to<br>
                continue<br>
                the analysis?<br>
<br>
<br>
<br>
                    -Ravi<br>
<br>
                        @Pranith, Ravi - Milos was trying to convert a<br>
        dist (1 X 1)<br>
                        volume to a replicate (1 X 2) using add brick<br>
        and hit<br>
                this issue<br>
                        where add-brick failed. The cluster is<br>
        operating with 3.7.6.<br>
                        Could you help on what scenario this code path<br>
        can be<br>
                hit? One<br>
                        straight forward issue I see here is missing<br>
        err_str in<br>
                this path.<br>
<br>
<br>
<br>
<br>
<br>
<br>
                --<br>
<br>
                ~ Atin (atinm)<br>
<br>
<br>
<br>
                --<br>
<br>
                ~ Atin (atinm)<br>
<br>
<br>
<br>
<br>
        --<br>
<br>
        ~ Atin (atinm)<br>
<br>
<br>
<br>
<br>
--<br>
<br>
~ Atin (atinm)<br>
</blockquote>
<br>
<br>
</blockquote></blockquote>
<br>
<br>
</blockquote>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><br></div><div>~ Atin (atinm)<br></div></div></div></div>
</div>