<div dir="ltr"><div>Milos,<br><br></div>I just managed to take a look into a similar issue and my analysis is at [1]. I remember you mentioning about some incorrect /etc/hosts entries which lead to this same problem in earlier case, do you mind to recheck the same?<br><br>[1] <a href="http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html">http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html</a> </div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Dec 14, 2016 at 2:57 AM, Miloš Čučulović - MDPI <span dir="ltr"><<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi All,<br>
<br>
Moving forward with my issue, sorry for the late reply!<br>
<br>
I had some issues with the storage2 server (original volume), then decided to use 3.9.0, si I have the latest version.<br>
<br>
For that, I synced manually all the files to the storage server. I installed there gluster 3.9.0, started it, created new volume called storage and all seems to work ok.<br>
<br>
Now, I need to create my replicated volume (add new brick on storage2 server). Almost all the files are there. So, I was adding on storage server:<br>
<br>
* sudo gluter peer probe storage2<br>
* sudo gluster volume add-brick storage replica 2 storage2:/data/data-cluster force<br>
<br>
But there I am receiving "volume add-brick: failed: Host storage2 is not in 'Peer in Cluster' state"<br>
<br>
Any idea?<span class="im HOEnZb"><br>
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
Skype: milos.cuculovic.mdpi<br>
<br></span><div class="HOEnZb"><div class="h5">
On 08.12.2016 17:52, Ravishankar N wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I was able to fix the sync by rsync-ing all the directories, then the<br>
hale started. The next problem :), as soon as there are files on the<br>
new brick, the gluster mount will render also this one for mounts, and<br>
the new brick is not ready yet, as the sync is not yet done, so it<br>
results on missing files on client side. I temporary removed the new<br>
brick, now I am running a manual rsync and will add the brick again,<br>
hope this could work.<br>
<br>
What mechanism is managing this issue, I guess there is something per<br>
built to make a replica brick available only once the data is<br>
completely synced.<br>
</blockquote>
This mechanism was introduced in 3.7.9 or 3.7.10<br>
(<a href="http://review.gluster.org/#/c/13806/" rel="noreferrer" target="_blank">http://review.gluster.org/#/c<wbr>/13806/</a>). Before that version, you<br>
manually needed to set some xattrs on the bricks so that healing could<br>
happen in parallel while the client still would server reads from the<br>
original brick. I can't find the link to the doc which describes these<br>
steps for setting xattrs.:-(<br>
<br>
Calling it a day,<br>
Ravi<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
Skype: milos.cuculovic.mdpi<br>
<br>
On 08.12.2016 16:17, Ravishankar N wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI<br>
<<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>> wrote:<br>
<br>
Ah, damn! I found the issue. On the storage server, the storage2<br>
IP address was wrong, I inversed two digits in the /etc/hosts<br>
file, sorry for that :(<br>
<br>
I was able to add the brick now, I started the heal, but still no<br>
data transfer visible.<br>
<br>
</blockquote>
1. Are the files getting created on the new brick though?<br>
2. Can you provide the output of `getfattr -d -m . -e hex<br>
/data/data-cluster` on both bricks?<br>
3. Is it possible to attach gdb to the self-heal daemon on the original<br>
(old) brick and get a backtrace?<br>
`gdb -p <pid of self-heal daemon on the orignal brick>`<br>
thread apply all bt -->share this output<br>
quit gdb.<br>
<br>
<br>
-Ravi<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
@Ravi/Pranith - can you help here?<br>
<br>
<br>
<br>
By doing gluster volume status, I have<br>
<br>
Status of volume: storage<br>
Gluster process TCP Port RDMA Port<br>
Online Pid<br>
------------------------------<wbr>------------------------------<wbr>------------------<br>
<br>
Brick storage2:/data/data-cluster 49152 0 Y<br>
23101<br>
Brick storage:/data/data-cluster 49152 0 Y<br>
30773<br>
Self-heal Daemon on localhost N/A N/A Y<br>
30050<br>
Self-heal Daemon on storage N/A N/A Y<br>
30792<br>
<br>
<br>
Any idea?<br>
<br>
On storage I have:<br>
Number of Peers: 1<br>
<br>
Hostname: 195.65.194.217<br>
Uuid: 7c988af2-9f76-4843-8e6f-d94866<wbr>d57bb0<br>
State: Peer in Cluster (Connected)<br>
<br>
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
Skype: milos.cuculovic.mdpi<br>
<br>
On 08.12.2016 13:55, Atin Mukherjee wrote:<br>
<br>
Can you resend the attachment as zip? I am unable to extract<br>
the<br>
content? We shouldn't have 0 info file. What does gluster peer<br>
status<br>
output say?<br>
<br>
On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI<br>
<<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
<mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>>> wrote:<br>
<br>
I hope you received my last email Atin, thank you!<br>
<br>
- Kindest regards,<br>
<br>
Milos Cuculovic<br>
IT Manager<br>
<br>
---<br>
MDPI AG<br>
Postfach, CH-4020 Basel, Switzerland<br>
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland<br>
Tel. +41 61 683 77 35<br>
Fax +41 61 302 89 18<br>
Email: <a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
<mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>><br>
Skype: milos.cuculovic.mdpi<br>
<br>
On 08.12.2016 10:28, Atin Mukherjee wrote:<br>
<br>
<br>
---------- Forwarded message ----------<br>
From: *Atin Mukherjee* <<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
<mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>><br>
<mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
<mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>>> <mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
<mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>><br>
<mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a><br>
<mailto:<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>>>><wbr>><br>
Date: Thu, Dec 8, 2016 at 11:56 AM<br>
Subject: Re: [Gluster-users] Replica brick not working<br>
To: Ravishankar N <<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a> <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>>>><br>
Cc: Miloš Čučulović - MDPI <<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a><br>
<mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
<mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>><br>
<mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>><br>
<mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a> <mailto:<a href="mailto:cuculovic@mdpi.com" target="_blank">cuculovic@mdpi.com</a>>>>><wbr>,<br>
Pranith Kumar Karampuri<br>
<<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>><br>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>><br>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>> <mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a><br>
<mailto:<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>>>><wbr>>,<br>
gluster-users<br>
<<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.org</a><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>>><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a><br>
<mailto:<a href="mailto:gluster-users@gluster.org" target="_blank">gluster-users@gluster.<wbr>org</a>>>>><br>
<br>
<br>
<br>
<br>
On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N<br>
<<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>> <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>> <mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><br>
<mailto:<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a><wbr>>>>><br>
<br>
wrote:<br>
<br>
On 12/08/2016 10:43 AM, Atin Mukherjee wrote:<br>
<br>
>From the log snippet:<br>
<br>
[2016-12-07 09:15:35.677645] I [MSGID: 106482]<br>
<br>
[glusterd-brick-ops.c:442:__gl<wbr>usterd_handle_add_brick]<br>
0-management: Received add brick req<br>
[2016-12-07 09:15:35.677708] I [MSGID: 106062]<br>
<br>
[glusterd-brick-ops.c:494:__gl<wbr>usterd_handle_add_brick]<br>
0-management: replica-count is 2<br>
[2016-12-07 09:15:35.677735] E [MSGID: 106291]<br>
<br>
[glusterd-brick-ops.c:614:__gl<wbr>usterd_handle_add_brick]<br>
0-management:<br>
<br>
The last log entry indicates that we hit the<br>
code path in<br>
gd_addbr_validate_replica_coun<wbr>t ()<br>
<br>
if (replica_count ==<br>
volinfo->replica_count) {<br>
if (!(total_bricks %<br>
volinfo->dist_leaf_count)) {<br>
ret = 1;<br>
goto out;<br>
}<br>
}<br>
<br>
<br>
It seems unlikely that this snippet was hit<br>
because we print<br>
the E<br>
[MSGID: 106291] in the above message only if<br>
ret==-1.<br>
gd_addbr_validate_replica_coun<wbr>t() returns -1 and<br>
yet not<br>
populates<br>
err_str only when in volinfo->type doesn't match<br>
any of the<br>
known<br>
volume types, so volinfo->type is corrupted<br>
perhaps?<br>
<br>
<br>
You are right, I missed that ret is set to 1 here in<br>
the above<br>
snippet.<br>
<br>
@Milos - Can you please provide us the volume info<br>
file from<br>
/var/lib/glusterd/vols/<volnam<wbr>e>/ from all the three<br>
nodes to<br>
continue<br>
the analysis?<br>
<br>
<br>
<br>
-Ravi<br>
<br>
@Pranith, Ravi - Milos was trying to convert a<br>
dist (1 X 1)<br>
volume to a replicate (1 X 2) using add brick<br>
and hit<br>
this issue<br>
where add-brick failed. The cluster is<br>
operating with 3.7.6.<br>
Could you help on what scenario this code path<br>
can be<br>
hit? One<br>
straight forward issue I see here is missing<br>
err_str in<br>
this path.<br>
<br>
<br>
<br>
<br>
<br>
<br>
--<br>
<br>
~ Atin (atinm)<br>
<br>
<br>
<br>
--<br>
<br>
~ Atin (atinm)<br>
<br>
<br>
<br>
<br>
--<br>
<br>
~ Atin (atinm)<br>
<br>
<br>
<br>
<br>
--<br>
<br>
~ Atin (atinm)<br>
</blockquote>
<br>
<br>
</blockquote></blockquote>
<br>
<br>
</blockquote>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><br></div><div>~ Atin (atinm)<br></div></div></div></div>
</div>