<div dir="ltr">I&#39;d tried that sometime back but ran into some merge conflicts and was not sure who to turn to :) May I come to you for help with that?!<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 17, 2016 at 3:29 PM, Atin Mukherjee <span dir="ltr">&lt;<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
<br>
On 06/17/2016 03:21 PM, B.K.Raghuram wrote:<br>
&gt; Thanks a ton Atin. That fixed cherry-pick. Will build it and let you<br>
&gt; know how it goes. Does it make sense to try and merge the whole upstream<br>
&gt; glusterfs repo for the 3.6 branch in order to get all the other bug<br>
&gt; fixes? That may bring in many more merge conflicts though..<br>
<br>
</span>Yup, I&#39;d not recommend that. Applying your local changes on the latest<br>
version is a much easier option :)<br>
<span class="im HOEnZb"><br>
&gt;<br>
&gt; On Fri, Jun 17, 2016 at 3:07 PM, Atin Mukherjee &lt;<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a><br>
</span><span class="im HOEnZb">&gt; &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt; wrote:<br>
&gt;<br>
&gt;     I&#39;ve resolved the merge conflicts and files are attached. Copy these<br>
&gt;     files and follow the instructions from the cherry pick command which<br>
&gt;     failed.<br>
&gt;<br>
&gt;     ~Atin<br>
&gt;<br>
&gt;     On 06/17/2016 02:55 PM, B.K.Raghuram wrote:<br>
&gt;     &gt;<br>
&gt;     &gt; Thanks Atin, I had three merge conflicts in the third patch.. I&#39;ve<br>
&gt;     &gt; attached the files with the conflicts. Would any of the intervening<br>
&gt;     &gt; commits be needed as well?<br>
&gt;     &gt;<br>
&gt;     &gt; The conflicts were in :<br>
&gt;     &gt;<br>
&gt;     &gt;     both modified:      libglusterfs/src/mem-types.h<br>
&gt;     &gt;     both modified:      xlators/mgmt/glusterd/src/glusterd-utils.c<br>
&gt;     &gt;     both modified:      xlators/mgmt/glusterd/src/glusterd-utils.h<br>
&gt;     &gt;<br>
&gt;     &gt;<br>
&gt;     &gt; On Fri, Jun 17, 2016 at 2:17 PM, Atin Mukherjee &lt;<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
</span><span class="im HOEnZb">&gt;     &gt; &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt; wrote:<br>
&gt;     &gt;<br>
&gt;     &gt;<br>
&gt;     &gt;<br>
&gt;     &gt;     On 06/17/2016 12:44 PM, B.K.Raghuram wrote:<br>
&gt;     &gt;     &gt; Thanks Atin.. I&#39;m not familiar with pulling patches the review system<br>
&gt;     &gt;     &gt; but will try:)<br>
&gt;     &gt;<br>
&gt;     &gt;     It&#39;s not that difficult. Open the gerrit review link, go to the download<br>
&gt;     &gt;     drop box at the top right corner, click on it and then you will see a<br>
&gt;     &gt;     cherry pick option, copy that content and paste it the source code repo<br>
&gt;     &gt;     you host. If there are no merge conflicts, it should auto apply,<br>
&gt;     &gt;     otherwise you&#39;d need to fix them manually.<br>
&gt;     &gt;<br>
&gt;     &gt;     HTH.<br>
&gt;     &gt;     Atin<br>
&gt;     &gt;<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt; On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee &lt;<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
</span><div class="HOEnZb"><div class="h5">&gt;     &gt;     &gt; &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;&gt; wrote:<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     On 06/16/2016 06:17 PM, Atin Mukherjee wrote:<br>
&gt;     &gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt; On 06/16/2016 01:32 PM, B.K.Raghuram wrote:<br>
&gt;     &gt;     &gt;     &gt;&gt; Thanks a lot Atin,<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt; The problem is that we are using a forked version of 3.6.1 which has<br>
&gt;     &gt;     &gt;     &gt;&gt; been modified to work with ZFS (for snapshots) but we do not have the<br>
&gt;     &gt;     &gt;     &gt;&gt; resources to port that over to the later versions of gluster.<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt; Would you know of anyone who would be willing to take this on?!<br>
&gt;     &gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt; If you can cherry pick the patches and apply them on your source and<br>
&gt;     &gt;     &gt;     &gt; rebuild it, I can point the patches to you, but you&#39;d need to give a<br>
&gt;     &gt;     &gt;     &gt; day&#39;s time to me as I have some other items to finish from my plate.<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     Here is the list of the patches need to be applied on the following<br>
&gt;     &gt;     &gt;     order:<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     <a href="http://review.gluster.org/9328" rel="noreferrer" target="_blank">http://review.gluster.org/9328</a><br>
&gt;     &gt;     &gt;     <a href="http://review.gluster.org/9393" rel="noreferrer" target="_blank">http://review.gluster.org/9393</a><br>
&gt;     &gt;     &gt;     <a href="http://review.gluster.org/10023" rel="noreferrer" target="_blank">http://review.gluster.org/10023</a><br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt; ~Atin<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt; Regards,<br>
&gt;     &gt;     &gt;     &gt;&gt; -Ram<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt; On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee<br>
&gt;     &gt;     &gt;     &lt;<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;<br>
</div></div><div class="HOEnZb"><div class="h5">&gt;     &gt;     &gt;     &gt;&gt; &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a><br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt; &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a><br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;&gt;&gt; wrote:<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     On 06/16/2016 10:49 AM, B.K.Raghuram wrote:<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee<br>
&gt;     &gt;     &gt;     &lt;<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;<br>
&gt;     &gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a><br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;<br>
&gt;     &gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;<br>
&gt;     &gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;<br>
&gt;     &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> &lt;mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;&gt;&gt;&gt;&gt;&gt; wrote:<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     On 06/15/2016 04:24 PM, B.K.Raghuram wrote:<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; Hi,<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; We&#39;re using gluster 3.6.1 and we<br>
&gt;     periodically find<br>
&gt;     &gt;     &gt;     that gluster commands<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; fail saying the it could not get the lock<br>
&gt;     on one of<br>
&gt;     &gt;     &gt;     the brick machines.<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; The logs on that machine then say<br>
&gt;     something like :<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; [2016-06-15 08:17:03.076119] E<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; [glusterd-op-sm.c:3058:glusterd_op_ac_lock]<br>
&gt;     &gt;     &gt;     0-management: Unable to<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; acquire lock for vol2<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     This is a possible case if concurrent volume<br>
&gt;     &gt;     operations<br>
&gt;     &gt;     &gt;     are run. Do you<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     have any script which checks for volume<br>
&gt;     status on an<br>
&gt;     &gt;     &gt;     interval from all<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     the nodes, if so then this is an expected<br>
&gt;     behavior.<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; Yes, I do have a couple of scripts that check on<br>
&gt;     &gt;     volume and<br>
&gt;     &gt;     &gt;     quota<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; status.. Given this, I do get a &quot;Another<br>
&gt;     transaction<br>
&gt;     &gt;     is in<br>
&gt;     &gt;     &gt;     progress..&quot;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; message which is ok. The problem is that<br>
&gt;     sometimes I get<br>
&gt;     &gt;     &gt;     the volume lock<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; held message which never goes away. This sometimes<br>
&gt;     &gt;     results<br>
&gt;     &gt;     &gt;     in glusterd<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; consuming a lot of memory and CPU and the<br>
&gt;     problem can<br>
&gt;     &gt;     only<br>
&gt;     &gt;     &gt;     be fixed with<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; a reboot. The log files are huge so I&#39;m not sure if<br>
&gt;     &gt;     its ok<br>
&gt;     &gt;     &gt;     to attach<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; them to an email.<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     Ok, so this is known. We have fixed lots of stale<br>
&gt;     lock<br>
&gt;     &gt;     issues<br>
&gt;     &gt;     &gt;     in 3.7<br>
&gt;     &gt;     &gt;     &gt;&gt;     branch and some of them if not all were also<br>
&gt;     backported to<br>
&gt;     &gt;     &gt;     3.6 branch.<br>
&gt;     &gt;     &gt;     &gt;&gt;     The issue is you are using 3.6.1 which is quite<br>
&gt;     old. If you<br>
&gt;     &gt;     &gt;     can upgrade<br>
&gt;     &gt;     &gt;     &gt;&gt;     to latest versions of 3.7 or at worst of 3.6 I am<br>
&gt;     confident<br>
&gt;     &gt;     &gt;     that this<br>
&gt;     &gt;     &gt;     &gt;&gt;     will go away.<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     ~Atin<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; After sometime, glusterd then seems to<br>
&gt;     give up<br>
&gt;     &gt;     and die..<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     Do you mean glusterd shuts down or<br>
&gt;     segfaults, if so I<br>
&gt;     &gt;     &gt;     am more<br>
&gt;     &gt;     &gt;     &gt;&gt;     interested<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     in analyzing this part. Could you provide<br>
&gt;     us the<br>
&gt;     &gt;     &gt;     glusterd log,<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     cmd_history log file along with core (in<br>
&gt;     case of<br>
&gt;     &gt;     SEGV) from<br>
&gt;     &gt;     &gt;     &gt;&gt;     all the<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     nodes for the further analysis?<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; There is no segfault. glusterd just shuts down.<br>
&gt;     As I said<br>
&gt;     &gt;     &gt;     above,<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; sometimes this happens and sometimes it just<br>
&gt;     continues to<br>
&gt;     &gt;     &gt;     hog a lot of<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; memory and CPU..<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; Interestingly, I also find the following line<br>
&gt;     &gt;     in the<br>
&gt;     &gt;     &gt;     &gt;&gt;     beginning of<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; etc-glusterfs-glusterd.vol.log and I dont<br>
&gt;     know if<br>
&gt;     &gt;     &gt;     this has any<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; significance to the issue :<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; [2016-06-14 06:48:57.282290] I<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     [glusterd-store.c:2063:glusterd_restore_op_version]<br>
&gt;     &gt;     &gt;     &gt;&gt;     0-management:<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt; Detected new install. Setting op-version to<br>
&gt;     &gt;     maximum :<br>
&gt;     &gt;     &gt;     30600<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;     &gt; What does this line signify?<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;     &gt;&gt;<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;     &gt;<br>
&gt;     &gt;<br>
&gt;     &gt;<br>
&gt;<br>
&gt;<br>
</div></div></blockquote></div><br></div>