<div dir="ltr">Thanks Atin.. I'm not familiar with pulling patches the review system but will try:)<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee <span dir="ltr"><<a href="mailto:amukherj@redhat.com" target="_blank">amukherj@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
<br>
On 06/16/2016 06:17 PM, Atin Mukherjee wrote:<br>
><br>
><br>
> On 06/16/2016 01:32 PM, B.K.Raghuram wrote:<br>
>> Thanks a lot Atin,<br>
>><br>
>> The problem is that we are using a forked version of 3.6.1 which has<br>
>> been modified to work with ZFS (for snapshots) but we do not have the<br>
>> resources to port that over to the later versions of gluster.<br>
>><br>
>> Would you know of anyone who would be willing to take this on?!<br>
><br>
> If you can cherry pick the patches and apply them on your source and<br>
> rebuild it, I can point the patches to you, but you'd need to give a<br>
> day's time to me as I have some other items to finish from my plate.<br>
<br>
<br>
</span>Here is the list of the patches need to be applied on the following order:<br>
<br>
<a href="http://review.gluster.org/9328" rel="noreferrer" target="_blank">http://review.gluster.org/9328</a><br>
<a href="http://review.gluster.org/9393" rel="noreferrer" target="_blank">http://review.gluster.org/9393</a><br>
<a href="http://review.gluster.org/10023" rel="noreferrer" target="_blank">http://review.gluster.org/10023</a><br>
<div class="HOEnZb"><div class="h5"><br>
><br>
> ~Atin<br>
>><br>
>> Regards,<br>
>> -Ram<br>
>><br>
>> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee <<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a><br>
>> <mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>>> wrote:<br>
>><br>
>><br>
>><br>
>> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:<br>
>> ><br>
>> ><br>
>> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee <<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> <mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>><br>
>> > <mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a> <mailto:<a href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>>>> wrote:<br>
>> ><br>
>> ><br>
>> ><br>
>> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:<br>
>> > > Hi,<br>
>> > ><br>
>> > > We're using gluster 3.6.1 and we periodically find that gluster commands<br>
>> > > fail saying the it could not get the lock on one of the brick machines.<br>
>> > > The logs on that machine then say something like :<br>
>> > ><br>
>> > > [2016-06-15 08:17:03.076119] E<br>
>> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management: Unable to<br>
>> > > acquire lock for vol2<br>
>> ><br>
>> > This is a possible case if concurrent volume operations are run. Do you<br>
>> > have any script which checks for volume status on an interval from all<br>
>> > the nodes, if so then this is an expected behavior.<br>
>> ><br>
>> ><br>
>> > Yes, I do have a couple of scripts that check on volume and quota<br>
>> > status.. Given this, I do get a "Another transaction is in progress.."<br>
>> > message which is ok. The problem is that sometimes I get the volume lock<br>
>> > held message which never goes away. This sometimes results in glusterd<br>
>> > consuming a lot of memory and CPU and the problem can only be fixed with<br>
>> > a reboot. The log files are huge so I'm not sure if its ok to attach<br>
>> > them to an email.<br>
>><br>
>> Ok, so this is known. We have fixed lots of stale lock issues in 3.7<br>
>> branch and some of them if not all were also backported to 3.6 branch.<br>
>> The issue is you are using 3.6.1 which is quite old. If you can upgrade<br>
>> to latest versions of 3.7 or at worst of 3.6 I am confident that this<br>
>> will go away.<br>
>><br>
>> ~Atin<br>
>> ><br>
>> > ><br>
>> > > After sometime, glusterd then seems to give up and die..<br>
>> ><br>
>> > Do you mean glusterd shuts down or segfaults, if so I am more<br>
>> interested<br>
>> > in analyzing this part. Could you provide us the glusterd log,<br>
>> > cmd_history log file along with core (in case of SEGV) from<br>
>> all the<br>
>> > nodes for the further analysis?<br>
>> ><br>
>> ><br>
>> > There is no segfault. glusterd just shuts down. As I said above,<br>
>> > sometimes this happens and sometimes it just continues to hog a lot of<br>
>> > memory and CPU..<br>
>> ><br>
>> ><br>
>> > ><br>
>> > > Interestingly, I also find the following line in the<br>
>> beginning of<br>
>> > > etc-glusterfs-glusterd.vol.log and I dont know if this has any<br>
>> > > significance to the issue :<br>
>> > ><br>
>> > > [2016-06-14 06:48:57.282290] I<br>
>> > > [glusterd-store.c:2063:glusterd_restore_op_version]<br>
>> 0-management:<br>
>> > > Detected new install. Setting op-version to maximum : 30600<br>
>> > ><br>
>> ><br>
>> ><br>
>> > What does this line signify?<br>
>><br>
>><br>
</div></div></blockquote></div><br></div>