<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Oct 24, 2016 at 12:39 PM, Xavier Hernandez <span dir="ltr"><<a href="mailto:xhernandez@datalab.es" target="_blank">xhernandez@datalab.es</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Soumya,<div><div class="m_-9158205786550862485gmail-h5"><br>
<br>
On 21/10/16 16:15, Soumya Koduri wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
<br>
On 10/21/2016 06:35 PM, Soumya Koduri wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Hi Xavi,<br>
<br>
On 10/21/2016 12:57 PM, Xavier Hernandez wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Looking at the code, I think that the added fd_unref() should only be<br>
called if the fop preparation fails. Otherwise the callback already<br>
unreferences the fd.<br>
<br>
Code flow:<br>
<br>
* glfs_fsync_async_common() takes an fd ref and calls STACK_WIND passing<br>
that fd.<br>
* Just after that a ref is released.<br>
* When glfs_io_async_cbk() is called another ref is released.<br>
<br>
Note that if fop preparation fails, a single fd_unref() is called, but<br>
on success two fd_unref() are called.<br>
</blockquote>
<br>
Sorry for the inconvenience caused. I think its patch#15573 hasn't<br>
caused the problem but has highlighted another ref leak in the code.<br>
<br>
>From the code I see that glfs_io_async_cbk() does fd_unref (glfd->fd)<br>
but not the fd passed in STACK_WIND_COOKIE() of the fop.<br>
<br>
If I take any fop, for eg.,<br>
glfs_fsync_common() {<br>
<br>
fd = glfs_resolve_fd (glfd->fs, subvol, glfd);<br>
<br>
<br>
}<br>
<br>
Here in glfs_resolve_fd ()<br>
<br>
fd_t *<br>
__glfs_resolve_fd (struct glfs *fs, xlator_t *subvol, struct glfs_fd<br>
*glfd)<br>
{<br>
fd_t *fd = NULL;<br>
<br>
if (glfd->fd->inode->table->xl == subvol)<br>
return fd_ref (glfd->fd);<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Here we can see that we are taking extra ref additional to the<br>
</blockquote></blockquote></blockquote></blockquote></blockquote></blockquote></blockquote></blockquote>
ref already taken for glfd->fd. That means the caller of this function<br>
needs to fd_unref(fd) irrespective of subsequent fd_unref (glfd->fd).<br>
<br>
fd = __glfs_migrate_fd (fs, subvol, glfd);<br>
if (!fd)<br>
return NULL;<br>
<br>
<br>
if (subvol == fs->active_subvol) {<br>
fd_unref (glfd->fd);<br>
glfd->fd = fd_ref (fd);<br>
}<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
I think the issue is here during graph_switch(). You have<br>
</blockquote></blockquote></blockquote></blockquote></blockquote></blockquote></blockquote></blockquote>
mentioned as well that the crash happens post graph_switch. Maybe here<br>
we are missing an extra ref to be taken for fd additional to glfd->fd. I<br>
need to look through __glfs_migrate_fd() to confirm that. But these are<br>
my initial thoughts.<br>
</blockquote>
<br>
Looking into this, I think we should fix glfs_io_async_cbk() not to<br>
fd_unref(glfd->fd). glfd->fd should be active though out the lifetime of<br>
glfd (i.e, until it is closed). Thoughts?<br>
</blockquote>
<br></div></div>
I don't know gfapi internals in deep, but at first sight I think this would be the right think to do. Assuming that glfd will keep a reference to the fd until it's destroyed, and that a glfd reference is taken during the lifetime of each request that needs it, the fd_unref() in glfd_io_async_cbk() seems unnecessary. I think it was there just to release the fd acquired if glfs_resolve_fd(), but it's better to place it where it's now.<br>
<br>
Another question is if we really need to take an additional reference in glfs_resolve_fd() ?<br></blockquote><div><br></div><div>This answers the first question too, we don't need the additional ref in glfs_resolve_fd() now that we have ref accounting in glfd. This confusion is because earlier there was no ref accounting for glfd and only refs were on fd_t.</div><div>No, we don't need this additional reference.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Can an fd returned by this function live more time than the associated glfd in some circumstances ?<span class="m_-9158205786550862485gmail-"><br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Also could you please check if it is the second/subsequent fsync_async()<br>
call which results in crash?<br>
</blockquote>
<br></span>
I'll try to test it as soon as possible, but this is on a server that we need to put in production very soon and we have decided to go with fuse for now. We'll have a lot of work to do this week. Once I have some free time I'll build a test environment to check it, probably next week.<br>
<br></blockquote><div><br></div><div>I have not been able to test this out completely. Theoretically, I don't see any possibility where fd can outlive a glfd that is pointing to it. </div><div>I have sent a patch[1] that</div><div>a. fixes the crash</div><div>b. handles the unref in failure cases</div><div>c. but still has the duplicate refs with some explanation in commit message.</div><div>d. adds a simple test for the same.</div><div><br></div><div>I request that we take this patch in and make a release soon as it is affecting many community users and do the cleanup in another patch.</div><div><br></div><div>[1] <a href="http://review.gluster.org/#/c/15768/" target="_blank">http://review.gluster.org/<wbr>#/c/15768/</a> </div><div><br></div><div>Thanks,</div><div>Raghavendra Talur </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Xavi<div class="m_-9158205786550862485gmail-HOEnZb"><div class="m_-9158205786550862485gmail-h5"><br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Thanks,<br>
Soumya<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Please let me know your comments.<br>
<br>
Thanks,<br>
Soumya<br>
<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Xavi<br>
<br>
On 21/10/16 09:03, Xavier Hernandez wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Hi,<br>
<br>
I've just tried Gluster 3.8.5 with Proxmox using gfapi and I<br>
consistently see a crash each time an attempt to connect to the volume<br>
is made.<br>
<br>
The backtrace of the crash shows this:<br>
<br>
#0 pthread_spin_lock () at<br>
../nptl/sysdeps/x86_64/pthread<wbr>_spin_lock.S:24<br>
#1 0x00007fe5345776a5 in fd_unref (fd=0x7fe523f7205c) at fd.c:553<br>
#2 0x00007fe53482ba18 in glfs_io_async_cbk (op_ret=<optimized out>,<br>
op_errno=0, frame=<optimized out>, cookie=0x7fe526c67040,<br>
iovec=iovec@entry=0x0, count=count@entry=0)<br>
at glfs-fops.c:839<br>
#3 0x00007fe53482beed in glfs_fsync_async_cbk (frame=<optimized out>,<br>
cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>,<br>
op_errno=<optimized out>,<br>
prebuf=<optimized out>, postbuf=0x7fe5217fe890, xdata=0x0) at<br>
glfs-fops.c:1382<br>
#4 0x00007fe520be2eb7 in ?? () from<br>
/usr/lib/x86_64-linux-gnu/glus<wbr>terfs/3.8.5/xlator/debug/io-st<wbr>ats.so<br>
#5 0x00007fe5345d118a in default_fsync_cbk (frame=0x7fe52ceef3ac,<br>
cookie=0x560ef95398e8, this=0x8, op_ret=0, op_errno=0, prebuf=0x1,<br>
postbuf=0x7fe5217fe890, xdata=0x0) at defaults.c:1508<br>
#6 0x00007fe5345d118a in default_fsync_cbk (frame=0x7fe52ceef204,<br>
cookie=0x560ef95398e8, this=0x8, op_ret=0, op_errno=0, prebuf=0x1,<br>
postbuf=0x7fe5217fe890, xdata=0x0) at defaults.c:1508<br>
#7 0x00007fe525f78219 in dht_fsync_cbk (frame=0x7fe52ceef2d8,<br>
cookie=0x560ef95398e8, this=0x0, op_ret=0, op_errno=0,<br>
prebuf=0x7fe5217fe820, postbuf=0x7fe5217fe890, xdata=0x0)<br>
at dht-inode-read.c:873<br>
#8 0x00007fe5261bbc7f in client3_3_fsync_cbk (req=0x7fe525f78030<br>
<dht_fsync_cbk>, iov=0x7fe526c61040, count=8,<br>
myframe=0x7fe52ceef130) at<br>
client-rpc-fops.c:975<br>
#9 0x00007fe5343201f0 in rpc_clnt_handle_reply (clnt=0x18,<br>
clnt@entry=0x7fe526fafac0, pollin=0x7fe526c3a1c0) at rpc-clnt.c:791<br>
#10 0x00007fe53432056c in rpc_clnt_notify (trans=<optimized out>,<br>
mydata=0x7fe526fafaf0, event=<optimized out>, data=0x7fe526c3a1c0) at<br>
rpc-clnt.c:962<br>
#11 0x00007fe53431c8a3 in rpc_transport_notify (this=<optimized out>,<br>
event=<optimized out>, data=<optimized out>) at rpc-transport.c:541<br>
#12 0x00007fe5283e8d96 in socket_event_poll_in (this=0x7fe526c69440) at<br>
socket.c:2267<br>
#13 0x00007fe5283eaf37 in socket_event_handler (fd=<optimized out>,<br>
idx=5, data=0x7fe526c69440, poll_in=1, poll_out=0, poll_err=0) at<br>
socket.c:2397<br>
#14 0x00007fe5345ab3f6 in event_dispatch_epoll_handler<br>
(event=0x7fe5217fecc0, event_pool=0x7fe526ca2040) at event-epoll.c:571<br>
#15 event_dispatch_epoll_worker (data=0x7fe527c0f0c0) at<br>
event-epoll.c:674<br>
#16 0x00007fe5324140a4 in start_thread (arg=0x7fe5217ff700) at<br>
pthread_create.c:309<br>
#17 0x00007fe53214962d in clone () at<br>
../sysdeps/unix/sysv/linux/x86<wbr>_64/clone.S:111<br>
<br>
The fd being unreferenced contains this:<br>
<br>
(gdb) print *fd<br>
$6 = {<br>
pid = 97649,<br>
flags = 2,<br>
refcount = 0,<br>
inode_list = {<br>
next = 0x7fe523f7206c,<br>
prev = 0x7fe523f7206c<br>
},<br>
inode = 0x0,<br>
lock = {<br>
spinlock = 1,<br>
mutex = {<br>
__data = {<br>
__lock = 1,<br>
__count = 0,<br>
__owner = 0,<br>
__nusers = 0,<br>
__kind = 0,<br>
__spins = 0,<br>
__elision = 0,<br>
__list = {<br>
__prev = 0x0,<br>
__next = 0x0<br>
}<br>
},<br>
__size = "\001", '\000' <repeats 38 times>,<br>
__align = 1<br>
}<br>
},<br>
_ctx = 0x7fe52ec31c40,<br>
xl_count = 11,<br>
lk_ctx = 0x7fe526c126a0,<br>
anonymous = _gf_false<br>
}<br>
<br>
fd->inode is NULL, explaining the cause of the crash. We also see that<br>
fd->refcount is already 0. So I'm wondering if this couldn't be an<br>
extra<br>
fd_unref() introduced by that patch.<br>
<br>
The crash seems to happen immediately after a graph switch.<br>
<br>
Xavi<br>
______________________________<wbr>_________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://www.gluster.org/mailman<wbr>/listinfo/gluster-devel</a><br>
</blockquote></blockquote>
______________________________<wbr>_________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://www.gluster.org/mailman<wbr>/listinfo/gluster-devel</a><br>
</blockquote></blockquote>
______________________________<wbr>_________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-devel" rel="noreferrer" target="_blank">http://www.gluster.org/mailman<wbr>/listinfo/gluster-devel</a><br>
</div></div></blockquote></div><br></div></div>