<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 05/12/2015 07:36 PM, Poornima
Gurusiddaiah wrote:<br>
</div>
<blockquote
cite="mid:145596718.13847710.1431439596741.JavaMail.zimbra@redhat.com"
type="cite">
<div style="font-family: times new roman,new york,times,serif;
font-size: 12pt; color: #000000">
<div>Hi,<br>
<br>
We recently uncovered an issue with THIS and libgfapi, it can
be generalized to any process having multiple glusterfs_ctxs.<br>
</div>
<div><br>
</div>
<div>Before the master xlator (fuse/libgfapi) is created, all
the code that access THIS will be using global_xlator object,<br>
defined globally for the whole of the process.<br>
</div>
<div>The problem is when multiple threads start modifying THIS,
and overwrite thr global_xlators' ctx eg: glfs_new:<br>
glfs_new () {<br>
...<br>
ctx = glusterfs_ctx_new();<br>
glusterfs_globals_inti();<br>
THIS = NULL; /* implies THIS = &global_xlator */<br>
THIS->ctx = ctx;<br>
...<br>
}<br>
The issue is more severe than it appears, as the other threads
like epoll, timer, sigwaiter, when not executing in<br>
fop context will always refer to the global_xlator and
global_xlator->ctx. Because of the probable race condition</div>
<div>explained above we may be referring to the stale ctxs and
could lead to crashes.<br>
<br>
Probable solution:<br>
Currently THIS is thread specific, but the global xlator
object it modifies is global to all threads!!<br>
The obvious association would be to have global_xlator per ctx
instead of per process.<br>
The changes would be as follows:<br>
- Have a new global_xlator object in glusterfs_ctx.<br>
- After every creation of new ctx assign<br>
<store THIS><br>
THIS = new_ctx->global_xlator<br>
<restore THIS><br>
- But how to set the THIS in every thread(epoll, timer etc)
that gets created as a part of that ctx.<br>
Replace all the pthread_create for the ctx threads, with
gf_pthread_create:<br>
</div>
<div> gf_pthread_create (fn,..., ctx) {<br>
...<br>
thr_id = pthread_create (global_thread_init, fn, ctx...);<br>
...<br>
}<br>
<br>
global_thread_init (fn, ctx, args) {<br>
THIS = ctx->global_xlator;<br>
fn(args);<br>
}<br>
<br>
The other solution would be to not associate threads with
ctx, instead shared among ctxs</div>
<div><br>
Please let me know your thoughts on the same.<br>
<br>
</div>
<div>Regards,<br>
</div>
<div>Poornima<br>
</div>
</div>
<br>
</blockquote>
Hi Poornima,<br>
<br>
Recently with glusterfs-3.7 beta1 rpms, while create VM Image using
qemu-img, I see the following errors :<br>
<br>
[2015-05-08 09:04:14.358896] E
[rpc-transport.c:512:rpc_transport_unref] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f51f6bb6516]
(-->
/lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7f51f965e493]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7f51f96617dc]
(--> /lib64/libglusterfs.so.0(+0x1edc1)[0x7f51f6bb2dc1] (-->
/lib64/libglusterfs.so.0(+0x1ed55)[0x7f51f6bb2d55] )))))
0-rpc_transport: invalid argument: this
<br>
[2015-05-08 09:04:14.359085] E
[rpc-transport.c:512:rpc_transport_unref] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f51f6bb6516]
(-->
/lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7f51f965e493]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7f51f96617dc]
(--> /lib64/libglusterfs.so.0(+0x1edc1)[0x7f51f6bb2dc1] (-->
/lib64/libglusterfs.so.0(+0x1ed55)[0x7f51f6bb2d55] )))))
0-rpc_transport: invalid argument: this
<br>
[2015-05-08 09:04:14.359241] E
[rpc-transport.c:512:rpc_transport_unref] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x186)[0x7f51f6bb6516]
(-->
/lib64/libgfrpc.so.0(rpc_transport_unref+0xa3)[0x7f51f965e493]
(--> /lib64/libgfrpc.so.0(rpc_clnt_unref+0x5c)[0x7f51f96617dc]
(--> /lib64/libglusterfs.so.0(+0x1edc1)[0x7f51f6bb2dc1] (-->
/lib64/libglusterfs.so.0(+0x1ed55)[0x7f51f6bb2d55] )))))
0-rpc_transport: invalid argument: this
<br>
<br>
Is this the consequence of the issue that you are talking about ?<br>
<br>
<br>
-- Satheesaran<br>
</body>
</html>