<div dir="ltr">
<p class="">2.17-106.el7 is the latest glibc on CentOS 7. Tried the one-liner on older versions as well which also results in "likely buggy" according to the test.</p><p class="">Found this CentOS issue - <a href="https://bugs.centos.org/view.php?id=10426">https://bugs.centos.org/view.php?id=10426</a></p><p class=""># rpm -qa | grep glibc<br><span class=""><b>glibc</b></span><span class="">-2.17-106.el7_2.4.x86_64<br></span><span class=""><b>glibc</b></span><span class="">-common-2.17-106.el7_2.4.x86_64</span></p><p class=""><span class=""># objdump -r -d /lib64/libc.so.6 | grep -C 20 _int_free | grep -C 10 cmpxchg | head -21 | grep -A 3 cmpxchg | tail -1 | (grep '%r' && echo "Your libc is likely buggy." || echo "Your libc looks OK.")<br></span><span class=""> 7ca3e:<span class="">        </span>48 85 c9 <span class="">        </span>test </span><span class=""><b>%r</b></span><span class="">cx,</span><span class=""><b>%r</b></span><span class="">cx<br></span>Your libc is likely buggy.</p><p class="">Kind regards,<br>Fredrik Widlund</p></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Feb 23, 2016 at 4:27 PM, Raghavendra Gowdappa <span dir="ltr"><<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Came across a glibc bug which could've caused some corruptions. On googling about possible problems, we found that there is an issue (<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1305406" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=1305406</a>) fixed in glibc-2.17-121.el7. From the bug we found the following test-script to determine if the glibc is buggy. And on running it, we ran it on the local setup using the following method given in the bug:<br>
<br>
----------------<br>
# objdump -r -d /lib64/libc.so.6 | grep -C 20 _int_free | grep -C 10 cmpxchg | head -21 | grep -A 3 cmpxchg | tail -1 | (grep '%r' && echo "Your libc is likely buggy." || echo "Your libc looks OK.")<br>
<br>
7cc36: 48 85 c9 test %rcx,%rcx<br>
Your libc is likely buggy.<br>
----------------<br>
<br>
Could you check if the above command on your setup gives the same output which says "Your libc is likely buggy."<br>
<br>
Thanks to Nithya, Krutika and Pranith for working on this.<br>
<span class="im HOEnZb"><br>
----- Original Message -----<br>
> From: "Fredrik Widlund" <<a href="mailto:fredrik.widlund@gmail.com">fredrik.widlund@gmail.com</a>><br>
> To: <a href="mailto:gluster@deej.net">gluster@deej.net</a><br>
> Cc: <a href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><br>
> Sent: Tuesday, February 23, 2016 5:51:37 PM<br>
</span><span class="im HOEnZb">> Subject: Re: [Gluster-users] glusterfs client crashes<br>
><br>
</span><div class="HOEnZb"><div class="h5">> Hi,<br>
><br>
> I have experienced what looks like a very similar crash. Gluster 3.7.6 on<br>
> CentOS 7. No errors on the bricks or on other at the time mounted clients.<br>
> Relatively high load at the time.<br>
><br>
> Remounting the filesystem brought it back online.<br>
><br>
><br>
> pending frames:<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(STAT)<br>
> frame : type(1) op(STAT)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(1) op(READ)<br>
> frame : type(0) op(0)<br>
> patchset: git:// <a href="http://git.gluster.com/glusterfs.git" rel="noreferrer" target="_blank">git.gluster.com/glusterfs.git</a><br>
> signal received: 6<br>
> time of crash:<br>
> 2016-02-22 10:28:45<br>
> configuration details:<br>
> argp 1<br>
> backtrace 1<br>
> dlfcn 1<br>
> libpthread 1<br>
> llistxattr 1<br>
> setfsid 1<br>
> spinlock 1<br>
> epoll.h 1<br>
> xattr.h 1<br>
> st_atim.tv_nsec 1<br>
> package-string: glusterfs 3.7.6<br>
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xc2)[0x7f83387f7012]<br>
> /lib64/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f83388134dd]<br>
> /lib64/libc.so.6(+0x35670)[0x7f8336ee5670]<br>
> /lib64/libc.so.6(gsignal+0x37)[0x7f8336ee55f7]<br>
> /lib64/libc.so.6(abort+0x148)[0x7f8336ee6ce8]<br>
> /lib64/libc.so.6(+0x75317)[0x7f8336f25317]<br>
> /lib64/libc.so.6(+0x7cfe1)[0x7f8336f2cfe1]<br>
> /lib64/libglusterfs.so.0(loc_wipe+0x27)[0x7f83387f4d47]<br>
> /usr/lib64/glusterfs/3.7.6/xlator/performance/md-cache.so(mdc_local_wipe+0x11)[0x7f8329c8e5f1]<br>
> /usr/lib64/glusterfs/3.7.6/xlator/performance/md-cache.so(mdc_stat_cbk+0x10c)[0x7f8329c8f4fc]<br>
> /lib64/libglusterfs.so.0(default_stat_cbk+0xac)[0x7f83387fcc5c]<br>
> /usr/lib64/glusterfs/3.7.6/xlator/cluster/distribute.so(dht_file_attr_cbk+0x149)[0x7f832ab2a409]<br>
> /usr/lib64/glusterfs/3.7.6/xlator/protocol/client.so(client3_3_stat_cbk+0x3c6)[0x7f832ad6d266]<br>
> /lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0x90)[0x7f83385c5b80]<br>
> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x1bf)[0x7f83385c5e3f]<br>
> /lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7f83385c1983]<br>
> /usr/lib64/glusterfs/3.7.6/rpc-transport/socket.so(+0x9506)[0x7f832d261506]<br>
> /usr/lib64/glusterfs/3.7.6/rpc-transport/socket.so(+0xc3f4)[0x7f832d2643f4]<br>
> /lib64/libglusterfs.so.0(+0x878ea)[0x7f83388588ea]<br>
> /lib64/libpthread.so.0(+0x7dc5)[0x7f833765fdc5]<br>
> /lib64/libc.so.6(clone+0x6d)[0x7f8336fa621d]<br>
><br>
><br>
><br>
> Kind regards,<br>
> Fredrik Widlund<br>
><br>
> On Tue, Feb 23, 2016 at 1:00 PM, < <a href="mailto:gluster-users-request@gluster.org">gluster-users-request@gluster.org</a> > wrote:<br>
><br>
><br>
> Date: Mon, 22 Feb 2016 15:08:47 -0500<br>
> From: Dj Merrill < <a href="mailto:gluster@deej.net">gluster@deej.net</a> ><br>
> To: Gaurav Garg < <a href="mailto:ggarg@redhat.com">ggarg@redhat.com</a> ><br>
> Cc: <a href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><br>
> Subject: Re: [Gluster-users] glusterfs client crashes<br>
> Message-ID: < <a href="mailto:56CB6ACF.5080408@deej.net">56CB6ACF.5080408@deej.net</a> ><br>
> Content-Type: text/plain; charset=utf-8; format=flowed<br>
><br>
> On 2/21/2016 2:23 PM, Dj Merrill wrote:<br>
> > Very interesting. They were reporting both bricks offline, but the<br>
> > processes on both servers were still running. Restarting glusterfsd on<br>
> > one of the servers brought them both back online.<br>
><br>
> I realize I wasn't clear in my comments yesterday and would like to<br>
> elaborate on this a bit further. The "very interesting" comment was<br>
> sparked because when we were running 3.7.6, the bricks were not<br>
> reporting as offline when a client was having an issue, so this is new<br>
> behaviour now that we are running 3.7.8 (or a different issue entirely).<br>
><br>
> The other point that I was not clear on is that we may have one client<br>
> reporting the "Transport endpoint is not connected" error, but the other<br>
> 40+ clients all continue to work properly. This is the case with both<br>
> 3.7.6 and 3.7.8.<br>
><br>
> Curious, how can the other clients continue to work fine if both Gluster<br>
> 3.7.8 servers are reporting the bricks as offline?<br>
><br>
> What does "offline" mean in this context?<br>
><br>
><br>
> Re: the server logs, here is what I've found so far listed on both<br>
> gluster servers (glusterfs1 and glusterfs2):<br>
><br>
> [2016-02-21 08:06:02.785788] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:48:20.677010] W [socket.c:588:__socket_rwv]<br>
> 0-gv0-client-1: readv on (sanitized IP of glusterfs2):49152 failed (No<br>
> data available)<br>
> [2016-02-21 18:48:20.677096] I [MSGID: 114018]<br>
> [client.c:2030:client_rpc_notify] 0-gv0-client-1: disconnected from<br>
> gv0-client-1. Client process will keep trying to connect to glusterd<br>
> until brick's port is available<br>
> [2016-02-21 18:48:31.148564] E [MSGID: 114058]<br>
> [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-1:<br>
> failed to get the port number for remote subvolume. Please run 'gluster<br>
> volume status' on server to see if brick process is running.<br>
> [2016-02-21 18:48:40.941715] W [socket.c:588:__socket_rwv] 0-glusterfs:<br>
> readv on (sanitized IP of glusterfs2):24007 failed (No data available)<br>
> [2016-02-21 18:48:51.184424] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:48:51.972068] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec]<br>
> 0-mgmt: Volume file changed<br>
> [2016-02-21 18:48:51.980210] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec]<br>
> 0-mgmt: Volume file changed<br>
> [2016-02-21 18:48:51.985211] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec]<br>
> 0-mgmt: Volume file changed<br>
> [2016-02-21 18:48:51.995002] I [glusterfsd-mgmt.c:58:mgmt_cbk_spec]<br>
> 0-mgmt: Volume file changed<br>
> [2016-02-21 18:48:53.006079] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:48:53.018104] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:48:53.024060] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:48:53.035170] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:48:53.045637] I [rpc-clnt.c:1847:rpc_clnt_reconfig]<br>
> 0-gv0-client-1: changing port to 49152 (from 0)<br>
> [2016-02-21 18:48:53.051991] I [MSGID: 114057]<br>
> [client-handshake.c:1437:select_server_supported_programs]<br>
> 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)<br>
> [2016-02-21 18:48:53.052439] I [MSGID: 114046]<br>
> [client-handshake.c:1213:client_setvolume_cbk] 0-gv0-client-1: Connected<br>
> to gv0-client-1, attached to remote volume '/export/brick1/sdb1'.<br>
> [2016-02-21 18:48:53.052486] I [MSGID: 114047]<br>
> [client-handshake.c:1224:client_setvolume_cbk] 0-gv0-client-1: Server<br>
> and Client lk-version numbers are not same, reopening the fds<br>
> [2016-02-21 18:48:53.052668] I [MSGID: 114035]<br>
> [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-1:<br>
> Server lk version = 1<br>
> [2016-02-21 18:48:31.148706] I [MSGID: 114018]<br>
> [client.c:2030:client_rpc_notify] 0-gv0-client-1: disconnected from<br>
> gv0-client-1. Client process will keep trying to connect to glusterd<br>
> until brick's port is available<br>
> [2016-02-21 18:49:12.271865] W [socket.c:588:__socket_rwv] 0-glusterfs:<br>
> readv on (sanitized IP of glusterfs2):24007 failed (No data available)<br>
> [2016-02-21 18:49:15.637745] W [socket.c:588:__socket_rwv]<br>
> 0-gv0-client-1: readv on (sanitized IP of glusterfs2):49152 failed (No<br>
> data available)<br>
> [2016-02-21 18:49:15.637824] I [MSGID: 114018]<br>
> [client.c:2030:client_rpc_notify] 0-gv0-client-1: disconnected from<br>
> gv0-client-1. Client process will keep trying to connect to glusterd<br>
> until brick's port is available<br>
> [2016-02-21 18:49:24.198431] E [socket.c:2278:socket_connect_finish]<br>
> 0-glusterfs: connection to (sanitized IP of glusterfs2):24007 failed<br>
> (Connection refused)<br>
> [2016-02-21 18:49:26.204811] E [socket.c:2278:socket_connect_finish]<br>
> 0-gv0-client-1: connection to (sanitized IP of glusterfs2):24007 failed<br>
> (Connection refused)<br>
> [2016-02-21 18:49:38.366559] I [MSGID: 108031]<br>
> [afr-common.c:1883:afr_local_discovery_cbk] 0-gv0-replicate-0: selecting<br>
> local read_child gv0-client-0<br>
> [2016-02-21 18:50:54.605535] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk]<br>
> 0-glusterfs: No change in volfile, continuing<br>
> [2016-02-21 18:50:54.605639] E [MSGID: 114058]<br>
> [client-handshake.c:1524:client_query_portmap_cbk] 0-gv0-client-1:<br>
> failed to get the port number for remote subvolume. Please run 'gluster<br>
> volume status' on server to see if brick process is running.<br>
><br>
><br>
</div></div><div class="HOEnZb"><div class="h5">> _______________________________________________<br>
> Gluster-users mailing list<br>
> <a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
> <a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
</div></div></blockquote></div><br></div>