<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Le 22/07/2016 21:12, Yannick Perret a
écrit :<br>
</div>
<blockquote cite="mid:5792701F.3070704@liris.cnrs.fr" type="cite">Le
22/07/2016 17:47, Mykola Ulianytskyi a écrit :
<br>
<blockquote type="cite">Hi
<br>
<br>
<blockquote type="cite"> 3.7 clients are not compatible with
3.6 servers
<br>
</blockquote>
Can you provide more info?
<br>
<br>
I use some 3.7 clients with 3.6 servers and don't see issues.
<br>
</blockquote>
Well,
<br>
with client 3.7.13 compiled on the same machine when I try the
same mount I get:
<br>
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
<br>
Mount failed. Please check the log file for more details.
<br>
<br>
Checking the logs (/var/log/glusterfs/zog.log) I have:
<br>
[2016-07-22 19:05:40.249143] I [MSGID: 100030]
[glusterfsd.c:2338:main] 0-/usr/local/sbin/glusterfs: Started
running /usr/local/sbin/glusterfs version 3.7.13 (args:
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
--volfile-id=BACKUP-ADMIN-DATA /zog)
<br>
[2016-07-22 19:05:40.258437] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
<br>
[2016-07-22 19:05:40.259480] W [socket.c:701:__socket_rwv]
0-glusterfs: readv on <the-IP>:24007 failed (Aucune donnée
disponible)
<br>
[2016-07-22 19:05:40.259859] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x175)[0x7fad7d039335]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b3)[0x7fad7ce04e73]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fad7ce04f6e]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fad7ce065ee]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fad7ce06de8]
))))) 0-glusterfs: forced unwinding frame type(GlusterFS
Handshake) op(GETSPEC(2)) called at 2016-07-22 19:05:40.258858
(xid=0x1)
<br>
[2016-07-22 19:05:40.259894] E
[glusterfsd-mgmt.c:1690:mgmt_getspec_cbk] 0-mgmt: failed to fetch
volume file (key:BACKUP-ADMIN-DATA)
<br>
[2016-07-22 19:05:40.259939] W
[glusterfsd.c:1251:cleanup_and_exit]
(-->/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1de)
[0x7fad7ce04e9e]
-->/usr/local/sbin/glusterfs(mgmt_getspec_cbk+0x454) [0x40d564]
-->/usr/local/sbin/glusterfs(cleanup_and_exit+0x4b) [0x407eab]
) 0-: received signum (0), shutting down
<br>
[2016-07-22 19:05:40.259965] I [fuse-bridge.c:5720:fini] 0-fuse:
Unmounting '/zog'.
<br>
[2016-07-22 19:05:40.260913] W
[glusterfsd.c:1251:cleanup_and_exit]
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)
[0x7fad7c0a30a4]
-->/usr/local/sbin/glusterfs(glusterfs_sigwaiter+0xc5)
[0x408015] -->/usr/local/sbin/glusterfs(cleanup_and_exit+0x4b)
[0x407eab] ) 0-: received signum (15), shutting down
<br>
<br>
</blockquote>
Hmmm… I just saw that logs are (partly) translated which can be
harder to understand for non-french speakers.<br>
"Aucune donnée disponible" means: no available data<br>
<br>
BTW If I could manage 3.7 clients to work with my servers and if the
memory leak don't exists in 3.7 it would be fine for me.<br>
<br>
--<br>
Y.<br>
<br>
<blockquote cite="mid:5792701F.3070704@liris.cnrs.fr" type="cite">I
did not go further about that as I just presumed that 3.7 series
was not compatible with 3.6 servers but it's maybe something else.
But here it is the same client, the same server(s) and the same
volume.
<br>
<br>
The compilation is with features (built with "configure
--disable-tiering" as I don't have installed stuff for that):
<br>
FUSE client : yes
<br>
Infiniband verbs : no
<br>
epoll IO multiplex : yes
<br>
argp-standalone : no
<br>
fusermount : yes
<br>
readline : yes
<br>
georeplication : yes
<br>
Linux-AIO : no
<br>
Enable Debug : no
<br>
Block Device xlator : no
<br>
glupy : yes
<br>
Use syslog : yes
<br>
XML output : yes
<br>
QEMU Block formats : no
<br>
Encryption xlator : yes
<br>
Unit Tests : no
<br>
POSIX ACLs : yes
<br>
Data Classification : no
<br>
firewalld-config : no
<br>
<br>
Regards,
<br>
--
<br>
Y.
<br>
<br>
<br>
<blockquote type="cite">Thank you
<br>
<br>
--
<br>
With best regards,
<br>
Mykola
<br>
<br>
<br>
On Fri, Jul 22, 2016 at 4:31 PM, Yannick Perret
<br>
<a class="moz-txt-link-rfc2396E" href="mailto:yannick.perret@liris.cnrs.fr"><yannick.perret@liris.cnrs.fr></a> wrote:
<br>
<blockquote type="cite">Note: I'm have a dev client machine so I
can perform tests or recompile
<br>
glusterfs client if it can helps getting data about that.
<br>
<br>
I did not test this problem against 3.7.x version as my 2
servers are in use
<br>
and I can't upgrade them at this time, and 3.7 clients are not
compatible
<br>
with 3.6 servers (as far as I can see from my tests).
<br>
<br>
--
<br>
Y.
<br>
<br>
<br>
Le 22/07/2016 14:06, Yannick Perret a écrit :
<br>
<br>
Hello,
<br>
some times ago I posted about a memory leak in client process,
but it was on
<br>
a very old 32bit machine (both kernel and OS) and I don't
found evidences
<br>
about a similar problem on our recent machines.
<br>
But I performed more tests and I have the same problem.
<br>
<br>
Clients are 64bit Debian 8.2 machines. Glusterfs client on
these machines is
<br>
compiled from sources with activated stuff:
<br>
FUSE client : yes
<br>
Infiniband verbs : no
<br>
epoll IO multiplex : yes
<br>
argp-standalone : no
<br>
fusermount : yes
<br>
readline : yes
<br>
georeplication : yes
<br>
Linux-AIO : no
<br>
Enable Debug : no
<br>
systemtap : no
<br>
Block Device xlator : no
<br>
glupy : no
<br>
Use syslog : yes
<br>
XML output : yes
<br>
QEMU Block formats : no
<br>
Encryption xlator : yes
<br>
Erasure Code xlator : yes
<br>
<br>
I tested both 3.6.7 and 3.6.9 version on client (3.6.7 is the
one installed
<br>
on our machines, even on servers, 3.6.9 is for testing with
last 3.6
<br>
version).
<br>
<br>
Here are the operations on the client (also performed with
similar results
<br>
with 3.6.7 version):
<br>
# /usr/local/sbin/glusterfs --version
<br>
glusterfs 3.6.9 built on Jul 22 2016 13:27:42
<br>
(…)
<br>
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
<br>
# cd /usr/
<br>
# cp -Rp * /zog/TEMP/
<br>
Then monitoring memory used by glusterfs process while 'cp' is
running
<br>
(resp. VSZ and RSS from 'ps'):
<br>
284740 70232
<br>
284740 70232
<br>
284876 71704
<br>
285000 72684
<br>
285136 74008
<br>
285416 75940
<br>
(…)
<br>
368684 151980
<br>
369324 153768
<br>
369836 155576
<br>
370092 156192
<br>
370092 156192
<br>
Here both sizes are stable and correspond to the end of 'cp'
command.
<br>
If I restart an other 'cp' (even on the same directories) size
starts again
<br>
to increase.
<br>
If I perform a 'ls -lR' in the directory size also increase:
<br>
370756 192488
<br>
389964 212148
<br>
390948 213232
<br>
(here I ^C the 'ls')
<br>
<br>
When doing nothing the size don't increase but never decrease
(calling
<br>
'sync' don't change the situation).
<br>
Sending a HUP signal to glusterfs process also increases
memory (390948
<br>
213324 → 456484 213320).
<br>
Changing volume configuration (changing
diagnostics.client-sys-log-level
<br>
value) don't change anything.
<br>
<br>
Here the actual ps:
<br>
root 17041 4.9 5.2 456484 213320 ? Ssl 13:29
1:21
<br>
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
<br>
--volfile-id=BACKUP-ADMIN-DATA /zog
<br>
<br>
Of course umouting/remounting fall back to "start" size:
<br>
# umount /zog
<br>
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
<br>
→ root 28741 0.3 0.7 273320 30484 ? Ssl 13:57
0:00
<br>
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
<br>
--volfile-id=BACKUP-ADMIN-DATA /zog
<br>
<br>
<br>
I didn't saw this before because most of our volumes are
mounted "on demand"
<br>
for some storage activities or are permanently mounted but
with very few
<br>
activity.
<br>
But clearly this memory usage driff is a long-term problem. On
the old 32bit
<br>
machine I had this problem ("solved" by using NFS mounts in
order to wait
<br>
for this old machine to be replaced) and it lead to glusterfs
being killed
<br>
by OS when out of free memory. It was faster than what I
describe here but
<br>
it's just a question of time.
<br>
<br>
<br>
Thanks for any help about that.
<br>
<br>
Regards,
<br>
--
<br>
Y.
<br>
<br>
<br>
The corresponding volume on servers is (if it can help):
<br>
Volume Name: BACKUP-ADMIN-DATA
<br>
Type: Replicate
<br>
Volume ID: 306d57f3-fb30-4bcc-8687-08bf0a3d7878
<br>
Status: Started
<br>
Number of Bricks: 1 x 2 = 2
<br>
Transport-type: tcp
<br>
Bricks:
<br>
Brick1: sto1.my.domain:/glusterfs/backup-admin/data
<br>
Brick2: sto2.my.domain:/glusterfs/backup-admin/data
<br>
Options Reconfigured:
<br>
diagnostics.client-sys-log-level: WARNING
<br>
<br>
<br>
<br>
<br>
<br>
<br>
_______________________________________________
<br>
Gluster-users mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<br>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a>
<br>
<br>
<br>
<br>
_______________________________________________
<br>
Gluster-users mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<br>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a>
<br>
</blockquote>
</blockquote>
<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
Gluster-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</body>
</html>