[Gluster-users] quick-read hang / lock up

Andre Felipe Machado andremachado at techforce.com.br
Tue Oct 13 18:25:56 UTC 2009


Hello,
I am trying to use the newly documented quick-read translator but the cluster
locks up now.
Substituted the read-ahead translator for this new one.
Previously was using debian 2.0.4 package and today packaged the 2.0.7 here
(unnoficial packages).
When using the quick read for bursts, glusterfs locks up. Need to kill glusterfs
server then umount, then restart nodes.
The log file shows, for a simple "touch filename":

# tail -50 /var/log/glusterfs/-etc-glusterfs-glusterfsd.vol.log 


 72: # option transport.ib-verbs.work-request-recv-size  131072
 73: # option transport.ib-verbs.work-request-recv-count 64
 74: 
 75: # option client-volume-filename /etc/glusterfs/glusterfs-client.vol
 76:   subvolumes brick
 77: # NOTE: Access to any volume through protocol/server is denied by
 78: # default. You need to explicitly grant access through # "auth"
 79: # option.
 80:   option auth.addr.brick.allow * # Allow access to "brick" volume
 81: end-volume

+------------------------------------------------------------------------------+
[2009-10-13 15:00:47] N [glusterfsd.c:1315:main] glusterfs: Successfully started
[2009-10-13 15:01:02] N [server-protocol.c:7065:mop_setvolume] server: accepted
client from 10.200.113.170:1023
[2009-10-13 15:01:02] N [server-protocol.c:7065:mop_setvolume] server: accepted
client from 10.200.113.170:1022
pending frames:
frame : type(1) op(LOOKUP)

patchset: v2.0.7
signal received: 11
time of crash: 2009-10-13 15:01:39
configuration details:
argp 1
backtrace 1
bdb->cursor->get 1
db.h 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.7
/lib/libc.so.6[0x7f47e8317f60]
/lib/libc.so.6(memcpy+0x2b0)[0x7f47e8363160]
/usr/lib/glusterfs/2.0.7/xlator/performance/io-cache.so(ioc_lookup_cbk+0x2a3)[0x7f47e70ab573]
/usr/lib/libglusterfs.so.0[0x7f47e8a6de54]
/usr/lib/glusterfs/2.0.7/xlator/performance/io-threads.so(iot_lookup_cbk+0x34)[0x7f47e74bed14]
/usr/lib/libglusterfs.so.0[0x7f47e8a6de54]
/usr/lib/glusterfs/2.0.7/xlator/storage/posix.so(posix_lookup+0x2fe)[0x7f47e78e185e]
/usr/lib/libglusterfs.so.0(default_lookup+0xb1)[0x7f47e8a71371]
/usr/lib/glusterfs/2.0.7/xlator/performance/io-threads.so(iot_lookup_wrapper+0xb1)[0x7f47e74c2291]
/usr/lib/libglusterfs.so.0(call_resume+0x1cb)[0x7f47e8a7792b]
/usr/lib/glusterfs/2.0.7/xlator/performance/io-threads.so(iot_worker_unordered+0x18)[0x7f47e74bfe48]
/lib/libpthread.so.0[0x7f47e863ffc7]
/lib/libc.so.6(clone+0x6d)[0x7f47e83b55ad]
---------
debian459140:~# 



# tail -25 /var/log/glusterfs/php_sessions-.log 


105: ### Add writeback feature
106: volume writeback
107:   type performance/write-behind
108: #  option aggregate-size 2MB 	# deprecated option
109:   option cache-size 500MB 	# default is equal to aggregate-size
110:   option flush-behind off  	# default is 'off'
111: 				# too aggressive and slow background flush!
112: 				# do not enable for php sessions behaviour
113:   subvolumes iocache   
114: end-volume

+------------------------------------------------------------------------------+
[2009-10-13 15:01:02] N [glusterfsd.c:1315:main] glusterfs: Successfully started
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote1:
Connected to 10.200.113.170:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote1:
Connected to 10.200.113.170:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote2:
Connected to 10.200.113.171:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote2:
Connected to 10.200.113.171:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote3:
Connected to 10.200.113.172:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote3:
Connected to 10.200.113.172:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote4:
Connected to 10.200.113.173:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:02] N [client-protocol.c:5730:client_setvolume_cbk] remote4:
Connected to 10.200.113.173:6996, attached to remote volume 'brick'.
[2009-10-13 15:01:39] E [saved-frames.c:165:saved_frames_unwind] remote1: forced
unwinding frame type(1) op(LOOKUP)
[2009-10-13 15:01:39] N [client-protocol.c:6435:notify] remote1: disconnected
[2009-10-13 15:01:42] E [socket.c:745:socket_connect_finish] remote1: connection
to 10.200.113.170:6996 failed (Connection refused)
[2009-10-13 15:01:42] E [socket.c:745:socket_connect_finish] remote1: connection
to 10.200.113.170:6996 failed (Connection refused)
debian459140:~# 







Please, what am I doing wrong?
Is the translator order correct?

Attached are configuration files used.
Regards.
Andre Felipe Machado



More information about the Gluster-users mailing list