<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.16.3">
</HEAD>
<BODY>
On Mon, 2015-08-10 at 21:39 +0530, Atin Mukherjee wrote:<BR>
<BLOCKQUOTE TYPE=CITE>
<FONT COLOR="#000000">-Atin</FONT><BR>
<FONT COLOR="#000000">Sent from one plus one</FONT><BR>
<FONT COLOR="#000000">On Aug 10, 2015 9:37 PM, "Kingsley" <<A HREF="mailto:gluster@gluster.dogwind.com">gluster@gluster.dogwind.com</A>> wrote:</FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">> On Mon, 2015-08-10 at 21:34 +0530, Atin Mukherjee wrote:</FONT><BR>
<FONT COLOR="#000000">> > -Atin</FONT><BR>
<FONT COLOR="#000000">> > Sent from one plus one</FONT><BR>
<FONT COLOR="#000000">> > On Aug 10, 2015 7:19 PM, "Kingsley" <<A HREF="mailto:gluster@gluster.dogwind.com">gluster@gluster.dogwind.com</A>></FONT><BR>
<FONT COLOR="#000000">> > wrote:</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > Further to this, the volume doesn't seem overly healthy. Any idea</FONT><BR>
<FONT COLOR="#000000">> > how I</FONT><BR>
<FONT COLOR="#000000">> > > can get it back into a working state?</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > Trying to access one particular directory on the clients just hangs.</FONT><BR>
<FONT COLOR="#000000">> > If</FONT><BR>
<FONT COLOR="#000000">> > > I query heal info, that directory appears in the output as possibly</FONT><BR>
<FONT COLOR="#000000">> > > undergoing heal (actual directory name changed as it's private</FONT><BR>
<FONT COLOR="#000000">> > info):</FONT><BR>
<FONT COLOR="#000000">> > Can you execute strace and see which call is stuck? That would help us</FONT><BR>
<FONT COLOR="#000000">> > to get to the exact component which we would need to look at.</FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">> Hi,</FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">> I've never used strace before. Could you give me the command line to</FONT><BR>
<FONT COLOR="#000000">> type?</FONT><BR>
<FONT COLOR="#000000">Just type strace followed by the command</FONT><BR>
</BLOCKQUOTE>
<BR>
Is this what you meant (I renamed the broken directory so that I could create another and let the system continue to work with a freshly created one). It ran very quickly and returned be back to the command prompt, but I then "cd"d into that directory and did a plain "ls" which then hung, ie:<BR>
<BR>
--8<--<BR>
[root@voicemail1b-1 14391.broken]# ls<BR>
^Z<BR>
<BR>
<BR>
<BR>
fg<BR>
<BR>
--8<--<BR>
<BR>
Anyway, the strace:<BR>
<BR>
[root@voicemail1b-1 <FONT COLOR="#000000">834723</FONT>]# strace ls 14391.broken<BR>
execve("/usr/bin/ls", ["ls", "14391.broken"], [/* 27 vars */]) = 0<BR>
brk(0) = 0x158c000<BR>
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d2494c000<BR>
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)<BR>
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3<BR>
fstat(3, {st_mode=S_IFREG|0644, st_size=31874, ...}) = 0<BR>
mmap(NULL, 31874, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7d24944000<BR>
close(3) = 0<BR>
open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240d\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=147120, ...}) = 0<BR>
mmap(NULL, 2246784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d24509000<BR>
mprotect(0x7f7d2452a000, 2097152, PROT_NONE) = 0<BR>
mmap(0x7f7d2472a000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x21000) = 0x7f7d2472a000<BR>
mmap(0x7f7d2472c000, 6272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7d2472c000<BR>
close(3) = 0<BR>
open("/lib64/libcap.so.2", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \26\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=20024, ...}) = 0<BR>
mmap(NULL, 2114112, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d24304000<BR>
mprotect(0x7f7d24308000, 2093056, PROT_NONE) = 0<BR>
mmap(0x7f7d24507000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7f7d24507000<BR>
close(3) = 0<BR>
open("/lib64/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\37\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=37056, ...}) = 0<BR>
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d24943000<BR>
mmap(NULL, 2130560, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d240fb000<BR>
mprotect(0x7f7d24102000, 2097152, PROT_NONE) = 0<BR>
mmap(0x7f7d24302000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7f7d24302000<BR>
close(3) = 0<BR>
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\34\2\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=2107760, ...}) = 0<BR>
mmap(NULL, 3932736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d23d3a000<BR>
mprotect(0x7f7d23ef0000, 2097152, PROT_NONE) = 0<BR>
mmap(0x7f7d240f0000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7f7d240f0000<BR>
mmap(0x7f7d240f6000, 16960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7d240f6000<BR>
close(3) = 0<BR>
open("/lib64/libpcre.so.1", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\25\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=398272, ...}) = 0<BR>
mmap(NULL, 2490888, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d23ad9000<BR>
mprotect(0x7f7d23b38000, 2097152, PROT_NONE) = 0<BR>
mmap(0x7f7d23d38000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5f000) = 0x7f7d23d38000<BR>
close(3) = 0<BR>
open("/lib64/liblzma.so.5", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000/\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=153184, ...}) = 0<BR>
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d24942000<BR>
mmap(NULL, 2245240, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d238b4000<BR>
mprotect(0x7f7d238d8000, 2093056, PROT_NONE) = 0<BR>
mmap(0x7f7d23ad7000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x23000) = 0x7f7d23ad7000<BR>
close(3) = 0<BR>
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\16\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=19512, ...}) = 0<BR>
mmap(NULL, 2109744, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d236b0000<BR>
mprotect(0x7f7d236b3000, 2093056, PROT_NONE) = 0<BR>
mmap(0x7f7d238b2000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f7d238b2000<BR>
close(3) = 0<BR>
open("/lib64/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\23\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=19888, ...}) = 0<BR>
mmap(NULL, 2113904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d234ab000<BR>
mprotect(0x7f7d234af000, 2093056, PROT_NONE) = 0<BR>
mmap(0x7f7d236ae000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7f7d236ae000<BR>
close(3) = 0<BR>
open("/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3<BR>
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240l\0\0\0\0\0\0"..., 832) = 832<BR>
fstat(3, {st_mode=S_IFREG|0755, st_size=141616, ...}) = 0<BR>
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d24941000<BR>
mmap(NULL, 2208864, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7d2328f000<BR>
mprotect(0x7f7d232a5000, 2097152, PROT_NONE) = 0<BR>
mmap(0x7f7d234a5000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7f7d234a5000<BR>
mmap(0x7f7d234a7000, 13408, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f7d234a7000<BR>
close(3) = 0<BR>
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d24940000<BR>
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d2493e000<BR>
arch_prctl(ARCH_SET_FS, 0x7f7d2493e800) = 0<BR>
mprotect(0x7f7d240f0000, 16384, PROT_READ) = 0<BR>
mprotect(0x7f7d234a5000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d236ae000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d238b2000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d23ad7000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d23d38000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d24302000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d24507000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d2472a000, 4096, PROT_READ) = 0<BR>
mprotect(0x61a000, 4096, PROT_READ) = 0<BR>
mprotect(0x7f7d2494f000, 4096, PROT_READ) = 0<BR>
munmap(0x7f7d24944000, 31874) = 0<BR>
set_tid_address(0x7f7d2493ead0) = 17906<BR>
set_robust_list(0x7f7d2493eae0, 24) = 0<BR>
rt_sigaction(SIGRTMIN, {0x7f7d23295780, [], SA_RESTORER|SA_SIGINFO, 0x7f7d2329e130}, NULL, 8) = 0<BR>
rt_sigaction(SIGRT_1, {0x7f7d23295810, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f7d2329e130}, NULL, 8) = 0<BR>
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0<BR>
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0<BR>
statfs("/sys/fs/selinux", {f_type=0xf97cff8c, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0<BR>
statfs("/sys/fs/selinux", {f_type=0xf97cff8c, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0<BR>
stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0<BR>
brk(0) = 0x158c000<BR>
brk(0x15ad000) = 0x15ad000<BR>
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo ...}) = 0<BR>
ioctl(1, TIOCGWINSZ, {ws_row=41, ws_col=202, ws_xpixel=0, ws_ypixel=0}) = 0<BR>
stat("14391.broken", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0<BR>
openat(AT_FDCWD, "14391.broken", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 23 entries */, 32768) = 552<BR>
getdents(3, /* 19 entries */, 32768) = 456<BR>
getdents(3, /* 0 entries */, 32768) = 0<BR>
close(3) = 0<BR>
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0<BR>
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7d2494b000<BR>
write(1, "012 033 046 063 076 087 09"..., 195012 033 046 063 076 087 096 104 112 120 128 136 144 152 160 172 180 195 209 225 235 246 258 279 298 313 343 389 628 900 908 918 926 934 942 950 958 968 980 994<BR>
) = 195<BR>
write(1, "013 034 049 065 079 088 09"..., 195013 034 049 065 079 088 097 105 113 121 129 137 145 153 161 173 184 196 212 226 236 247 266 281 299 314 348 394 843 901 910 919 927 935 943 951 959 970 981 996<BR>
) = 195<BR>
write(1, "014 035 050 066 080 089 09"..., 195014 035 050 066 080 089 098 106 114 122 130 138 146 154 162 174 185 197 215 227 237 248 267 288 301 317 349 396 869 902 911 920 928 936 944 952 960 972 982 997<BR>
) = 195<BR>
write(1, "016 039 052 071 081 090 09"..., 195016 039 052 071 081 090 099 107 115 123 131 139 147 155 163 175 186 198 219 229 238 250 269 291 305 321 350 405 882 903 912 921 929 937 945 953 961 973 984 998<BR>
) = 195<BR>
write(1, "018 041 055 072 082 091 10"..., 190018 041 055 072 082 091 100 108 116 124 132 140 148 156 164 176 187 203 221 230 239 251 270 292 306 328 354 407 890 904 914 922 930 938 946 954 962 974 985<BR>
) = 190<BR>
write(1, "019 042 057 073 084 092 10"..., 190019 042 057 073 084 092 101 109 117 125 133 141 149 157 165 177 190 204 222 231 240 253 272 293 308 336 357 413 892 905 915 923 931 939 947 955 965 976 988<BR>
) = 190<BR>
write(1, "024 043 059 074 085 093 10"..., 190024 043 059 074 085 093 102 110 118 126 134 142 150 158 166 178 193 206 223 232 241 255 274 294 309 339 370 470 895 906 916 924 932 940 948 956 966 977 989<BR>
) = 190<BR>
write(1, "031 044 060 075 086 095 10"..., 190031 044 060 075 086 095 103 111 119 127 135 143 151 159 167 179 194 207 224 234 243 257 275 296 310 342 386 517 899 907 917 925 933 941 949 957 967 978 993<BR>
) = 190<BR>
close(1) = 0<BR>
munmap(0x7f7d2494b000, 4096) = 0<BR>
close(2) = 0<BR>
exit_group(0) = ?<BR>
+++ exited with 0 +++<BR>
<BR>
<BR>
<BR>
<BLOCKQUOTE TYPE=CITE>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">> Then ... do I need to run something on one of the bricks while strace is</FONT><BR>
<FONT COLOR="#000000">> running?</FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">> Cheers,</FONT><BR>
<FONT COLOR="#000000">> Kingsley.</FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > [root@gluster1b-1 ~]# gluster volume heal callrec info</FONT><BR>
<FONT COLOR="#000000">> > > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > <gfid:164f888f-2049-49e6-ad26-c758ee091863></FONT><BR>
<FONT COLOR="#000000">> > > /recordings/834723/14391 - Possibly undergoing heal</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:650efeca-b45c-413b-acc3-f0a5853ccebd></FONT><BR>
<FONT COLOR="#000000">> > > Number of entries: 7</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > Number of entries: 0</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > <gfid:e280b40c-d8b7-43c5-9da7-4737054d7a7f></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:164f888f-2049-49e6-ad26-c758ee091863></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:650efeca-b45c-413b-acc3-f0a5853ccebd></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:b1fbda4a-732f-4f5d-b5a1-8355d786073e></FONT><BR>
<FONT COLOR="#000000">> > > /recordings/834723/14391 - Possibly undergoing heal</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:edb74524-b4b7-4190-85e7-4aad002f6e7c></FONT><BR>
<FONT COLOR="#000000">> > > <gfid:9b8b8446-1e27-4113-93c2-6727b1f457eb></FONT><BR>
<FONT COLOR="#000000">> > > Number of entries: 7</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > Number of entries: 0</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > If I query each brick directly for the number of files/directories</FONT><BR>
<FONT COLOR="#000000">> > > within that, I get 1731 on gluster1a-1 and gluster2a-1, but 1737 on</FONT><BR>
<FONT COLOR="#000000">> > the</FONT><BR>
<FONT COLOR="#000000">> > > other two, using this command:</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > # find /data/brick/callrec/recordings/834723/14391 -print | wc -l</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > Cheers,</FONT><BR>
<FONT COLOR="#000000">> > > Kingsley.</FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > On Mon, 2015-08-10 at 11:05 +0100, Kingsley wrote:</FONT><BR>
<FONT COLOR="#000000">> > > > Sorry for the blind panic - restarting the volume seems to have</FONT><BR>
<FONT COLOR="#000000">> > fixed</FONT><BR>
<FONT COLOR="#000000">> > > > it.</FONT><BR>
<FONT COLOR="#000000">> > > ></FONT><BR>
<FONT COLOR="#000000">> > > > But then my next question - why is this necessary? Surely it</FONT><BR>
<FONT COLOR="#000000">> > undermines</FONT><BR>
<FONT COLOR="#000000">> > > > the whole point of a high availability system?</FONT><BR>
<FONT COLOR="#000000">> > > ></FONT><BR>
<FONT COLOR="#000000">> > > > Cheers,</FONT><BR>
<FONT COLOR="#000000">> > > > Kingsley.</FONT><BR>
<FONT COLOR="#000000">> > > ></FONT><BR>
<FONT COLOR="#000000">> > > > On Mon, 2015-08-10 at 10:53 +0100, Kingsley wrote:</FONT><BR>
<FONT COLOR="#000000">> > > > > Hi,</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > We have a 4 way replicated volume using gluster 3.6.3 on CentOS</FONT><BR>
<FONT COLOR="#000000">> > 7.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Over the weekend I did a yum update on each of the bricks in</FONT><BR>
<FONT COLOR="#000000">> > turn, but</FONT><BR>
<FONT COLOR="#000000">> > > > > now when clients (using fuse mounts) try to access the volume,</FONT><BR>
<FONT COLOR="#000000">> > it hangs.</FONT><BR>
<FONT COLOR="#000000">> > > > > Gluster itself wasn't updated (we've disabled that repo so that</FONT><BR>
<FONT COLOR="#000000">> > we keep</FONT><BR>
<FONT COLOR="#000000">> > > > > to 3.6.3 for now).</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > This was what I did:</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > * on first brick, "yum update"</FONT><BR>
<FONT COLOR="#000000">> > > > > * reboot brick</FONT><BR>
<FONT COLOR="#000000">> > > > > * watch "gluster volume status" on another brick and wait</FONT><BR>
<FONT COLOR="#000000">> > for it</FONT><BR>
<FONT COLOR="#000000">> > > > > to say all 4 bricks are online before proceeding to</FONT><BR>
<FONT COLOR="#000000">> > update the</FONT><BR>
<FONT COLOR="#000000">> > > > > next brick</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > I was expecting the clients might pause 30 seconds while they</FONT><BR>
<FONT COLOR="#000000">> > notice a</FONT><BR>
<FONT COLOR="#000000">> > > > > brick is offline, but then recover.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > I've tried re-mounting clients, but that hasn't helped.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > I can't see much data in any of the log files.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > I've tried "gluster volume heal callrec" but it doesn't seem to</FONT><BR>
<FONT COLOR="#000000">> > have</FONT><BR>
<FONT COLOR="#000000">> > > > > helped.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > What shall I do next?</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > I've pasted some stuff below in case any of it helps.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Cheers,</FONT><BR>
<FONT COLOR="#000000">> > > > > Kingsley.</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > [root@gluster1b-1 ~]# gluster volume info callrec</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Volume Name: callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Type: Replicate</FONT><BR>
<FONT COLOR="#000000">> > > > > Volume ID: a39830b7-eddb-4061-b381-39411274131a</FONT><BR>
<FONT COLOR="#000000">> > > > > Status: Started</FONT><BR>
<FONT COLOR="#000000">> > > > > Number of Bricks: 1 x 4 = 4</FONT><BR>
<FONT COLOR="#000000">> > > > > Transport-type: tcp</FONT><BR>
<FONT COLOR="#000000">> > > > > Bricks:</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick1: gluster1a-1:/data/brick/callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick2: gluster1b-1:/data/brick/callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick3: gluster2a-1:/data/brick/callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick4: gluster2b-1:/data/brick/callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Options Reconfigured:</FONT><BR>
<FONT COLOR="#000000">> > > > > performance.flush-behind: off</FONT><BR>
<FONT COLOR="#000000">> > > > > [root@gluster1b-1 ~]#</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > [root@gluster1b-1 ~]# gluster volume status callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Status of volume: callrec</FONT><BR>
<FONT COLOR="#000000">> > > > > Gluster process Port</FONT><BR>
<FONT COLOR="#000000">> > Online Pid</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > ------------------------------------------------------------------------------</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster1a-1:/data/brick/callrec 49153</FONT><BR>
<FONT COLOR="#000000">> > Y 6803</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster1b-1:/data/brick/callrec 49153</FONT><BR>
<FONT COLOR="#000000">> > Y 2614</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster2a-1:/data/brick/callrec 49153</FONT><BR>
<FONT COLOR="#000000">> > Y 2645</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster2b-1:/data/brick/callrec 49153</FONT><BR>
<FONT COLOR="#000000">> > Y 4325</FONT><BR>
<FONT COLOR="#000000">> > > > > NFS Server on localhost 2049</FONT><BR>
<FONT COLOR="#000000">> > Y 2769</FONT><BR>
<FONT COLOR="#000000">> > > > > Self-heal Daemon on localhost N/A</FONT><BR>
<FONT COLOR="#000000">> > Y 2789</FONT><BR>
<FONT COLOR="#000000">> > > > > NFS Server on gluster2a-1 2049</FONT><BR>
<FONT COLOR="#000000">> > Y 2857</FONT><BR>
<FONT COLOR="#000000">> > > > > Self-heal Daemon on gluster2a-1 N/A</FONT><BR>
<FONT COLOR="#000000">> > Y 2814</FONT><BR>
<FONT COLOR="#000000">> > > > > NFS Server on 88.151.41.100 2049</FONT><BR>
<FONT COLOR="#000000">> > Y 6833</FONT><BR>
<FONT COLOR="#000000">> > > > > Self-heal Daemon on 88.151.41.100 N/A</FONT><BR>
<FONT COLOR="#000000">> > Y 6824</FONT><BR>
<FONT COLOR="#000000">> > > > > NFS Server on gluster2b-1 2049</FONT><BR>
<FONT COLOR="#000000">> > Y 4428</FONT><BR>
<FONT COLOR="#000000">> > > > > Self-heal Daemon on gluster2b-1 N/A</FONT><BR>
<FONT COLOR="#000000">> > Y 4387</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Task Status of Volume callrec</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > ------------------------------------------------------------------------------</FONT><BR>
<FONT COLOR="#000000">> > > > > There are no active volume tasks</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > [root@gluster1b-1 ~]#</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > [root@gluster1b-1 ~]# gluster volume heal callrec info</FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > > > /to_process - Possibly undergoing heal</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Number of entries: 1</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > > > Number of entries: 0</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > > > /to_process - Possibly undergoing heal</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Number of entries: 1</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/</FONT><BR>
<FONT COLOR="#000000">> > > > > Number of entries: 0</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > [root@gluster1b-1 ~]#</FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > > > _______________________________________________</FONT><BR>
<FONT COLOR="#000000">> > > > > Gluster-users mailing list</FONT><BR>
<FONT COLOR="#000000">> > > > > <A HREF="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</A></FONT><BR>
<FONT COLOR="#000000">> > > > > <A HREF="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</A></FONT><BR>
<FONT COLOR="#000000">> > > > ></FONT><BR>
<FONT COLOR="#000000">> > > ></FONT><BR>
<FONT COLOR="#000000">> > > > _______________________________________________</FONT><BR>
<FONT COLOR="#000000">> > > > Gluster-users mailing list</FONT><BR>
<FONT COLOR="#000000">> > > > <A HREF="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</A></FONT><BR>
<FONT COLOR="#000000">> > > > <A HREF="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</A></FONT><BR>
<FONT COLOR="#000000">> > > ></FONT><BR>
<FONT COLOR="#000000">> > ></FONT><BR>
<FONT COLOR="#000000">> > > _______________________________________________</FONT><BR>
<FONT COLOR="#000000">> > > Gluster-users mailing list</FONT><BR>
<FONT COLOR="#000000">> > > <A HREF="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</A></FONT><BR>
<FONT COLOR="#000000">> > > <A HREF="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</A></FONT><BR>
<FONT COLOR="#000000">> ></FONT><BR>
<FONT COLOR="#000000">> ></FONT><BR>
<FONT COLOR="#000000">></FONT><BR>
<BR>
<BR>
</BLOCKQUOTE>
</BODY>
</HTML>