<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>
</head>
<body ocsi="0" fpstyle="1">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">This is possibly another instance of the earlier threads (below). This occurs<br>
with 3.6.6-1 and 3.6.2-1.<br>
<br>
http://www.gluster.org/pipermail/gluster-users/2014-June/017635.html<br>
http://www.gluster.org/pipermail/gluster-users/2012-March/009942.html<br>
<br>
Synopsis:<br>
<br>
A standard user tries to build GIT and succeeds, but is then unable to<br>
delete a relatively small number of files in the build tree. They get<br>
'No data available' and while the bricks contain associated nodes which<br>
getfattr seems happy with, at least one entry for each node has mode 1000<br>
(text bit only). The logs are also being spammed again with iobref_unref<br>
and iobuf_unref, which may be connected, or may not. The brick logs do<br>
contain xattr set errors to start, and get/modify later on, ending with<br>
unlink errors when the deletion attempts arrive.<br>
<br>
I'm mainly hoping it's *not* a case of the latter thread listed above<br>
(ie., use ext4 instead of xfs for the backing storage), because backing<br>
up the healthy side of 80TB before rebuilding the underlying bricks' LUNs<br>
will be ... interesting.<br>
<br>
Environment:<br>
RHEL 6.7, kernel 2.6.32-573.7.1.el6.x86_64<br>
<br>
Gluster locations/packages/versions:<br>
<br>
servers: "service{4..7,10..13}":<br>
glusterfs-server-3.6.6-1.el6.x86_64<br>
glusterfs-api-3.6.6-1.el6.x86_64<br>
glusterfs-debuginfo-3.6.6-1.el6.x86_64<br>
glusterfs-3.6.6-1.el6.x86_64<br>
glusterfs-fuse-3.6.6-1.el6.x86_64<br>
glusterfs-rdma-3.6.6-1.el6.x86_64<br>
glusterfs-libs-3.6.6-1.el6.x86_64<br>
glusterfs-devel-3.6.6-1.el6.x86_64<br>
glusterfs-api-devel-3.6.6-1.el6.x86_64<br>
glusterfs-extra-xlators-3.6.6-1.el6.x86_64<br>
glusterfs-cli-3.6.6-1.el6.x86_64<br>
<br>
clients: "service1" aka "phoenix01":<br>
glusterfs-3.6.6-1.el6.x86_64<br>
glusterfs-api-devel-3.6.6-1.el6.x86_64<br>
glusterfs-libs-3.6.6-1.el6.x86_64<br>
glusterfs-devel-3.6.6-1.el6.x86_64<br>
glusterfs-cli-3.6.6-1.el6.x86_64<br>
glusterfs-extra-xlators-3.6.6-1.el6.x86_64<br>
glusterfs-fuse-3.6.6-1.el6.x86_64<br>
glusterfs-rdma-3.6.6-1.el6.x86_64<br>
glusterfs-api-3.6.6-1.el6.x86_64<br>
glusterfs-debuginfo-3.6.6-1.el6.x86_64<br>
<br>
volume info:<br>
Volume Name: home<br>
Type: Distribute<br>
Volume ID: f03fcaf0-3889-45ac-a06a-a4d60d5a673d<br>
Status: Started<br>
Number of Bricks: 28<br>
Transport-type: rdma<br>
Bricks:<br>
Brick1: service4-ib1:/mnt/l1_s4_ost0000_0000/brick<br>
Brick2: service4-ib1:/mnt/l1_s4_ost0001_0001/brick<br>
Brick3: service4-ib1:/mnt/l1_s4_ost0002_0002/brick<br>
Brick4: service5-ib1:/mnt/l1_s5_ost0003_0003/brick<br>
Brick5: service5-ib1:/mnt/l1_s5_ost0004_0004/brick<br>
Brick6: service5-ib1:/mnt/l1_s5_ost0005_0005/brick<br>
Brick7: service5-ib1:/mnt/l1_s5_ost0006_0006/brick<br>
Brick8: service6-ib1:/mnt/l1_s6_ost0007_0007/brick<br>
Brick9: service6-ib1:/mnt/l1_s6_ost0008_0008/brick<br>
Brick10: service6-ib1:/mnt/l1_s6_ost0009_0009/brick<br>
Brick11: service7-ib1:/mnt/l1_s7_ost000a_0010/brick<br>
Brick12: service7-ib1:/mnt/l1_s7_ost000b_0011/brick<br>
Brick13: service7-ib1:/mnt/l1_s7_ost000c_0012/brick<br>
Brick14: service7-ib1:/mnt/l1_s7_ost000d_0013/brick<br>
Brick15: service10-ib1:/mnt/l1_s10_ost000e_0014/brick<br>
Brick16: service10-ib1:/mnt/l1_s10_ost000f_0015/brick<br>
Brick17: service10-ib1:/mnt/l1_s10_ost0010_0016/brick<br>
Brick18: service11-ib1:/mnt/l1_s11_ost0011_0017/brick<br>
Brick19: service11-ib1:/mnt/l1_s11_ost0012_0018/brick<br>
Brick20: service11-ib1:/mnt/l1_s11_ost0013_0019/brick<br>
Brick21: service11-ib1:/mnt/l1_s11_ost0014_0020/brick<br>
Brick22: service12-ib1:/mnt/l1_s12_ost0015_0021/brick<br>
Brick23: service12-ib1:/mnt/l1_s12_ost0016_0022/brick<br>
Brick24: service12-ib1:/mnt/l1_s12_ost0017_0023/brick<br>
Brick25: service13-ib1:/mnt/l1_s13_ost0018_0024/brick<br>
Brick26: service13-ib1:/mnt/l1_s13_ost0019_0025/brick<br>
Brick27: service13-ib1:/mnt/l1_s13_ost001a_0026/brick<br>
Brick28: service13-ib1:/mnt/l1_s13_ost001b_0027/brick<br>
Options Reconfigured:<br>
diagnostics.count-fop-hits: on<br>
diagnostics.latency-measurement: on<br>
storage.build-pgfid: on<br>
performance.stat-prefetch: off<br>
<br>
volume status:<br>
Status of volume: home<br>
Gluster process Port Online Pid<br>
------------------------------------------------------------------------------<br>
Brick service4-ib1:/mnt/l1_s4_ost0000_0000/brick 49156 Y 7513<br>
Brick service4-ib1:/mnt/l1_s4_ost0001_0001/brick 49157 Y 7525<br>
Brick service4-ib1:/mnt/l1_s4_ost0002_0002/brick 49158 Y 7537<br>
Brick service5-ib1:/mnt/l1_s5_ost0003_0003/brick 49163 Y 7449<br>
Brick service5-ib1:/mnt/l1_s5_ost0004_0004/brick 49164 Y 7461<br>
Brick service5-ib1:/mnt/l1_s5_ost0005_0005/brick 49165 Y 7473<br>
Brick service5-ib1:/mnt/l1_s5_ost0006_0006/brick 49166 Y 7485<br>
Brick service6-ib1:/mnt/l1_s6_ost0007_0007/brick 49155 Y 7583<br>
Brick service6-ib1:/mnt/l1_s6_ost0008_0008/brick 49156 Y 7595<br>
Brick service6-ib1:/mnt/l1_s6_ost0009_0009/brick 49157 Y 7607<br>
Brick service7-ib1:/mnt/l1_s7_ost000a_0010/brick 49160 Y 7490<br>
Brick service7-ib1:/mnt/l1_s7_ost000b_0011/brick 49161 Y 7502<br>
Brick service7-ib1:/mnt/l1_s7_ost000c_0012/brick 49162 Y 7514<br>
Brick service7-ib1:/mnt/l1_s7_ost000d_0013/brick 49163 Y 7526<br>
Brick service10-ib1:/mnt/l1_s10_ost000e_0014/brick 49155 Y 8136<br>
Brick service10-ib1:/mnt/l1_s10_ost000f_0015/brick 49156 Y 8148<br>
Brick service10-ib1:/mnt/l1_s10_ost0010_0016/brick 49157 Y 8160<br>
Brick service11-ib1:/mnt/l1_s11_ost0011_0017/brick 49160 Y 7453<br>
Brick service11-ib1:/mnt/l1_s11_ost0012_0018/brick 49161 Y 7465<br>
Brick service11-ib1:/mnt/l1_s11_ost0013_0019/brick 49162 Y 7477<br>
Brick service11-ib1:/mnt/l1_s11_ost0014_0020/brick 49163 Y 7489<br>
Brick service12-ib1:/mnt/l1_s12_ost0015_0021/brick 49155 Y 7457<br>
Brick service12-ib1:/mnt/l1_s12_ost0016_0022/brick 49156 Y 7469<br>
Brick service12-ib1:/mnt/l1_s12_ost0017_0023/brick 49157 Y 7481<br>
Brick service13-ib1:/mnt/l1_s13_ost0018_0024/brick 49156 Y 7536<br>
Brick service13-ib1:/mnt/l1_s13_ost0019_0025/brick 49157 Y 7548<br>
Brick service13-ib1:/mnt/l1_s13_ost001a_0026/brick 49158 Y 7560<br>
Brick service13-ib1:/mnt/l1_s13_ost001b_0027/brick 49159 Y 7572<br>
NFS Server on localhost 2049 Y 7553<br>
NFS Server on service6-ib1 2049 Y 7625<br>
NFS Server on service13-ib1 2049 Y 7589<br>
NFS Server on service11-ib1 2049 Y 7507<br>
NFS Server on service12-ib1 2049 Y 7498<br>
NFS Server on service10-ib1 2049 Y 8179<br>
NFS Server on service5-ib1 2049 Y 7502<br>
NFS Server on service7-ib1 2049 Y 7543<br>
<br>
Task Status of Volume home<br>
------------------------------------------------------------------------------<br>
Task : Rebalance <br>
ID : f3ad27ce-7bcf-4fab-92c1-b40af75d4300<br>
Status : completed <br>
<br>
Reproduction: As standard user clone the latest git source into ~/build_tests/, then...<br>
<br>
test 0: dup source tree, delete original<br>
<br>
success, test0.script<br>
<br>
test 1: copy dupe to new, cd into new, make configure, cd out,<br>
delete new<br>
<br>
success, test1.script<br>
<br>
test 2: mkdir $WORKDIR/temp/, copy dump to new, cd into it, make<br>
configure, ./configure --prefix $WORKDIR/temp, cd out,<br>
delete new, delete $WORKDIR/temp/<br>
<br>
success, test2.script<br>
<br>
test 3: mkdir $WORKDIR/temp/, copy dump to new, cd into it, make <br>
configure, ./configure --prefix $WORKDIR/temp/, make all<br>
doc, cd out, delete new, delete $WORKDIR/temp/<br>
<br>
failure on attempt to remove the working tree<br>
<br>
as root, trying to remove a sample file (file owner gets same result):<br>
<br>
[root@phoenix-smc users]# ssh service1 rm /home/olagarde/build_tests/new/git-diff<br>
rm: cannot remove `/home/olagarde/build_tests/new/git-diff': No data available<br>
<br>
the file is homed on backing servers 4 and 13 (what happened on 13?):<br>
<br>
[root@phoenix-smc users]# pdsh -g glfs ls -l /mnt/*/brick/olagarde/build_tests/new/git-diff<br>
service10: ...<br>
service11: ...<br>
service6: ...<br>
service12: ...<br>
service7: ...<br>
service5: ...<br>
service13: ---------T 5 500 206 0 Nov 9 23:28 /mnt/l1_s13_ost001a_0026/brick/olagarde/build_tests/new/git-diff<br>
service4: -rwxr----- 81 500 206 7960411 Nov 9 23:28 /mnt/l1_s4_ost0000_0000/brick/olagarde/build_tests/new/git-diff<br>
<br>
fattrs appear to claim happiness for both backing instances:<br>
<br>
[root@phoenix-smc users]# pdsh -w service4 -w service13 -f 1 'getfattr -m . -d -e hex /mnt/*/brick/olagarde/build_tests/new/git-diff'<br>
service4: getfattr: Removing leading '/' from absolute path names<br>
service4: # file: mnt/l1_s4_ost0000_0000/brick/olagarde/build_tests/new/git-diff<br>
service4: trusted.gfid=0xa4daceb603b0485eab77df659ea3d34c<br>
service4: trusted.pgfid.8bfecb0a-bae2-48e9-9992-ddce2ff8e4c7=0x00000050<br>
service4: <br>
service13: getfattr: Removing leading '/' from absolute path names<br>
service13: # file: mnt/l1_s13_ost001a_0026/brick/olagarde/build_tests/new/git-diff<br>
service13: trusted.gfid=0xa4daceb603b0485eab77df659ea3d34c<br>
service13: trusted.glusterfs.dht.linkto=0x686f6d652d636c69656e742d3000<br>
service13:<br>
<br>
Profile output (vol profile home info incremental, 60s snaps) is available if that helps.<br>
Logs are also available but I have to review/sanitize them before they leave the site.<br>
Output of 'script' sessions around the above tests is also available, if it helps.<br>
<br>
##END<br>
</div>
</body>
</html>