<div dir="ltr">Hi guys,<div>I&#39;ve got a strange problem involving this timeline (matches the &quot;Log fragment 1&quot; excerpt)</div><div>19:56:50: disk is detached from my system. This disk is actually the brick of the volume V.</div><div>19:56:50: LVM sees the disk as unreachable and starts its maintenance procedures</div><div>19:56:50: LVM umounts my thin provisioned volumes</div><div>19:57:02: Health check on specific bricks fails thus moving the brick to a down state<br></div><div>19:57:32: XFS filesystem umounts<br></div><div><br></div><div>At this point, the brick filesystem is no longer mounted. The underlying filesystems is empty (misses the brick directory too). My assumption is that gluster would stop itself in such conditions: it is not.</div><div>Gluster slowly fills my entire root partition, creating its full tree.</div><div><br></div><div>My only warning point is the disk that starts to fill its inodes to 100%.</div><div><br></div><div>I&#39;ve read release notes for every version subsequent mine (3.7.14, 3.7.15) without finding relevant fixes and at this point i&#39;m pretty sure is some bug undocumented.</div><div>Servers were made symmetric.</div><div><br></div><div>Could you please help me understand how to avoid that gluster coninues write on an unmounted filesystem? Thanks.</div><div><br></div><div>I&#39;m running a 3 node replica on 3 azure vms. This is the configuration:</div><div><br></div><div>MD (yes, i use md to aggregate 4 disks into a single 4Tb volume):</div><div><div>/dev/md128:</div><div>        Version : 1.2</div><div>  Creation Time : Mon Aug 29 18:10:45 2016</div><div>     Raid Level : raid0</div><div>     Array Size : 4290248704 (4091.50 GiB 4393.21 GB)</div><div>   Raid Devices : 4</div><div>  Total Devices : 4</div><div>    Persistence : Superblock is persistent</div><div><br></div><div>    Update Time : Mon Aug 29 18:10:45 2016</div><div>          State : clean </div><div> Active Devices : 4</div><div>Working Devices : 4</div><div> Failed Devices : 0</div><div>  Spare Devices : 0</div><div><br></div><div>     Chunk Size : 512K</div><div><br></div><div>           Name : 128</div><div>           UUID : d5c51214:43e48da9:49086616:c1371514</div><div>         Events : 0</div><div><br></div><div>    Number   Major   Minor   RaidDevice State</div><div>       0       8       80        0      active sync   /dev/sdf</div><div>       1       8       96        1      active sync   /dev/sdg</div><div>       2       8      112        2      active sync   /dev/sdh</div><div>       3       8      128        3      active sync   /dev/sdi</div></div><div><br></div><div>PV, VG, LV status</div><div><div>  PV         VG      Fmt  Attr PSize PFree DevSize PV UUID                               <br></div><div>  /dev/md127 VGdata  lvm2 a--  2.00t 2.00t   2.00t Kxb6C0-FLIH-4rB1-DKyf-IQuR-bbPE-jm2mu0</div><div>  /dev/md128 gluster lvm2 a--  4.00t 1.07t   4.00t lDazuw-zBPf-Duis-ZDg1-3zfg-53Ba-2ZF34m</div></div><div><div> </div><div> VG      Attr   Ext   #PV #LV #SN VSize VFree VG UUID                                VProfile</div><div>  VGdata  wz--n- 4.00m   1   0   0 2.00t 2.00t XI2V2X-hdxU-0Jrn-TN7f-GSEk-7aNs-GCdTtn         </div><div>  gluster wz--n- 4.00m   1   6   0 4.00t 1.07t ztxX4f-vTgN-IKop-XePU-OwqW-T9k6-A6uDk0  </div></div><div><br></div><div><div> LV                  VG      #Seg Attr       LSize   Maj Min KMaj KMin Pool     Origin Data%  Meta%  Move Cpy%Sync Log Convert LV UUID                                LProfile</div><div>  apps-data           gluster    1 Vwi-aotz--  50.00g  -1  -1  253   12 thinpool        0.08                                    znUMbm-ax1N-R7aj-dxLc-gtif-WOvk-9QC8tq         </div><div>  feed                gluster    1 Vwi-aotz-- 100.00g  -1  -1  253   14 thinpool        0.08                                    hZ4Isk-dELG-lgFs-2hJ6-aYid-8VKg-3jJko9         </div><div>  homes               gluster    1 Vwi-aotz--   1.46t  -1  -1  253   11 thinpool        58.58                                   salIPF-XvsA-kMnm-etjf-Uaqy-2vA9-9WHPkH         </div><div>  search-data         gluster    1 Vwi-aotz-- 100.00g  -1  -1  253   13 thinpool        16.41                                   Z5hoa3-yI8D-dk5Q-2jWH-N5R2-ge09-RSjPpQ         </div><div>  thinpool            gluster    1 twi-aotz--   2.93t  -1  -1  253    9                 29.85  60.00                            oHTbgW-tiPh-yDfj-dNOm-vqsF-fBNH-o1izx2         </div><div>  video-asset-manager gluster    1 Vwi-aotz-- 100.00g  -1  -1  253   15 thinpool        0.07                                    4dOXga-96Wa-u3mh-HMmE-iX1I-o7ov-dtJ8lZ  </div></div><div><br></div><div>Gluster volume configuration (all volumes use the same exact configuration, listing them all would be redundant)</div><div><div>Volume Name: vol-homes</div><div>Type: Replicate</div><div>Volume ID: 0c8fa62e-dd7e-429c-a19a-479404b5e9c6</div><div>Status: Started</div><div>Number of Bricks: 1 x 3 = 3</div><div>Transport-type: tcp</div><div>Bricks:</div><div>Brick1: glu01.prd.azr:/bricks/vol-homes/brick1</div><div>Brick2: glu02.prd.azr:/bricks/vol-homes/brick1</div><div>Brick3: glu03.prd.azr:/bricks/vol-homes/brick1</div><div>Options Reconfigured:</div><div>performance.readdir-ahead: on</div><div>cluster.server-quorum-type: server</div><div>nfs.disable: disable</div><div>cluster.lookup-unhashed: auto</div><div>performance.nfs.quick-read: on</div><div>performance.nfs.read-ahead: on</div><div>performance.cache-size: 4096MB</div><div>cluster.self-heal-daemon: enable</div><div>diagnostics.brick-log-level: ERROR</div><div>diagnostics.client-log-level: ERROR</div><div>nfs.rpc-auth-unix: off</div><div>nfs.acl: off</div><div>performance.nfs.io-cache: on</div><div>performance.client-io-threads: on</div><div>performance.nfs.stat-prefetch: on</div><div>performance.nfs.io-threads: on</div><div>diagnostics.latency-measurement: on</div><div>diagnostics.count-fop-hits: on</div><div>performance.md-cache-timeout: 1</div><div>performance.cache-refresh-timeout: 1</div><div>performance.io-thread-count: 16</div><div>performance.high-prio-threads: 16</div><div>performance.normal-prio-threads: 16</div><div>performance.low-prio-threads: 16</div><div>performance.least-prio-threads: 1</div><div>cluster.server-quorum-ratio: 60</div></div><div><br></div><div>fstab:</div><div>/dev/gluster/homes                              /bricks/vol-homes                   xfs defaults,noatime,nobarrier,nofail 0 2<br></div><div><br></div><div>Software:</div><div>CentOS Linux release 7.1.1503 (Core) </div><div><div>glusterfs-api-3.7.13-1.el7.x86_64</div><div>glusterfs-libs-3.7.13-1.el7.x86_64</div><div>glusterfs-3.7.13-1.el7.x86_64</div><div>glusterfs-fuse-3.7.13-1.el7.x86_64</div><div>glusterfs-server-3.7.13-1.el7.x86_64</div><div>glusterfs-client-xlators-3.7.13-1.el7.x86_64</div><div>glusterfs-cli-3.7.13-1.el7.x86_64</div></div><div><br></div><div><br></div><div>Log fragment 1:</div><div><div>Sep 22 19:56:50 glu03 lvm[868]: WARNING: Device for PV lDazuw-zBPf-Duis-ZDg1-3zfg-53Ba-2ZF34m not found or rejected by a filter.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Cannot change VG gluster while PVs are missing.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Consider vgreduce --removemissing.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Failed to extend thin metadata gluster-thinpool-tpool.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-homes.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-search-data.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-apps-data.</div><div>Sep 22 19:56:50 glu03 lvm[868]: Unmounting thin volume gluster-thinpool-tpool from /bricks/vol-video-asset-manager.</div></div><div><div>Sep 22 19:57:02 glu03 bricks-vol-video-asset-manager-brick1[45162]: [2016-09-22 17:57:02.713428] M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc] 0-vol-video-asset-manager-posix: health-check failed, going down</div><div>Sep 22 19:57:05 glu03 bricks-vol-apps-data-brick1[44536]: [2016-09-22 17:57:05.186146] M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc] 0-vol-apps-data-posix: health-check failed, going down</div><div>Sep 22 19:57:18 glu03 bricks-vol-search-data-brick1[40928]: [2016-09-22 17:57:18.674279] M [MSGID: 113075] [posix-helpers.c:1844:posix_health_check_thread_proc] 0-vol-search-data-posix: health-check failed, going down</div><div>Sep 22 19:57:32 glu03 bricks-vol-video-asset-manager-brick1[45162]: [2016-09-22 17:57:32.714461] M [MSGID: 113075] [posix-helpers.c:1850:posix_health_check_thread_proc] 0-vol-video-asset-manager-posix: still alive! -&gt; SIGTERM</div><div>Sep 22 19:57:32 glu03 kernel: XFS (dm-15): Unmounting Filesystem</div><div>Sep 22 19:57:35 glu03 bricks-vol-apps-data-brick1[44536]: [2016-09-22 17:57:35.186352] M [MSGID: 113075] [posix-helpers.c:1850:posix_health_check_thread_proc] 0-vol-apps-data-posix: still alive! -&gt; SIGTERM</div><div>Sep 22 19:57:35 glu03 kernel: XFS (dm-12): Unmounting Filesystem</div><div>Sep 22 19:57:48 glu03 bricks-vol-search-data-brick1[40928]: [2016-09-22 17:57:48.674444] M [MSGID: 113075] [posix-helpers.c:1850:posix_health_check_thread_proc] 0-vol-search-data-posix: still alive! -&gt; SIGTERM</div><div>Sep 22 19:57:48 glu03 kernel: XFS (dm-13): Unmounting Filesystem</div></div></div>