<div dir="ltr">We do need to consider this as bug and fix full self-heal to handle the case where it has to look at both the bricks to see if there are any files missing in the bricks. We won&#39;t be letting this happen on the mounts though because it will slow down performance. Be very careful about deleting files directly from the brick though. It is always recommended you take back up of the good file before attempting heal.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 17, 2016 at 4:28 PM, Дмитрий Глушенок <span dir="ltr">&lt;<a href="mailto:glush@jet.msk.su" target="_blank">glush@jet.msk.su</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>You are right, stat triggers self-heal. Thank you!</div><span class=""><br><div>
<div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div style="color:rgb(0,0,0);letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word"><div>--</div><div><div style="word-wrap:break-word"><div style="word-wrap:break-word"><div style="word-wrap:break-word"><div>Dmitry Glushenok</div><div>Jet Infosystems</div></div></div></div></div></div></div></div></div></div></div></div>
</div>

<br></span><div><blockquote type="cite"><div>17 авг. 2016 г., в 13:38, Ravishankar N &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a>&gt; написал(а):</div><div><div class="h5"><br><div>
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div>On 08/17/2016 03:48 PM, Дмитрий
      Глушенок wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div>Unfortunately not:</div>
      <div><br>
      </div>
      <div>Remount FS, then access test file from second
        client:</div>
      <div><br>
      </div>
      <div>
        <div>[root@srv02 ~]# umount /mnt</div>
        <div>[root@srv02 ~]# mount -t glusterfs srv01:/test01
          /mnt</div>
        <div>[root@srv02 ~]# ls -l /mnt/passwd </div>
        <div>-rw-r--r--. 1 root root 1505 авг 16 19:59
          /mnt/passwd</div>
        <div>[root@srv02 ~]# ls -l /R1/test01/</div>
        <div>итого 4</div>
        <div>-rw-r--r--. 2 root root 1505 авг 16 19:59 passwd</div>
        <div>[root@srv02 ~]# </div>
        <div><br>
        </div>
        <div>Then remount FS and check if accessing the file
          from second node triggered self-heal on first node:</div>
        <div><br>
        </div>
        <div>[root@srv01 ~]# umount /mnt</div>
        <div>[root@srv01 ~]# mount -t glusterfs srv01:/test01
          /mnt</div>
        <div>[root@srv01 ~]# ls -l /mnt</div>
      </div>
    </blockquote>
    <br>
    Can you try `stat /mnt/passwd` from this node after remounting? You
    need to explicitly lookup the file.  `ls -l /mnt`  is only
    triggering readdir on the parent directory.<br>
    If that doesn&#39;t work, is this mount connected to both clients? i.e.
    if you create a new file from here, is it getting replicated to both
    bricks?<br>
    <br>
    -Ravi<br>
    <br>
    <blockquote type="cite">
      <div>
        <div>итого 0</div>
        <div>[root@srv01 ~]# ls -l /R1/test01/</div>
        <div>итого 0</div>
        <div>[root@srv01 ~]#</div>
      </div>
      <div><br>
      </div>
      <div>Nothing appeared.</div>
      <div><br>
      </div>
      <div>
        <div>[root@srv01 ~]# gluster volume info test01</div>
        <div> </div>
        <div>Volume Name: test01</div>
        <div>Type: Replicate</div>
        <div>Volume ID: 2c227085-0b06-4804-805c-<wbr>ea9c1bb11d8b</div>
        <div>Status: Started</div>
        <div>Number of Bricks: 1 x 2 = 2</div>
        <div>Transport-type: tcp</div>
        <div>Bricks:</div>
        <div>Brick1: srv01:/R1/test01</div>
        <div>Brick2: srv02:/R1/test01</div>
        <div>Options Reconfigured:</div>
        <div>features.scrub-freq: hourly</div>
        <div>features.scrub: Active</div>
        <div>features.bitrot: on</div>
        <div>transport.address-family: inet</div>
        <div>performance.readdir-ahead: on</div>
        <div>nfs.disable: on</div>
        <div>[root@srv01 ~]# </div>
      </div>
      <div><br>
      </div>
      <div>
        <div>[root@srv01 ~]# gluster volume get test01 all |
          grep heal</div>
        <div>cluster.background-self-heal-<wbr>count      8        
                                        </div>
        <div>cluster.metadata-self-heal              on        
                                       </div>
        <div>cluster.data-self-heal                  on        
                                       </div>
        <div>cluster.entry-self-heal                 on        
                                       </div>
        <div>cluster.self-heal-daemon                on        
                                       </div>
        <div>cluster.heal-timeout                    600      
                                        </div>
        <div>cluster.self-heal-window-size           1        
                                        </div>
        <div>cluster.data-self-heal-<wbr>algorithm        (null)    
                                       </div>
        <div>cluster.self-heal-readdir-size          1KB      
                                        </div>
        <div>cluster.heal-wait-queue-length          128      
                                        </div>
        <div>features.lock-heal                      off      
                                        </div>
        <div>features.lock-heal                      off      
                                        </div>
        <div>storage.health-check-interval           30        
                                       </div>
        <div>features.ctr_lookupheal_link_<wbr>timeout    300      
                                        </div>
        <div>features.ctr_lookupheal_inode_<wbr>timeout   300      
                                        </div>
        <div>cluster.disperse-self-heal-<wbr>daemon       enable    
                                       </div>
        <div>disperse.background-heals               8        
                                        </div>
        <div>disperse.heal-wait-qlength              128      
                                        </div>
        <div>cluster.heal-timeout                    600      
                                        </div>
        <div>cluster.granular-entry-heal             no        
                                       </div>
        <div>[root@srv01 ~]#</div>
      </div>
      <br>
      <div>
        <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
          <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
            <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
              <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                  <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                    <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                      <div>--</div>
                      <div>
                        <div style="word-wrap:break-word">
                          <div style="word-wrap:break-word">
                            <div style="word-wrap:break-word">
                              <div>Dmitry Glushenok</div>
                              <div>Jet Infosystems</div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <div>
        <blockquote type="cite">
          <div>17 авг. 2016 г., в 11:30, Ravishankar N &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a>&gt;
            написал(а):</div>
          <br>
          <div>
            
            <div bgcolor="#FFFFFF" text="#000000">
              <div>On 08/17/2016 01:48 PM,
                Дмитрий Глушенок wrote:<br>
              </div>
              <blockquote type="cite">
                
                <div>Hello Ravi,</div>
                <div><br>
                </div>
                <div>Thank you for reply. Found bug number (for
                  those who will google the email) <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1112158" target="_blank">https://bugzilla.redhat.com/<wbr>show_bug.cgi?id=1112158</a></div>
                <div><br>
                </div>
                <div>Accessing the removed file from
                  mount-point is not always working because we have to
                  find a special client which DHT will point to the
                  brick with removed file. Otherwise the file will be
                  accessed from good brick and self-healing will not
                  happen (just verified). Or by accessing you meant
                  something like touch?</div>
              </blockquote>
              <br>
              Sorry should have been more explicit. I meant triggering a
              lookup on that file with `stat filename`. I don&#39;t think
              you need a special client. DHT sends the lookup to AFR
              which in turn sends to all its children. When one of them
              returns ENOENT (because you removed it from the brick),
              AFR will automatically trigger heal. I&#39;m guessing it is
              not always working in your case due to caching at various
              levels and the lookup not coming till AFR. If you do it
              from a fresh mount ,it should always work.<br>
              -Ravi<br>
              <br>
              <blockquote type="cite">
                <div>
                  <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                    <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                      <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                        <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                          <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                            <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                              <div style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;word-wrap:break-word">
                                <div>
                                  <div style="word-wrap:break-word">
                                    <div style="word-wrap:break-word">
                                      <div style="word-wrap:break-word">
                                        <div>Dmitry Glushenok</div>
                                        <div>Jet Infosystems</div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
                <br>
                <div>
                  <blockquote type="cite">
                    <div>17 авг. 2016 г., в 4:24, Ravishankar N
                      &lt;<a href="mailto:ravishankar@redhat.com" target="_blank">ravishankar@redhat.com</a>&gt;
                      написал(а):</div>
                    <br>
                    <div><span style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">On
                        08/16/2016 10:44 PM, Дмитрий Глушенок wrote:</span><br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <blockquote type="cite" style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">Hello,<br>
                        <br>
                        While testing healing after bitrot error it was
                        found that self healing cannot heal files which
                        were manually deleted from brick. Gluster 3.8.1:<br>
                        <br>
                        - Create volume, mount it locally and copy test
                        file to it<br>
                        [root@srv01 ~]# gluster volume create test01
                        replica 2  srv01:/R1/test01 srv02:/R1/test01<br>
                        volume create: test01: success: please start the
                        volume to access data<br>
                        [root@srv01 ~]# gluster volume start test01<br>
                        volume start: test01: success<br>
                        [root@srv01 ~]# mount -t glusterfs srv01:/test01
                        /mnt<br>
                        [root@srv01 ~]# cp /etc/passwd /mnt<br>
                        [root@srv01 ~]# ls -l /mnt<br>
                        итого 2<br>
                        -rw-r--r--. 1 root root 1505 авг 16 19:59 passwd<br>
                        <br>
                        - Then remove test file from first brick like we
                        have to do in case of bitrot error in the file<br>
                      </blockquote>
                      <br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <span style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">You also
                        need to remove all hard-links to the corrupted
                        file from the brick, including the one in the
                        .glusterfs folder.</span><br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <span style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">There is a
                        bug in heal-full that prevents it from crawling
                        all bricks of the replica. The right way to heal
                        the corrupted files as of now is to access them
                        from the mount-point like you did after removing
                        the hard-links. The list of files that are
                        corrupted can be obtained with the scrub status
                        command.</span><br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <span style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">Hope this
                        helps,</span><br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <span style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important">Ravi</span><br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <br style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
                      <blockquote type="cite" style="font-family:Menlo-Regular;font-size:13px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">[root@srv01
                        ~]# rm /R1/test01/passwd<br>
                        [root@srv01 ~]# ls -l /mnt<br>
                        итого 0<br>
                        [root@srv01 ~]#<br>
                        <br>
                        - Issue full self heal<br>
                        [root@srv01 ~]# gluster volume heal test01 full<br>
                        Launching heal operation to perform full self
                        heal on volume test01 has been successful<br>
                        Use heal info commands to check status<br>
                        [root@srv01 ~]# tail -2
                        /var/log/glusterfs/glustershd.<wbr>log<br>
                        [2016-08-16 16:59:56.483767] I [MSGID: 108026]
                        [afr-self-heald.c:611:afr_shd_<wbr>full_healer]
                        0-test01-replicate-0: starting full sweep on
                        subvol test01-client-0<br>
                        [2016-08-16 16:59:56.486560] I [MSGID: 108026]
                        [afr-self-heald.c:621:afr_shd_<wbr>full_healer]
                        0-test01-replicate-0: finished full sweep on
                        subvol test01-client-0<br>
                        <br>
                        - Now we still see no files in mount point (it
                        becomes empty right after removing file from the
                        brick)<br>
                        [root@srv01 ~]# ls -l /mnt<br>
                        итого 0<br>
                        [root@srv01 ~]#<br>
                        <br>
                        - Then try to access file by using full name
                        (lookup-optimize and readdir-optimize are turned
                        off by default). Now glusterfs shows the file!<br>
                        [root@srv01 ~]# ls -l /mnt/passwd<br>
                        -rw-r--r--. 1 root root 1505 авг 16 19:59
                        /mnt/passwd<br>
                        <br>
                        - And it reappeared in the brick<br>
                        [root@srv01 ~]# ls -l /R1/test01/<br>
                        итого 4<br>
                        -rw-r--r--. 2 root root 1505 авг 16 19:59 passwd<br>
                        [root@srv01 ~]#<br>
                        <br>
                        Is it a bug or we can tell self heal to scan all
                        files on all bricks in the volume?<br>
                        <br>
                        --<br>
                        Dmitry Glushenok<br>
                        Jet Infosystems<br>
                        <br>
                        ______________________________<wbr>_________________<br>
                        Gluster-users mailing list<br>
                        <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
                        <a href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/<wbr>mailman/listinfo/gluster-users</a></blockquote>
                    </div>
                  </blockquote>
                </div>
                <br>
              </blockquote><p><br>
              </p>
            </div>
          </div>
        </blockquote>
      </div>
      <br>
    </blockquote><p><br>
    </p>
  </div>

</div></div></div></blockquote></div><br></div><br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Pranith<br></div></div>
</div>