<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <br>
    <div class="moz-cite-prefix">On 01/22/2016 07:19 AM, Pranith Kumar
      Karampuri wrote:<br>
    </div>
    <blockquote cite="mid:56A18AA6.8010701@redhat.com" type="cite">
      <meta content="text/html; charset=windows-1252"
        http-equiv="Content-Type">
      <br>
      <br>
      <div class="moz-cite-prefix">On 01/22/2016 07:13 AM, Glomski,
        Patrick wrote:<br>
      </div>
      <blockquote
cite="mid:CALkMjdDxd0zCGM4tn9PTXGEgUR+Z7cF0vhbd+d4TCJkun2tEfg@mail.gmail.com"
        type="cite">
        <div dir="ltr">We use the samba glusterfs virtual filesystem
          (the current version provided on <a moz-do-not-send="true"
            href="http://download.gluster.org">download.gluster.org</a>),

          but no windows clients connecting directly.<br>
        </div>
      </blockquote>
      <br>
      Hmm.. Is there a way to disable using this and check if the CPU%
      still increases? What getxattr of "glusterfs.get_real_filename
      &lt;filanme&gt;" does is to scan the entire directory looking for
      strcasecmp(&lt;filname&gt;, &lt;scanned-filename&gt;). If anything
      matches then it will return the &lt;scanned-filename&gt;. But the
      problem is the scan is costly. So I wonder if this is the reason
      for the CPU spikes.<br>
    </blockquote>
    +Raghavendra Talur, +Poornima<br>
    <br>
    Raghavendra, Poornima,<br>
                When are these getxattrs triggered? Did you guys see any
    brick CPU spikes before? I initially thought it could be because of
    big directory heals. But this is happening even when no self-heals
    are required. So I had to move away from that theory.<br>
    <br>
    Pranith<br>
    <blockquote cite="mid:56A18AA6.8010701@redhat.com" type="cite"> <br>
      Pranith<br>
      <blockquote
cite="mid:CALkMjdDxd0zCGM4tn9PTXGEgUR+Z7cF0vhbd+d4TCJkun2tEfg@mail.gmail.com"
        type="cite">
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Thu, Jan 21, 2016 at 8:37 PM,
            Pranith Kumar Karampuri <span dir="ltr">&lt;<a
                moz-do-not-send="true" href="mailto:pkarampu@redhat.com"
                target="_blank">pkarampu@redhat.com</a>&gt;</span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000"> Do you have any
                windows clients? I see a lot of getxattr calls for
                "glusterfs.get_real_filename" which lead to full
                readdirs of the directories on the brick.<span
                  class="HOEnZb"><font color="#888888"><br>
                    <br>
                    Pranith</font></span><span class=""><br>
                  <br>
                  <div>On 01/22/2016 12:51 AM, Glomski, Patrick wrote:<br>
                  </div>
                </span>
                <div>
                  <div class="h5">
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div>Pranith, could this kind of behavior be
                          self-inflicted by us deleting files directly
                          from the bricks? We have done that in the past
                          to clean up an issues where gluster wouldn't
                          allow us to delete from the mount.<br>
                          <br>
                          If so, is it feasible to clean them up by
                          running a search on the .glusterfs directories
                          directly and removing files with a reference
                          count of 1 that are non-zero size (or directly
                          checking the xattrs to be sure that it's not a
                          DHT link). <br>
                          <br>
                          find /data/brick01a/homegfs/.glusterfs -type f
                          -not -empty -links -2 -exec rm -f "{}" \;<br>
                          <br>
                        </div>
                        Is there anything I'm inherently missing with
                        that approach that will further corrupt the
                        system?<br>
                        <div><br>
                        </div>
                      </div>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Thu, Jan 21, 2016 at
                          1:02 PM, Glomski, Patrick <span dir="ltr">&lt;<a
                              moz-do-not-send="true"
                              href="mailto:patrick.glomski@corvidtec.com"
                              target="_blank">patrick.glomski@corvidtec.com</a>&gt;</span>
                          wrote:<br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            <div dir="ltr">
                              <div>
                                <div>Load spiked again: ~1200%cpu on
                                  gfs02a for glusterfsd. Crawl has been
                                  running on one of the bricks on gfs02b
                                  for 25 min or so and users cannot
                                  access the volume.<br>
                                  <br>
                                  I re-listed the xattrop directories as
                                  well as a 'top' entry and heal
                                  statistics. Then I restarted the
                                  gluster services on gfs02a. <br>
                                  <br>
                                  =================== top
                                  ===================<br>
                                  PID USER      PR  NI  VIRT  RES  SHR S
                                  %CPU %MEM    TIME+ 
                                  COMMAND                                                
                                  <br>
                                   8969 root      20   0 2815m 204m 3588
                                  S 1181.0  0.6 591:06.93
                                  glusterfsd         <br>
                                  <br>
                                  =================== xattrop
                                  ===================<br>
/data/brick01a/homegfs/.glusterfs/indices/xattrop:<br>
xattrop-41f19453-91e4-437c-afa9-3b25614de210 
xattrop-9b815879-2f4d-402b-867c-a6d65087788c<br>
                                  <br>
/data/brick02a/homegfs/.glusterfs/indices/xattrop:<br>
xattrop-70131855-3cfb-49af-abce-9d23f57fb393 
xattrop-dfb77848-a39d-4417-a725-9beca75d78c6<br>
                                  <br>
/data/brick01b/homegfs/.glusterfs/indices/xattrop:<br>
e6e47ed9-309b-42a7-8c44-28c29b9a20f8         
xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125<br>
xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934 
xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0<br>
                                  <br>
/data/brick02b/homegfs/.glusterfs/indices/xattrop:<br>
xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc 
xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413<br>
                                  <br>
/data/brick01a/homegfs/.glusterfs/indices/xattrop:<br>
xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531<br>
                                  <br>
/data/brick02a/homegfs/.glusterfs/indices/xattrop:<br>
xattrop-7e20fdb1-5224-4b9a-be06-568708526d70<br>
                                  <br>
/data/brick01b/homegfs/.glusterfs/indices/xattrop:<br>
                                  8034bc06-92cd-4fa5-8aaf-09039e79d2c8 
                                  c9ce22ed-6d8b-471b-a111-b39e57f0b512<br>
                                  94fa1d60-45ad-4341-b69c-315936b51e8d 
xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7<br>
                                  <br>
/data/brick02b/homegfs/.glusterfs/indices/xattrop:<br>
xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d<br>
                                  <br>
                                  <br>
                                  =================== heal stats
                                  ===================<br>
                                   <br>
                                  homegfs [b0-gfsib01a] : Starting time
                                  of crawl       : Thu Jan 21 12:36:45
                                  2016<br>
                                  homegfs [b0-gfsib01a] : Ending time of
                                  crawl         : Thu Jan 21 12:36:45
                                  2016<br>
                                  homegfs [b0-gfsib01a] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b0-gfsib01a] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b0-gfsib01a] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b0-gfsib01a] : No. of heal
                                  failed entries   : 0<br>
                                   <br>
                                  homegfs [b1-gfsib01b] : Starting time
                                  of crawl       : Thu Jan 21 12:36:19
                                  2016<br>
                                  homegfs [b1-gfsib01b] : Ending time of
                                  crawl         : Thu Jan 21 12:36:19
                                  2016<br>
                                  homegfs [b1-gfsib01b] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b1-gfsib01b] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b1-gfsib01b] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b1-gfsib01b] : No. of heal
                                  failed entries   : 1<br>
                                   <br>
                                  homegfs [b2-gfsib01a] : Starting time
                                  of crawl       : Thu Jan 21 12:36:48
                                  2016<br>
                                  homegfs [b2-gfsib01a] : Ending time of
                                  crawl         : Thu Jan 21 12:36:48
                                  2016<br>
                                  homegfs [b2-gfsib01a] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b2-gfsib01a] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b2-gfsib01a] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b2-gfsib01a] : No. of heal
                                  failed entries   : 0<br>
                                   <br>
                                  homegfs [b3-gfsib01b] : Starting time
                                  of crawl       : Thu Jan 21 12:36:47
                                  2016<br>
                                  homegfs [b3-gfsib01b] : Ending time of
                                  crawl         : Thu Jan 21 12:36:47
                                  2016<br>
                                  homegfs [b3-gfsib01b] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b3-gfsib01b] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b3-gfsib01b] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b3-gfsib01b] : No. of heal
                                  failed entries   : 0<br>
                                   <br>
                                  homegfs [b4-gfsib02a] : Starting time
                                  of crawl       : Thu Jan 21 12:36:06
                                  2016<br>
                                  homegfs [b4-gfsib02a] : Ending time of
                                  crawl         : Thu Jan 21 12:36:06
                                  2016<br>
                                  homegfs [b4-gfsib02a] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b4-gfsib02a] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b4-gfsib02a] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b4-gfsib02a] : No. of heal
                                  failed entries   : 0<br>
                                   <br>
                                  homegfs [b5-gfsib02b] : Starting time
                                  of crawl       : Thu Jan 21 12:13:40
                                  2016<br>
                                  homegfs [b5-gfsib02b]
                                  :                                ***
                                  Crawl is in progress ***<br>
                                  homegfs [b5-gfsib02b] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b5-gfsib02b] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b5-gfsib02b] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b5-gfsib02b] : No. of heal
                                  failed entries   : 0<br>
                                   <br>
                                  homegfs [b6-gfsib02a] : Starting time
                                  of crawl       : Thu Jan 21 12:36:58
                                  2016<br>
                                  homegfs [b6-gfsib02a] : Ending time of
                                  crawl         : Thu Jan 21 12:36:58
                                  2016<br>
                                  homegfs [b6-gfsib02a] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b6-gfsib02a] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b6-gfsib02a] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b6-gfsib02a] : No. of heal
                                  failed entries   : 0<br>
                                   <br>
                                  homegfs [b7-gfsib02b] : Starting time
                                  of crawl       : Thu Jan 21 12:36:50
                                  2016<br>
                                  homegfs [b7-gfsib02b] : Ending time of
                                  crawl         : Thu Jan 21 12:36:50
                                  2016<br>
                                  homegfs [b7-gfsib02b] : Type of crawl:
                                  INDEX<br>
                                  homegfs [b7-gfsib02b] : No. of entries
                                  healed        : 0<br>
                                  homegfs [b7-gfsib02b] : No. of entries
                                  in split-brain: 0<br>
                                  homegfs [b7-gfsib02b] : No. of heal
                                  failed entries   : 0<br>
                                  <br>
                                  <br>
========================================================================================<br>
                                </div>
                                I waited a few minutes for the heals to
                                finish and ran the heal statistics and
                                info again. one file is in split-brain.
                                Aside from the split-brain, the load on
                                all systems is down now and they are
                                behaving normally. glustershd.log is
                                attached. What is going on??? <br>
                                <br>
                                Thu Jan 21 12:53:50 EST 2016<br>
                                 <br>
                                =================== homegfs
                                ===================<br>
                                 <br>
                                homegfs [b0-gfsib01a] : Starting time of
                                crawl       : Thu Jan 21 12:53:02 2016<br>
                                homegfs [b0-gfsib01a] : Ending time of
                                crawl         : Thu Jan 21 12:53:02 2016<br>
                                homegfs [b0-gfsib01a] : Type of crawl:
                                INDEX<br>
                                homegfs [b0-gfsib01a] : No. of entries
                                healed        : 0<br>
                                homegfs [b0-gfsib01a] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b0-gfsib01a] : No. of heal
                                failed entries   : 0<br>
                                 <br>
                                homegfs [b1-gfsib01b] : Starting time of
                                crawl       : Thu Jan 21 12:53:38 2016<br>
                                homegfs [b1-gfsib01b] : Ending time of
                                crawl         : Thu Jan 21 12:53:38 2016<br>
                                homegfs [b1-gfsib01b] : Type of crawl:
                                INDEX<br>
                                homegfs [b1-gfsib01b] : No. of entries
                                healed        : 0<br>
                                homegfs [b1-gfsib01b] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b1-gfsib01b] : No. of heal
                                failed entries   : 1<br>
                                 <br>
                                homegfs [b2-gfsib01a] : Starting time of
                                crawl       : Thu Jan 21 12:53:04 2016<br>
                                homegfs [b2-gfsib01a] : Ending time of
                                crawl         : Thu Jan 21 12:53:04 2016<br>
                                homegfs [b2-gfsib01a] : Type of crawl:
                                INDEX<br>
                                homegfs [b2-gfsib01a] : No. of entries
                                healed        : 0<br>
                                homegfs [b2-gfsib01a] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b2-gfsib01a] : No. of heal
                                failed entries   : 0<br>
                                 <br>
                                homegfs [b3-gfsib01b] : Starting time of
                                crawl       : Thu Jan 21 12:53:04 2016<br>
                                homegfs [b3-gfsib01b] : Ending time of
                                crawl         : Thu Jan 21 12:53:04 2016<br>
                                homegfs [b3-gfsib01b] : Type of crawl:
                                INDEX<br>
                                homegfs [b3-gfsib01b] : No. of entries
                                healed        : 0<br>
                                homegfs [b3-gfsib01b] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b3-gfsib01b] : No. of heal
                                failed entries   : 0<br>
                                 <br>
                                homegfs [b4-gfsib02a] : Starting time of
                                crawl       : Thu Jan 21 12:53:33 2016<br>
                                homegfs [b4-gfsib02a] : Ending time of
                                crawl         : Thu Jan 21 12:53:33 2016<br>
                                homegfs [b4-gfsib02a] : Type of crawl:
                                INDEX<br>
                                homegfs [b4-gfsib02a] : No. of entries
                                healed        : 0<br>
                                homegfs [b4-gfsib02a] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b4-gfsib02a] : No. of heal
                                failed entries   : 1<br>
                                 <br>
                                homegfs [b5-gfsib02b] : Starting time of
                                crawl       : Thu Jan 21 12:53:14 2016<br>
                                homegfs [b5-gfsib02b] : Ending time of
                                crawl         : Thu Jan 21 12:53:15 2016<br>
                                homegfs [b5-gfsib02b] : Type of crawl:
                                INDEX<br>
                                homegfs [b5-gfsib02b] : No. of entries
                                healed        : 0<br>
                                homegfs [b5-gfsib02b] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b5-gfsib02b] : No. of heal
                                failed entries   : 3<br>
                                 <br>
                                homegfs [b6-gfsib02a] : Starting time of
                                crawl       : Thu Jan 21 12:53:04 2016<br>
                                homegfs [b6-gfsib02a] : Ending time of
                                crawl         : Thu Jan 21 12:53:04 2016<br>
                                homegfs [b6-gfsib02a] : Type of crawl:
                                INDEX<br>
                                homegfs [b6-gfsib02a] : No. of entries
                                healed        : 0<br>
                                homegfs [b6-gfsib02a] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b6-gfsib02a] : No. of heal
                                failed entries   : 0<br>
                                 <br>
                                homegfs [b7-gfsib02b] : Starting time of
                                crawl       : Thu Jan 21 12:53:09 2016<br>
                                homegfs [b7-gfsib02b] : Ending time of
                                crawl         : Thu Jan 21 12:53:09 2016<br>
                                homegfs [b7-gfsib02b] : Type of crawl:
                                INDEX<br>
                                homegfs [b7-gfsib02b] : No. of entries
                                healed        : 0<br>
                                homegfs [b7-gfsib02b] : No. of entries
                                in split-brain: 0<br>
                                homegfs [b7-gfsib02b] : No. of heal
                                failed entries   : 0<br>
                                 <br>
                                *** gluster bug in 'gluster volume heal
                                homegfs statistics'   ***<br>
                                *** Use 'gluster volume heal homegfs
                                info' until bug is fixed ***<span><br>
                                   <br>
                                  Brick
                                  gfs01a.corvidtec.com:/data/brick01a/homegfs/<br>
                                  Number of entries: 0<br>
                                  <br>
                                  Brick
                                  gfs01b.corvidtec.com:/data/brick01b/homegfs/<br>
                                  Number of entries: 0<br>
                                  <br>
                                  Brick
                                  gfs01a.corvidtec.com:/data/brick02a/homegfs/<br>
                                  Number of entries: 0<br>
                                  <br>
                                  Brick
                                  gfs01b.corvidtec.com:/data/brick02b/homegfs/<br>
                                  Number of entries: 0<br>
                                  <br>
                                  Brick
                                  gfs02a.corvidtec.com:/data/brick01a/homegfs/<br>
                                </span>/users/bangell/.gconfd - Is in
                                split-brain<br>
                                <br>
                                Number of entries: 1<br>
                                <br>
                                Brick
                                gfs02b.corvidtec.com:/data/brick01b/homegfs/<br>
                                /users/bangell/.gconfd - Is in
                                split-brain<br>
                                <br>
                                /users/bangell/.gconfd/saved_state <br>
                                Number of entries: 2<span><br>
                                  <br>
                                  Brick
                                  gfs02a.corvidtec.com:/data/brick02a/homegfs/<br>
                                  Number of entries: 0<br>
                                  <br>
                                  Brick
                                  gfs02b.corvidtec.com:/data/brick02b/homegfs/<br>
                                  Number of entries: 0<br>
                                  <br>
                                </span></div>
                              <div><br>
                                <br>
                              </div>
                            </div>
                            <div>
                              <div>
                                <div class="gmail_extra"><br>
                                  <div class="gmail_quote">On Thu, Jan
                                    21, 2016 at 11:10 AM, Pranith Kumar
                                    Karampuri <span dir="ltr">&lt;<a
                                        moz-do-not-send="true"
                                        href="mailto:pkarampu@redhat.com"
                                        target="_blank">pkarampu@redhat.com</a>&gt;</span>
                                    wrote:<br>
                                    <blockquote class="gmail_quote"
                                      style="margin:0 0 0
                                      .8ex;border-left:1px #ccc
                                      solid;padding-left:1ex">
                                      <div bgcolor="#FFFFFF"
                                        text="#000000"><span> <br>
                                          <br>
                                          <div>On 01/21/2016 09:26 PM,
                                            Glomski, Patrick wrote:<br>
                                          </div>
                                          <blockquote type="cite">
                                            <div dir="ltr">
                                              <div>I should mention that
                                                the problem is not
                                                currently occurring and
                                                there are no heals
                                                (output appended). By
                                                restarting the gluster
                                                services, we can stop
                                                the crawl, which lowers
                                                the load for a while.
                                                Subsequent crawls seem
                                                to finish properly. For
                                                what it's worth,
                                                files/folders that show
                                                up in the 'volume info'
                                                output during a hung
                                                crawl don't seem to be
                                                anything out of the
                                                ordinary. <br>
                                                <br>
                                                Over the past four days,
                                                the typical time before
                                                the problem recurs after
                                                suppressing it in this
                                                manner is an hour. Last
                                                night when we reached
                                                out to you was the last
                                                time it happened and the
                                                load has been low since
                                                (a relief).  David
                                                believes that
                                                recursively listing the
                                                files (ls -alR or
                                                similar) from a client
                                                mount can force the
                                                issue to happen, but
                                                obviously I'd rather not
                                                unless we have some
                                                precise thing we're
                                                looking for. Let me know
                                                if you'd like me to
                                                attempt to drive the
                                                system unstable like
                                                that and what I should
                                                look for. As it's a
                                                production system, I'd
                                                rather not leave it in
                                                this state for long.<br>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                        </span> Will it be possible to
                                        send glustershd, mount logs of
                                        the past 4 days? I would like to
                                        see if this is because of
                                        directory self-heal going wild
                                        (Ravi is working on throttling
                                        feature for 3.8, which will
                                        allow to put breaks on self-heal
                                        traffic)<span><font
                                            color="#888888"><br>
                                            <br>
                                            Pranith</font></span>
                                        <div>
                                          <div><br>
                                            <blockquote type="cite">
                                              <div dir="ltr">
                                                <div><br>
                                                </div>
                                                <div>[root@gfs01a
                                                  xattrop]# gluster
                                                  volume heal homegfs
                                                  info<br>
                                                  Brick
                                                  gfs01a.corvidtec.com:/data/brick01a/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs01b.corvidtec.com:/data/brick01b/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs01a.corvidtec.com:/data/brick02a/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs01b.corvidtec.com:/data/brick02b/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs02a.corvidtec.com:/data/brick01a/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs02b.corvidtec.com:/data/brick01b/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs02a.corvidtec.com:/data/brick02a/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  Brick
                                                  gfs02b.corvidtec.com:/data/brick02b/homegfs/<br>
                                                  Number of entries: 0<br>
                                                  <br>
                                                  <br>
                                                  <br>
                                                </div>
                                              </div>
                                              <div class="gmail_extra"><br>
                                                <div class="gmail_quote">On

                                                  Thu, Jan 21, 2016 at
                                                  10:40 AM, Pranith
                                                  Kumar Karampuri <span
                                                    dir="ltr">&lt;<a
                                                      moz-do-not-send="true"
href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>
                                                  wrote:<br>
                                                  <blockquote
                                                    class="gmail_quote"
                                                    style="margin:0 0 0
                                                    .8ex;border-left:1px
                                                    #ccc
                                                    solid;padding-left:1ex">
                                                    <div
                                                      bgcolor="#FFFFFF"
                                                      text="#000000"><span>
                                                        <br>
                                                        <br>
                                                        <div>On
                                                          01/21/2016
                                                          08:25 PM,
                                                          Glomski,
                                                          Patrick wrote:<br>
                                                        </div>
                                                        <blockquote
                                                          type="cite">
                                                          <div dir="ltr">
                                                          <div>Hello,
                                                          Pranith. The
                                                          typical
                                                          behavior is
                                                          that the %cpu
                                                          on a
                                                          glusterfsd
                                                          process jumps
                                                          to number of
                                                          processor
                                                          cores
                                                          available
                                                          (800% or
                                                          1200%,
                                                          depending on
                                                          the pair of
                                                          nodes
                                                          involved) and
                                                          the load
                                                          average on the
                                                          machine goes
                                                          very high
                                                          (~20). The
                                                          volume's heal
                                                          statistics
                                                          output shows
                                                          that it is
                                                          crawling one
                                                          of the bricks
                                                          and trying to
                                                          heal, but this
                                                          crawl hangs
                                                          and never
                                                          seems to
                                                          finish.<br>
                                                          </div>
                                                          </div>
                                                        </blockquote>
                                                        <blockquote
                                                          type="cite">
                                                          <div dir="ltr">
                                                          <div><br>
                                                          </div>
                                                          The number of
                                                          files in the
                                                          xattrop
                                                          directory
                                                          varies over
                                                          time, so I ran
                                                          a wc -l as you
                                                          requested
                                                          periodically
                                                          for some time
                                                          and then
                                                          started
                                                          including a
                                                          datestamped
                                                          list of the
                                                          files that
                                                          were in the
                                                          xattrops
                                                          directory on
                                                          each brick to
                                                          see which were
                                                          persistent.
                                                          All bricks had
                                                          files in the
                                                          xattrop
                                                          folder, so all
                                                          results are
                                                          attached.<br>
                                                          </div>
                                                        </blockquote>
                                                      </span> Thanks
                                                      this info is
                                                      helpful. I don't
                                                      see a lot of
                                                      files. Could you
                                                      give output of
                                                      "gluster volume
                                                      heal
                                                      &lt;volname&gt;
                                                      info"? Is there
                                                      any directory in
                                                      there which is
                                                      LARGE?<span><font
color="#888888"><br>
                                                          <br>
                                                          Pranith</font></span>
                                                      <div>
                                                        <div><br>
                                                          <blockquote
                                                          type="cite">
                                                          <div dir="ltr">
                                                          <div><br>
                                                          </div>
                                                          <div>Please
                                                          let me know if
                                                          there is
                                                          anything else
                                                          I can provide.<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          <div>Patrick<br>
                                                          </div>
                                                          <div><br>
                                                          </div>
                                                          </div>
                                                          <div
                                                          class="gmail_extra"><br>
                                                          <div
                                                          class="gmail_quote">On


                                                          Thu, Jan 21,
                                                          2016 at 12:01
                                                          AM, Pranith
                                                          Kumar
                                                          Karampuri <span
                                                          dir="ltr">&lt;<a
moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>
                                                          wrote:<br>
                                                          <blockquote
                                                          class="gmail_quote"
                                                          style="margin:0

                                                          0 0
                                                          .8ex;border-left:1px
                                                          #ccc
                                                          solid;padding-left:1ex">
                                                          <div
                                                          bgcolor="#FFFFFF"
                                                          text="#000000">
                                                          hey,<br>
                                                                 Which
                                                          process is
                                                          consuming so
                                                          much cpu? I
                                                          went through
                                                          the logs you
                                                          gave me. I see
                                                          that the
                                                          following
                                                          files are in
                                                          gfid mismatch
                                                          state:<br>
                                                          <br>
&lt;066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup&gt;,<br>
&lt;1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak&gt;,<br>
&lt;ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg&gt;,<br>
                                                          <br>
                                                          Could you give
                                                          me the output
                                                          of "ls
                                                          &lt;brick-path&gt;/indices/xattrop
                                                          | wc -l"
                                                          output on all
                                                          the bricks
                                                          which are
                                                          acting this
                                                          way? This will
                                                          tell us the
                                                          number of
                                                          pending
                                                          self-heals on
                                                          the system.<br>
                                                          <br>
                                                          Pranith
                                                          <div>
                                                          <div><br>
                                                          <br>
                                                          <div>On
                                                          01/20/2016
                                                          09:26 PM,
                                                          David Robinson
                                                          wrote:<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <blockquote
                                                          type="cite">
                                                          <div>
                                                          <div>
                                                          <div>resending
                                                          with parsed
                                                          logs... </div>
                                                          <div> </div>
                                                          <div>
                                                          <blockquote
                                                          cite="http://em5ee26b0e-002a-4230-bdec-3020b98cff3c@dfrobins-vaio"
                                                          type="cite">
                                                          <div> </div>
                                                          <div> </div>
                                                          <div>
                                                          <blockquote
                                                          cite="http://eme3b2cb80-8be2-4fa5-9d08-4710955e237c@dfrobins-vaio"
                                                          type="cite">
                                                          <div>I am
                                                          having issues
                                                          with 3.6.6
                                                          where the load
                                                          will spike up
                                                          to 800% for
                                                          one of the
                                                          glusterfsd
                                                          processes and
                                                          the users can
                                                          no longer
                                                          access the
                                                          system.  If I
                                                          reboot the
                                                          node, the heal
                                                          will finish
                                                          normally after
                                                          a few minutes
                                                          and the system
                                                          will be
                                                          responsive,
                                                          but a few
                                                          hours later
                                                          the issue will
                                                          start again. 
                                                          It look like
                                                          it is hanging
                                                          in a heal and
                                                          spinning up
                                                          the load on
                                                          one of the
                                                          bricks.  The
                                                          heal gets
                                                          stuck and says
                                                          it is crawling
                                                          and never
                                                          returns. 
                                                          After a few
                                                          minutes of the
                                                          heal saying it
                                                          is crawling,
                                                          the load
                                                          spikes up and
                                                          the mounts
                                                          become
                                                          unresponsive.</div>
                                                          <div> </div>
                                                          <div>Any
                                                          suggestions on
                                                          how to fix
                                                          this?  It has
                                                          us stopped
                                                          cold as the
                                                          user can no
                                                          longer access
                                                          the systems
                                                          when the load
                                                          spikes... Logs
                                                          attached.</div>
                                                          <div> </div>
                                                          <div>System
                                                          setup info is:
                                                          </div>
                                                          <div> </div>
                                                          <div>[root@gfs01a


                                                          ~]# gluster
                                                          volume info
                                                          homegfs<br>
                                                           <br>
                                                          Volume Name:
                                                          homegfs<br>
                                                          Type:
                                                          Distributed-Replicate<br>
                                                          Volume ID:
                                                          1e32672a-f1b7-4b58-ba94-58c085e59071<br>
                                                          Status:
                                                          Started<br>
                                                          Number of
                                                          Bricks: 4 x 2
                                                          = 8<br>
                                                          Transport-type:


                                                          tcp<br>
                                                          Bricks:<br>
                                                          Brick1:
                                                          gfsib01a.corvidtec.com:/data/brick01a/homegfs<br>
                                                          Brick2:
                                                          gfsib01b.corvidtec.com:/data/brick01b/homegfs<br>
                                                          Brick3:
                                                          gfsib01a.corvidtec.com:/data/brick02a/homegfs<br>
                                                          Brick4:
                                                          gfsib01b.corvidtec.com:/data/brick02b/homegfs<br>
                                                          Brick5:
                                                          gfsib02a.corvidtec.com:/data/brick01a/homegfs<br>
                                                          Brick6:
                                                          gfsib02b.corvidtec.com:/data/brick01b/homegfs<br>
                                                          Brick7:
                                                          gfsib02a.corvidtec.com:/data/brick02a/homegfs<br>
                                                          Brick8:
                                                          gfsib02b.corvidtec.com:/data/brick02b/homegfs<br>
                                                          Options
                                                          Reconfigured:<br>
                                                          performance.io-thread-count:



                                                          32<br>
                                                          performance.cache-size:



                                                          128MB<br>
                                                          performance.write-behind-window-size:




                                                          128MB<br>
                                                          server.allow-insecure:


                                                          on<br>
                                                          network.ping-timeout:


                                                          42<br>
                                                          storage.owner-gid:


                                                          100<br>
                                                          geo-replication.indexing:



                                                          off<br>
                                                          geo-replication.ignore-pid-check:



                                                          on<br>
                                                          changelog.changelog:


                                                          off<br>
                                                          changelog.fsync-interval:


                                                          3<br>
                                                          changelog.rollover-time:


                                                          15<br>
                                                          server.manage-gids:


                                                          on<br>
                                                          diagnostics.client-log-level:



                                                          WARNING</div>
                                                          <div> </div>
                                                          <div>[root@gfs01a


                                                          ~]# rpm -qa |
                                                          grep gluster<br>
gluster-nagios-common-0.1.1-0.el6.noarch<br>
glusterfs-fuse-3.6.6-1.el6.x86_64<br>
glusterfs-debuginfo-3.6.6-1.el6.x86_64<br>
glusterfs-libs-3.6.6-1.el6.x86_64<br>
glusterfs-geo-replication-3.6.6-1.el6.x86_64<br>
glusterfs-api-3.6.6-1.el6.x86_64<br>
glusterfs-devel-3.6.6-1.el6.x86_64<br>
glusterfs-api-devel-3.6.6-1.el6.x86_64<br>
glusterfs-3.6.6-1.el6.x86_64<br>
glusterfs-cli-3.6.6-1.el6.x86_64<br>
glusterfs-rdma-3.6.6-1.el6.x86_64<br>
samba-vfs-glusterfs-4.1.11-2.el6.x86_64<br>
glusterfs-server-3.6.6-1.el6.x86_64<br>
glusterfs-extra-xlators-3.6.6-1.el6.x86_64<br>
                                                          </div>
                                                          <div> </div>
                                                          <div>
                                                          <div
                                                          style="FONT-SIZE:12pt;FONT-FAMILY:Times
                                                          New Roman"><span><span>
                                                          <div> </div>
                                                          </span></span></div>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          <fieldset></fieldset>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          <pre>_______________________________________________
Gluster-devel mailing list
<a moz-do-not-send="true" href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a>
<a moz-do-not-send="true" href="http://www.gluster.org/mailman/listinfo/gluster-devel" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-devel</a></pre>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          <br>
_______________________________________________<br>
                                                          Gluster-users
                                                          mailing list<br>
                                                          <a
                                                          moz-do-not-send="true"
href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
                                                          <a
                                                          moz-do-not-send="true"
href="http://www.gluster.org/mailman/listinfo/gluster-users"
                                                          rel="noreferrer"
target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                </div>
                                                <br>
                                              </div>
                                            </blockquote>
                                            <br>
                                          </div>
                                        </div>
                                      </div>
                                    </blockquote>
                                  </div>
                                  <br>
                                </div>
                              </div>
                            </div>
                          </blockquote>
                        </div>
                        <br>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </blockquote>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Gluster-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-devel">http://www.gluster.org/mailman/listinfo/gluster-devel</a></pre>
    </blockquote>
    <br>
  </body>
</html>