<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <br>

    <div class="moz-cite-prefix">On 01/22/2016 07:19 AM, Pranith Kumar

      Karampuri wrote:<br>

    </div>

    <blockquote cite="mid:56A18AA6.8010701@redhat.com" type="cite">

      <meta content="text/html; charset=windows-1252"

        http-equiv="Content-Type">

      <br>

      <br>

      <div class="moz-cite-prefix">On 01/22/2016 07:13 AM, Glomski,

        Patrick wrote:<br>

      </div>

      <blockquote

cite="mid:CALkMjdDxd0zCGM4tn9PTXGEgUR+Z7cF0vhbd+d4TCJkun2tEfg@mail.gmail.com"

        type="cite">

        <div dir="ltr">We use the samba glusterfs virtual filesystem

          (the current version provided on <a moz-do-not-send="true"

            href="http://download.gluster.org">download.gluster.org</a>),

          but no windows clients connecting directly.<br>

        </div>

      </blockquote>

      <br>

      Hmm.. Is there a way to disable using this and check if the CPU%

      still increases? What getxattr of "glusterfs.get_real_filename

      &lt;filanme&gt;" does is to scan the entire directory looking for

      strcasecmp(&lt;filname&gt;, &lt;scanned-filename&gt;). If anything

      matches then it will return the &lt;scanned-filename&gt;. But the

      problem is the scan is costly. So I wonder if this is the reason

      for the CPU spikes.<br>

    </blockquote>

    +Raghavendra Talur, +Poornima<br>

    <br>

    Raghavendra, Poornima,<br>

                When are these getxattrs triggered? Did you guys see any

    brick CPU spikes before? I initially thought it could be because of

    big directory heals. But this is happening even when no self-heals

    are required. So I had to move away from that theory.<br>

    <br>

    Pranith<br>

    <blockquote cite="mid:56A18AA6.8010701@redhat.com" type="cite"> <br>

      Pranith<br>

      <blockquote

cite="mid:CALkMjdDxd0zCGM4tn9PTXGEgUR+Z7cF0vhbd+d4TCJkun2tEfg@mail.gmail.com"

        type="cite">

        <div class="gmail_extra"><br>

          <div class="gmail_quote">On Thu, Jan 21, 2016 at 8:37 PM,

            Pranith Kumar Karampuri <span dir="ltr">&lt;<a

                moz-do-not-send="true" href="mailto:pkarampu@redhat.com"

                target="_blank">pkarampu@redhat.com</a>&gt;</span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000"> Do you have any

                windows clients? I see a lot of getxattr calls for

                "glusterfs.get_real_filename" which lead to full

                readdirs of the directories on the brick.<span

                  class="HOEnZb"><font color="#888888"><br>

                    <br>

                    Pranith</font></span><span class=""><br>

                  <br>

                  <div>On 01/22/2016 12:51 AM, Glomski, Patrick wrote:<br>

                  </div>

                </span>

                <div>

                  <div class="h5">

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div>Pranith, could this kind of behavior be

                          self-inflicted by us deleting files directly

                          from the bricks? We have done that in the past

                          to clean up an issues where gluster wouldn't

                          allow us to delete from the mount.<br>

                          <br>

                          If so, is it feasible to clean them up by

                          running a search on the .glusterfs directories

                          directly and removing files with a reference

                          count of 1 that are non-zero size (or directly

                          checking the xattrs to be sure that it's not a

                          DHT link). <br>

                          <br>

                          find /data/brick01a/homegfs/.glusterfs -type f

                          -not -empty -links -2 -exec rm -f "{}" \;<br>

                          <br>

                        </div>

                        Is there anything I'm inherently missing with

                        that approach that will further corrupt the

                        system?<br>

                        <div><br>

                        </div>

                      </div>

                      <div class="gmail_extra"><br>

                        <div class="gmail_quote">On Thu, Jan 21, 2016 at

                          1:02 PM, Glomski, Patrick <span dir="ltr">&lt;<a

                              moz-do-not-send="true"

                              href="mailto:patrick.glomski@corvidtec.com"

                              target="_blank">patrick.glomski@corvidtec.com</a>&gt;</span>

                          wrote:<br>

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex">

                            <div dir="ltr">

                              <div>

                                <div>Load spiked again: ~1200%cpu on

                                  gfs02a for glusterfsd. Crawl has been

                                  running on one of the bricks on gfs02b

                                  for 25 min or so and users cannot

                                  access the volume.<br>

                                  <br>

                                  I re-listed the xattrop directories as

                                  well as a 'top' entry and heal

                                  statistics. Then I restarted the

                                  gluster services on gfs02a. <br>

                                  <br>

                                  =================== top

                                  ===================<br>

                                  PID USER      PR  NI  VIRT  RES  SHR S

                                  %CPU %MEM    TIME+ 

                                  COMMAND                                                

                                  <br>

                                   8969 root      20   0 2815m 204m 3588

                                  S 1181.0  0.6 591:06.93

                                  glusterfsd         <br>

                                  <br>

                                  =================== xattrop

                                  ===================<br>

/data/brick01a/homegfs/.glusterfs/indices/xattrop:<br>

xattrop-41f19453-91e4-437c-afa9-3b25614de210 

xattrop-9b815879-2f4d-402b-867c-a6d65087788c<br>

                                  <br>

/data/brick02a/homegfs/.glusterfs/indices/xattrop:<br>

xattrop-70131855-3cfb-49af-abce-9d23f57fb393 

xattrop-dfb77848-a39d-4417-a725-9beca75d78c6<br>

                                  <br>

/data/brick01b/homegfs/.glusterfs/indices/xattrop:<br>

e6e47ed9-309b-42a7-8c44-28c29b9a20f8         

xattrop-5c797a64-bde7-4eac-b4fc-0befc632e125<br>

xattrop-38ec65a1-00b5-4544-8a6c-bf0f531a1934 

xattrop-ef0980ad-f074-4163-979f-16d5ef85b0a0<br>

                                  <br>

/data/brick02b/homegfs/.glusterfs/indices/xattrop:<br>

xattrop-7402438d-0ee7-4fcf-b9bb-b561236f99bc 

xattrop-8ffbf5f7-ace3-497d-944e-93ac85241413<br>

                                  <br>

/data/brick01a/homegfs/.glusterfs/indices/xattrop:<br>

xattrop-0115acd0-caae-4dfd-b3b4-7cc42a0ff531<br>

                                  <br>

/data/brick02a/homegfs/.glusterfs/indices/xattrop:<br>

xattrop-7e20fdb1-5224-4b9a-be06-568708526d70<br>

                                  <br>

/data/brick01b/homegfs/.glusterfs/indices/xattrop:<br>

                                  8034bc06-92cd-4fa5-8aaf-09039e79d2c8 

                                  c9ce22ed-6d8b-471b-a111-b39e57f0b512<br>

                                  94fa1d60-45ad-4341-b69c-315936b51e8d 

xattrop-9c04623a-64ce-4f66-8b23-dbaba49119c7<br>

                                  <br>

/data/brick02b/homegfs/.glusterfs/indices/xattrop:<br>

xattrop-b8c8f024-d038-49a2-9a53-c54ead09111d<br>

                                  <br>

                                  <br>

                                  =================== heal stats

                                  ===================<br>

                                   <br>

                                  homegfs [b0-gfsib01a] : Starting time

                                  of crawl       : Thu Jan 21 12:36:45

                                  2016<br>

                                  homegfs [b0-gfsib01a] : Ending time of

                                  crawl         : Thu Jan 21 12:36:45

                                  2016<br>

                                  homegfs [b0-gfsib01a] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b0-gfsib01a] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b0-gfsib01a] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b0-gfsib01a] : No. of heal

                                  failed entries   : 0<br>

                                   <br>

                                  homegfs [b1-gfsib01b] : Starting time

                                  of crawl       : Thu Jan 21 12:36:19

                                  2016<br>

                                  homegfs [b1-gfsib01b] : Ending time of

                                  crawl         : Thu Jan 21 12:36:19

                                  2016<br>

                                  homegfs [b1-gfsib01b] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b1-gfsib01b] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b1-gfsib01b] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b1-gfsib01b] : No. of heal

                                  failed entries   : 1<br>

                                   <br>

                                  homegfs [b2-gfsib01a] : Starting time

                                  of crawl       : Thu Jan 21 12:36:48

                                  2016<br>

                                  homegfs [b2-gfsib01a] : Ending time of

                                  crawl         : Thu Jan 21 12:36:48

                                  2016<br>

                                  homegfs [b2-gfsib01a] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b2-gfsib01a] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b2-gfsib01a] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b2-gfsib01a] : No. of heal

                                  failed entries   : 0<br>

                                   <br>

                                  homegfs [b3-gfsib01b] : Starting time

                                  of crawl       : Thu Jan 21 12:36:47

                                  2016<br>

                                  homegfs [b3-gfsib01b] : Ending time of

                                  crawl         : Thu Jan 21 12:36:47

                                  2016<br>

                                  homegfs [b3-gfsib01b] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b3-gfsib01b] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b3-gfsib01b] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b3-gfsib01b] : No. of heal

                                  failed entries   : 0<br>

                                   <br>

                                  homegfs [b4-gfsib02a] : Starting time

                                  of crawl       : Thu Jan 21 12:36:06

                                  2016<br>

                                  homegfs [b4-gfsib02a] : Ending time of

                                  crawl         : Thu Jan 21 12:36:06

                                  2016<br>

                                  homegfs [b4-gfsib02a] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b4-gfsib02a] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b4-gfsib02a] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b4-gfsib02a] : No. of heal

                                  failed entries   : 0<br>

                                   <br>

                                  homegfs [b5-gfsib02b] : Starting time

                                  of crawl       : Thu Jan 21 12:13:40

                                  2016<br>

                                  homegfs [b5-gfsib02b]

                                  :                                ***

                                  Crawl is in progress ***<br>

                                  homegfs [b5-gfsib02b] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b5-gfsib02b] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b5-gfsib02b] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b5-gfsib02b] : No. of heal

                                  failed entries   : 0<br>

                                   <br>

                                  homegfs [b6-gfsib02a] : Starting time

                                  of crawl       : Thu Jan 21 12:36:58

                                  2016<br>

                                  homegfs [b6-gfsib02a] : Ending time of

                                  crawl         : Thu Jan 21 12:36:58

                                  2016<br>

                                  homegfs [b6-gfsib02a] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b6-gfsib02a] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b6-gfsib02a] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b6-gfsib02a] : No. of heal

                                  failed entries   : 0<br>

                                   <br>

                                  homegfs [b7-gfsib02b] : Starting time

                                  of crawl       : Thu Jan 21 12:36:50

                                  2016<br>

                                  homegfs [b7-gfsib02b] : Ending time of

                                  crawl         : Thu Jan 21 12:36:50

                                  2016<br>

                                  homegfs [b7-gfsib02b] : Type of crawl:

                                  INDEX<br>

                                  homegfs [b7-gfsib02b] : No. of entries

                                  healed        : 0<br>

                                  homegfs [b7-gfsib02b] : No. of entries

                                  in split-brain: 0<br>

                                  homegfs [b7-gfsib02b] : No. of heal

                                  failed entries   : 0<br>

                                  <br>

                                  <br>

========================================================================================<br>

                                </div>

                                I waited a few minutes for the heals to

                                finish and ran the heal statistics and

                                info again. one file is in split-brain.

                                Aside from the split-brain, the load on

                                all systems is down now and they are

                                behaving normally. glustershd.log is

                                attached. What is going on??? <br>

                                <br>

                                Thu Jan 21 12:53:50 EST 2016<br>

                                 <br>

                                =================== homegfs

                                ===================<br>

                                 <br>

                                homegfs [b0-gfsib01a] : Starting time of

                                crawl       : Thu Jan 21 12:53:02 2016<br>

                                homegfs [b0-gfsib01a] : Ending time of

                                crawl         : Thu Jan 21 12:53:02 2016<br>

                                homegfs [b0-gfsib01a] : Type of crawl:

                                INDEX<br>

                                homegfs [b0-gfsib01a] : No. of entries

                                healed        : 0<br>

                                homegfs [b0-gfsib01a] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b0-gfsib01a] : No. of heal

                                failed entries   : 0<br>

                                 <br>

                                homegfs [b1-gfsib01b] : Starting time of

                                crawl       : Thu Jan 21 12:53:38 2016<br>

                                homegfs [b1-gfsib01b] : Ending time of

                                crawl         : Thu Jan 21 12:53:38 2016<br>

                                homegfs [b1-gfsib01b] : Type of crawl:

                                INDEX<br>

                                homegfs [b1-gfsib01b] : No. of entries

                                healed        : 0<br>

                                homegfs [b1-gfsib01b] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b1-gfsib01b] : No. of heal

                                failed entries   : 1<br>

                                 <br>

                                homegfs [b2-gfsib01a] : Starting time of

                                crawl       : Thu Jan 21 12:53:04 2016<br>

                                homegfs [b2-gfsib01a] : Ending time of

                                crawl         : Thu Jan 21 12:53:04 2016<br>

                                homegfs [b2-gfsib01a] : Type of crawl:

                                INDEX<br>

                                homegfs [b2-gfsib01a] : No. of entries

                                healed        : 0<br>

                                homegfs [b2-gfsib01a] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b2-gfsib01a] : No. of heal

                                failed entries   : 0<br>

                                 <br>

                                homegfs [b3-gfsib01b] : Starting time of

                                crawl       : Thu Jan 21 12:53:04 2016<br>

                                homegfs [b3-gfsib01b] : Ending time of

                                crawl         : Thu Jan 21 12:53:04 2016<br>

                                homegfs [b3-gfsib01b] : Type of crawl:

                                INDEX<br>

                                homegfs [b3-gfsib01b] : No. of entries

                                healed        : 0<br>

                                homegfs [b3-gfsib01b] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b3-gfsib01b] : No. of heal

                                failed entries   : 0<br>

                                 <br>

                                homegfs [b4-gfsib02a] : Starting time of

                                crawl       : Thu Jan 21 12:53:33 2016<br>

                                homegfs [b4-gfsib02a] : Ending time of

                                crawl         : Thu Jan 21 12:53:33 2016<br>

                                homegfs [b4-gfsib02a] : Type of crawl:

                                INDEX<br>

                                homegfs [b4-gfsib02a] : No. of entries

                                healed        : 0<br>

                                homegfs [b4-gfsib02a] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b4-gfsib02a] : No. of heal

                                failed entries   : 1<br>

                                 <br>

                                homegfs [b5-gfsib02b] : Starting time of

                                crawl       : Thu Jan 21 12:53:14 2016<br>

                                homegfs [b5-gfsib02b] : Ending time of

                                crawl         : Thu Jan 21 12:53:15 2016<br>

                                homegfs [b5-gfsib02b] : Type of crawl:

                                INDEX<br>

                                homegfs [b5-gfsib02b] : No. of entries

                                healed        : 0<br>

                                homegfs [b5-gfsib02b] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b5-gfsib02b] : No. of heal

                                failed entries   : 3<br>

                                 <br>

                                homegfs [b6-gfsib02a] : Starting time of

                                crawl       : Thu Jan 21 12:53:04 2016<br>

                                homegfs [b6-gfsib02a] : Ending time of

                                crawl         : Thu Jan 21 12:53:04 2016<br>

                                homegfs [b6-gfsib02a] : Type of crawl:

                                INDEX<br>

                                homegfs [b6-gfsib02a] : No. of entries

                                healed        : 0<br>

                                homegfs [b6-gfsib02a] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b6-gfsib02a] : No. of heal

                                failed entries   : 0<br>

                                 <br>

                                homegfs [b7-gfsib02b] : Starting time of

                                crawl       : Thu Jan 21 12:53:09 2016<br>

                                homegfs [b7-gfsib02b] : Ending time of

                                crawl         : Thu Jan 21 12:53:09 2016<br>

                                homegfs [b7-gfsib02b] : Type of crawl:

                                INDEX<br>

                                homegfs [b7-gfsib02b] : No. of entries

                                healed        : 0<br>

                                homegfs [b7-gfsib02b] : No. of entries

                                in split-brain: 0<br>

                                homegfs [b7-gfsib02b] : No. of heal

                                failed entries   : 0<br>

                                 <br>

                                *** gluster bug in 'gluster volume heal

                                homegfs statistics'   ***<br>

                                *** Use 'gluster volume heal homegfs

                                info' until bug is fixed ***<span><br>

                                   <br>

                                  Brick

                                  gfs01a.corvidtec.com:/data/brick01a/homegfs/<br>

                                  Number of entries: 0<br>

                                  <br>

                                  Brick

                                  gfs01b.corvidtec.com:/data/brick01b/homegfs/<br>

                                  Number of entries: 0<br>

                                  <br>

                                  Brick

                                  gfs01a.corvidtec.com:/data/brick02a/homegfs/<br>

                                  Number of entries: 0<br>

                                  <br>

                                  Brick

                                  gfs01b.corvidtec.com:/data/brick02b/homegfs/<br>

                                  Number of entries: 0<br>

                                  <br>

                                  Brick

                                  gfs02a.corvidtec.com:/data/brick01a/homegfs/<br>

                                </span>/users/bangell/.gconfd - Is in

                                split-brain<br>

                                <br>

                                Number of entries: 1<br>

                                <br>

                                Brick

                                gfs02b.corvidtec.com:/data/brick01b/homegfs/<br>

                                /users/bangell/.gconfd - Is in

                                split-brain<br>

                                <br>

                                /users/bangell/.gconfd/saved_state <br>

                                Number of entries: 2<span><br>

                                  <br>

                                  Brick

                                  gfs02a.corvidtec.com:/data/brick02a/homegfs/<br>

                                  Number of entries: 0<br>

                                  <br>

                                  Brick

                                  gfs02b.corvidtec.com:/data/brick02b/homegfs/<br>

                                  Number of entries: 0<br>

                                  <br>

                                </span></div>

                              <div><br>

                                <br>

                              </div>

                            </div>

                            <div>

                              <div>

                                <div class="gmail_extra"><br>

                                  <div class="gmail_quote">On Thu, Jan

                                    21, 2016 at 11:10 AM, Pranith Kumar

                                    Karampuri <span dir="ltr">&lt;<a

                                        moz-do-not-send="true"

                                        href="mailto:pkarampu@redhat.com"

                                        target="_blank">pkarampu@redhat.com</a>&gt;</span>

                                    wrote:<br>

                                    <blockquote class="gmail_quote"

                                      style="margin:0 0 0

                                      .8ex;border-left:1px #ccc

                                      solid;padding-left:1ex">

                                      <div bgcolor="#FFFFFF"

                                        text="#000000"><span> <br>

                                          <br>

                                          <div>On 01/21/2016 09:26 PM,

                                            Glomski, Patrick wrote:<br>

                                          </div>

                                          <blockquote type="cite">

                                            <div dir="ltr">

                                              <div>I should mention that

                                                the problem is not

                                                currently occurring and

                                                there are no heals

                                                (output appended). By

                                                restarting the gluster

                                                services, we can stop

                                                the crawl, which lowers

                                                the load for a while.

                                                Subsequent crawls seem

                                                to finish properly. For

                                                what it's worth,

                                                files/folders that show

                                                up in the 'volume info'

                                                output during a hung

                                                crawl don't seem to be

                                                anything out of the

                                                ordinary. <br>

                                                <br>

                                                Over the past four days,

                                                the typical time before

                                                the problem recurs after

                                                suppressing it in this

                                                manner is an hour. Last

                                                night when we reached

                                                out to you was the last

                                                time it happened and the

                                                load has been low since

                                                (a relief).  David

                                                believes that

                                                recursively listing the

                                                files (ls -alR or

                                                similar) from a client

                                                mount can force the

                                                issue to happen, but

                                                obviously I'd rather not

                                                unless we have some

                                                precise thing we're

                                                looking for. Let me know

                                                if you'd like me to

                                                attempt to drive the

                                                system unstable like

                                                that and what I should

                                                look for. As it's a

                                                production system, I'd

                                                rather not leave it in

                                                this state for long.<br>

                                              </div>

                                            </div>

                                          </blockquote>

                                          <br>

                                        </span> Will it be possible to

                                        send glustershd, mount logs of

                                        the past 4 days? I would like to

                                        see if this is because of

                                        directory self-heal going wild

                                        (Ravi is working on throttling

                                        feature for 3.8, which will

                                        allow to put breaks on self-heal

                                        traffic)<span><font

                                            color="#888888"><br>

                                            <br>

                                            Pranith</font></span>

                                        <div>

                                          <div><br>

                                            <blockquote type="cite">

                                              <div dir="ltr">

                                                <div><br>

                                                </div>

                                                <div>[root@gfs01a

                                                  xattrop]# gluster

                                                  volume heal homegfs

                                                  info<br>

                                                  Brick

                                                  gfs01a.corvidtec.com:/data/brick01a/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs01b.corvidtec.com:/data/brick01b/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs01a.corvidtec.com:/data/brick02a/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs01b.corvidtec.com:/data/brick02b/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs02a.corvidtec.com:/data/brick01a/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs02b.corvidtec.com:/data/brick01b/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs02a.corvidtec.com:/data/brick02a/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  Brick

                                                  gfs02b.corvidtec.com:/data/brick02b/homegfs/<br>

                                                  Number of entries: 0<br>

                                                  <br>

                                                  <br>

                                                  <br>

                                                </div>

                                              </div>

                                              <div class="gmail_extra"><br>

                                                <div class="gmail_quote">On

                                                  Thu, Jan 21, 2016 at

                                                  10:40 AM, Pranith

                                                  Kumar Karampuri <span

                                                    dir="ltr">&lt;<a

                                                      moz-do-not-send="true"

href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>

                                                  wrote:<br>

                                                  <blockquote

                                                    class="gmail_quote"

                                                    style="margin:0 0 0

                                                    .8ex;border-left:1px

                                                    #ccc

                                                    solid;padding-left:1ex">

                                                    <div

                                                      bgcolor="#FFFFFF"

                                                      text="#000000"><span>

                                                        <br>

                                                        <br>

                                                        <div>On

                                                          01/21/2016

                                                          08:25 PM,

                                                          Glomski,

                                                          Patrick wrote:<br>

                                                        </div>

                                                        <blockquote

                                                          type="cite">

                                                          <div dir="ltr">

                                                          <div>Hello,

                                                          Pranith. The

                                                          typical

                                                          behavior is

                                                          that the %cpu

                                                          on a

                                                          glusterfsd

                                                          process jumps

                                                          to number of

                                                          processor

                                                          cores

                                                          available

                                                          (800% or

                                                          1200%,

                                                          depending on

                                                          the pair of

                                                          nodes

                                                          involved) and

                                                          the load

                                                          average on the

                                                          machine goes

                                                          very high

                                                          (~20). The

                                                          volume's heal

                                                          statistics

                                                          output shows

                                                          that it is

                                                          crawling one

                                                          of the bricks

                                                          and trying to

                                                          heal, but this

                                                          crawl hangs

                                                          and never

                                                          seems to

                                                          finish.<br>

                                                          </div>

                                                          </div>

                                                        </blockquote>

                                                        <blockquote

                                                          type="cite">

                                                          <div dir="ltr">

                                                          <div><br>

                                                          </div>

                                                          The number of

                                                          files in the

                                                          xattrop

                                                          directory

                                                          varies over

                                                          time, so I ran

                                                          a wc -l as you

                                                          requested

                                                          periodically

                                                          for some time

                                                          and then

                                                          started

                                                          including a

                                                          datestamped

                                                          list of the

                                                          files that

                                                          were in the

                                                          xattrops

                                                          directory on

                                                          each brick to

                                                          see which were

                                                          persistent.

                                                          All bricks had

                                                          files in the

                                                          xattrop

                                                          folder, so all

                                                          results are

                                                          attached.<br>

                                                          </div>

                                                        </blockquote>

                                                      </span> Thanks

                                                      this info is

                                                      helpful. I don't

                                                      see a lot of

                                                      files. Could you

                                                      give output of

                                                      "gluster volume

                                                      heal

                                                      &lt;volname&gt;

                                                      info"? Is there

                                                      any directory in

                                                      there which is

                                                      LARGE?<span><font

color="#888888"><br>

                                                          <br>

                                                          Pranith</font></span>

                                                      <div>

                                                        <div><br>

                                                          <blockquote

                                                          type="cite">

                                                          <div dir="ltr">

                                                          <div><br>

                                                          </div>

                                                          <div>Please

                                                          let me know if

                                                          there is

                                                          anything else

                                                          I can provide.<br>

                                                          </div>

                                                          <div><br>

                                                          </div>

                                                          <div>Patrick<br>

                                                          </div>

                                                          <div><br>

                                                          </div>

                                                          </div>

                                                          <div

                                                          class="gmail_extra"><br>

                                                          <div

                                                          class="gmail_quote">On

                                                          Thu, Jan 21,

                                                          2016 at 12:01

                                                          AM, Pranith

                                                          Kumar

                                                          Karampuri <span

                                                          dir="ltr">&lt;<a

moz-do-not-send="true" href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>

                                                          wrote:<br>

                                                          <blockquote

                                                          class="gmail_quote"

                                                          style="margin:0

                                                          0 0

                                                          .8ex;border-left:1px

                                                          #ccc

                                                          solid;padding-left:1ex">

                                                          <div

                                                          bgcolor="#FFFFFF"

                                                          text="#000000">

                                                          hey,<br>

                                                                 Which

                                                          process is

                                                          consuming so

                                                          much cpu? I

                                                          went through

                                                          the logs you

                                                          gave me. I see

                                                          that the

                                                          following

                                                          files are in

                                                          gfid mismatch

                                                          state:<br>

                                                          <br>

&lt;066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup&gt;,<br>

&lt;1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak&gt;,<br>

&lt;ddc92637-303a-4059-9c56-ab23b1bb6ae9/patch0008.cnvrg&gt;,<br>

                                                          <br>

                                                          Could you give

                                                          me the output

                                                          of "ls

                                                          &lt;brick-path&gt;/indices/xattrop

                                                          | wc -l"

                                                          output on all

                                                          the bricks

                                                          which are

                                                          acting this

                                                          way? This will

                                                          tell us the

                                                          number of

                                                          pending

                                                          self-heals on

                                                          the system.<br>

                                                          <br>

                                                          Pranith

                                                          <div>

                                                          <div><br>

                                                          <br>

                                                          <div>On

                                                          01/20/2016

                                                          09:26 PM,

                                                          David Robinson

                                                          wrote:<br>

                                                          </div>

                                                          </div>

                                                          </div>

                                                          <blockquote

                                                          type="cite">

                                                          <div>

                                                          <div>

                                                          <div>resending

                                                          with parsed

                                                          logs... </div>

                                                          <div> </div>

                                                          <div>

                                                          <blockquote

                                                          cite="http://em5ee26b0e-002a-4230-bdec-3020b98cff3c@dfrobins-vaio"

                                                          type="cite">

                                                          <div> </div>

                                                          <div> </div>

                                                          <div>

                                                          <blockquote

                                                          cite="http://eme3b2cb80-8be2-4fa5-9d08-4710955e237c@dfrobins-vaio"

                                                          type="cite">

                                                          <div>I am

                                                          having issues

                                                          with 3.6.6

                                                          where the load

                                                          will spike up

                                                          to 800% for

                                                          one of the

                                                          glusterfsd

                                                          processes and

                                                          the users can

                                                          no longer

                                                          access the

                                                          system.  If I

                                                          reboot the

                                                          node, the heal

                                                          will finish

                                                          normally after

                                                          a few minutes

                                                          and the system

                                                          will be

                                                          responsive,

                                                          but a few

                                                          hours later

                                                          the issue will

                                                          start again. 

                                                          It look like

                                                          it is hanging

                                                          in a heal and

                                                          spinning up

                                                          the load on

                                                          one of the

                                                          bricks.  The

                                                          heal gets

                                                          stuck and says

                                                          it is crawling

                                                          and never

                                                          returns. 

                                                          After a few

                                                          minutes of the

                                                          heal saying it

                                                          is crawling,

                                                          the load

                                                          spikes up and

                                                          the mounts

                                                          become

                                                          unresponsive.</div>

                                                          <div> </div>

                                                          <div>Any

                                                          suggestions on

                                                          how to fix

                                                          this?  It has

                                                          us stopped

                                                          cold as the

                                                          user can no

                                                          longer access

                                                          the systems

                                                          when the load

                                                          spikes... Logs

                                                          attached.</div>

                                                          <div> </div>

                                                          <div>System

                                                          setup info is:

                                                          </div>

                                                          <div> </div>

                                                          <div>[root@gfs01a

                                                          ~]# gluster

                                                          volume info

                                                          homegfs<br>

                                                           <br>

                                                          Volume Name:

                                                          homegfs<br>

                                                          Type:

                                                          Distributed-Replicate<br>

                                                          Volume ID:

                                                          1e32672a-f1b7-4b58-ba94-58c085e59071<br>

                                                          Status:

                                                          Started<br>

                                                          Number of

                                                          Bricks: 4 x 2

                                                          = 8<br>

                                                          Transport-type:

                                                          tcp<br>

                                                          Bricks:<br>

                                                          Brick1:

                                                          gfsib01a.corvidtec.com:/data/brick01a/homegfs<br>

                                                          Brick2:

                                                          gfsib01b.corvidtec.com:/data/brick01b/homegfs<br>

                                                          Brick3:

                                                          gfsib01a.corvidtec.com:/data/brick02a/homegfs<br>

                                                          Brick4:

                                                          gfsib01b.corvidtec.com:/data/brick02b/homegfs<br>

                                                          Brick5:

                                                          gfsib02a.corvidtec.com:/data/brick01a/homegfs<br>

                                                          Brick6:

                                                          gfsib02b.corvidtec.com:/data/brick01b/homegfs<br>

                                                          Brick7:

                                                          gfsib02a.corvidtec.com:/data/brick02a/homegfs<br>

                                                          Brick8:

                                                          gfsib02b.corvidtec.com:/data/brick02b/homegfs<br>

                                                          Options

                                                          Reconfigured:<br>

                                                          performance.io-thread-count:

                                                          32<br>

                                                          performance.cache-size:

                                                          128MB<br>

                                                          performance.write-behind-window-size:

                                                          128MB<br>

                                                          server.allow-insecure:

                                                          on<br>

                                                          network.ping-timeout:

                                                          42<br>

                                                          storage.owner-gid:

                                                          100<br>

                                                          geo-replication.indexing:

                                                          off<br>

                                                          geo-replication.ignore-pid-check:

                                                          on<br>

                                                          changelog.changelog:

                                                          off<br>

                                                          changelog.fsync-interval:

                                                          3<br>

                                                          changelog.rollover-time:

                                                          15<br>

                                                          server.manage-gids:

                                                          on<br>

                                                          diagnostics.client-log-level:

                                                          WARNING</div>

                                                          <div> </div>

                                                          <div>[root@gfs01a

                                                          ~]# rpm -qa |

                                                          grep gluster<br>

gluster-nagios-common-0.1.1-0.el6.noarch<br>

glusterfs-fuse-3.6.6-1.el6.x86_64<br>

glusterfs-debuginfo-3.6.6-1.el6.x86_64<br>

glusterfs-libs-3.6.6-1.el6.x86_64<br>

glusterfs-geo-replication-3.6.6-1.el6.x86_64<br>

glusterfs-api-3.6.6-1.el6.x86_64<br>

glusterfs-devel-3.6.6-1.el6.x86_64<br>

glusterfs-api-devel-3.6.6-1.el6.x86_64<br>

glusterfs-3.6.6-1.el6.x86_64<br>

glusterfs-cli-3.6.6-1.el6.x86_64<br>

glusterfs-rdma-3.6.6-1.el6.x86_64<br>

samba-vfs-glusterfs-4.1.11-2.el6.x86_64<br>

glusterfs-server-3.6.6-1.el6.x86_64<br>

glusterfs-extra-xlators-3.6.6-1.el6.x86_64<br>

                                                          </div>

                                                          <div> </div>

                                                          <div>

                                                          <div

                                                          style="FONT-SIZE:12pt;FONT-FAMILY:Times

                                                          New Roman"><span><span>

                                                          <div> </div>

                                                          </span></span></div>

                                                          </div>

                                                          </blockquote>

                                                          </div>

                                                          </blockquote>

                                                          </div>

                                                          <br>

                                                          <fieldset></fieldset>

                                                          <br>

                                                          </div>

                                                          </div>

                                                          <pre>_______________________________________________

Gluster-devel mailing list

<a moz-do-not-send="true" href="mailto:Gluster-devel@gluster.org" target="_blank">Gluster-devel@gluster.org</a>

<a moz-do-not-send="true" href="http://www.gluster.org/mailman/listinfo/gluster-devel" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-devel</a></pre>

                                                          </blockquote>

                                                          <br>

                                                          </div>

                                                          <br>

_______________________________________________<br>

                                                          Gluster-users

                                                          mailing list<br>

                                                          <a

                                                          moz-do-not-send="true"

href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

                                                          <a

                                                          moz-do-not-send="true"

href="http://www.gluster.org/mailman/listinfo/gluster-users"

                                                          rel="noreferrer"

target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

                                                          </blockquote>

                                                          </div>

                                                          <br>

                                                          </div>

                                                          </blockquote>

                                                          <br>

                                                        </div>

                                                      </div>

                                                    </div>

                                                  </blockquote>

                                                </div>

                                                <br>

                                              </div>

                                            </blockquote>

                                            <br>

                                          </div>

                                        </div>

                                      </div>

                                    </blockquote>

                                  </div>

                                  <br>

                                </div>

                              </div>

                            </div>

                          </blockquote>

                        </div>

                        <br>

                      </div>

                    </blockquote>

                    <br>

                  </div>

                </div>

              </div>

            </blockquote>

          </div>

          <br>

        </div>

      </blockquote>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Gluster-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@gluster.org">Gluster-devel@gluster.org</a>

<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-devel">http://www.gluster.org/mailman/listinfo/gluster-devel</a></pre>

    </blockquote>

    <br>

  </body>

</html>