<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi,<br>
      <br>
      the log does not show anything like that. Also, I'm using ext4 on
      the bricks.<br>
      <br>
      The log only contains entries like these:<br>
      <br>
      [Fri Nov 25 14:23:27 2016] INFO: task gpu_graphene_bv:4476 blocked
      for more than 120 seconds.<br>
      [Fri Nov 25 14:23:27 2016]       Tainted: P           OE 
      3.19.0-25-generic #26~14.04.1-Ubuntu<br>
      [Fri Nov 25 14:23:27 2016] "echo 0 &gt;
      /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>
      [Fri Nov 25 14:23:27 2016] gpu_graphene_bv D ffff8804aa39be08    
      0  4476   4461 0x00000000<br>
      [Fri Nov 25 14:23:27 2016]  ffff8804aa39be08 ffff8804ad0febf0
      0000000000013e80 ffff8804aa39bfd8<br>
      [Fri Nov 25 14:23:27 2016]  0000000000013e80 ffff8804ad403110
      ffff8804ad0febf0 ffff8804aa39be18<br>
      [Fri Nov 25 14:23:27 2016]  ffff8804aa2c87d0 ffff88049df2e000
      ffff8804aa39be30 ffff8804aa2c88a0<br>
      [Fri Nov 25 14:23:27 2016] Call Trace:<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff817b22e9&gt;]
      schedule+0x29/0x70<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff812dc06d&gt;]
      __fuse_request_send+0x11d/0x290<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff810b4e10&gt;] ?
      prepare_to_wait_event+0x110/0x110<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff812dc1f2&gt;]
      fuse_request_send+0x12/0x20<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff812e576d&gt;]
      fuse_flush+0x12d/0x180<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff811e9973&gt;]
      filp_close+0x33/0x80<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff8120a152&gt;]
      __close_fd+0x82/0xa0<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff811e99e3&gt;]
      SyS_close+0x23/0x50<br>
      [Fri Nov 25 14:23:27 2016]  [&lt;ffffffff817b668d&gt;]
      system_call_fastpath+0x16/0x1b<br>
      <br>
      Which is due to the file system not responding, I guess.<br>
      Since I switched the mounts from FUSE to NFS, occasionally I also
      see:<br>
      <br>
      [Wed Dec 14 23:42:47 2016] nfs: server giant2 not responding,
      still trying<br>
      [Wed Dec 14 23:43:12 2016] nfs: server giant2 not responding,
      still trying<br>
      [Wed Dec 14 23:45:04 2016] nfs: server giant2 OK<br>
      [Wed Dec 14 23:45:04 2016] nfs: server giant2 OK<br>
      <br>
      In another post you asked for logfiles with TRACE loglevel, I'll
      provide them shortly.<br>
      <br>
      Best regards and thanks,<br>
      Micha<br>
      <br>
      Am 19.12.2016 um 16:09 schrieb Mohammed Rafi K C:<br>
    </div>
    <blockquote
      cite="mid:b14c62b3-3381-5298-eb7f-f77be32cef99@redhat.com"
      type="cite">
      <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
      <p>Hi Micha,</p>
      <p>Can you please also see if there is any error messages in dmesg
        ? Basically I'm trying to see whether your hitting issues
        described in <a moz-do-not-send="true"
          class="moz-txt-link-freetext"
          href="https://bugzilla.kernel.org/show_bug.cgi?id=73831">https://bugzilla.kernel.org/show_bug.cgi?id=73831</a>
        .</p>
      <p><br>
      </p>
      <p>Regards</p>
      <p>Rafi KC</p>
      <p><br>
      </p>
      <div class="moz-cite-prefix">On 12/19/2016 11:58 AM, Mohammed Rafi
        K C wrote:<br>
      </div>
      <blockquote
        cite="mid:86231d60-3363-0e68-48d3-818cd73c62e9@redhat.com"
        type="cite">
        <meta content="text/html; charset=utf-8"
          http-equiv="Content-Type">
        <p>Hi Micha,</p>
        <p>Sorry for the late reply. I was busy with some other things.</p>
        <p>If you have still the setup available Can you enable TRACE
          log level [1],[2] and see if you could find any log entries
          when the network start disconnecting. Basically I'm trying to
          find out any disconnection had occurred other than ping timer
          expire issue.</p>
        <p><br>
        </p>
        <p><br>
        </p>
        <p>[1] : gluster volume &lt;volname&gt;
          diagnostics.brick-log-level TRACE</p>
        <p>[2] : gluster volume &lt;volname&gt;
          diagnostics.client-log-level TRACE<br>
        </p>
        <p><br>
        </p>
        <p>Regards</p>
        <p>Rafi KC<br>
        </p>
        <br>
        <div class="moz-cite-prefix">On 12/08/2016 07:59 PM, Atin
          Mukherjee wrote:<br>
        </div>
        <blockquote
cite="mid:CAGNCGH3Rjy8B7wz+gTQqc35FLpQ4gn9u+bMaDRM0hkaGitUaGw@mail.gmail.com"
          type="cite">
          <div dir="ltr"><br>
            <div class="gmail_extra"><br>
              <div class="gmail_quote">On Thu, Dec 8, 2016 at 4:37 PM,
                Micha Ober <span dir="ltr">&lt;<a
                    moz-do-not-send="true"
                    href="mailto:micha2k@gmail.com" target="_blank">micha2k@gmail.com</a>&gt;</span>
                wrote:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div bgcolor="#FFFFFF" text="#000000">
                    <div class="m_4766802258719003127moz-cite-prefix">Hi
                      Rafi,<br>
                      <br>
                      thank you for your support. It is greatly
                      appreciated.<br>
                      <br>
                      Just some more thoughts from my side:<br>
                      <br>
                      There have been no reports from other  users in
                      *this* thread until now, but I have found at least
                      one user with a very simiar problem in an older
                      thread:<br>
                      <br>
                      <a moz-do-not-send="true"
                        class="m_4766802258719003127moz-txt-link-freetext"
href="https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html"
                        target="_blank">https://www.gluster.org/<wbr>pipermail/gluster-users/2014-<wbr>November/019637.html</a><br>
                      <br>
                      He is also reporting disconnects  with no apparent
                      reasons, althogh his setup is a bit more
                      complicated, also involving a firewall. In our
                      setup, all servers/clients are connected via 1 GbE
                      with no firewall or anything that might
                      block/throttle traffic. Also, we are using exactly
                      the same software versions on all nodes.<br>
                      <br>
                      <br>
                      I can also find some reports in the bugtracker
                      when searching for "rpc_client_ping_timer_<wbr>expired"
                      and "rpc_clnt_ping_timer_expired" (looks like
                      spelling changed during versions).<br>
                      <br>
                      <a moz-do-not-send="true"
                        class="m_4766802258719003127moz-txt-link-freetext"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1096729"
                        target="_blank">https://bugzilla.redhat.com/<wbr>show_bug.cgi?id=1096729</a></div>
                  </div>
                </blockquote>
                <div><br>
                </div>
                <div>Just FYI, this is a different issue, here GlusterD
                  fails to handle the volume of incoming requests on
                  time since MT-epoll is not enabled here.<br>
                   <br>
                </div>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  <div bgcolor="#FFFFFF" text="#000000">
                    <div class="m_4766802258719003127moz-cite-prefix"><br>
                      <a moz-do-not-send="true"
                        class="m_4766802258719003127moz-txt-link-freetext"
href="https://bugzilla.redhat.com/show_bug.cgi?id=1370683"
                        target="_blank">https://bugzilla.redhat.com/<wbr>show_bug.cgi?id=1370683</a><br>
                      <br>
                      But both reports involve large traffic/load on the
                      bricks/disks, which is not the case for out setup.<br>
                      To give a ballpark figure: Over three days, 30 GiB
                      were written. And the data was not written at
                      once, but continuously over the whole time.<br>
                      <br>
                      <br>
                      Just to be sure, I have checked the logfiles of
                      one of the other clusters right now, which are
                      sitting in the same building, in the same rack,
                      even on the same switch, running the same jobs,
                      but with glusterfs 3.4.2 and I can see no
                      disconnects in the logfiles. So I can definitely
                      rule out our infrastructure as problem.<br>
                      <br>
                      Regards,<br>
                      Micha
                      <div>
                        <div class="h5"><br>
                          <br>
                          <br>
                          Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K
                          C:<br>
                        </div>
                      </div>
                    </div>
                    <div>
                      <div class="h5">
                        <blockquote type="cite">
                          <p>Hi Micha,</p>
                          <p>This is great. I will provide you one debug
                            build which has two fixes which I possible
                            suspect for a frequent disconnect issue,
                            though I don't have much data to validate my
                            theory. So I will take one more day to dig
                            in to that.</p>
                          <p>Thanks for your support, and opensource++ 
                          </p>
                          <p>Regards</p>
                          <p>Rafi KC<br>
                          </p>
                          <div
                            class="m_4766802258719003127moz-cite-prefix">On
                            12/07/2016 05:02 AM, Micha Ober wrote:<br>
                          </div>
                          <blockquote type="cite">
                            <div
                              class="m_4766802258719003127moz-cite-prefix">Hi,<br>
                              <br>
                              thank you for your answer and even more
                              for the question!<br>
                              Until now, I was using FUSE. Today I
                              changed all mounts to NFS using the same
                              3.7.17 version.<br>
                              <br>
                              But: The problem is still the same. Now,
                              the NFS logfile contains lines like these:<br>
                              <br>
                              [2016-12-06 15:12:29.006325] C
                              [rpc-clnt-ping.c:165:rpc_clnt_<wbr>ping_timer_expired]
                              0-gv0-client-7: server X.X.18.62:49153 has
                              not responded in the last 42 seconds,
                              disconnecting.<br>
                              <br>
                              Interestingly enough,  the IP address
                              X.X.18.62 is the same machine! As I wrote
                              earlier, each node serves both as a server
                              and a client, as each node contributes
                              bricks to the volume. Every server is
                              connecting to itself via its hostname. For
                              example, the fstab on the node "giant2"
                              looks like:<br>
                              <br>
                              #giant2:/gv0    /shared_data   
                              glusterfs       defaults,noauto 0       0<br>
                              #giant2:/gv2    /shared_slurm  
                              glusterfs       defaults,noauto 0       0<br>
                              <br>
                              giant2:/gv0     /shared_data   
                              nfs             defaults,_netdev,vers=3
                              0       0<br>
                              giant2:/gv2     /shared_slurm  
                              nfs             defaults,_netdev,vers=3
                              0       0<br>
                              <br>
                              So I understand the disconnects even less.
                              <br>
                              <br>
                              I don't know if it's possible to create a
                              dummy cluster which exposes the same
                              behaviour, because the disconnects only
                              happen when there are compute jobs running
                              on those nodes - and they are GPU compute
                              jobs, so that's something which cannot be
                              easily emulated in a VM.<br>
                              <br>
                              As we have more clusters (which are
                              running fine with an ancient 3.4 version
                              :-)) and we are currently not dependent on
                              this particular cluster (which may stay
                              like this for this month, I think) I
                              should be able to deploy the debug build
                              on the "real" cluster, if you can provide
                              a debug build.<br>
                              <br>
                              Regards and thanks,<br>
                              Micha<br>
                              <br>
                              <br>
                              <br>
                              Am 06.12.2016 um 08:15 schrieb Mohammed
                              Rafi K C:<br>
                            </div>
                            <blockquote type="cite">
                              <p><br>
                              </p>
                              <br>
                              <div
                                class="m_4766802258719003127moz-cite-prefix">On
                                12/03/2016 12:56 AM, Micha Ober wrote:<br>
                              </div>
                              <blockquote type="cite">
                                <div
                                  class="m_4766802258719003127moz-cite-prefix"><tt>**
                                    Update: ** I have downgraded from
                                    3.8.6 to 3.7.17 now, but the problem
                                    still exists.</tt><tt><br>
                                  </tt></div>
                              </blockquote>
                              <blockquote type="cite">
                                <div
                                  class="m_4766802258719003127moz-cite-prefix"><tt>
                                  </tt><tt><br>
                                  </tt><tt>Client log: <a
                                      moz-do-not-send="true"
                                      class="moz-txt-link-freetext"
                                      href="http://paste.ubuntu.com/">http://paste.ubuntu.com/</a><wbr>23569065/</tt><tt><br>
                                  </tt><tt>Brick log: <a
                                      moz-do-not-send="true"
                                      class="moz-txt-link-freetext"
                                      href="http://paste.ubuntu.com/">http://paste.ubuntu.com/</a><wbr>23569067/</tt><tt><br>
                                  </tt><tt><br>
                                  </tt><tt>Please note that each server
                                    has two bricks.</tt><tt><br>
                                  </tt><tt>Whereas, according to the
                                    logs, one brick loses the connection
                                    to all other hosts:</tt><tt><br>
                                  </tt>
                                  <pre style="color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px">[2016-12-02 18:38:53.703301] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.219:49121 failed (Broken pipe)
[2016-12-02 18:38:53.703381] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.62:49118 failed (Broken pipe)
[2016-12-02 18:38:53.703380] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.107:49121 failed (Broken pipe)
[2016-12-02 18:38:53.703424] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.206:49120 failed (Broken pipe)
[2016-12-02 18:38:53.703359] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: writev on X.X.X.58:49121 failed (Broken pipe)

The SECOND brick on the SAME host is NOT affected, i.e. no disconnects!
As I said, the network connection is fine and the disks are idle.
The CPU always has 2 free cores.

It looks like I have to downgrade to 3.4 now in order for the disconnects to stop.</pre>
                                </div>
                              </blockquote>
                              <br>
                              Hi Micha,<br>
                              <br>
                              Thanks for the update and sorry for what
                              happened with gluster higher versions. I
                              can understand the need for downgrade as
                              it is a production setup.<br>
                              <br>
                              Can you tell me the clients used here ?
                              whether it is a fuse,nfs,nfs-ganesha, smb
                              or libgfapi ?<br>
                              <br>
                              Since I'm not able to reproduce the issue
                              (I have been trying from last 3days) and
                              the logs are not much helpful here (we
                              don't have much logs in socket layer),
                              Could you please create a dummy cluster
                              and try to reproduce the issue? If then we
                              can play with that volume and I could
                              provide some debug build which we can use
                              for further debugging?<br>
                              <br>
                              If you don't have bandwidth for this,
                              please leave it ;).<br>
                              <br>
                              Regards<br>
                              Rafi KC<br>
                              <br>
                              <blockquote type="cite">
                                <div
                                  class="m_4766802258719003127moz-cite-prefix">
                                  <pre style="color:rgb(0,0,0);font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px">- Micha
</pre>
                                  <br>
                                  Am 30.11.2016 um 06:57 schrieb
                                  Mohammed Rafi K C:<br>
                                </div>
                                <blockquote type="cite">
                                  <p>Hi Micha,</p>
                                  <p>I have changed the thread and
                                    subject so that your original thread
                                    remain same for your query. Let's
                                    try to fix the problem what you
                                    observed with 3.8.4, So I have
                                    started a new thread to discuss the
                                    frequent disconnect problem.</p>
                                  <p><b>If any one else has experienced
                                      the same problem, please respond
                                      to the mail.</b><br>
                                  </p>
                                  <p>It would be very helpful if you
                                    could give us some more logs from
                                    clients and bricks.  Also any
                                    reproducible steps will surely help
                                    to chase the problem further.</p>
                                  <p>Regards</p>
                                  <p>Rafi KC<br>
                                  </p>
                                  <div
                                    class="m_4766802258719003127moz-cite-prefix">On
                                    11/30/2016 04:44 AM, Micha Ober
                                    wrote:<br>
                                  </div>
                                  <blockquote type="cite">
                                    <div dir="ltr">
                                      <div>
                                        <div><font face="monospace,
                                            monospace">I had opened
                                            another thread on this
                                            mailing list (Subject:
                                            "After upgrade from 3.4.2 to
                                            3.8.5 - High CPU usage
                                            resulting in disconnects and
                                            split-brain").</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">The title may be
                                            a bit misleading now, as I
                                            am no longer observing high
                                            CPU usage after upgrading to
                                            3.8.6, but the disconnects
                                            are still happening and the
                                            number of files in
                                            split-brain is growing.</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">Setup: 6 compute
                                            nodes, each serving as a
                                            glusterfs server and client,
                                            Ubuntu 14.04, two bricks per
                                            node, distribute-replicate</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">I have two
                                            gluster volumes set up (one
                                            for scratch data, one for
                                            the slurm scheduler). Only
                                            the scratch data volume
                                            shows critical errors "[...]
                                            has not responded in the
                                            last 42 seconds,
                                            disconnecting.". So I can
                                            rule out network problems,
                                            the gigabit link between the
                                            nodes is not saturated at
                                            all. The disks are almost
                                            idle (&lt;10%).</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">I have glusterfs
                                            3.4.2 on Ubuntu 12.04 on a
                                            another compute cluster,
                                            running fine since it was
                                            deployed.</font></div>
                                        <div><font face="monospace,
                                            monospace">I had glusterfs
                                            3.4.2 on Ubuntu 14.04 on
                                            this cluster, running fine
                                            for almost a year.</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">After upgrading
                                            to 3.8.5, the problems (as
                                            described) started. I would
                                            like to use some of the new
                                            features of the newer
                                            versions (like bitrot), but
                                            the users can't run their
                                            compute jobs right now
                                            because the result files are
                                            garbled.</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">There also seems
                                            to be a bug report with a
                                            smiliar problem: (but no
                                            progress)</font></div>
                                        <div><font face="monospace,
                                            monospace"><a
                                              moz-do-not-send="true"
                                              class="moz-txt-link-freetext"
href="https://bugzilla.redhat.com/">https://bugzilla.redhat.com/</a><wbr>show_bug.cgi?id=1370683</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">For me, ALL
                                            servers are affected (not
                                            isolated to one or two
                                            servers)</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">I also see
                                            messages like <a
                                              moz-do-not-send="true"
                                              class="moz-txt-link-rfc2396E"
href="INFO:taskgpu_graphene_bv:4476blockedformorethan120seconds.">"INFO:
                                              task gpu_graphene_bv:4476
                                              blocked for more than 120
                                              seconds."</a> in the
                                            syslog.</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">For completeness
                                            (gv0 is the scratch volume,
                                            gv2 the slurm volume):</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">[root@giant2: ~]#
                                            gluster v info</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">Volume Name: gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Type:
                                            Distributed-Replicate</font></div>
                                        <div><font face="monospace,
                                            monospace">Volume ID:
                                            993ec7c9-e4bc-44d0-b7c4-<wbr>2d977e622e86</font></div>
                                        <div><font face="monospace,
                                            monospace">Status: Started</font></div>
                                        <div><font face="monospace,
                                            monospace">Snapshot Count: 0</font></div>
                                        <div><font face="monospace,
                                            monospace">Number of Bricks:
                                            6 x 2 = 12</font></div>
                                        <div><font face="monospace,
                                            monospace">Transport-type:
                                            tcp</font></div>
                                        <div><font face="monospace,
                                            monospace">Bricks:</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick1:
                                            giant1:/gluster/sdc/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick2:
                                            giant2:/gluster/sdc/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick3:
                                            giant3:/gluster/sdc/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick4:
                                            giant4:/gluster/sdc/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick5:
                                            giant5:/gluster/sdc/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick6:
                                            giant6:/gluster/sdc/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick7:
                                            giant1:/gluster/sdd/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick8:
                                            giant2:/gluster/sdd/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick9:
                                            giant3:/gluster/sdd/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick10:
                                            giant4:/gluster/sdd/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick11:
                                            giant5:/gluster/sdd/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick12:
                                            giant6:/gluster/sdd/gv0</font></div>
                                        <div><font face="monospace,
                                            monospace">Options
                                            Reconfigured:</font></div>
                                        <div><font face="monospace,
                                            monospace">auth.allow:
                                            X.X.X.*,127.0.0.1</font></div>
                                        <div><font face="monospace,
                                            monospace">nfs.disable: on</font></div>
                                        <div><font face="monospace,
                                            monospace"><br>
                                          </font></div>
                                        <div><font face="monospace,
                                            monospace">Volume Name: gv2</font></div>
                                        <div><font face="monospace,
                                            monospace">Type: Replicate</font></div>
                                        <div><font face="monospace,
                                            monospace">Volume ID:
                                            30c78928-5f2c-4671-becc-<wbr>8deaee1a7a8d</font></div>
                                        <div><font face="monospace,
                                            monospace">Status: Started</font></div>
                                        <div><font face="monospace,
                                            monospace">Snapshot Count: 0</font></div>
                                        <div><font face="monospace,
                                            monospace">Number of Bricks:
                                            1 x 2 = 2</font></div>
                                        <div><font face="monospace,
                                            monospace">Transport-type:
                                            tcp</font></div>
                                        <div><font face="monospace,
                                            monospace">Bricks:</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick1:
                                            giant1:/gluster/sdd/gv2</font></div>
                                        <div><font face="monospace,
                                            monospace">Brick2:
                                            giant2:/gluster/sdd/gv2</font></div>
                                        <div><font face="monospace,
                                            monospace">Options
                                            Reconfigured:</font></div>
                                        <div><font face="monospace,
                                            monospace">auth.allow:
                                            X.X.X.*,127.0.0.1</font></div>
                                        <div><font face="monospace,
                                            monospace">cluster.granular-entry-heal:
                                            on</font></div>
                                        <div><font face="monospace,
                                            monospace">cluster.locking-scheme:
                                            granular</font></div>
                                        <div><font face="monospace,
                                            monospace">nfs.disable: on</font></div>
                                        <div
                                          style="font-family:monospace,monospace"><br>
                                        </div>
                                      </div>
                                    </div>
                                    <div class="gmail_extra"><br>
                                      <div class="gmail_quote">2016-11-30
                                        0:10 GMT+01:00 Micha Ober <span
                                          dir="ltr">&lt;<a
                                            moz-do-not-send="true"
                                            class="moz-txt-link-abbreviated"
href="mailto:micha2k@gmail.com">micha2k@gmail.com</a>&gt;</span>:<br>
                                        <blockquote class="gmail_quote"
                                          style="margin:0 0 0
                                          .8ex;border-left:1px #ccc
                                          solid;padding-left:1ex">
                                          <div dir="ltr">
                                            <div
                                              style="font-family:monospace,monospace">There
                                              also seems to be a bug
                                              report with a smiliar
                                              problem: (but no progress)</div>
                                            <div><font face="monospace,
                                                monospace"><a
                                                  moz-do-not-send="true"
class="moz-txt-link-freetext" href="https://bugzilla.redhat.com/sh">https://bugzilla.redhat.com/sh</a><wbr>ow_bug.cgi?id=1370683</font><br>
                                            </div>
                                            <div><font face="monospace,
                                                monospace"><br>
                                              </font></div>
                                            <div><font face="monospace,
                                                monospace">For me, ALL
                                                servers are affected
                                                (not isolated to one or
                                                two servers)</font></div>
                                            <div><font face="monospace,
                                                monospace"><br>
                                              </font></div>
                                            <div><font face="monospace,
                                                monospace">I also see
                                                messages like <a
                                                  moz-do-not-send="true"
class="moz-txt-link-rfc2396E"
                                                  href="INFO:taskgpu_graphene_bv:4476blockedformorethan120seconds.">"INFO:
                                                  task
                                                  gpu_graphene_bv:4476
                                                  blocked for more than
                                                  120 seconds."</a> in
                                                the syslog.</font></div>
                                            <div><font face="monospace,
                                                monospace"><br>
                                              </font></div>
                                            <div><font face="monospace,
                                                monospace">For
                                                completeness (gv0 is the
                                                scratch volume, gv2 the
                                                slurm volume):</font></div>
                                            <div><font face="monospace,
                                                monospace"><br>
                                              </font></div>
                                            <div><font face="monospace,
                                                monospace">
                                                <div>[root@giant2: ~]#
                                                  gluster v info</div>
                                                <div><br>
                                                </div>
                                                <div>Volume Name: gv0</div>
                                                <div>Type:
                                                  Distributed-Replicate</div>
                                                <div>Volume ID:
                                                  993ec7c9-e4bc-44d0-b7c4-2d977e<wbr>622e86</div>
                                                <div>Status: Started</div>
                                                <div>Snapshot Count: 0</div>
                                                <div>Number of Bricks: 6
                                                  x 2 = 12</div>
                                                <div>Transport-type: tcp</div>
                                                <div>Bricks:</div>
                                                <div>Brick1:
                                                  giant1:/gluster/sdc/gv0</div>
                                                <div>Brick2:
                                                  giant2:/gluster/sdc/gv0</div>
                                                <div>Brick3:
                                                  giant3:/gluster/sdc/gv0</div>
                                                <div>Brick4:
                                                  giant4:/gluster/sdc/gv0</div>
                                                <div>Brick5:
                                                  giant5:/gluster/sdc/gv0</div>
                                                <div>Brick6:
                                                  giant6:/gluster/sdc/gv0</div>
                                                <div>Brick7:
                                                  giant1:/gluster/sdd/gv0</div>
                                                <div>Brick8:
                                                  giant2:/gluster/sdd/gv0</div>
                                                <div>Brick9:
                                                  giant3:/gluster/sdd/gv0</div>
                                                <div>Brick10:
                                                  giant4:/gluster/sdd/gv0</div>
                                                <div>Brick11:
                                                  giant5:/gluster/sdd/gv0</div>
                                                <div>Brick12:
                                                  giant6:/gluster/sdd/gv0</div>
                                                <div>Options
                                                  Reconfigured:</div>
                                                <div>auth.allow:
                                                  X.X.X.*,127.0.0.1</div>
                                                <div>nfs.disable: on</div>
                                                <div><br>
                                                </div>
                                                <div>Volume Name: gv2</div>
                                                <div>Type: Replicate</div>
                                                <div>Volume ID:
                                                  30c78928-5f2c-4671-becc-8deaee<wbr>1a7a8d</div>
                                                <div>Status: Started</div>
                                                <div>Snapshot Count: 0</div>
                                                <div>Number of Bricks: 1
                                                  x 2 = 2</div>
                                                <div>Transport-type: tcp</div>
                                                <div>Bricks:</div>
                                                <div>Brick1:
                                                  giant1:/gluster/sdd/gv2</div>
                                                <div>Brick2:
                                                  giant2:/gluster/sdd/gv2</div>
                                                <div>Options
                                                  Reconfigured:</div>
                                                <div>auth.allow:
                                                  X.X.X.*,127.0.0.1</div>
                                                <div>cluster.granular-entry-heal:
                                                  on</div>
                                                <div>cluster.locking-scheme:
                                                  granular</div>
                                                <div>nfs.disable: on</div>
                                                <div><br>
                                                </div>
                                              </font></div>
                                          </div>
                                          <div
                                            class="m_4766802258719003127HOEnZb">
                                            <div
                                              class="m_4766802258719003127h5">
                                              <div class="gmail_extra"><br>
                                                <div class="gmail_quote">2016-11-29
                                                  19:21 GMT+01:00 Micha
                                                  Ober <span dir="ltr">&lt;<a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
                                                      href="mailto:micha2k@gmail.com">micha2k@gmail.com</a>&gt;</span>:<br>
                                                  <blockquote
                                                    class="gmail_quote"
                                                    style="margin:0 0 0
                                                    .8ex;border-left:1px
                                                    #ccc
                                                    solid;padding-left:1ex">
                                                    <div dir="ltr">
                                                      <div
                                                        style="font-family:monospace,monospace">I
                                                        had opened
                                                        another thread
                                                        on this mailing
                                                        list (Subject:
                                                        "After upgrade
                                                        from 3.4.2 to
                                                        3.8.5 - High CPU
                                                        usage resulting
                                                        in disconnects
                                                        and
                                                        split-brain").</div>
                                                      <div
                                                        style="font-family:monospace,monospace"><br>
                                                      </div>
                                                      <div
                                                        style="font-family:monospace,monospace">The
                                                        title may be a
                                                        bit misleading
                                                        now, as I am no
                                                        longer observing
                                                        high CPU usage
                                                        after upgrading
                                                        to 3.8.6, but
                                                        the disconnects
                                                        are still
                                                        happening and
                                                        the number of
                                                        files in
                                                        split-brain is
                                                        growing.<br>
                                                      </div>
                                                      <div
                                                        style="font-family:monospace,monospace"><br>
                                                      </div>
                                                      <div
                                                        style="font-family:monospace,monospace">Setup:
                                                        6 compute nodes,
                                                        each serving as
                                                        a glusterfs
                                                        server and
                                                        client, Ubuntu
                                                        14.04, two
                                                        bricks per node,
distribute-replicate</div>
                                                      <div
                                                        style="font-family:monospace,monospace"><br>
                                                      </div>
                                                      <div
                                                        style="font-family:monospace,monospace">I
                                                        have two gluster
                                                        volumes set up
                                                        (one for scratch
                                                        data, one for
                                                        the slurm
                                                        scheduler). Only
                                                        the scratch data
                                                        volume shows
                                                        critical errors
                                                        "[...] has not
                                                        responded in the
                                                        last 42 seconds,
                                                        disconnecting.".
                                                        So I can rule
                                                        out network
                                                        problems, the
                                                        gigabit link
                                                        between the
                                                        nodes is not
                                                        saturated at
                                                        all. The disks
                                                        are almost idle
                                                        (&lt;10%).</div>
                                                      <div
                                                        style="font-family:monospace,monospace"><br>
                                                      </div>
                                                      <div
                                                        style="font-family:monospace,monospace">I
                                                        have glusterfs
                                                        3.4.2 on Ubuntu
                                                        12.04 on a
                                                        another compute
                                                        cluster, running
                                                        fine since it
                                                        was deployed.</div>
                                                      <div
                                                        style="font-family:monospace,monospace">I
                                                        had glusterfs
                                                        3.4.2 on Ubuntu
                                                        14.04 on this
                                                        cluster, running
                                                        fine for almost
                                                        a year.</div>
                                                      <div
                                                        style="font-family:monospace,monospace"><br>
                                                      </div>
                                                      <div
                                                        style="font-family:monospace,monospace">After
                                                        upgrading to
                                                        3.8.5, the
                                                        problems (as
                                                        described)
                                                        started. I would
                                                        like to use some
                                                        of the new
                                                        features of the
                                                        newer versions
                                                        (like bitrot),
                                                        but the users
                                                        can't run their
                                                        compute jobs
                                                        right now
                                                        because the
                                                        result files are
                                                        garbled.</div>
                                                    </div>
                                                    <div
                                                      class="m_4766802258719003127m_-1578094958703753071HOEnZb">
                                                      <div
                                                        class="m_4766802258719003127m_-1578094958703753071h5">
                                                        <div
                                                          class="gmail_extra"><br>
                                                          <div
                                                          class="gmail_quote">2016-11-29
                                                          18:53
                                                          GMT+01:00 Atin
                                                          Mukherjee <span
                                                          dir="ltr">&lt;<a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
                                                          href="mailto:amukherj@redhat.com">amukherj@redhat.com</a>&gt;</span>:<br>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div style="white-space:pre-wrap">Would you be able to share what is not working for you in 3.8.x (mention the exact version). 3.4 is quite old and falling back to an unsupported version doesn't look a feasible option.</div>
                                                          <br>
                                                          <div
                                                          class="gmail_quote">
                                                          <div>
                                                          <div
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209h5">
                                                          <div dir="ltr">On
                                                          Tue, 29 Nov
                                                          2016 at 17:01,
                                                          Micha Ober
                                                          &lt;<a
                                                          moz-do-not-send="true"
class="moz-txt-link-abbreviated" href="mailto:micha2k@gmail.com">micha2k@gmail.com</a>&gt;
                                                          wrote:<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <blockquote
                                                          class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div>
                                                          <div
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209h5">
                                                          <div dir="ltr"
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">Hi,</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace"><br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          </div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">I was using gluster 3.4 and
                                                          upgraded to
                                                          3.8, but that
                                                          version showed
                                                          to be unusable
                                                          for me. I now
                                                          need to
                                                          downgrade.</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace"><br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          </div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">I'm running Ubuntu 14.04. As
                                                          upgrades of
                                                          the op version
are irreversible, I guess I have to delete all gluster volumes and
                                                          re-create them
                                                          with the
                                                          downgraded
                                                          version. </div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace"><br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          </div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">0. Backup data</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">1. Unmount all gluster volumes</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">2. apt-get purge
                                                          glusterfs-server
glusterfs-client</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">3. Remove PPA for 3.8</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">4. Add PPA for older version</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">5. apt-get install
                                                          glusterfs-server
glusterfs-client</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">6. Create volumes</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace"><br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          </div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">Is "purge" enough to delete all
                                                          configuration
                                                          files of the
                                                          currently
                                                          installed
                                                          version or do
                                                          I need to
                                                           manually
                                                          clear some
                                                          residues
                                                          before
                                                          installing an
                                                          older version?</div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace"><br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          </div>
                                                          <div
                                                          class="m_4766802258719003127gmail_default
m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg"
style="font-family:monospace,monospace">Thanks.</div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <span>
                                                          ______________________________<wbr>_________________<br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          Gluster-users
                                                          mailing list<br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          <a
                                                          moz-do-not-send="true"
class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209m_-2705140003504720857gmail_msg">
                                                          <a
                                                          moz-do-not-send="true"
class="moz-txt-link-freetext" href="http://www.gluster.org/mailman">http://www.gluster.org/mailman</a><wbr>/listinfo/gluster-users</span></blockquote>
                                                          </div>
                                                          <span
class="m_4766802258719003127m_-1578094958703753071m_-2811647508981727209HOEnZb"><font
color="#888888">
                                                          <div dir="ltr">--
                                                          <br>
                                                          </div>
                                                          <div
                                                          data-smartmail="gmail_signature">-
                                                          Atin (atinm)</div>
                                                          </font></span></blockquote>
                                                          </div>
                                                          <br>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                </div>
                                                <br>
                                              </div>
                                            </div>
                                          </div>
                                        </blockquote>
                                      </div>
                                      <br>
                                    </div>
                                    <br>
                                    <fieldset
                                      class="m_4766802258719003127mimeAttachmentHeader"></fieldset>
                                    <br>
                                    <pre>______________________________<wbr>_________________
Gluster-users mailing list
<a moz-do-not-send="true" class="m_4766802258719003127moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="m_4766802258719003127moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://www.gluster.org/<wbr>mailman/listinfo/gluster-users</a></pre>
              </blockquote>
              

            </blockquote>
            <p>

            </p>
          </blockquote>
          

        </blockquote>
        <p>

        </p>
      </blockquote>
      

    </blockquote>
    <p>

    </p>
  </div></div></div>

</blockquote></div>


-- 
<div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">
</div><div>~ Atin (atinm)
</div></div></div></div>
</div></div>



</blockquote>



</blockquote>



</blockquote><p>
</p></body></html>