<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <div class="moz-cite-prefix">On 03/23/2015 11:28 AM, Jonathan Heese
      wrote:<br>
    </div>
    <blockquote
      cite="mid:3BBF89B7-2F55-48C8-A93B-CA6BE22AFD12@inetu.net"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div>On Mar 23, 2015, at 1:20 AM, "Mohammed Rafi K C" &lt;<a
          moz-do-not-send="true" href="mailto:rkavunga@redhat.com">rkavunga@redhat.com</a>&gt;
        wrote:<br>
        <br>
      </div>
      <blockquote type="cite">
        <div><br>
          <div class="moz-cite-prefix">On 03/21/2015 07:49 PM, Jonathan
            Heese wrote:<br>
          </div>
          <blockquote
            cite="mid:9db8a1f4e38b4ba8abc485483ef76696@int-exch6.int.inetu.net"
            type="cite">
            <div
style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
              <p>Mohamed,</p>
              <p><br>
              </p>
              <p>I have completed the steps you suggested (unmount all,
                stop the volume, set the config.transport to tcp, start
                the volume, mount, etc.), and the behavior has indeed
                changed.</p>
              <p><br>
              </p>
              <p>[root@duke ~]# gluster volume info<br>
                <br>
                Volume Name: gluster_disk<br>
                Type: Replicate<br>
                Volume ID: 2307a5a8-641e-44f4-8eaf-7cc2b704aafd<br>
                Status: Started<br>
                Number of Bricks: 1 x 2 = 2<br>
                Transport-type: tcp<br>
                Bricks:<br>
                Brick1: duke-ib:/bricks/brick1<br>
                Brick2: duchess-ib:/bricks/brick1<br>
                Options Reconfigured:<br>
                config.transport: tcp</p>
              <p><br>
                [root@duke ~]# gluster volume status<br>
                Status of volume: gluster_disk<br>
                Gluster process                                        
                Port    Online  Pid<br>
------------------------------------------------------------------------------<br>
                Brick duke-ib:/bricks/brick1                           
                49152   Y       16362<br>
                Brick duchess-ib:/bricks/brick1                        
                49152   Y       14155<br>
                NFS Server on localhost                                
                2049    Y       16374<br>
                Self-heal Daemon on localhost                          
                N/A     Y       16381<br>
                NFS Server on duchess-ib                               
                2049    Y       14167<br>
                Self-heal Daemon on duchess-ib                         
                N/A     Y       14174<br>
                <br>
                Task Status of Volume gluster_disk<br>
------------------------------------------------------------------------------<br>
                There are no active volume tasks<br>
                <br>
              </p>
              <p>I am no longer seeing the I/O errors during prolonged
                periods of write I/O that I was seeing when the
                transport was set to rdma. However, I am seeing this
                message on both nodes every 3 seconds (almost exactly):</p>
              <p><br>
              </p>
              <p>==&gt; /var/log/glusterfs/nfs.log &lt;==<br>
                [2015-03-21 14:17:40.379719] W
                [rdma.c:1076:gf_rdma_cm_event_handler]
                0-gluster_disk-client-1: cma event
                RDMA_CM_EVENT_REJECTED, error 8 (me:10.10.10.1:1023
                peer:10.10.10.2:49152)<br>
              </p>
              <p><br>
              </p>
              <p>Is this something to worry about? </p>
            </div>
          </blockquote>
          If you are not using nfs to export the volumes, there is
          nothing to worry. <br>
        </div>
      </blockquote>
      <div><br>
      </div>
      I'm using the native glusterfs FUSE component to mount the volume
      locally on both servers -- I assume that you're referring to the
      standard NFS protocol stuff, which I'm not using here.
      <div><br>
      </div>
      <div>Incidentally, I would like to keep my logs from filling up
        with junk if possible.  Is there something I can do to get rid
        of these (useless?) error messages?<br>
      </div>
    </blockquote>
    <br>
    If i understand correctly, you are getting this enormous log message
    from nfs log only, all other logs and everything are fine now, right
    ? If that is the case, and you are not at all using nfs for
    exporting the volume, as  a workaround you can disable nfs for your
    volume or cluster. (gluster v set nfs.disable on). This will turnoff
    your gluster nfs server, and you will no longer get those log
    messages.<br>
    <br>
    <br>
    <blockquote
      cite="mid:3BBF89B7-2F55-48C8-A93B-CA6BE22AFD12@inetu.net"
      type="cite">
      <div>
        <div>
          <blockquote type="cite">
            <div>
              <blockquote
                cite="mid:9db8a1f4e38b4ba8abc485483ef76696@int-exch6.int.inetu.net"
                type="cite">
                <div
style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
                  <p>Any idea why there are rdma pieces in play when
                    I've set my transport to tcp?</p>
                </div>
              </blockquote>
              <br>
              there should not be any piece of rdma,if possible, can you
              paste the volfile for nfs server. You can find the volfile
              in /var/lib/glusterd/nfs/nfs-server.vol or
              /usr/local/var/lib/glusterd/nfs/nfs-server.vol<br>
            </div>
          </blockquote>
          <div><br>
          </div>
          <div>I will get this for you when I can.  Thanks.</div>
        </div>
      </div>
    </blockquote>
    <br>
    If you can make it, that will be great help to understand the
    problem.<br>
    <br>
    <br>
    Rafi KC<br>
    <br>
    <blockquote
      cite="mid:3BBF89B7-2F55-48C8-A93B-CA6BE22AFD12@inetu.net"
      type="cite">
      <div>
        <div>
          <div><br>
          </div>
          <div>Regards,</div>
          <div>Jon Heese</div>
          <br>
          <blockquote type="cite">
            <div>Rafi KC<br>
              <blockquote
                cite="mid:9db8a1f4e38b4ba8abc485483ef76696@int-exch6.int.inetu.net"
                type="cite">
                <div
style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
                  <p>The actual I/O appears to be handled properly and
                    I've seen no further errors in the testing I've done
                    so far.</p>
                  <p><br>
                  </p>
                  <p>Thanks.<br>
                  </p>
                  <p><br>
                  </p>
                  <p>Regards,</p>
                  <p>Jon Heese</p>
                  <p><br>
                  </p>
                  <div style="color: rgb(40, 40, 40);" dir="auto">
                    <hr tabindex="-1" style="display:inline-block;
                      width:98%">
                    <div id="divRplyFwdMsg" dir="ltr"><font
                        style="font-size:11pt" color="#000000"
                        face="Calibri, sans-serif"><b>From:</b>
                        <a moz-do-not-send="true"
                          class="moz-txt-link-abbreviated"
                          href="mailto:gluster-users-bounces@gluster.org">
                          gluster-users-bounces@gluster.org</a> <a
                          moz-do-not-send="true"
                          class="moz-txt-link-rfc2396E"
                          href="mailto:gluster-users-bounces@gluster.org">
                          &lt;gluster-users-bounces@gluster.org&gt;</a>
                        on behalf of Jonathan Heese <a
                          moz-do-not-send="true"
                          class="moz-txt-link-rfc2396E"
                          href="mailto:jheese@inetu.net">
                          &lt;jheese@inetu.net&gt;</a><br>
                        <b>Sent:</b> Friday, March 20, 2015 7:04 AM<br>
                        <b>To:</b> Mohammed Rafi K C<br>
                        <b>Cc:</b> gluster-users<br>
                        <b>Subject:</b> Re: [Gluster-users] I/O error on
                        replicated volume</font>
                      <div> </div>
                    </div>
                    <div>
                      <div>Mohammed,</div>
                      <div><br>
                      </div>
                      <div>Thanks very much for the reply.  I will try
                        that and report back.<br>
                        <br>
                        Regards,
                        <div>Jon Heese</div>
                      </div>
                      <div><br>
                        On Mar 20, 2015, at 3:26 AM, "Mohammed Rafi K C"
                        &lt;<a moz-do-not-send="true"
                          href="mailto:rkavunga@redhat.com">rkavunga@redhat.com</a>&gt;
                        wrote:<br>
                        <br>
                      </div>
                      <blockquote type="cite">
                        <div><br>
                          <div class="moz-cite-prefix">On 03/19/2015
                            10:16 PM, Jonathan Heese wrote:<br>
                          </div>
                          <blockquote type="cite">
                            <style>
<!--
@font-face
        {font-family:"Cambria Math"}
@font-face
        {font-family:Calibri}
@font-face
        {font-family:"Segoe UI"}
@font-face
        {font-family:Consolas}
@font-face
        {font-family:Georgia}
@font-face
        {font-family:o365IconsIE8}
@font-face
        {font-family:o365IconsMouse}
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
a:link, span.MsoHyperlink
        {color:#0563C1;
        text-decoration:underline}
a:visited, span.MsoHyperlinkFollowed
        {color:#954F72;
        text-decoration:underline}
pre
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";
        color:black}
span.HTMLPreformattedChar
        {font-family:Consolas;
        color:black}
p.ms-cui-menu, li.ms-cui-menu, div.ms-cui-menu
        {margin:0in;
        margin-bottom:.0001pt;
        background:white;
        font-size:10.0pt;
        font-family:"Segoe UI",sans-serif;
        color:#333333}
p.ms-cui-menusection-title, li.ms-cui-menusection-title, div.ms-cui-menusection-title
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-ctl, li.ms-cui-ctl, div.ms-cui-ctl
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:#333333}
p.ms-cui-ctl-on, li.ms-cui-ctl-on, div.ms-cui-ctl-on
        {margin:0in;
        margin-bottom:.0001pt;
        background:#DFEDFA;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-img-cont-float, li.ms-cui-img-cont-float, div.ms-cui-img-cont-float
        {margin-top:1.5pt;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-smenu-inner, li.ms-cui-smenu-inner, div.ms-cui-smenu-inner
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-owa-paste-option-icon, li.ms-owa-paste-option-icon, div.ms-owa-paste-option-icon
        {margin-top:1.5pt;
        margin-right:3.0pt;
        margin-bottom:0in;
        margin-left:3.0pt;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black;
        vertical-align:sub}
p.ms-rtepasteflyout-option, li.ms-rtepasteflyout-option, div.ms-rtepasteflyout-option
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.ms-cui-menusection, li.ms-cui-menusection, div.ms-cui-menusection
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf, li.wf, div.wf
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-family-owa, li.wf-family-owa, div.wf-family-owa
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:o365IconsMouse;
        color:black}
p.msochpdefault, li.msochpdefault, div.msochpdefault
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;
        color:black}
p.wf-owa-play-large, li.wf-owa-play-large, div.wf-owa-play-large
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-size-play-large, li.wf-size-play-large, div.wf-size-play-large
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-family-owa1, li.wf-family-owa1, div.wf-family-owa1
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:o365IconsIE8;
        color:black}
p.wf-owa-play-large1, li.wf-owa-play-large1, div.wf-owa-play-large1
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-owa-play-large2, li.wf-owa-play-large2, div.wf-owa-play-large2
        {margin:0in;
        margin-bottom:.0001pt;
        text-align:center;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-size-play-large1, li.wf-size-play-large1, div.wf-size-play-large1
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-size-play-large2, li.wf-size-play-large2, div.wf-size-play-large2
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-family-owa2, li.wf-family-owa2, div.wf-family-owa2
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:o365IconsIE8;
        color:black}
p.wf-owa-play-large3, li.wf-owa-play-large3, div.wf-owa-play-large3
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-owa-play-large4, li.wf-owa-play-large4, div.wf-owa-play-large4
        {margin:0in;
        margin-bottom:.0001pt;
        text-align:center;
        font-size:12.0pt;
        font-family:"Times New Roman",serif;
        color:white}
p.wf-size-play-large3, li.wf-size-play-large3, div.wf-size-play-large3
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
p.wf-size-play-large4, li.wf-size-play-large4, div.wf-size-play-large4
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:22.5pt;
        font-family:"Times New Roman",serif;
        color:black}
span.emailstyle17
        {font-family:"Calibri",sans-serif;
        color:windowtext}
span.EmailStyle45
        {font-family:"Calibri",sans-serif;
        color:#1F497D}
span.EmailStyle46
        {font-family:"Calibri",sans-serif;
        color:#1F497D}
span.EmailStyle47
        {font-family:"Calibri",sans-serif;
        color:windowtext}
.MsoChpDefault
        {font-size:10.0pt}
@page WordSection1
        {margin:1.0in 1.0in 1.0in 1.0in}
-->
</style>
                            <div class="WordSection1">
                              <p class="MsoNormal"><a
                                  moz-do-not-send="true"
                                  name="_MailEndCompose"><span
                                    style="color:#1F497D">Hello all,</span></a></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Does anyone else
                                  have any further suggestions for
                                  troubleshooting this?</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">To sum up: I
                                  have a 2 node 2 brick replicated
                                  volume, which holds a handful of iSCSI
                                  image files which are mounted and
                                  served up by tgtd (CentOS 6) to a
                                  handful of devices on a dedicated
                                  iSCSI network.  The most important
                                  iSCSI clients (initiators) are four
                                  VMware ESXi 5.5 hosts that use the
                                  iSCSI volumes as backing for their
                                  datastores for virtual machine
                                  storage.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">After a few
                                  minutes of sustained writing to the
                                  volume, I am seeing a massive flood
                                  (over 1500 per second at times) of
                                  this error in
                                  /var/log/glusterfs/mnt-gluster-disk.log:</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  02:24:07.582801] W
                                  [fuse-bridge.c:2242:fuse_writev_cbk]
                                  0-glusterfs-fuse: 635358: WRITE =&gt;
                                  -1 (Input/output error)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">When this
                                  happens, the ESXi box fails its write
                                  operation and returns an error to the
                                  effect of “Unable to write data to
                                  datastore”.  I don’t see anything else
                                  in the supporting logs to explain the
                                  root cause of the i/o errors.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Any and all
                                  suggestions are appreciated.  Thanks.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                            </div>
                          </blockquote>
                          <br>
                          From the mount logs, i assume that your volume
                          transport type is rdma. There are some known
                          issues for rdma in 3.5.3, and the patch for to
                          address those issues are already send to
                          upstream [1]. From the logs, I'm not sure and
                          it is hard to tell you whether this problem is
                          something related to rdma transport or not. To
                          make sure that the tcp transport is works well
                          in this scenario, if possible can you try to
                          reproduce the same using tcp type volumes. You
                          can change the transport type of volume by
                          doing the following step ( not recommended in
                          normal use case).<br>
                          <br>
                          1) unmount every client<br>
                          2) stop the volume<br>
                          3) run gluster volume set volname
                          config.transport tcp<br>
                          4) start the volume again<br>
                          5) mount the clients<br>
                          <br>
                          [1] : <a moz-do-not-send="true"
                            class="moz-txt-link-freetext"
                            href="http://goo.gl/2PTL61">
                            http://goo.gl/2PTL61</a><br>
                          <br>
                          Regards<br>
                          Rafi KC<br>
                          <br>
                          <blockquote type="cite">
                            <div class="WordSection1">
                              <div>
                                <p class="MsoNormal" style=""><i><span
                                      style="font-size:16.0pt;
                                      font-family:&quot;Georgia&quot;,serif;
                                      color:#0F5789">Jon Heese</span></i><span
                                    style=""><br>
                                  </span><i><span style="color:#333333">Systems
                                      Engineer</span></i><span style=""><br>
                                  </span><b><span style="color:#333333">INetU
                                      Managed Hosting</span></b><span
                                    style=""><br>
                                  </span><span style="color:#333333">P:
                                    610.266.7441 x 261</span><span
                                    style=""><br>
                                  </span><span style="color:#333333">F:
                                    610.266.7434</span><span style=""><br>
                                  </span><a moz-do-not-send="true"
                                    href="https://www.inetu.net/"><span
                                      style="color:blue">www.inetu.net</span></a><span
                                    style=""></span></p>
                                <p class="MsoNormal"><i><span
                                      style="font-size:8.0pt;
                                      color:#333333">** This message
                                      contains confidential information,
                                      which also may be privileged, and
                                      is intended only for the person(s)
                                      addressed above. Any unauthorized
                                      use, distribution, copying or
                                      disclosure of confidential and/or
                                      privileged information is strictly
                                      prohibited. If you have received
                                      this communication in error,
                                      please erase all copies of the
                                      message and its attachments and
                                      notify the sender immediately via
                                      reply e-mail. **</span></i><span
                                    style="color:#1F497D"></span></p>
                              </div>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <div>
                                <div style="border:none;
                                  border-top:solid #E1E1E1 1.0pt;
                                  padding:3.0pt 0in 0in 0in">
                                  <p class="MsoNormal"><b><span
                                        style="color:windowtext">From:</span></b><span
                                      style="color:windowtext"> Jonathan
                                      Heese
                                      <br>
                                      <b>Sent:</b> Tuesday, March 17,
                                      2015 12:36 PM<br>
                                      <b>To:</b> 'Ravishankar N'; <a
                                        moz-do-not-send="true"
                                        class="moz-txt-link-abbreviated"
href="mailto:gluster-users@gluster.org">
                                        gluster-users@gluster.org</a><br>
                                      <b>Subject:</b> RE:
                                      [Gluster-users] I/O error on
                                      replicated volume</span></p>
                                </div>
                              </div>
                              <p class="MsoNormal"> </p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Ravi,</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">The last lines
                                  in the mount log before the massive
                                  vomit of I/O errors are from 22
                                  minutes prior, and seem innocuous to
                                  me:</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:07.126340] E
                                  [client-handshake.c:1760:client_query_portmap_cbk]
                                  0-gluster_disk-client-0: failed to get
                                  the port number for remote subvolume.
                                  Please run 'gluster volume status' on
                                  server to see if brick process is
                                  running.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:07.126587] W
                                  [rdma.c:4273:gf_rdma_disconnect]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
                                  [0x7fd9c557bccf]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
                                  [0x7fd9c557a995]
                                  (--&gt;/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)

                                  [0x7fd9c0d8fb9a])))
                                  0-gluster_disk-client-0: disconnect
                                  called (peer:10.10.10.1:24008)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:07.126687] E
                                  [client-handshake.c:1760:client_query_portmap_cbk]
                                  0-gluster_disk-client-1: failed to get
                                  the port number for remote subvolume.
                                  Please run 'gluster volume status' on
                                  server to see if brick process is
                                  running.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:07.126737] W
                                  [rdma.c:4273:gf_rdma_disconnect]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
                                  [0x7fd9c557bccf]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
                                  [0x7fd9c557a995]
                                  (--&gt;/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)

                                  [0x7fd9c0d8fb9a])))
                                  0-gluster_disk-client-1: disconnect
                                  called (peer:10.10.10.2:24008)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.730165] I
                                  [rpc-clnt.c:1729:rpc_clnt_reconfig]
                                  0-gluster_disk-client-0: changing port
                                  to 49152 (from 0)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.730276] W
                                  [rdma.c:4273:gf_rdma_disconnect]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
                                  [0x7fd9c557bccf]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
                                  [0x7fd9c557a995]
                                  (--&gt;/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)

                                  [0x7fd9c0d8fb9a])))
                                  0-gluster_disk-client-0: disconnect
                                  called (peer:10.10.10.1:24008)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.739500] I
                                  [rpc-clnt.c:1729:rpc_clnt_reconfig]
                                  0-gluster_disk-client-1: changing port
                                  to 49152 (from 0)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.739560] W
                                  [rdma.c:4273:gf_rdma_disconnect]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)
                                  [0x7fd9c557bccf]
                                  (--&gt;/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
                                  [0x7fd9c557a995]
                                  (--&gt;/usr/lib64/glusterfs/3.5.3/xlator/protocol/client.so(client_query_portmap_cbk+0x1ea)

                                  [0x7fd9c0d8fb9a])))
                                  0-gluster_disk-client-1: disconnect
                                  called (peer:10.10.10.2:24008)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.741883] I
                                  [client-handshake.c:1677:select_server_supported_programs]
                                  0-gluster_disk-client-0: Using Program
                                  GlusterFS 3.3, Num (1298437), Version
                                  (330)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.744524] I
                                  [client-handshake.c:1462:client_setvolume_cbk]
                                  0-gluster_disk-client-0: Connected to
                                  10.10.10.1:49152, attached to remote
                                  volume '/bricks/brick1'.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.744537] I
                                  [client-handshake.c:1474:client_setvolume_cbk]
                                  0-gluster_disk-client-0: Server and
                                  Client lk-version numbers are not
                                  same, reopening the fds</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.744566] I
                                  [afr-common.c:4267:afr_notify]
                                  0-gluster_disk-replicate-0: Subvolume
                                  'gluster_disk-client-0' came back up;
                                  going online.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.744627] I
                                  [client-handshake.c:450:client_set_lk_version_cbk]
                                  0-gluster_disk-client-0: Server lk
                                  version = 1</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.753037] I
                                  [client-handshake.c:1677:select_server_supported_programs]
                                  0-gluster_disk-client-1: Using Program
                                  GlusterFS 3.3, Num (1298437), Version
                                  (330)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.755657] I
                                  [client-handshake.c:1462:client_setvolume_cbk]
                                  0-gluster_disk-client-1: Connected to
                                  10.10.10.2:49152, attached to remote
                                  volume '/bricks/brick1'.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.755676] I
                                  [client-handshake.c:1474:client_setvolume_cbk]
                                  0-gluster_disk-client-1: Server and
                                  Client lk-version numbers are not
                                  same, reopening the fds</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.761945] I
                                  [fuse-bridge.c:5016:fuse_graph_setup]
                                  0-fuse: switched to graph 0</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[2015-03-16
                                  01:37:10.762144] I
                                  [client-handshake.c:450:client_set_lk_version_cbk]
                                  0-gluster_disk-client-1: Server lk
                                  version = 1</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[<b>2015-03-16
                                    01:37:10.762279</b>] I
                                  [fuse-bridge.c:3953:fuse_init]
                                  0-glusterfs-fuse: FUSE inited with
                                  protocol versions: glusterfs 7.22
                                  kernel 7.14</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[<b>2015-03-16
                                    01:59:26.098670</b>] W
                                  [fuse-bridge.c:2242:fuse_writev_cbk]
                                  0-glusterfs-fuse: 292084: WRITE =&gt;
                                  -1 (Input/output error)</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">…</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">I’ve seen no
                                  indication of split-brain on any files
                                  at any point in this (ever since
                                  downdating from 3.6.2 to 3.5.3, which
                                  is when this particular issue
                                  started):</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">[root@duke
                                  gfapi-module-for-linux-target-driver-]#
                                  gluster v heal gluster_disk info</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Brick
                                  duke.jonheese.local:/bricks/brick1/</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Number of
                                  entries: 0</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Brick
                                  duchess.jonheese.local:/bricks/brick1/</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Number of
                                  entries: 0</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D">Thanks.</span></p>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <div>
                                <p class="MsoNormal" style=""><i><span
                                      style="font-size:16.0pt;
                                      font-family:&quot;Georgia&quot;,serif;
                                      color:#0F5789">Jon Heese</span></i><span
                                    style=""><br>
                                  </span><i><span style="color:#333333">Systems
                                      Engineer</span></i><span style=""><br>
                                  </span><b><span style="color:#333333">INetU
                                      Managed Hosting</span></b><span
                                    style=""><br>
                                  </span><span style="color:#333333">P:
                                    610.266.7441 x 261</span><span
                                    style=""><br>
                                  </span><span style="color:#333333">F:
                                    610.266.7434</span><span style=""><br>
                                  </span><a moz-do-not-send="true"
                                    href="https://www.inetu.net/"><span
                                      style="color:blue">www.inetu.net</span></a><span
                                    style=""></span></p>
                                <p class="MsoNormal"><i><span
                                      style="font-size:8.0pt;
                                      color:#333333">** This message
                                      contains confidential information,
                                      which also may be privileged, and
                                      is intended only for the person(s)
                                      addressed above. Any unauthorized
                                      use, distribution, copying or
                                      disclosure of confidential and/or
                                      privileged information is strictly
                                      prohibited. If you have received
                                      this communication in error,
                                      please erase all copies of the
                                      message and its attachments and
                                      notify the sender immediately via
                                      reply e-mail. **</span></i><span
                                    style="color:#1F497D"></span></p>
                              </div>
                              <p class="MsoNormal"><span
                                  style="color:#1F497D"> </span></p>
                              <div>
                                <div style="border:none;
                                  border-top:solid #E1E1E1 1.0pt;
                                  padding:3.0pt 0in 0in 0in">
                                  <p class="MsoNormal"><b><span
                                        style="color:windowtext">From:</span></b><span
                                      style="color:windowtext">
                                      Ravishankar N [</span><a
                                      moz-do-not-send="true"
                                      href="mailto:ravishankar@redhat.com">mailto:ravishankar@redhat.com</a><span
                                      style="color:windowtext">]
                                      <br>
                                      <b>Sent:</b> Tuesday, March 17,
                                      2015 12:35 AM<br>
                                      <b>To:</b> Jonathan Heese; </span><a
                                      moz-do-not-send="true"
                                      href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><span
                                      style="color:windowtext"><br>
                                      <b>Subject:</b> Re:
                                      [Gluster-users] I/O error on
                                      replicated volume</span></p>
                                </div>
                              </div>
                              <p class="MsoNormal"> </p>
                              <p class="MsoNormal"><span
                                  style="font-size:12.0pt"> </span></p>
                              <div>
                                <p class="MsoNormal">On 03/17/2015 02:14
                                  AM, Jonathan Heese wrote:</p>
                              </div>
                              <blockquote style="margin-top:5.0pt;
                                margin-bottom:5.0pt">
                                <div>
                                  <div>
                                    <p class="MsoNormal"
                                      style="background:white"><span
                                        style="font-size:12.0pt">Hello,<br>
                                        <br>
                                        So I resolved my previous issue
                                        with split-brains and the lack
                                        of self-healing by dropping my
                                        installed glusterfs* packages
                                        from 3.6.2 to 3.5.3, but now
                                        I've picked up a new issue,
                                        which actually makes normal use
                                        of the volume practically
                                        impossible.<br>
                                        <br>
                                        A little background for those
                                        not already paying close
                                        attention:<br>
                                        I have a 2 node 2 brick
                                        replicating volume whose purpose
                                        in life is to hold iSCSI target
                                        files, primarily for use to
                                        provide datastores to a VMware
                                        ESXi cluster.  The plan is to
                                        put a handful of image files on
                                        the Gluster volume, mount them
                                        locally on both Gluster nodes,
                                        and run tgtd on both, pointed to
                                        the image files on the mounted
                                        gluster volume. Then the ESXi
                                        boxes will use multipath
                                        (active/passive) iSCSI to
                                        connect to the nodes, with
                                        automatic failover in case of
                                        planned or unplanned downtime of
                                        the Gluster nodes.<br>
                                        <br>
                                        In my most recent round of
                                        testing with 3.5.3, I'm seeing a
                                        massive failure to write data to
                                        the volume after about 5-10
                                        minutes, so I've simplified the
                                        scenario a bit (to minimize the
                                        variables) to: both Gluster
                                        nodes up, only one node (duke)
                                        mounted and running tgtd, and
                                        just regular (single path) iSCSI
                                        from a single ESXi server.<br>
                                        <br>
                                        About 5-10 minutes into
                                        migration a VM onto the test
                                        datastore, /var/log/messages on
                                        duke gets blasted with a ton of
                                        messages exactly like this:</span></p>
                                    <p class="MsoNormal"
                                      style="background:white">Mar 15
                                      22:24:06 duke tgtd:
                                      bs_rdwr_request(180) io error
                                      0x1781e00 2a -1 512 22971904,
                                      Input/output error</p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                    <p class="MsoNormal"
                                      style="background:white">And
                                      /var/log/glusterfs/mnt-gluster_disk.log
                                      gets blased with a ton of messages
                                      exactly like this:</p>
                                    <p class="MsoNormal"
                                      style="background:white">[2015-03-16
                                      02:24:07.572279] W
                                      [fuse-bridge.c:2242:fuse_writev_cbk]
                                      0-glusterfs-fuse: 635299: WRITE
                                      =&gt; -1 (Input/output error)</p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                  </div>
                                </div>
                              </blockquote>
                              <p class="MsoNormal"
                                style="margin-bottom:12.0pt"><span
                                  style=""><br>
                                  Are there any messages in the mount
                                  log from AFR about split-brain just
                                  before the above line appears?<br>
                                  Does `gluster v heal &lt;VOLNAME&gt;
                                  info` show any files? Performing I/O
                                  on files that are in split-brain fail
                                  with EIO.<br>
                                  <br>
                                  -Ravi<br>
                                  <br>
                                </span></p>
                              <blockquote style="margin-top:5.0pt;
                                margin-bottom:5.0pt">
                                <div>
                                  <div>
                                    <p class="MsoNormal"
                                      style="background:white">And the
                                      write operation from VMware's side
                                      fails as soon as these messages
                                      start.</p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                    <p class="MsoNormal"
                                      style="background:white">I don't
                                      see any other errors (in the log
                                      files I know of) indicating the
                                      root cause of these i/o errors. 
                                      I'm sure that this is not enough
                                      information to tell what's going
                                      on, but can anyone help me figure
                                      out what to look at next to figure
                                      this out?</p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                    <p class="MsoNormal"
                                      style="background:white">I've also
                                      considered using Dan Lambright's
                                      libgfapi gluster module for tgtd
                                      (or something similar) to avoid
                                      going through FUSE, but I'm not
                                      sure whether that would be
                                      irrelevant to this problem, since
                                      I'm not 100% sure if it lies in
                                      FUSE or elsewhere.</p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                    <p class="MsoNormal"
                                      style="background:white">Thanks!</p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                    <p class="MsoNormal"
                                      style="background:white"><i><span
                                          style="font-size:16.0pt;
                                          font-family:&quot;Georgia&quot;,serif;
                                          color:#0F5789">Jon Heese</span></i><span
                                        style=""><br>
                                      </span><i><span
                                          style="color:#333333">Systems
                                          Engineer</span></i><span
                                        style=""><br>
                                      </span><b><span
                                          style="color:#333333">INetU
                                          Managed Hosting</span></b><span
                                        style=""><br>
                                      </span><span style="color:#333333">P:
                                        610.266.7441 x 261</span><span
                                        style=""><br>
                                      </span><span style="color:#333333">F:
                                        610.266.7434</span><span
                                        style=""><br>
                                      </span><a moz-do-not-send="true"
                                        href="https://www.inetu.net/"><span
                                          style="color:blue">www.inetu.net</span></a></p>
                                    <p class="MsoNormal"
                                      style="background:white"><i><span
                                          style="font-size:8.0pt;
                                          color:#333333">** This message
                                          contains confidential
                                          information, which also may be
                                          privileged, and is intended
                                          only for the person(s)
                                          addressed above. Any
                                          unauthorized use,
                                          distribution, copying or
                                          disclosure of confidential
                                          and/or privileged information
                                          is strictly prohibited. If you
                                          have received this
                                          communication in error, please
                                          erase all copies of the
                                          message and its attachments
                                          and notify the sender
                                          immediately via reply e-mail.
                                          **</span></i></p>
                                    <p class="MsoNormal"
                                      style="background:white"> </p>
                                  </div>
                                </div>
                                <p class="MsoNormal"
                                  style="margin-bottom:12.0pt"><span
                                    style=""><br>
                                    <br>
                                  </span></p>
                                <pre>_______________________________________________</pre>
                                <pre>Gluster-users mailing list</pre>
                                <pre><a moz-do-not-send="true" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a></pre>
                                <pre><a moz-do-not-send="true" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
                              </blockquote>
                              <p class="MsoNormal"><span style=""> </span></p>
                            </div>
                            <br>
                            <fieldset class="mimeAttachmentHeader"></fieldset>
                            <br>
                            <pre>_______________________________________________
Gluster-users mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>
                          </blockquote>
                          <br>
                        </div>
                      </blockquote>
                    </div>
                  </div>
                </div>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>