<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 03/15/2015 11:16 AM, Jonathan Heese wrote:<br>

    <blockquote

      cite="mid:4e8f7c2c6f784e708e98afe090a7ccb5@int-exch6.int.inetu.net"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <style type="text/css" style="display:none"><!--P{margin-top:0;margin-bottom:0;} .ms-cui-menu {background-color:#ffffff;border:1px rgb(171, 171, 171) solid;font-family:"Segoe UI WPC","Segoe UI",Tahoma,"Microsoft Sans Serif",Verdana,sans-serif;font-size:10pt;color:rgb(51, 51, 51);} .ms-cui-menusection-title {display:none;} .ms-cui-ctl {vertical-align:text-top;text-decoration:none;color:rgb(51, 51, 51);} .ms-cui-ctl-on {background-color:rgb(223, 237, 250);opacity: 0.8;} .ms-cui-img-cont-float {display:inline-block;margin-top:2px} .ms-cui-smenu-inner {padding-top:0px;} .ms-owa-paste-option-icon {margin: 2px 4px 0px 4px;vertical-align:sub;padding-bottom: 2px;display:inline-block;} .ms-rtePasteFlyout-option:hover {background-color:rgb(223, 237, 250) !important;opacity:1 !important;} .ms-rtePasteFlyout-option {padding:8px 4px 8px 4px;outline:none;} .ms-cui-menusection {float:left; width:85px;height:24px;overflow:hidden}.wf {speak:none; font-weight:normal; font-variant:normal; text-tra

nsform:none; -webkit-font-smoothing:antialiased; vertical-align:middle; display:inline-block;}.wf-family-owa {font-family:'o365Icons'}@font-face {  font-family:'o365IconsIE8';  src:url('prem/15.0.913.22/resources/styles/office365icons.ie8.eot?#iefix') format('embedded-opentype'),         url('prem/15.0.913.22/resources/styles/office365icons.ie8.woff') format('woff'),         url('prem/15.0.913.22/resources/styles/office365icons.ie8.ttf') format('truetype');  font-weight:normal;  font-style:normal;}@font-face {  font-family:'o365IconsMouse';  src:url('prem/15.0.913.22/resources/styles/office365icons.mouse.eot?#iefix') format('embedded-opentype'),         url('prem/15.0.913.22/resources/styles/office365icons.mouse.woff') format('woff'),         url('prem/15.0.913.22/resources/styles/office365icons.mouse.ttf') format('truetype');  font-weight:normal;  font-style:normal;}.wf-family-owa {font-family:'o365IconsMouse'}.ie8 .wf-family-owa {font-family:'o365IconsIE8'}.ie8 .wf-owa-play-large:be

fore {content:'\e254';}.notIE8 .wf-owa-play-large:before {content:'\e054';}.ie8 .wf-owa-play-large {color:#FFFFFF/*$WFWhiteColor*/;}.notIE8 .wf-owa-play-large {border-color:#FFFFFF/*$WFWhiteColor*/; width:1.4em; height:1.4em; border-width:.1em; border-style:solid; border-radius:.8em; text-align:center; box-sizing:border-box; -moz-box-sizing:border-box; padding:0.1em; color:#FFFFFF/*$WFWhiteColor*/;}.ie8 .wf-size-play-large {width:40px; height:40px; font-size:30px}.notIE8 .wf-size-play-large {width:40px; height:40px; font-size:30px}--></style>

      <div

style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">

        <p>Hello all,</p>

        <p><br>

        </p>

        <p>I have a 2 node 2 brick replicate gluster volume that I'm

          having trouble making fault tolerant (a seemingly basic

          feature!) under CentOS 6.6 using EPEL packages.</p>

        <p><br>

        </p>

        <p>Both nodes are as close to identical hardware and software as

          possible, and I'm running the following packages:</p>

        <p>glusterfs-rdma-3.6.2-1.el6.x86_64<br>

          glusterfs-fuse-3.6.2-1.el6.x86_64<br>

          glusterfs-libs-3.6.2-1.el6.x86_64<br>

          glusterfs-cli-3.6.2-1.el6.x86_64<br>

          glusterfs-api-3.6.2-1.el6.x86_64<br>

          glusterfs-server-3.6.2-1.el6.x86_64<br>

          glusterfs-3.6.2-1.el6.x86_64<br>

        </p>

      </div>

    </blockquote>

    3.6.2 is not considered production stable. Based on your expressed

    concern, you should probably be running 3.5.3.<br>

    <blockquote

      cite="mid:4e8f7c2c6f784e708e98afe090a7ccb5@int-exch6.int.inetu.net"

      type="cite">

      <div

style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">

        <p>

        </p>

        <p><br>

        </p>

        <p>They both have dual-port Mellanox 20Gbps InfiniBand cards

          with a straight (i.e. "crossover") cable and opensm to

          facilitate the RDMA transport between them.</p>

        <p><br>

        </p>

        <p>Here are some data dumps to set the stage (and yes, the

          output of these commands looks the same on both nodes):<br>

        </p>

        <p><br>

        </p>

        <p>[root@duchess ~]# gluster volume info<br>

          <br>

          Volume Name: gluster_disk<br>

          Type: Replicate<br>

          Volume ID: b1279e22-8589-407b-8671-3760f42e93e4<br>

          Status: Started<br>

          Number of Bricks: 1 x 2 = 2<br>

          Transport-type: rdma<br>

          Bricks:<br>

          Brick1: duke-ib:/bricks/brick1<br>

          Brick2: duchess-ib:/bricks/brick1<br>

        </p>

        <p><br>

        </p>

        <p>[root@duchess ~]# gluster volume status<br>

          Status of volume: gluster_disk<br>

          Gluster process                                        

          Port    Online  Pid<br>

------------------------------------------------------------------------------<br>

          Brick duke-ib:/bricks/brick1                           

          49153   Y       9594<br>

          Brick duchess-ib:/bricks/brick1                        

          49153   Y       9583<br>

          NFS Server on localhost                                

          2049    Y       9590<br>

          Self-heal Daemon on localhost                          

          N/A     Y       9597<br>

          NFS Server on 10.10.10.1                               

          2049    Y       9607<br>

          Self-heal Daemon on 10.10.10.1                         

          N/A     Y       9614<br>

          <br>

          Task Status of Volume gluster_disk<br>

------------------------------------------------------------------------------<br>

          There are no active volume tasks<br>

        </p>

        <p><br>

        </p>

        <p>[root@duchess ~]# gluster peer status<br>

          Number of Peers: 1<br>

          <br>

          Hostname: 10.10.10.1<br>

          Uuid: aca56ec5-94bb-4bb0-8a9e-b3d134bbfe7b<br>

          State: Peer in Cluster (Connected)</p>

        <p><br>

        </p>

        <p>So before putting any real data on these guys (the data will

          eventually be a handful of large image files backing an iSCSI

          target via tgtd for ESXi datastores), I wanted to simulate the

          failure of one of the nodes. So I stopped glusterfsd and

          glusterd on duchess, waited about 5 minutes, then started them

          back up again, tail'ing /var/log/glusterfs/* and

          /var/log/messages. I'm not sure exactly what I'm looking for,

          but the logs quieted down after just a minute or so of

          restarting the daemons. I didn't see much indicating that

          self-healing was going on.<br>

        </p>

        <p><br>

        </p>

        <p>Every now and then (and seemingly more often than not), when

          I run "gluster volume heal gluster_disk info", I get no output

          from the command, and the following dumps into my

          /var/log/messages:<br>

        </p>

        <p><br>

        </p>

        <p>Mar 15 13:59:16 duchess kernel: glfsheal[10365]: segfault at

          7ff56068d020 ip 00007ff54f366d80 sp 00007ff54e22adf8 error 6

          in libmthca-rdmav2.so[7ff54f365000+7000]<br>

        </p>

      </div>

    </blockquote>

    This a segfault in the mellanox driver. Please report it to the

    driver developers.

    <blockquote

      cite="mid:4e8f7c2c6f784e708e98afe090a7ccb5@int-exch6.int.inetu.net"

      type="cite">

      <div

style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">

        <p>

          Mar 15 13:59:17 duchess abrtd: Directory

          'ccpp-2015-03-15-13:59:16-10359' creation detected<br>

          Mar 15 13:59:17 duchess abrt[10368]: Saved core dump of pid

          10359 (/usr/sbin/glfsheal) to

          /var/spool/abrt/ccpp-2015-03-15-13:59:16-10359 (225595392

          bytes)<br>

          Mar 15 13:59:25 duchess abrtd: Package 'glusterfs-server'

          isn't signed with proper key<br>

          Mar 15 13:59:25 duchess abrtd: 'post-create' on

          '/var/spool/abrt/ccpp-2015-03-15-13:59:16-10359' exited with 1<br>

          Mar 15 13:59:25 duchess abrtd: Deleting problem directory

          '/var/spool/abrt/ccpp-2015-03-15-13:59:16-10359'<br>

          <br>

        </p>

        <p>Other times, when I'm lucky, I get messages from the "heal

          info" command indicating that datastore1.img (the file that I

          intentionally changed while duchess was offline) is in need of

          healing:</p>

        <p><br>

        </p>

        <p>[root@duke ~]# gluster volume heal gluster_disk info<br>

          Brick duke.jonheese.local:/bricks/brick1/<br>

          /datastore1.img - Possibly undergoing heal<br>

          <br>

          Number of entries: 1<br>

          <br>

          Brick duchess.jonheese.local:/bricks/brick1/<br>

          /datastore1.img - Possibly undergoing heal<br>

          <br>

          Number of entries: 1<br>

        </p>

        <p><br>

        </p>

        <p>But watching df on the bricks and tailing glustershd.log

          doesn't seem to indicate that anything is actually happening

          -- and df indicates that brick on duke *is* different in file

          size from the brick on duchess. It's been over an hour now,

          and I'm not confident that the selfheal functionality is even

          working at all... Nor do I know how to do anything about it!</p>

      </div>

    </blockquote>

    File sizes are not necessarily any indication. If the changes you

    made were nulls, the change may be sparse. df --apparent is a little

    better indicator. Comparing hashes would be even better.<br>

    <br>

    The extended attributes on the file itself, on the bricks, can tell

    you the heal state. Look at "getfattr -m . -d -e hex $file". The

    trusted.afr attributes, if non-zero, show pending changes destined

    for the other server.<br>

    <blockquote

      cite="mid:4e8f7c2c6f784e708e98afe090a7ccb5@int-exch6.int.inetu.net"

      type="cite">

      <div

style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">

        <p><br>

        </p>

        <p>Also, I find it a little bit troubling that I'm using the

          aliases (in /etc/hosts on both servers) duke-ib and duchess-ib

          for the gluster node configuration, but the "heal info"

          command refers to my nodes with their internal FQDNs, which

          resolve to their 1Gbps interface IPs... That doesn't mean that

          they're trying to communicate over those interfaces (the

          volume is configured with "transport rdma", as you can see

          above), does it?<br>

        </p>

      </div>

    </blockquote>

    <br>

    I'd call that a bug. It should report the hostnames as they're

    listed in the volume info.<br>

    <blockquote

      cite="mid:4e8f7c2c6f784e708e98afe090a7ccb5@int-exch6.int.inetu.net"

      type="cite">

      <div

style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">

        <p>

        </p>

        <p><br>

        </p>

        <p>Can anyone throw out any ideas on how I can:</p>

        <p>1. Determine whether this is intentional behavior (or a

          bug?),</p>

        <p>2. Determine whether my data has been properly resync'd

          across the bricks, and</p>

        <p>3. Make it work correctly if not.</p>

        <p><br>

        </p>

        <p>Thanks in advance!</p>

        <p><br>

        </p>

        <p>Regards,</p>

        <p>Jon Heese<br>

        </p>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Gluster-users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a>

<a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></pre>

    </blockquote>

    <br>

  </body>

</html>