<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Le 04/03/2016 10:56, Krutika Dhananjay
      a écrit :<br>
    </div>
    <blockquote
      cite="mid:898345658.15935171.1457085397319.JavaMail.zimbra@redhat.com"
      type="cite">
      <div style="font-family: garamond,new york,times,serif; font-size:
        12pt; color: #000000">
        <div>Hi,<br>
        </div>
        <div><br>
        </div>
        <div>So up until in 3.5.x, there was a read child selection mode
          called 'first responder' where the brick that responds first
          for a particular client becomes the read child.<br>
        </div>
        <div>After the replication module was rewritten for the most
          part from 3.6.0, this mode was removed.<br>
        </div>
        <div><br>
        </div>
        <div>There exists a workaround, though. Could you share the
          output of `gluster volume info &lt;VOL&gt;`?<br>
        </div>
        <div><br>
        </div>
      </div>
    </blockquote>
    It gives (note: this volume is not used apart for testing so I can
    "play" with it):<br>
    Volume Name: HOME<br>
    Type: Replicate<br>
    Volume ID: ea90bcaf-990d-436a-b4fa-8fa20d67f924<br>
    Status: Started<br>
    Number of Bricks: 1 x 2 = 2<br>
    Transport-type: tcp<br>
    Bricks:<br>
    Brick1: sto1.liris.cnrs.fr:/glusterfs/home/data<br>
    Brick2: sto2.liris.cnrs.fr:/glusterfs/home/data<br>
    Options Reconfigured:<br>
    performance.quick-read: on<br>
    cluster.metadata-self-heal: on<br>
    cluster.data-self-heal: on<br>
    cluster.entry-self-heal: on<br>
    cluster.consistent-metadata: true<br>
    auth.ssl-allow:
    sto1.liris.cnrs.fr,sto2.liris.cnrs.fr,connect.liris.cnrs.fr<br>
    server.ssl: off<br>
    client.ssl: off<br>
    diagnostics.latency-measurement: off<br>
    diagnostics.count-fop-hits: off<br>
    <br>
    <br>
    BTW a first-repsonding option can be nice but is not exaclty the
    same thing as under heavy load from one server you could fall onto
    the other one. Our purpose is really to reduce bandwidth beetween
    buildings when possible, not to use the faster server :)<br>
    <br>
    --<br>
    Y.<br>
    <br>
    <br>
    <br>
    <br>
    <blockquote
      cite="mid:898345658.15935171.1457085397319.JavaMail.zimbra@redhat.com"
      type="cite">
      <div style="font-family: garamond,new york,times,serif; font-size:
        12pt; color: #000000">
        <div>-Krutika<br>
        </div>
        <div><br>
        </div>
        <hr id="zwchr">
        <blockquote style="border-left:2px solid
#1010FF;margin-left:5px;padding-left:5px;color:#000;font-weight:normal;font-style:normal;text-decoration:none;font-family:Helvetica,Arial,sans-serif;font-size:12pt;"
          data-mce-style="border-left: 2px solid #1010FF; margin-left:
          5px; padding-left: 5px; color: #000; font-weight: normal;
          font-style: normal; text-decoration: none; font-family:
          Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"Yannick
          Perret" <a class="moz-txt-link-rfc2396E" href="mailto:yannick.perret@liris.cnrs.fr">&lt;yannick.perret@liris.cnrs.fr&gt;</a><br>
          <b>To: </b>"Saravanakumar Arumugam"
          <a class="moz-txt-link-rfc2396E" href="mailto:sarumuga@redhat.com">&lt;sarumuga@redhat.com&gt;</a>, <a class="moz-txt-link-abbreviated" href="mailto:gluster-users@gluster.org">gluster-users@gluster.org</a><br>
          <b>Sent: </b>Friday, March 4, 2016 2:43:16 PM<br>
          <b>Subject: </b>Re: [Gluster-users] Per-client prefered
          server?<br>
          <div><br>
          </div>
          <div class="moz-cite-prefix">Le 03/03/2016 15:17,
            Saravanakumar Arumugam a écrit :<br>
          </div>
          <blockquote cite="mid:56D8476E.4010900@redhat.com"><br>
            <br>
            <div class="moz-cite-prefix">On 03/03/2016 05:38 PM, Yannick
              Perret wrote:<br>
            </div>
            <blockquote cite="mid:56D82941.2050707@liris.cnrs.fr">Hello,
              <br>
              <br>
              I can't find if it is possible to set a prefered server on
              a per-client basis for replica volumes, so I ask the
              question here. <br>
              <br>
              The context: we have 2 storage servers, each in one
              building. We also have several virtual machines on each
              building, and they can migrate from one building to an
              other (depending on load, maintenance…). <br>
              <br>
              So (for testing at this time) I setup a x2 replica volume,
              one replica on each storage server of course. As most of
              our volumes are "many reads - few writes" it would be
              better for bandwidth that each client uses the "nearest"
              storage server (local building switch) - for reading, of
              course. The 2 buildings have a good netlink but we prefer
              to minimize - when not needed - data transferts beetween
              them (this link is shared). <br>
              <br>
              Can you see a solution for this kind of tuning? As far as
              I understand geo-replica is not really what I need, no? <br>
            </blockquote>
            <br>
            Yes, geo-replication "cannot" be used as you wish to carry
            out "write" operation on Slave side.<br>
            <br>
          </blockquote>
          Ok, thanks. I was pretty sure it was the case but I prefer to
          ask.<br>
          <blockquote cite="mid:56D8476E.4010900@redhat.com">
            <blockquote cite="mid:56D82941.2050707@liris.cnrs.fr"><br>
              It exists "cluster.read-subvolume" option of course but we
              can have clients on both building so a per-volume option
              is not what we need. An per-client equivalent of this
              option should be nice. <br>
              <br>
              I tested by myself a small patch to perform this - and it
              seems to work fine as far as I can see - but 1. before
              continuing in this way I would first check if it exists an
              other way and 2. I'm not familiar with the whole code so
              I'm not sure that my tests are in the "state-of-the-art"
              for glusterfs. <br>
              <br>
            </blockquote>
            maybe you should share that interesting patch :) and get
            better feedback about your test case.<br>
          </blockquote>
          <br>
          My "patch" is quite simple: I added in
          afr_read_subvol_select_by_policy()
          (xlators/cluster/afr/src/afr-common.c) a target selection
          similar to the one managed by "read_child" configuration (see
          patches at the end).<br>
          <br>
          Of course I also added the definition of this "forced_child"
          in afr.h, in the same way favorite_child or read_child is
          defined.<br>
          <br>
          <br>
          My real problem here is how to tell to client to change its
          "forced-child" value.<br>
          <br>
          I did this by reading it from a local file
          (/var/lib/glusterd/forced-child) in init() and reconfigure()
          (from xlators/cluster/afr/src/afr.c). This is fine at startup,
          and when volume configuration changed, but I find the sending
          a SIGUP is not enough because client detects that no change
          occurs and do not call "reconfigure()". So for my tests I
          modified glusterfsd/src/glusterfsd-mgmt.c so that if
          /var/lib/glusterf/forced-child exists it behave as if a
          configuration change occured (and so calls reconfigure() which
          reload the forced-child value).<br>
          <br>
          <br>
          At this point it works as I expected but I think it is
          possible to handle a new forced-child value without calling
          all reconfigure().<br>
          Moreover this file contains a raw number, it would be better
          to use a server name, which would be converted into index
          after that.<br>
          <br>
          If possible it may be better to send the new value using
          'gluster' command on client? I.e. something like 'gluster
          volume set client.prefered-server SERVER-NAME'?<br>
          <br>
          <br>
          Any advices welcome.<br>
          <br>
          <br>
          Regards,<br>
          --<br>
          Y.<br>
          <br>
          <br>
          &lt;&lt;&lt;<br>
          --- glusterfs-3.6.7/xlators/cluster/afr/src/afr-common.c   
          2015-11-25 12:55:58.000000000 +0100<br>
          +++ glusterfs-3.6.7b/xlators/cluster/afr/src/afr-common.c   
          2015-12-10 14:59:18.898580772 +0100<br>
          @@ -764,10 +764,18 @@<br>
               int             i           = 0;<br>
               int             read_subvol = -1;<br>
               afr_private_t  *priv        = NULL;<br>
          -        afr_read_subvol_args_t local_args = {0,};<br>
          +    afr_read_subvol_args_t local_args = {0,};<br>
           <br>
               priv = this-&gt;private;<br>
           <br>
          +<br>
          +    /* if forced-child use it */<br>
          +    if ((priv-&gt;forced_child &gt;= 0)<br>
          +        &amp;&amp; (priv-&gt;forced_child &lt;
          priv-&gt;child_count)<br>
          +        &amp;&amp; (readable[priv-&gt;forced_child])) {<br>
          +        return priv-&gt;forced_child;<br>
          +    }<br>
          +<br>
               /* first preference - explicitly specified or local
          subvolume */<br>
               if (priv-&gt;read_child &gt;= 0 &amp;&amp;
          readable[priv-&gt;read_child])<br>
                           return priv-&gt;read_child;<br>
          &gt;&gt;&gt;<br>
          <br>
          &lt;&lt;&lt;<br>
          @@ -83,6 +83,7 @@<br>
                   unsigned int hash_mode;       /* for when read_child
          is not set */<br>
                   int favorite_child;  /* subvolume to be preferred in
          resolving<br>
                                                    split-brain cases */<br>
          +        int forced_child;    /* child to use (if possible) */<br>
           <br>
                   gf_boolean_t inodelk_trace;<br>
                   gf_boolean_t entrylk_trace;<br>
          &gt;&gt;&gt;<br>
          <br>
          &lt;&lt;&lt;<br>
          --- glusterfs-3.6.7/xlators/cluster/afr/src/afr.c   
          2015-11-25 12:55:58.000000000 +0100<br>
          +++ glusterfs-3.6.7b/xlators/cluster/afr/src/afr.c   
          2015-12-10 16:34:55.530790442 +0100<br>
          @@ -23,6 +23,7 @@<br>
           <br>
           struct volume_options options[];<br>
           <br>
          +<br>
           int32_t<br>
           notify (xlator_t *this, int32_t event,<br>
                   void *data, ...)<br>
          @@ -106,9 +107,26 @@<br>
                   int            ret         = -1;<br>
                   int            index       = -1;<br>
                   char          *qtype       = NULL;<br>
          +    FILE          *prefer      = NULL;<br>
          +        int            i           = -1;<br>
           <br>
                   priv = this-&gt;private;<br>
           <br>
          +        /* if /var/lib/glusterd/forced-child exists read the
          content<br>
          +           and use it as prefered target for read */<br>
          +        priv-&gt;forced_child = -1;<br>
          +        prefer = fopen("/var/lib/glusterd/forced-child",
          "r");<br>
          +        if (prefer) {<br>
          +                if (fscanf(prefer, "%d", &amp;i) == 1) {<br>
          +                        if ((i &gt;= 0) &amp;&amp; (i &lt;
          priv-&gt;child_count)) {<br>
          +                                priv-&gt;forced_child = i;<br>
          +                gf_log (this-&gt;name, GF_LOG_INFO,<br>
          +                        "using %d as forced-child", i);<br>
          +                        }<br>
          +                }<br>
          +                fclose(prefer);<br>
          +        }<br>
          +<br>
               GF_OPTION_RECONF ("afr-dirty-xattr",<br>
                         priv-&gt;afr_dirty, options, str,<br>
                         out);<br>
          @@ -234,6 +252,7 @@<br>
                   int            read_subvol_index = -1;<br>
                   xlator_t      *fav_child   = NULL;<br>
                   char          *qtype       = NULL;<br>
          +    FILE          *prefer      = NULL;<br>
           <br>
                   if (!this-&gt;children) {<br>
                           gf_log (this-&gt;name, GF_LOG_ERROR,<br>
          @@ -261,6 +280,21 @@<br>
           <br>
                   priv-&gt;read_child = -1;<br>
           <br>
          +    /* if /var/lib/glusterd/forced-child exists read the
          content<br>
          +           and use it as prefered target for read */<br>
          +        priv-&gt;forced_child = -1;<br>
          +        prefer = fopen("/var/lib/glusterd/forced-child",
          "r");<br>
          +        if (prefer) {<br>
          +        if (fscanf(prefer, "%d", &amp;i) == 1) {<br>
          +            if ((i &gt;= 0) &amp;&amp; (i &lt;
          priv-&gt;child_count)) {<br>
          +                priv-&gt;forced_child = i;<br>
          +                gf_log (this-&gt;name, GF_LOG_INFO,<br>
          +                                        "using %d as
          forced-child", i);<br>
          +            }<br>
          +        }<br>
          +        fclose(prefer);<br>
          +    }<br>
          +<br>
               GF_OPTION_INIT ("afr-dirty-xattr", priv-&gt;afr_dirty,
          str, out);<br>
           <br>
               GF_OPTION_INIT ("metadata-splitbrain-forced-heal",<br>
          &gt;&gt;&gt;<br>
          <br>
          &lt;&lt;&lt;<br>
          --- glusterfs-3.6.7/glusterfsd/src/glusterfsd-mgmt.c   
          2015-11-25 12:55:58.000000000 +0100<br>
          +++ glusterfs-3.6.7b/glusterfsd/src/glusterfsd-mgmt.c   
          2015-12-10 16:34:20.530789162 +0100<br>
          @@ -1502,7 +1502,9 @@<br>
                   if (size == oldvollen &amp;&amp; (memcmp (oldvolfile,
          rsp.spec, size) == 0)) {<br>
                           gf_log (frame-&gt;this-&gt;name, GF_LOG_INFO,<br>
                                   "No change in volfile, continuing");<br>
          -                goto out;<br>
          +        if (access("/var/lib/glusterd/forced-child", R_OK) !=
          0) {<br>
          +                    goto out; /* don't skip if exists to
          re-read forced-child */<br>
          +        }<br>
                   }<br>
           <br>
                   tmpfp = tmpfile ();<br>
          &gt;&gt;&gt;<br>
          <br>
          --<br>
          Y.<br>
          <br>
          _______________________________________________<br>
          Gluster-users mailing list<br>
          <a class="moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
          <a class="moz-txt-link-freetext" href="http://www.gluster.org/mailman/listinfo/gluster-users">http://www.gluster.org/mailman/listinfo/gluster-users</a></blockquote>
        <div><br>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>