<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Hrm, we have an 80T volume now, comprised over several RAID0, 2-disk
    stripes; each drive 5TB SATA.<br>
    <br>
    That's on the commercial third-party Direct Connect instance; which
    is not to say that we couldn't test something using a slew of EBS
    volumes in some odd configuration.   We're open to whatever at this
    point.<br>
    <br>
    Amazon is offering EFS, but I'm not convinced yet that this will get
    us the performance we need.<br>
    <br>
    Wouldn't FUSE in this configuration somewhere provide a performance
    hit?   I've been warned to stay away from FUSE, but I admit not to
    having all the facts yet.<br>
    <br>
    <br>
    Thank you.<br>
    <br>
    <br>
    <div class="moz-cite-prefix">On 7/14/15 4:29 PM, Mathieu Chateau
      wrote:<br>
    </div>
    <blockquote
cite="mid:CACpSnaJ58qX+Sy7ptoK5shpE4AHms_aOQhR3OpfrY=K0oqLyqA@mail.gmail.com"
      type="cite">
      <div dir="ltr">Hello,
        <div><br>
        </div>
        <div>Ok you can stick with NFS, will just have to manage
          failover if needed.</div>
        <div><br>
        </div>
        <div>So they use 4TB hard drive (80TB/20 disks).</div>
        <div>each disk can provide let's say 150 io/s max. that means
          3000 io/s max, without raid cost &amp; co.</div>
        <div><br>
        </div>
        <div>From your explaination, I guess you have many workloads
          running in parallel, and so 20 disks may not be enough anyway.</div>
        <div><br>
        </div>
        <div>You first must be sure that storage can physically provide
          your needs in terms or capacity and performance. </div>
        <div><br>
        </div>
        <div>Then you can choose solution that fit best your needs.</div>
        <div><br>
        </div>
        <div>just my 2cts</div>
      </div>
      <div class="gmail_extra"><br clear="all">
        <div>
          <div class="gmail_signature">Cordialement,<br>
            Mathieu CHATEAU<br>
            <a moz-do-not-send="true" href="http://www.lotp.fr"
              target="_blank">http://www.lotp.fr</a></div>
        </div>
        <br>
        <div class="gmail_quote">2015-07-14 22:21 GMT+02:00 Forrest
          Aldrich <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:forrie@gmail.com" target="_blank">forrie@gmail.com</a>&gt;</span>:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000"> The instances we use
              via Direct Connect (a third party company) have upwards 20
              disks and a total of 80T.   That part is covered.<br>
              <br>
              If we were to experiment with EBS, that would be a
              different case as we'd need to stripe them.<br>
              <br>
              Our present model requires one single namespace via NFS.  
              The Instances are running CentOS 6.x.    The Instances
              mount the Direct Connect disk space via NFS, the only
              other alternative we'd have is iSCSI which wouldn't work
              for the level of sharing we need.<span class=""><br>
                <br>
                <br>
                <br>
                <br>
                <div>On 7/14/15 4:18 PM, Mathieu Chateau wrote:<br>
                </div>
              </span>
              <blockquote type="cite"><span class="">
                  <div dir="ltr">by NFS i think you just mean "all
                    servers seeing and changing sames files" ? That can
                    be done with fuse, without nfs.
                    <div>NFS is harder for failover while automatic with
                      fuse (no need for dynamic dns or virtual IP).</div>
                    <div><br>
                    </div>
                    <div>for redundancy I mean : What failure do you
                      want to survive ?</div>
                    <div>
                      <ul>
                        <li>Loosing a disk</li>
                        <li>Filesystem corrupt</li>
                        <li>Server lost or in maintenance</li>
                        <li>Whole region down</li>
                      </ul>
                      <div>Depending on your needs, then you may have to
                        replicate data accross gluster brick or even use
                        a geo dispersed brick.</div>
                    </div>
                    <div><br>
                    </div>
                    <div>Will network between your servers and node be
                      able to handle that traffic (380MB/s = 3040Mb/s) ?</div>
                    <div><br>
                    </div>
                    <div>I guess gluster can handle that load, you are
                      using big files and this is where gluster deliver
                      highest output. Nevertheless, you will need many
                      disk to provide these i/o, even more if using
                      replicated bricks.</div>
                    <div><br>
                    </div>
                  </div>
                </span>
                <div class="gmail_extra"><span class=""><br clear="all">
                    <div>
                      <div>Cordialement,<br>
                        Mathieu CHATEAU<br>
                        <a moz-do-not-send="true"
                          href="http://www.lotp.fr" target="_blank">http://www.lotp.fr</a></div>
                    </div>
                    <br>
                  </span>
                  <div class="gmail_quote"><span class="">2015-07-14
                      21:15 GMT+02:00 Forrest Aldrich <span dir="ltr">&lt;<a
                          moz-do-not-send="true"
                          href="mailto:forrie@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:forrie@gmail.com">forrie@gmail.com</a></a>&gt;</span>:<br>
                    </span>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div bgcolor="#FFFFFF" text="#000000"> Sorry, I
                        should have noted that.  380MB is both read and
                        write (I confirmed this with a developer).<br>
                        <br>
                        We do need the NFS stack, as that's how all the
                        code and various many Instances work -- we have
                        several "workers" that chop up video on the same
                        namespace.  It's not efficient, but that's how
                        it has to be for now.<br>
                        <br>
                        Redundancy, in terms of the server?   We have
                        RAIDED volumes if that's what you're referring
                        to.<span class=""><br>
                          <br>
                          Here's a basic outline of the flow (as I
                          understand it):<br>
                          <br>
                          <br>
                          Video Capture Agent sends in large file of
                          video (30gb +/-) <br>
                          <br>
                          Administrative host receives and writes to NFS<br>
                          <br>
                          A process copies this over to another point in
                          the namespace<br>
                          <br>
                          Another Instance picks up the file, reads and
                          starts processing and writes (FFMPEG is
                          involved)<br>
                          <br>
                          <br>
                        </span> Something like that -- I may not have
                        all the steps, but essentially there's a ton of
                        I/O going on.   I know our code model is not
                        efficient, but it's complicated and can't just
                        be changed (it's based on an open source product
                        and there's some code baggage).<span class=""><br>
                          <br>
                          We looked into another product that allegedly
                          scaled out using multiple NFS heads with
                          massive local cache (AWS instances) and
                          sharing the same space, but it was horrible
                          and just didn't work for us.<br>
                          <br>
                          <br>
                          <br>
                          Thank you. </span>
                        <div>
                          <div><span class=""><br>
                              <br>
                              <br>
                              <br>
                              <div>On 7/14/15 3:06 PM, Mathieu Chateau
                                wrote:<br>
                              </div>
                            </span>
                            <blockquote type="cite"><span class="">
                                <div dir="ltr">Hello,
                                  <div><br>
                                  </div>
                                  <div>is it 380MB in read or write ?
                                    What level of redundancy do you
                                    need?</div>
                                  <div>do you really need nfs stack or
                                    just a mount point (and so be able
                                    to use native gluster protocol) ?</div>
                                  <div><br>
                                  </div>
                                  <div>Gluster load is mostly put on
                                    clients, not server (clients do the
                                    sync writes to all replica, and do
                                    the memory cache)</div>
                                  <div><br>
                                  </div>
                                </div>
                              </span>
                              <div class="gmail_extra"><span class=""><br
                                    clear="all">
                                  <div>
                                    <div>Cordialement,<br>
                                      Mathieu CHATEAU<br>
                                      <a moz-do-not-send="true"
                                        href="http://www.lotp.fr"
                                        target="_blank">http://www.lotp.fr</a></div>
                                  </div>
                                  <br>
                                </span>
                                <div class="gmail_quote"><span class="">2015-07-14
                                    20:49 GMT+02:00 Forrest Aldrich <span
                                      dir="ltr">&lt;<a
                                        moz-do-not-send="true"
                                        href="mailto:forrie@gmail.com"
                                        target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:forrie@gmail.com">forrie@gmail.com</a></a>&gt;</span>:<br>
                                  </span>
                                  <blockquote class="gmail_quote"
                                    style="margin:0 0 0
                                    .8ex;border-left:1px #ccc
                                    solid;padding-left:1ex">I'm
                                    exploring solutions to help us
                                    achieve high throughput and
                                    scalability within the AWS
                                    environment.  Â Specifically, I
                                    work in a department where we handle
                                    and produce video content that
                                    results in very large files (30GB
                                    etc) that must be written to NFS,
                                    chopped up and copied over on the
                                    same mount (there are some odd
                                    limits to the code we use, but
                                    that's outside the scope of this
                                    question).<br>
                                    <br>
                                    Currently, we're using a commercial
                                    vendor with AWS, with dedicated
                                    Direct Connect instances as the back
                                    end to our production.  Â We're
                                    maxing out at 350 to 380 MB/s which
                                    is not enough.  We expect our
                                    capacity will double or even triple
                                    when we bring on more classes or
                                    even other entities and we need to
                                    find a way to squeeze out as much
                                    I/O as we can.<span class=""><br>
                                      <br>
                                      Our software model depends on NFS,
                                      there's no way around that
                                      presently.<br>
                                      <br>
                                    </span> Since GlusterFS uses FUSE,
                                    I'm concerned about performance,
                                    which is a key issue.  Â Sounds
                                    like a STRIPE would be appropriate.<br>
                                    <br>
                                    My basic understanding of Gluster is
                                    the ability to include several
                                    "bricks" which could be multiples of
                                    either dedicated EBS volumes or even
                                    multiple instances of the above
                                    commercial vendor, served up via NFS
                                    namespace, which would be
                                    transparently a single namespace to
                                    client connections.  Â The I/O
                                    could be distributed in this manner.<br>
                                    <br>
                                    I wonder if someone here with more
                                    experience with the above might
                                    elaborate on whether GlusterFS could
                                    be used in the above scenario.
                                    Specifically, performance I/O. 
                                    We'd really like to gain upwards as
                                    much as possible, like 700 Mb/s and
                                    1 GB/s and up if possible.<span
                                      class=""><br>
                                      <br>
                                      <br>
                                      <br>
                                      Thanks in advance.<br>
                                      <br>
                                      <br>
                                      <br>
                                      <br>
                                      <br>
_______________________________________________<br>
                                      Gluster-users mailing list<br>
                                      <a moz-do-not-send="true"
                                        href="mailto:Gluster-users@gluster.org"
                                        target="_blank">Gluster-users@gluster.org</a><br>
                                      <a moz-do-not-send="true"
                                        href="http://www.gluster.org/mailman/listinfo/gluster-users"
                                        rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
                                    </span></blockquote>
                                </div>
                                <br>
                              </div>
                            </blockquote>
                            <br>
                          </div>
                        </div>
                      </div>
                      <span class=""> <br>
                        _______________________________________________<br>
                        Gluster-users mailing list<br>
                        <a moz-do-not-send="true"
                          href="mailto:Gluster-users@gluster.org"
                          target="_blank">Gluster-users@gluster.org</a><br>
                        <a moz-do-not-send="true"
                          href="http://www.gluster.org/mailman/listinfo/gluster-users"
                          rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
                      </span></blockquote>
                  </div>
                  <br>
                </div>
              </blockquote>
              <br>
            </div>
            <br>
            _______________________________________________<br>
            Gluster-users mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
            <a moz-do-not-send="true"
              href="http://www.gluster.org/mailman/listinfo/gluster-users"
              rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>