<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hrm, we have an 80T volume now, comprised over several RAID0, 2-disk

    stripes; each drive 5TB SATA.<br>

    <br>

    That's on the commercial third-party Direct Connect instance; which

    is not to say that we couldn't test something using a slew of EBS

    volumes in some odd configuration.   We're open to whatever at this

    point.<br>

    <br>

    Amazon is offering EFS, but I'm not convinced yet that this will get

    us the performance we need.<br>

    <br>

    Wouldn't FUSE in this configuration somewhere provide a performance

    hit?   I've been warned to stay away from FUSE, but I admit not to

    having all the facts yet.<br>

    <br>

    <br>

    Thank you.<br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 7/14/15 4:29 PM, Mathieu Chateau

      wrote:<br>

    </div>

    <blockquote

cite="mid:CACpSnaJ58qX+Sy7ptoK5shpE4AHms_aOQhR3OpfrY=K0oqLyqA@mail.gmail.com"

      type="cite">

      <div dir="ltr">Hello,

        <div><br>

        </div>

        <div>Ok you can stick with NFS, will just have to manage

          failover if needed.</div>

        <div><br>

        </div>

        <div>So they use 4TB hard drive (80TB/20 disks).</div>

        <div>each disk can provide let's say 150 io/s max. that means

          3000 io/s max, without raid cost &amp; co.</div>

        <div><br>

        </div>

        <div>From your explaination, I guess you have many workloads

          running in parallel, and so 20 disks may not be enough anyway.</div>

        <div><br>

        </div>

        <div>You first must be sure that storage can physically provide

          your needs in terms or capacity and performance. </div>

        <div><br>

        </div>

        <div>Then you can choose solution that fit best your needs.</div>

        <div><br>

        </div>

        <div>just my 2cts</div>

      </div>

      <div class="gmail_extra"><br clear="all">

        <div>

          <div class="gmail_signature">Cordialement,<br>

            Mathieu CHATEAU<br>

            <a moz-do-not-send="true" href="http://www.lotp.fr"

              target="_blank">http://www.lotp.fr</a></div>

        </div>

        <br>

        <div class="gmail_quote">2015-07-14 22:21 GMT+02:00 Forrest

          Aldrich <span dir="ltr">&lt;<a moz-do-not-send="true"

              href="mailto:forrie@gmail.com" target="_blank">forrie@gmail.com</a>&gt;</span>:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div bgcolor="#FFFFFF" text="#000000"> The instances we use

              via Direct Connect (a third party company) have upwards 20

              disks and a total of 80T.   That part is covered.<br>

              <br>

              If we were to experiment with EBS, that would be a

              different case as we'd need to stripe them.<br>

              <br>

              Our present model requires one single namespace via NFS.  

              The Instances are running CentOS 6.x.    The Instances

              mount the Direct Connect disk space via NFS, the only

              other alternative we'd have is iSCSI which wouldn't work

              for the level of sharing we need.<span class=""><br>

                <br>

                <br>

                <br>

                <br>

                <div>On 7/14/15 4:18 PM, Mathieu Chateau wrote:<br>

                </div>

              </span>

              <blockquote type="cite"><span class="">

                  <div dir="ltr">by NFS i think you just mean "all

                    servers seeing and changing sames files" ? That can

                    be done with fuse, without nfs.

                    <div>NFS is harder for failover while automatic with

                      fuse (no need for dynamic dns or virtual IP).</div>

                    <div><br>

                    </div>

                    <div>for redundancy I mean : What failure do you

                      want to survive ?</div>

                    <div>

                      <ul>

                        <li>Loosing a disk</li>

                        <li>Filesystem corrupt</li>

                        <li>Server lost or in maintenance</li>

                        <li>Whole region down</li>

                      </ul>

                      <div>Depending on your needs, then you may have to

                        replicate data accross gluster brick or even use

                        a geo dispersed brick.</div>

                    </div>

                    <div><br>

                    </div>

                    <div>Will network between your servers and node be

                      able to handle that traffic (380MB/s = 3040Mb/s) ?</div>

                    <div><br>

                    </div>

                    <div>I guess gluster can handle that load, you are

                      using big files and this is where gluster deliver

                      highest output. Nevertheless, you will need many

                      disk to provide these i/o, even more if using

                      replicated bricks.</div>

                    <div><br>

                    </div>

                  </div>

                </span>

                <div class="gmail_extra"><span class=""><br clear="all">

                    <div>

                      <div>Cordialement,<br>

                        Mathieu CHATEAU<br>

                        <a moz-do-not-send="true"

                          href="http://www.lotp.fr" target="_blank">http://www.lotp.fr</a></div>

                    </div>

                    <br>

                  </span>

                  <div class="gmail_quote"><span class="">2015-07-14

                      21:15 GMT+02:00 Forrest Aldrich <span dir="ltr">&lt;<a

                          moz-do-not-send="true"

                          href="mailto:forrie@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:forrie@gmail.com">forrie@gmail.com</a></a>&gt;</span>:<br>

                    </span>

                    <blockquote class="gmail_quote" style="margin:0 0 0

                      .8ex;border-left:1px #ccc solid;padding-left:1ex">

                      <div bgcolor="#FFFFFF" text="#000000"> Sorry, I

                        should have noted that.Â  380MB is both read and

                        write (I confirmed this with a developer).<br>

                        <br>

                        We do need the NFS stack, as that's how all the

                        code and various many Instances work -- we have

                        several "workers" that chop up video on the same

                        namespace.Â  It's not efficient, but that's how

                        it has to be for now.<br>

                        <br>

                        Redundancy, in terms of the server?Â Â  We have

                        RAIDED volumes if that's what you're referring

                        to.<span class=""><br>

                          <br>

                          Here's a basic outline of the flow (as I

                          understand it):<br>

                          <br>

                          <br>

                          Video Capture Agent sends in large file of

                          video (30gb +/-) <br>

                          <br>

                          Administrative host receives and writes to NFS<br>

                          <br>

                          A process copies this over to another point in

                          the namespace<br>

                          <br>

                          Another Instance picks up the file, reads and

                          starts processing and writes (FFMPEG is

                          involved)<br>

                          <br>

                          <br>

                        </span> Something like that -- I may not have

                        all the steps, but essentially there's a ton of

                        I/O going on.Â Â  I know our code model is not

                        efficient, but it's complicated and can't just

                        be changed (it's based on an open source product

                        and there's some code baggage).<span class=""><br>

                          <br>

                          We looked into another product that allegedly

                          scaled out using multiple NFS heads with

                          massive local cache (AWS instances) and

                          sharing the same space, but it was horrible

                          and just didn't work for us.<br>

                          <br>

                          <br>

                          <br>

                          Thank you. </span>

                        <div>

                          <div><span class=""><br>

                              <br>

                              <br>

                              <br>

                              <div>On 7/14/15 3:06 PM, Mathieu Chateau

                                wrote:<br>

                              </div>

                            </span>

                            <blockquote type="cite"><span class="">

                                <div dir="ltr">Hello,

                                  <div><br>

                                  </div>

                                  <div>is it 380MB in read or write ?

                                    What level of redundancy do you

                                    need?</div>

                                  <div>do you really need nfs stack or

                                    just a mount point (and so be able

                                    to use native gluster protocol) ?</div>

                                  <div><br>

                                  </div>

                                  <div>Gluster load is mostly put on

                                    clients, not server (clients do the

                                    sync writes to all replica, and do

                                    the memory cache)</div>

                                  <div><br>

                                  </div>

                                </div>

                              </span>

                              <div class="gmail_extra"><span class=""><br

                                    clear="all">

                                  <div>

                                    <div>Cordialement,<br>

                                      Mathieu CHATEAU<br>

                                      <a moz-do-not-send="true"

                                        href="http://www.lotp.fr"

                                        target="_blank">http://www.lotp.fr</a></div>

                                  </div>

                                  <br>

                                </span>

                                <div class="gmail_quote"><span class="">2015-07-14

                                    20:49 GMT+02:00 Forrest Aldrich <span

                                      dir="ltr">&lt;<a

                                        moz-do-not-send="true"

                                        href="mailto:forrie@gmail.com"

                                        target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:forrie@gmail.com">forrie@gmail.com</a></a>&gt;</span>:<br>

                                  </span>

                                  <blockquote class="gmail_quote"

                                    style="margin:0 0 0

                                    .8ex;border-left:1px #ccc

                                    solid;padding-left:1ex">I'm

                                    exploring solutions to help us

                                    achieve high throughput and

                                    scalability within the AWS

                                    environment.Â  Â Specifically, I

                                    work in a department where we handle

                                    and produce video content that

                                    results in very large files (30GB

                                    etc) that must be written to NFS,

                                    chopped up and copied over on the

                                    same mount (there are some odd

                                    limits to the code we use, but

                                    that's outside the scope of this

                                    question).<br>

                                    <br>

                                    Currently, we're using a commercial

                                    vendor with AWS, with dedicated

                                    Direct Connect instances as the back

                                    end to our production.Â  Â We're

                                    maxing out at 350 to 380 MB/s which

                                    is not enough.Â  We expect our

                                    capacity will double or even triple

                                    when we bring on more classes or

                                    even other entities and we need to

                                    find a way to squeeze out as much

                                    I/O as we can.<span class=""><br>

                                      <br>

                                      Our software model depends on NFS,

                                      there's no way around that

                                      presently.<br>

                                      <br>

                                    </span> Since GlusterFS uses FUSE,

                                    I'm concerned about performance,

                                    which is a key issue.Â  Â Sounds

                                    like a STRIPE would be appropriate.<br>

                                    <br>

                                    My basic understanding of Gluster is

                                    the ability to include several

                                    "bricks" which could be multiples of

                                    either dedicated EBS volumes or even

                                    multiple instances of the above

                                    commercial vendor, served up via NFS

                                    namespace, which would be

                                    transparently a single namespace to

                                    client connections.Â  Â The I/O

                                    could be distributed in this manner.<br>

                                    <br>

                                    I wonder if someone here with more

                                    experience with the above might

                                    elaborate on whether GlusterFS could

                                    be used in the above scenario.

                                    Specifically, performance I/O.Â 

                                    We'd really like to gain upwards as

                                    much as possible, like 700 Mb/s and

                                    1 GB/s and up if possible.<span

                                      class=""><br>

                                      <br>

                                      <br>

                                      <br>

                                      Thanks in advance.<br>

                                      <br>

                                      <br>

                                      <br>

                                      <br>

                                      <br>

_______________________________________________<br>

                                      Gluster-users mailing list<br>

                                      <a moz-do-not-send="true"

                                        href="mailto:Gluster-users@gluster.org"

                                        target="_blank">Gluster-users@gluster.org</a><br>

                                      <a moz-do-not-send="true"

                                        href="http://www.gluster.org/mailman/listinfo/gluster-users"

                                        rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

                                    </span></blockquote>

                                </div>

                                <br>

                              </div>

                            </blockquote>

                            <br>

                          </div>

                        </div>

                      </div>

                      <span class=""> <br>

                        _______________________________________________<br>

                        Gluster-users mailing list<br>

                        <a moz-do-not-send="true"

                          href="mailto:Gluster-users@gluster.org"

                          target="_blank">Gluster-users@gluster.org</a><br>

                        <a moz-do-not-send="true"

                          href="http://www.gluster.org/mailman/listinfo/gluster-users"

                          rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

                      </span></blockquote>

                  </div>

                  <br>

                </div>

              </blockquote>

              <br>

            </div>

            <br>

            _______________________________________________<br>

            Gluster-users mailing list<br>

            <a moz-do-not-send="true"

              href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

            <a moz-do-not-send="true"

              href="http://www.gluster.org/mailman/listinfo/gluster-users"

              rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

          </blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

  </body>

</html>