<div dir="ltr">Hello,<div><br></div><div>Ok you can stick with NFS, will just have to manage failover if needed.</div><div><br></div><div>So they use 4TB hard drive (80TB/20 disks).</div><div>each disk can provide let&#39;s say 150 io/s max. that means 3000 io/s max, without raid cost &amp; co.</div><div><br></div><div>From your explaination, I guess you have many workloads running in parallel, and so 20 disks may not be enough anyway.</div><div><br></div><div>You first must be sure that storage can physically provide your needs in terms or capacity and performance. </div><div><br></div><div>Then you can choose solution that fit best your needs.</div><div><br></div><div>just my 2cts</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature">Cordialement,<br>Mathieu CHATEAU<br><a href="http://www.lotp.fr" target="_blank">http://www.lotp.fr</a></div></div>

<br><div class="gmail_quote">2015-07-14 22:21 GMT+02:00 Forrest Aldrich <span dir="ltr">&lt;<a href="mailto:forrie@gmail.com" target="_blank">forrie@gmail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000">

    The instances we use via Direct Connect (a third party company) have

    upwards 20 disks and a total of 80T.   That part is covered.<br>

    <br>

    If we were to experiment with EBS, that would be a different case as

    we&#39;d need to stripe them.<br>

    <br>

    Our present model requires one single namespace via NFS.   The

    Instances are running CentOS 6.x.    The Instances mount the Direct

    Connect disk space via NFS, the only other alternative we&#39;d have is

    iSCSI which wouldn&#39;t work for the level of sharing we need.<span class=""><br>

    <br>

    <br>

    <br>

    <br>

    <div>On 7/14/15 4:18 PM, Mathieu Chateau

      wrote:<br>

    </div>

    </span><blockquote type="cite"><span class="">

      <div dir="ltr">by NFS i think you just mean &quot;all servers seeing

        and changing sames files&quot; ? That can be done with fuse, without

        nfs.

        <div>NFS is harder for failover while automatic with fuse (no

          need for dynamic dns or virtual IP).</div>

        <div><br>

        </div>

        <div>for redundancy I mean : What failure do you want to survive

          ?</div>

        <div>

          <ul>

            <li>Loosing a disk</li>

            <li>Filesystem corrupt</li>

            <li>Server lost or in maintenance</li>

            <li>Whole region down</li>

          </ul>

          <div>Depending on your needs, then you may have to replicate

            data accross gluster brick or even use a geo dispersed

            brick.</div>

        </div>

        <div><br>

        </div>

        <div>Will network between your servers and node be able to

          handle that traffic (380MB/s = 3040Mb/s) ?</div>

        <div><br>

        </div>

        <div>I guess gluster can handle that load, you are using big

          files and this is where gluster deliver highest output.

          Nevertheless, you will need many disk to provide these i/o,

          even more if using replicated bricks.</div>

        <div><br>

        </div>

      </div>

      </span><div class="gmail_extra"><span class=""><br clear="all">

        <div>

          <div>Cordialement,<br>

            Mathieu CHATEAU<br>

            <a href="http://www.lotp.fr" target="_blank">http://www.lotp.fr</a></div>

        </div>

        <br>

        </span><div class="gmail_quote"><span class="">2015-07-14 21:15 GMT+02:00 Forrest

          Aldrich <span dir="ltr">&lt;<a href="mailto:forrie@gmail.com" target="_blank">forrie@gmail.com</a>&gt;</span>:<br>

          </span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div bgcolor="#FFFFFF" text="#000000"> Sorry, I should have

              noted that.Â  380MB is both read and write (I confirmed

              this with a developer).<br>

              <br>

              We do need the NFS stack, as that&#39;s how all the code and

              various many Instances work -- we have several &quot;workers&quot;

              that chop up video on the same namespace.Â  It&#39;s not

              efficient, but that&#39;s how it has to be for now.<br>

              <br>

              Redundancy, in terms of the server?Â Â  We have RAIDED

              volumes if that&#39;s what you&#39;re referring to.<span class=""><br>

              <br>

              Here&#39;s a basic outline of the flow (as I understand it):<br>

              <br>

              <br>

              Video Capture Agent sends in large file of video (30gb

              +/-) <br>

              <br>

              Administrative host receives and writes to NFS<br>

              <br>

              A process copies this over to another point in the

              namespace<br>

              <br>

              Another Instance picks up the file, reads and starts

              processing and writes (FFMPEG is involved)<br>

              <br>

              <br></span>

              Something like that -- I may not have all the steps, but

              essentially there&#39;s a ton of I/O going on.Â Â  I know our

              code model is not efficient, but it&#39;s complicated and

              can&#39;t just be changed (it&#39;s based on an open source

              product and there&#39;s some code baggage).<span class=""><br>

              <br>

              We looked into another product that allegedly scaled out

              using multiple NFS heads with massive local cache (AWS

              instances) and sharing the same space, but it was horrible

              and just didn&#39;t work for us.<br>

              <br>

              <br>

              <br>

              Thank you.

              </span><div>

                <div><span class=""><br>

                  <br>

                  <br>

                  <br>

                  <div>On 7/14/15 3:06 PM, Mathieu Chateau wrote:<br>

                  </div>

                  </span><blockquote type="cite"><span class="">

                    <div dir="ltr">Hello,

                      <div><br>

                      </div>

                      <div>is it 380MB in read or write ? What level of

                        redundancy do you need?</div>

                      <div>do you really need nfs stack or just a mount

                        point (and so be able to use native gluster

                        protocol) ?</div>

                      <div><br>

                      </div>

                      <div>Gluster load is mostly put on clients, not

                        server (clients do the sync writes to all

                        replica, and do the memory cache)</div>

                      <div><br>

                      </div>

                    </div>

                    </span><div class="gmail_extra"><span class=""><br clear="all">

                      <div>

                        <div>Cordialement,<br>

                          Mathieu CHATEAU<br>

                          <a href="http://www.lotp.fr" target="_blank">http://www.lotp.fr</a></div>

                      </div>

                      <br>

                      </span><div class="gmail_quote"><span class="">2015-07-14 20:49

                        GMT+02:00 Forrest Aldrich <span dir="ltr">&lt;<a href="mailto:forrie@gmail.com" target="_blank"></a><a href="mailto:forrie@gmail.com" target="_blank">forrie@gmail.com</a>&gt;</span>:<br>

                        </span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I&#39;m exploring

                          solutions to help us achieve high throughput

                          and scalability within the AWS environment.Â 

                          Â Specifically, I work in a department where

                          we handle and produce video content that

                          results in very large files (30GB etc) that

                          must be written to NFS, chopped up and copied

                          over on the same mount (there are some odd

                          limits to the code we use, but that&#39;s outside

                          the scope of this question).<br>

                          <br>

                          Currently, we&#39;re using a commercial vendor

                          with AWS, with dedicated Direct Connect

                          instances as the back end to our production.Â 

                          Â We&#39;re maxing out at 350 to 380 MB/s which is

                          not enough.Â  We expect our capacity will

                          double or even triple when we bring on more

                          classes or even other entities and we need to

                          find a way to squeeze out as much I/O as we

                          can.<span class=""><br>

                          <br>

                          Our software model depends on NFS, there&#39;s no

                          way around that presently.<br>

                          <br></span>

                          Since GlusterFS uses FUSE, I&#39;m concerned about

                          performance, which is a key issue.Â  Â Sounds

                          like a STRIPE would be appropriate.<br>

                          <br>

                          My basic understanding of Gluster is the

                          ability to include several &quot;bricks&quot; which

                          could be multiples of either dedicated EBS

                          volumes or even multiple instances of the

                          above commercial vendor, served up via NFS

                          namespace, which would be transparently a

                          single namespace to client connections.Â 

                          Â The I/O could be distributed in this manner.<br>

                          <br>

                          I wonder if someone here with more experience

                          with the above might elaborate on whether

                          GlusterFS could be used in the above scenario.

                          Specifically, performance I/O.Â  We&#39;d really

                          like to gain upwards as much as possible, like

                          700 Mb/s and 1 GB/s and up if possible.<span class=""><br>

                          <br>

                          <br>

                          <br>

                          Thanks in advance.<br>

                          <br>

                          <br>

                          <br>

                          <br>

                          <br>

_______________________________________________<br>

                          Gluster-users mailing list<br>

                          <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

                          <a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

                        </span></blockquote>

                      </div>

                      <br>

                    </div>

                  </blockquote>

                  <br>

                </div>

              </div>

            </div><span class="">

            <br>

            _______________________________________________<br>

            Gluster-users mailing list<br>

            <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

            <a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

          </span></blockquote>

        </div>

        <br>

      </div>

    </blockquote>

    <br>

  </div>

<br>_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br></blockquote></div><br></div>