The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

Rough data density calculations

Gluster
2014-08-27

Seagate has just publicly announced 8TB HDD’s in a 3.5″ form factor. I decided to do some rough calculations to understand the density a bit better…

Note: I have decided to ignore the distinction between Terabytes (TB) and Tebibytes (TiB), since I always work in base 2, but I hate the -bi naming conventions. Seagate is most likely announcing an 8TB HDD, which is actually smaller than a true 8TiB drive. If you don’t know the difference it’s worth learning.

Rack Unit Density:

Supermicro sells a high density, double-sided 4U server, which can hold 90 x 3.5″ drives. This means you can easily store:

90 * 8TB = 720TB in 4U,

or:

720TB/4U = 180TB per U.

To store a petabyte of data, since:

1PB = 1024TB,

we need:

1024TB/180TB/U = 5.68 U.

Rounding up we realize that we can easily store one petabyte of raw data in 6U.

Since an average rack is usually 42U (tall racks can be 48U) that means we can store between seven and eight PB per rack:

42U/rack / 6U/PB = 7PB/rack

48U/rack / 6U/PB = 8PB/rack

If you can provide the power and cooling, you can quickly see that small data centers can easily get into exabyte scale if needed. One raw exabyte would only require:

1EB = 1024PB

1024PB/7PB/rack = 146 racks =~ 150 racks.

Raid and Redundancy:

Since you’ll most likely have lots of failures, I would recommend having some number of RAID sets per server, and perhaps a distributed file system like GlusterFS to replicate the data across different servers. Suppose you broke each 90 drive server into five separate RAID 6 bricks for GlusterFS:

90/5 = 18 drives per brick.

In RAID 6, you loose two drives to parity, so that means:

18 drives – 2 drives = 16 drives per brick of usable storage.

16 drives * 5 bricks * 8 TB = 640 TB after RAID 6 in 4U.

640TB/4U = 160TB/U

1024TB/160TB/U = 6.4TB/U =~ 7PB/rack.

Since I rounded a lot, the result is similar. With a replica count of 2 in a standard GlusterFS configuration, you average a total of about 3-4PB of usable storage per rack. Need a petabyte scale filesystem? One rack should do it!

Other considerations:

  • Remember that you need to take into account space for power, cooling and networking.
  • Keep in mind that SMR might be used to increase density even further (unless it’s not already being used on these drives).
  • Remember that these calculations were done to understand the order of magnitude, and not to get a precise measurement on the size of a planned cluster.
  • Petabyte scale is starting to feel small…

Conclusions:

Storage is getting very inexpensive. After the above analysis, I feel safe in concluding that:

  1. Puppet-Gluster could easily automate a petabyte scale filesystem.
  2. I have an embarrassingly small amount of personal storage.

Hope this was fun,

Happy hacking,

James

 

Disclaimer: I have not tried the 8TB Seagate HDD’s, or the Supermicro 90 x 3.5″ servers, but if you are building a petabyte scale cluster with GlusterFS/Puppet-Gluster, I’d like to hear about it!

 

BLOG

  • 06 Dec 2020
    Looking back at 2020 – with g...

    2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

    Read more
  • 27 Apr 2020
    Update from the team

    It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

    Read more
  • 03 Feb 2020
    Building a longer term focus for Gl...

    The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

    Read more