The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

A Picture Can Beat 1000 Dead Horses

Gluster
2013-04-25

Unless this is your first time reading my blog, you are probably aware that I am beginning to become obsessed with the idea of a data transfer service.  In this post I continue the topic from my previous post by introducing a couple of diagrams.

the_wild_west

A diagram of a possible swift deployment is on the right side.  On the left is a client to that service.  The swift deployment is very well managed, redundant and highly available.  The client speaks to the swift via a well defined REST API and using supported client side software to interpret the protocol.  However, between the server side network protocol interpreter and the client side network protocol interpreter is the wild west.

The wild west is completely unprotected and unmanaged. Many things can occur that cause a lost, slow, or disruptive transfer.  For example

  • Dropped connections
  • Congestion events
  • Network partitions

Such problems make data transfer expensive.  Ideally there would be a service to oversee the transfer.  Transfer could be check-pointed as they progress so that if a connection is dropped it could be restarted with minimal loss.  Also it could try to maximize the efficiency of the pathway between the source and the destination by tuning the protocols in use (like setting a good value for the TCP window), or using multicast protocols where appropriate (like bittorrent), or scheduling transfers so as to not shoot itself in the foot.

A safer architecture would look like this:

tamed_west

The transfer service is now in a position to manage the transfer thus it allows for the following:

  • A fire and forget asynchronous transfer request from the client.
  • Baby sit and checkpoint the transfer.  If it fails restart it from the last checkpoint.
  • Schedule transfer for optimal times.
  • Prioritize transfers and their use of the network.
  • Coalesce transfer requests and schedule appropriately and into multicast sessions.
  • Negotiate the best possible protocol between the two endpoints.
  • Verify that the data successfully is written to the destination storage system and verify its integrity.

BLOG

  • 06 Dec 2020
    Looking back at 2020 – with g...

    2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

    Read more
  • 27 Apr 2020
    Update from the team

    It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

    Read more
  • 03 Feb 2020
    Building a longer term focus for Gl...

    The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

    Read more