The Gluster Blog

Gluster blog stories provide high-level spotlights on our users all over the world

Fun With Renaming Files

Gluster
2012-10-22

One of the “fun” things about consistent hashing in an actual filesystem, as opposed to an object or key/value store, is that you have to support rename operations. Supporting all of the required POSIX semantics around rename can introduce many kinds of complexity, which I’ll probably write about some time, but right now I’ll just focus on one kind of complexity that affects even simple rename operations. Consider the following quite-common sequence to update a file fubar.txt:

  1. Create .fubar.txt.tmp0123
  2. Write everything into the new file
  3. Rename .fubar.txt.tmp0123 over fubar.txt

The idea here is to rely on the atomic nature of rename to make the entire update atomic[1]. The problem this poses for GlusterFS has to do with the way we use consistent hashing to locate files. A file’s hash is based on its name, so when the name changes the hash might change too. Therefore, after the rename it might be on the “wrong” server according to the hash, but we really don’t want to incur the expense of moving it. What we do instead is create a linkfile on the new correct server, pointing to where the file really is. The problem is that there’s an extra cost involved in creating the link, and then another cost involved in following it. This cost can be surprisingly high for some workloads, and it turns out that two of our own components – UFO and geosync – are affected by it.

We already had a fix in place for rsync (which is used by geosync). However, that doesn’t help anything else. Therefore, over the weekend I created a patch to handle the more general case. What the patch does is allow the user to specify a regular expression that separates the final form of the name from the transient prefix/suffix associated with the temp-file idiom above. We then apply that expression internally to put the temp file in the right place to start with, so the subsequent rename all occurs within a single server and doesn’t require a linkfile.

It’s worth noting that we wouldn’t have this problem if we located files by hashing immutable GFIDs instead of mutable names, and that for directory traversals etc. we do have the GFID in hand. However, not all lookups occur that way. For the remainder, we’d have to fall back to querying every server, which would have a catastrophic effect on performance. We could go to a scheme where we always query the “right” server for a file’s parent directory if we don’t already have a GFID, and make sure that server has a linkfile pointing to the file’s real location, but then we’re kind of right back where we started. Actually we’d be even worse off, because the extra linkfile would be needed even for files that were never renamed. We’re better off with name-based hashing plus linkfiles plus rename-target prediction, as inelegant as that combination might be.

[1] The rename might actually not be reflected on storage unless/until you call fsync on the parent directory, but that’s another future post.

BLOG

  • 06 Dec 2020
    Looking back at 2020 – with g...

    2020 has not been a year we would have been able to predict. With a worldwide pandemic and lives thrown out of gear, as we head into 2021, we are thankful that our community and project continued to receive new developers, users and make small gains. For that and a...

    Read more
  • 27 Apr 2020
    Update from the team

    It has been a while since we provided an update to the Gluster community. Across the world various nations, states and localities have put together sets of guidelines around shelter-in-place and quarantine. We request our community members to stay safe, to care for their loved ones, to continue to be...

    Read more
  • 03 Feb 2020
    Building a longer term focus for Gl...

    The initial rounds of conversation around the planning of content for release 8 has helped the project identify one key thing – the need to stagger out features and enhancements over multiple releases. Thus, while release 8 is unlikely to be feature heavy as previous releases, it will be the...

    Read more