GlusterFS Features

From GlusterDocumentation

Jump to: navigation, search
  • User-space Design: User space design comes with number of advantages. No kernel patches or kernel modules are required. Complex features can be added relatively easily. Easy to debug and maintain. Bugs do not crash the OS. With so many advantages, GlusterFS can run as fast or even faster than kernel based FS.
  • Stackable Modules: Stackable modular design allows GlusterFS features to be extended beyond the scope of a regular file system without compromising the elegance in framework. All most all of the features (from performance options, distributed locking to replication and striping) are implemented as stackable modules (translators). Users can select appropriate translators specific to their application and hardware needs and build an optimized storage system. GlusterFS borrowed the concept of user-space stackable file system from the GNU Hurd kernel.
  • No Meta-data: Unlike other cluster file systems which addressed the parallelization at block level, GlusterFS engineers believed that problem is at volume management and I/O scheduling level. This enabled meta-data info to be offloaded to underlying mature disk file systems. Eliminating centralized meta-data server gave significant scaling and reliability advantage to GlusterFS.
  • Self-healing: As your volume size grows beyond 32TBs, fsck (filesystem check) downtime becomes a huge problem. GlusterFS has no fsck. It heals itself transparently with very less impact on performance.
  • NFS-like Backend: Users' files and folders are stored as it is at the backend. Users can always access the data through scp or ftp (like NFS), even without GlusterFS installed. This simplicity gives a lot of confidence to scale to multi-peta bytes.
  • Automatic Replication: Automatic File Replication (AFR) feature in GlusterFS replicates all your I/O in real time. With AFR, GlusterFS can with-stand hardware failures.
  • Aggregation: Unify feature in GlusterFS allows aggregation of various storage bricks (servers) in to one large volume. It does distribution at the file level. Distribution policy is decided by the chosen I/O scheduler.
  • Scalable Striping: GlusterFS striping scales to huge number of bricks unlike a meta-data based approach. Even striped files can easily be recovered by simply dd'ing strided blocks back into regular files.
  • Pluggable I/O Schedulers: Users can choose different I/O schedulers depending upon application's requirement. Available options are adaptive-least-usage self tuning I/O scheduler, round robin I/O scheduler, non-uniform-memory-access I/O scheduler, random I/O scheduler, wild-card scheduler. It is fairly easy to develop custom schedulers.
  • Pluggable Transport: GlusterFS supports TCP/IP based networks such as Fast Ethernet, GigE, 10 GigE and RDMA based Infiniband. Available options are TCP, IB-verbs, Unix-IPC.
  • Pluggable Auth: GlusterFS supports IP and user-pass based authentication. It is possible to extend the auth interface to support MySQL or LDAP based authentication.
  • Distributed Locking: Locks translator in GlusterFS supports full featured POSIX distributed locking.
  • Distributed BDB: BerkeleyDB backend module enables GlusterFS to storage small files very efficiently. Billions of small files can be packed into fewer BDB files spread across multiple storage bricks. User is still presented with a POSIX compliant file system view.
  • Embeddable: Entire GlusterFS filesystem can be embedded into a web farm of Apache or Lighttpd web servers. This enables web requests to bypass kernel and access data directly. Particularly if your have Infiniband, Apache or Lighttpd won't even know it is performing RDMA I/O.
  • Performance Modules: Number of performance modules such as IO-Cache, IO-Threads, Read-Ahead and Write-Behind are available for optimizing your storage performance.
  • Flexible Volume Management: Every feature in GlusterFS (from network, scheduler, cache to disk) is represented as a logical volume. Users can stack them up in any meaningful order to build a highly customized, optimized storage environment.
  • Undelete: Trashcan module provides undelete functionality by transparently moving all deleted/modified files into a /trash directory.
  • Encryption: As of now, GlusterFS supports only rot-13 encryption module. Rot-13 is a ridiculously weak encryption algorithm. Main purpose of this module is to act as a reference implementation for future development.
  • Trace: Application I/O can be traced call by call by inserting trace translator at specific points in the file system. It is useful for debugging.
  • Meta: Provides a virtual /proc like interface for management and monitoring.
Personal tools