New-doc
From GlusterDocumentation
FIXME: work in progrees - suggestions welcome
Contents |
Introduction
What is GlusterFS
A largely scalable clustered filesystem, implemented in userspace, (using FUSE). Highly adaptable, layered design, feature rich, posix compliant. Can work on any type of interconnect, IB or gig/e, or 10gig/e.
Why we wrote another filesystem
We ourself wanted to deploy a large filesystem for one of our customer, who needed Petabyte + storage. For that type of large volume, we needed a filesystem which addresses reliability, maintainability (ease of use), and scalability. We couldn't find all these three in a single filesystem. This is the reason why we started our own filesystem.
Advantages of being in Userspace
Its well known that filesystems are core part of the OS, hence has to be in kernel space. But when it comes to network filesystems, this is is not true. The delay which is caused by network latency is much larger than context switch overhead caused by being in userspace. And if the network latency is reduced by using Infiniband or 10gigabyte cards, one can do RDMA from userspace to remote machines, hence the question of context switch doesn't arise. So, the question of performance doesn't come into picture.
Also as we are in userspace, the development cycles for any feature is quite less compared to kernel based filesystem. Any problem with the filesystem, just a application restart is enough (say a umount/mount is enough). No need of reboots, no nightmares of kernel panics.
Copyright
GlusterFS is distributed under 'GNU Public License version 3 or later' (GPLv3) license. And its documentation is released under GNU Free Documentation License 1.2 or later. Gluster.com or Z Research Inc, is owning the copyright of the product. More questions about usage, contributions, licensing in our Legal FAQ section
Support
Gluster.com, or Z Research Inc, works based on support/subscription for the product business model. There is only one code base of the product, which is given to both community users and paid customers. Infact, users are having direct read (or checkout) permission to our archive, through which they can get access to any new features before the enterprises or customers. The version which the developers think is more stable is pushed to the paid customers.
Free Software community
- Gluster developer mailing list
Gluster developer mailing list where discussions are mostly based on pure technical problems, like compilation issues, bugs, new feature developments.
- Gluster user mailing list
Gluster user mailing list is open for discussions regarding configurations, management related discussions.
- #gluster irc channel
IRC channel (on irc.gnu.org or irc.freenode.net) #gluster can be used as a direct discussion channel to talk with experts on GlusterFS. As chat log is indexed, leave your questions there, they may be answered based on the availability of developers or users.
- Roadmap open for requests
GlusterFS Roadmap page is open for requests of community. If a feature has higher requests, its pulled up in roadmaps.
Paid Subscription
Visit Subscription Request page to get a quote based on your requirement. If you are willing to get a subscription package, your requests on Roadmap, your queries will be given higher priorities in the development team.
Features in GlusterFS
Its surely very hard to explain the feature of GlusterFS in few words. To list few,
- Fully POSIX compliant.
- The features are implemented as layers (called Translators), and are very modular.
- Scales seamlessly to more number of servers and capacity.
- No 'fsck', errors are self-healed.
- Files are kept as files and folders in backend (remember NFS ?), so recovering data without GlusterFS is very simple.
- Works over multiple OSes, different hardwares.
- Cost effective and user friendly.
Translators
- unify
- stripe (raid-0 like)
- afr or Automatic File Replication (raid-1 like)
- dht or Distributed Hash Table
- write-behind
- read-ahead
- io-threads
- io-cache
- posix
- BDB or Berkeley DB storage backend
- filter
- quota
- posix-locks
- trash
Protocol
Earlier versions of GlusterFS used ASCII based protocol, which enabled us to concentrate on other features/functionality without worrying much about modifying protocol syntax. Once the product started getting wider acceptance, the need for performance, the need for going low on CPU was a requirement. GlusterFS is not any more a proof of concept, fancy, research oriented filesystem. Rather, its aimed to be enterprise class large filesystem, which can be used to address variety of storage problems. From large files to small files. Hence there was a need for a standard binary protocol, which can reduce the CPU overhead caused by the string comparison/conversion, and the extra data transferred over the wire. Both of these were a major hurdle to achieve the best possible performance. So, now we have binary protocol between client and server process with very less header overhead, which makes the performance of small files very attractive.
Transports
A common questions asked when you encounter a network based product is what interface does this need?. GlusterFS has different transport modules. It can work fine with TCP/IP stack (both ipv4 and ipv6), and also has module to work with IB (Infiniband) verbs.
The following transport modules are supported in glusterfs
- tcp/ip (both ipv4 and ipv6)
- ib-verbs (Infiniband native RDMA support)
- ib sdp (IB socket direct protocol)
- unix sockets
Authentication
Currently GlusterFS implements very minimal authentication modules.
Address based
Authentication is done based on ip address.
username based
Authentication is based on username password
Booster
Doing I/O without going through the FUSE layer.
libglusterfsclient
Few applications may like to use library calls which gets access to filesystem directly instead of going through FUSE layer.
mod_glusterfs
Web embeddable glusterfs module, works fine on Apache 1.3.9, Lighty 1.4 and 1.5.
Installation process
The first thing we recommend any of the interested users is to try out the product by themself to see whether it suits their needs. Heavy marketing is for the products which doesn't do so well technically, for us, its the simplicity of code, ease of management, functionality/performance which is its marketing by itself. Installation is the first step to get a feel of it.
Download
Check the download page for latest version
Dependencies
Basic Dependencies
- FUSE is the primary requirement for GlusterFS to work. Now a days its part of most of the OSes. (oh! sorry not yet stable on MS Windows yet).
- Extended attribute support for backend (exporting) filesystem. (This may not be required in all the cases, but is required with some of the key features).
Misc/Feature supports
- OFED stack: If you have Infiniband network, this is required.
- Apache / Lighttp: If you want 'mod_glusterfs', a web embeddable module for web server, which doesn't need fuse layer to see the filesystem behind.
- Berkeley DB: To get the BDB backend to store very small files as records.
Installation
Here are some distro specific Installation steps of GlusterFS. If your distribution of OS is different than below mentioned, try compiling from Source.
After installation, make sure the installation is successful by checking the version of GlusterFS.
bash# glusterfs -V
Generally its advised to shift to newer releases as they will be coming out with some extra features and bug fixes. But you are the best person to judge which version is working good for you.
GNU/Linux
'rpm' based distros
One can use the rpm available in GlusterFS ftp site.
bash# rpm -ivh glusterfs-<version>.rpm
It can be used on 'Fedora', 'OpenSuse', RedHat and CentOS distributions.
'deb' based distros
Currently there are few contributors maintaining the glusterfs debian package. If the latest version is available in the debian repository, you just need to do
bash# apt-cache search glusterfs bash# apt-get install <glusterfs-*> # what ever the above search shows.
If the latest package is not available, then you need to install from source. which is described following sections.
Install from Source
Source tarball is available in the ftp repository of the project. Get the latest version as of today.
bash# tar -xzf glusterfs-<version>.tar.gz bash# cd glusterfs-<version> bash# ./configure > /dev/null GlusterFS configure summary =========================== FUSE client : yes Infiniband verbs : yes epoll IO multiplex : yes Berkeley-DB : yes libglusterfsclient : yes mod_glusterfs : yes argp-standalone : no bash# make && make install bash# ldconfig bash# glusterfs -V
NOTE: By default ./configure takes installation path (prefix) as /usr/local/, if you want different path, just add --prefix=/<your>/<path> to ./configure.
OS X
On Mac, though the source tarball can be built without any problems, one may be interested to use the click install .dmg images available from our ftp site. Click on the .dmg image after download and you will get a glusterfs package, which need to be installed by clicking on it again. If its a remote machine, you are doing installation on a terminal, you can use the below commands to install glusterfs.
bash# hdiutil attach glusterfs-<version>.dmg bash# installer -pkg /Volumes/glusterfs-<version>/glusterfs-<version>.pkg -target / bash# hdiutil detach /Volumes/glusterfs-<version>/
NOTE: Please go through 'README.MacOS' available with the .dmg image for complete steps for installation, for any version specific information.
Solaris
On solaris one need to set the PATH variable properly before compiling/building GlusterFS. GlusterFS is not tested to build with Sun Studio compiler. It works fine with GNU make, and gcc.
bash# export PATH=/usr/sfw/bin:$PATH bash# gunzip glusterfs-<version>.tar.gz bash# tar -xf glusterfs-<version>.tar bash# cd glusterfs-<version> bash# ./configure && make && make install bash# glusterfs -V
BSD
Only tested on FreeBSD 7 or later.
bash# gunzip glusterfs-<version>.tar.gz bash# tar -xf glusterfs-<version>.tar bash# cd glusterfs-<version> bash# ./configure && make && make install bash# glusterfs -V
If you find any problems, write to the developer mailing list (please provide complete information like machine type, OS version, log messages while failing, for quick help).
Configuration
We hope your installation process was simple enough. Now, lets move to the configuration part, which is also simple, but one need to understand the design of volume specification files before anything else.
If you don't want to spend more time learning this, and want to start right away, there are standard volume specification files available, just copy them into proper path, change IP address as per your network, and you are all set to mount GlusterFS
Volume specification
Volume specification file (or specfile in short) is the mechanism through which GlusterFS understands how it has to behave, how it has to load its translators to give a feature rich filesystem. In GlusterFS, all the translators are dynamically loadable shared object libraries. Using specfile, glusterfs gets a graph of translators which defines its behavior. The art of writing a good volume specfile can make you GlusterFS 'GURU' . So, our personal advise to you is read this section bit carefully and understand it :-)
Importance
As said earlier, the specfile defines the behavior of the filesystem, which means it can make it a perfect solution, or a product which doesn't work at all. I hope you know what is a Swiss army knife. When all the tools are hidden, its a small heavy metal piece. When you use appropriate tool inside for required task, you realize it can solve so many problems of you in daily life. Well, you can't even hold it if you take out all the tools from it. Well, the glimpse of that is, with all these layered approach of glusterfs, so many features, you can solve most of your storage problems, but properly tuning of parameters, proper organizing of translators in volume specfile is required to get the best performance and stability. Everyday, we try to make sure that even the worst possible combination functionally works fine.
You just need to understand that, volume specfile is important. Please make sure that you send the volume specfile (which is logged in logfile when the process starts) with your bug report :-)
Syntax
GlusterFS has a strict syntax check on the volume specfile. The keywords allowed are 'volume' , 'end-volume' , 'subvolumes' , 'type' and 'option' .
A snippet of the volume specfile looks like below
volume union type cluster/unify option scheduler rr option namespace ns option self-heal on subvolumes client1 client2 client3 client4 client5 end-volume
GlusterFS syntax is pretty simple, but still, there can be errors while writing/editing it. Hence GlusterFS team provides 'emacs' and 'vim' specific syntax modes.
'emacs' mode
Download the syntax mode file here - glusterfs-mode.el
You should add the following lines in your '~/.emacs' file
;; (add-to-list 'load-path "<directory path, which contains glusterfs-mode.el") (add-to-list 'load-path "/usr/share/doc/glusterfs/") (require 'glusterfs-mode)
Now when you open '*.vol' files in emacs, you can see different colors for glusterfs syntax. If the volume spec file name ends with different extension (other than *.vol), do 'M-x glusterfs-mode' to get the syntax highlighting feature.
'vim' mode
Download the syntax mode file here - glusterfs.vim
You can enable it in the command mode of vim, by typing 'source glusterfs.vim'. Now you can see the different color coding used for volume spec files.
Example
- NFS Like Standalone Storage Server
- Automatic File Replication (Mirror) across Two Storage Servers
- Aggregating Three Storage Servers with Unify
- Striping Across Four Storage Servers
- Mixing Unify and Automatic File Replication
- Mixing Unify and Stripe
- Unify over AFR
- Unify NUFA with single process
- AFR single process
- Fully loaded Simple two nodes Unify
NOTE: These volume spec files given above are for functionality tests only. These may not include any or all the performance tuning translators.
Management scripts on OS
When you install glusterfs, 'mount.glusterfs' script is installed at /sbin, so administrator gets a option to have the glusterfs mountpoint in /etc/fstab like any other filesystems. This reduces lot of management overhead for admins.
Well, the above statement is a bestcase which covers 90% of setups. Due to its wide configurable options, sometimes just having mount point entry in /etc/fstab may not be enough (the cases where a server process and client process are present in the machine), even sometimes, a entry in /etc/fstab is not at all required (a case where a machine is acting only as storage server). In that type of cases admins may want to use the 'init.d' scripts to start the server processes. Checkout extras/init.d/* in source tarball for proper init.d scripts. Well, its not difficult to write one for yourself too.
NOTE: Refer to Roadmap where we have a WebUI coming out to monitor other fields related to network filesystem, which is intended to handle most of the manageability issues.
Exporting over NFS
As on backend GlusterFS needs just a directory to export, you can run GlusterFS over NFS. But as NFS is not completely posix compliant FS, those operations which fail over NFS, also fails over GlusterFS. Also, because some versions of NFS doesn't support extended attributes, AFR/DHT may not work properly.
But if you are exporting GlusterFS over NFS, don't think of getting very good performance as NFS may choke under load.
NFS re-export
Documentation Pending
NFS re-export works fine with GlusterFS. But you may have to go through the 'README.NFS' of fuse tarball (in fuse-2.7.3)
root@space:/tmp/fuse-2.7.3glfs10 # cat README.NFS FUSE module in official kernels (>= 2.6.14) don't support NFS exporting. In this case if you need NFS exporting capability, use the '--enable-kernel-module' configure option to compile the module from this package. And make sure, that the FUSE is not compiled into the kernel (CONFIG_FUSE_FS must be 'm' or 'n').
You need to add an fsid=NNN option to /etc/exports to make exporting a FUSE directory work.
You may get ESTALE (Stale NFS file handle) errors with this. This is because the current FUSE kernel API and the userspace library cannot handle a situation where the kernel forgets about an inode which is still referenced by the remote NFS client. This problem will be addressed in a later version. root@space:/tmp/fuse-2.7.3glfs10 # _
Also you may need to give option '--disable-direct-io-mode' for GlusterFS while starting it.
Re-exporting with Samba for CIFS clients
You can easily start smdb/nmdb servers giving a directory inside GlusterFS mountpoint. It should work fine. Its also seen that over GNU/Linux machines the performance of samba clients in this condition is nice, but if using Windows samba client (or maping a network device option in windows), you have to compile the samba server with 'no-utimes' option, which gives a very nice performance on Windows machine.
RoadMap
Visit GlusterFS Roadmap.
Highlights are
- webUI for management
- hot add/remove of nodes
- nodup
- snapshot
and the list goes on..
FAQ
Questions are part of any new initiatives. That too when a revolutionary concept becomes so simple, its very hard to believe. So, the number of questions increases. The GlusterFS team is very happy to answer most of your doubts about filesystem in general, and GlusterFS in specific.
General
Pending
Legal
What is the license used for Gluster?
GlusterFS is released under GNU General Public License v3 or later. Documentation is released under GNU Free Documentation License 1.2 or later.
What is the relation between Gluster and Z RESEARCH Inc?
Z RESEARCH Inc owns the copyright and trademark of GlusterFS. All the current developers are employed by Z Research Inc. Also, Z Research Inc offers support on the software.
How does the external contributions handled ?
External contributions are copyrighted to their corresponding owners if the work is significant. If the contribution is a very small patch, minor bug fix, currently the copyright is held by Z Research Inc itself as it becomes easier to maintain the legal part of it.
Is there a difference between commercially supported Gluster and community version?
Z RESEARCH Inc does not maintain any proprietary extensions to GlusterFS. Free Software can be commercial too. Z RESEARCH bundles support and services into a commercial subscription package. For more info, please visit this page Subscription Package.
Why software patents are evil?
Today it has become impossible to develop any useful software without violating patents. Software patents affect both free and proprietary software, small and large corporations. We believe software patents should be abolished all together.
Management
Pending
Technical
Pending
Developer
Pending
Contribute
GlusterFS is a FreeSoftware Project, and all of its developers believe in Freedom of mind. You are free to choose how to contribute in your own way. Here are few common options if you like to contribute.
Develop
As each functionality of GlusterFS is implemented as layers (or translators), you can write your own functionality layer.
Testing
Again, because of its modularity and layered design, there are so many different combination possible using GlusterFS. You can test it in your own combination of translators, and run your application and report us bugs if you find any.
Port
The main focus of the development team will be on GNU/Linux initially. Though developers try to code portable and don't use any OS specific code inside GlusterFS, a constant check on different OSes is very helpful.
Currently Tested on Mac OS X, OpenSolaris, FreeBSD 7.0 or later.
Documentation
Any piece of documentation is helpful for the project. The core developers are good only in writing code (that too mostly in C or lisp :p), so helping them to write something in common english (or any other human understandable languages like, Kannada, Spanish, Hindi, French, German etc will be very helpful).
Spread the word
If you like the product, please let the world know about it. You can let the world know about it by adding an entry in our Who's using GlusterFS page. You can also promote us by writing blogs, speaking in conferences etc.
Buy Support
We can't deny the power of money in any good work. Though all the above type of contributions can be done by individual, as an industry/enterprise you can choose to support the product by taking the Subscription, which works as win win case for you and us. Increased development speed for us, and more prioritized support for you.
NOTE: Any user of GlusterFS are not bounded by above mentioned steps. One can choose to use the product without notifying the team. But when you choose to re-distribute, make sure you publish the code under GPLv3 or later license.
Thanks
To the core team, members of developer mailing list who helped us to test the product and get it to the stable state, and all of you who helped to spread the name.

