GlusterFS Hackers Guide
From GlusterDocumentation
FIXME: WORK IN PROGRESS
Contents |
Source repository
You can get the latest and unreleased developer's source from the Savannah's GNU Arch repository.
Anonymous access
$ tla register-archive http://arch.savannah.gnu.org/archives/gluster/ $ tla get -A gluster@sv.gnu.org glusterfs--mainline--BRANCH glusterfs
See Gluster Download Page for latest branch-id.
Developer Access
Setting-up signatures auto-checking
$ mkdir -p -m 700 ~/.arch-params/signing/ $ echo 'tla-gpg-check gpg_command="gpg --verify-files -"' > ~/.arch-params/signing/=default.check $ echo 'gpg --clearsign --use-agent' > ~/.arch-params/signing/=default
You may also be interested in the GPG Keyring of this project.
Note: Copy tla/=gpg-check.awk from GNU Arch source to your bin path as tla-gpg-check. Debian GNU/Linux does this by default. Option --use-agent requires gnupg-agent package as well. You need this option if you do not want to enter password everytime you commit. To initialize gnupg-agent, execute gpg-agent --daemon in your current shell context.
Repository Developer Checkout
$ tla my-id "FIRST-NAME LAST-NAME <YOUR-EMAIL-ID>" $ tla register-archive sftp://USER@arch.sv.gnu.org/archives/gluster $ tla my-default-archive gluster@sv.gnu.org $ tla get glusterfs--mainline--2.5 glusterfs
Hacking Translators
Prerequisites :
- Idea of translators in programming.
- About FUSE (Not so compulsory).
- GlusterFS Asynchronous Messaging Framework.
Howto :
In GlusterFS, different translators are written under 'xlators/' directory of the source. Each translator is classified based on their functionalities. To get an idea of glusterfs translators, have a look at the following files,
libglusterfs/src/xlator.h --> check for fops and mops prototype. libglusterfs/src/xlator.c libglusterfs/src/defaults.c --> check for default fops and mops implementation.
Each translator should implement functions by name 'init ()' and 'fini ()'. There are two structures, 'fops' (filesystem operations) and 'mops' (management operations), which developer should define. Functions that are not implemented will automatically be set to default operations, which are just pass through functions.
Asynchronous calls :
Now translator's fops and mops calls are asynchronous. This method is very necessary to make GlusterFS give more performance by reducing the latency caused by the network calls. There are two macros used for this 'STACK_WIND' and 'STACK_UNWIND', which maintains the call stack (at user space). Look at libglusterfs/src/stack.h to see definition of macros.
Hacking Schedulers
Each scheduler should implement three functions in sched_ops structure. You have to take care of locking your state variables safely.
- static int my_init (struct xlator *xl)
Module entry initialization function invoked once at the time of load. xl is pointer to this translator's structure. You can also have your own private state in xl->private void pointer. Return 0 on success or -1 with errno set appropriately.
- static void my_fini (struct xlator *xl)
Module exit uninitialization function invoked once at the time of unload. xl is pointer to this translator's structure. Make sure you free up your own private state in xl->private properly. Return 0 on success or -1 with errno set appropriately.
- static struct xlator * my_schedule (struct xlator *xl)
Actual scheduling logic. xl is pointer to this translator's structure. Return 0 on success or -1 with errno set appropriately.
Good examples to begin with are 'random' or 'rr' (round-robin) schedulers.
scheduler/rr/src/rr.c scheduler/random/src/random.c
Hacking Transport
** TODO **
GlusterFS Protocol
ASCII protocol
The GlusterFS protocol is a simple ASCII protocol, designed to be easy to parse and be human readable.
A sample GlusterFS protocol block looks like this:
Block Start header 0000000000000023 call id 00000001 type 00000016 op xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx human readable name 00000000000000000000000000000ac3 block size <...> block Block End
call id
This is a hexadecimal integer unique to each request/response pair. This lets the client and server xlators identify the stack frame to which a particular request/response belongs.
type
Type is one of GF_OP_TYPE_FOP_REQUEST, GF_OP_TYPE_MOP_REQUEST, GF_OP_TYPE_FOP_REPLY, or GF_OP_TYPE_MOP_REPLY.
op
Identifies the operation to be performed (for example FOP_UNLINK).
name
A human readable name for the entity that is sending out the block. This is not interpreted by the code.
block size
Size in bytes (expressed in hex) of the block that follows.
block
Serialization of a dict_t, which contains key/value pairs which hold whatever data is necessary to perform the operation (like the filename, errno, return value, etc.)
Binary Protocol
The idea of binary protocol is to reduce both CPU overhead (used for ascii conversion) and size of header information sent over the cable. Its designed for performance.
This section gives an idea of how the binary protocol of GlusterFS functions.
Header
0 8 16 24 32 Bits
| | | | |
0 |------------------------Signature---------------------->|------Version---->|
1 |----------------Type---------------->|------------------Op---------------->|
2 |-----------------------------------Callid--------------------------------->|
3 |-----------------------------------Callid--------------------------------->|
4 |------------------------------------Size---------------------------------->|
5-n |------------------------------------Data---------------------------------->|
n |---------End Signature ------------->|
Header of protocol is of fixed length. Starts with a signature field and Version. After verifying the signature and version, parser will read size bytes of Data following the header, after which, parser will look for a tail signature. After receiving packet correctly, the packet is sent to different functions according to the type and op value. Data is interpreted by that function accordingly.
callid field is of 8bytes (64bits), which is used for mapping request with the response. This is important in GlusterFS as it has a asynchronous request/response design.
Data
0 8 16 24 32 Bits | | | | | |--------------------------Fixed------------------------->| |--------------------------Fixed------------------------->| |--------------------------Fixed------------------------->| |--------------------------Fixed------------------------->| |---Type------->|------------------FieldLength----------->| |---------------------------Data------------------------->| |---Type------->|------------------FieldLength----------->| |---------------------------Data------------------------->| | | | | |
Data field starts just after the header. It has 16bytes fixed field. Its used for uid, gid, pid and any other 4 bytes flag (mostly fd) in case of request packet, and will be op_ret, and errno in reply packet. (if needed the two 4byte words are used for some flags like fd).
The fields can be of any of the following type, and is interpreted properly based on the field type and field length.
- Character type
- Integer 32bit fields
- Integer 64bit fields
- IO vector type (mostly used for readv and writev calls).
- Misc
Debugging GlusterFS
While developping any major application, the major problem faced by developper is the ways of debugging the program. GlusterFS being a program which handles valuable data, needs to be bugs free. Hence there is a major need for various debugging methods. As its implemented in userspace, it has the benefits of different methods for debugging.
Debug/Trace Translator
As an example for the translator framework, a trace translator is implemented, which logs all the arguments passed to a function. This is very useful for checking the arguments passed for the under laying volumes. This is very useful method to start debug.
Non-Daemon Mode
Both Glusterfs-client and glusterfs-server can run in daemon mode and non-daemon mode. Non-daemon mode logs messages to console. You can see when it dies. Also it is gdb friendly.
Debugging with GNU Gdb
GlusterFS is implemented in user space completely. So, it can be debugged using gdb as any other user space application.
Coding Guidelines
- All input arguments should be checked for validity
- Always try avoiding void return types and return -1 for error and 0 for success with appropriate errno setting. For pointer types, return NULL. Invalid input args should have EINVAL errno.
- When ever you return with error, logging with gf_log is must.
- All global functions and variables should be prefixed with gf_ or declared static otherwise. Otherwise we are polluting the name space. This is not only important in libraries, but also general code. Because external libraries may clash with our names.
- Use gcc-4.1 or above as a compiler and treat all warnings as errors.
- Use format macros from stdint.h to ensure portability between 32bit and 64bit platforms.
- Try avoiding functions returning allocated buffer. Instead take a pointer to pre-allocated buffer. You can introduce functions that calculates buffer length. This will alse encourage the use of alloca in place of malloc.
- All dynamically allocated memory pointers should be followed by size variable in all function arguments. And you should check if buffer size is sufficient before you use the buffer. Otherwise it may lead to memory corruption. Use snprintf in place of sprintf and alike calls. Use memcpy or strncpy in place of strcpy.
- Dict class should have iteration function. dict is more effective than linked list in many of our cases.
- Write comments if your code is not readable enough.
- Avoid global variables and take them as part of function arguments. If there are lot of such states, group them under a struct.
- It is better to have readable long names than short cryptic acronyms in naming variables and functions.
- As a convention, always arrange input arguments before output arguments in functions.
- All typedefs should be suffixed with _t.
- All macros and enum values should be in upper case.
- Every function call should be checked for return status.
- Errno should be checked for system calls.
- Always avoid type punning pointers (always obey strict aliasing rules).
- Nested functions should be declared with 'auto' not as static or extern.
- Understanding GNU Coding Standards is must before working on GlusterFS
Patch Submission Guidelines
# diff -pruN glusterfs glusterfs-hacked > glusterfs-$VERSION-$PATCH_NAME
Mail the patch as inline mail message to gluster-devel mailing-list with subject: [PATCH: glusterfs] actual subject. Include Changelog contents in the body and not in patch.
Quality Assurance Guidelines
Test Procedures
- Kernel Compilation
- dbench
- iozone
- dd
- parallel dd
- fill/remove file system
- emacs editing
- md5sum of all files
- [un]compress all files
- watch for memory leaks
- LAPACK or LINPACK (Parallel Benchmarking applications)

