<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hello,<div class=""><br class=""></div><div class="">I am using a GlusterFS disperse volume to host QEMU images.&nbsp;</div><div class=""><br class=""></div><div class="">Previously, I had used a distribute-replicate volume, but the disperse volume seems like it would be a better fit for us.</div><div class=""><br class=""></div><div class="">I have created a volume with 11 bricks (3 redundancy).</div><div class=""><br class=""></div><div class="">During testing, we’ve encountered ongoing problems. Mainly, there appear to be glusterfs hangs that seem severe. We get many log messages like the following:</div><div class=""><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">INFO: task glusterfs:4359 blocked for more than 120 seconds.</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp; &nbsp; &nbsp; Tainted: P &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ---------------&nbsp; &nbsp; 2.6.32-37-pve #1</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">"echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">glusterfs &nbsp; &nbsp; D ffff8803130d8040 &nbsp; &nbsp; 0&nbsp; 4359&nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp; 0 0x00000000</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;ffff88031311db98 0000000000000086 0000000000000000 ffffffff00000000</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;ffff88033fc0ad00 ffff8803130d8040 ffff88033e6eaf80 ffff88002820ffb0</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;00016cebb4c94040 0000000000000006 0000000117e5c6b9 000000000000091c</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Call Trace:</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff81139590&gt;] ? sync_page+0x0/0x50</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff815616c3&gt;] io_schedule+0x73/0xc0</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff811395cb&gt;] sync_page+0x3b/0x50</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff8156249b&gt;] __wait_on_bit_lock+0x5b/0xc0</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff81139567&gt;] __lock_page+0x67/0x70</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff810a6910&gt;] ? wake_bit_function+0x0/0x50</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff8115323b&gt;] invalidate_inode_pages2_range+0x11b/0x380</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffffa016da80&gt;] ? fuse_inode_eq+0x0/0x20 [fuse]</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff811ccb54&gt;] ? ifind+0x74/0xd0</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffffa016fa10&gt;] fuse_reverse_inval_inode+0x70/0xa0 [fuse]</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffffa01629ae&gt;] fuse_dev_do_write+0x50e/0x6d0 [fuse]</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff811ad81e&gt;] ? do_sync_read+0xfe/0x140</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffffa0162ed9&gt;] fuse_dev_write+0x69/0x80 [fuse]</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff811ad6cc&gt;] do_sync_write+0xec/0x140</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff811adf01&gt;] vfs_write+0xa1/0x190</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff811ae25a&gt;] sys_write+0x4a/0x90</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">&nbsp;[&lt;ffffffff8100b182&gt;] system_call_fastpath+0x16/0x1b</div></div><div class=""><br class=""></div><div class="">This plays havoc on the virtual machines.&nbsp;</div><div class=""><br class=""></div><div class="">In addition to this, read-write performance would bog down more quickly than would be expected, even under light load. The bricks are distributed among 4 servers connected by bonded gigabit ethernet (LACP). For our application, the slow downs are not a major problem, but they are an irritation.</div><div class=""><br class=""></div><div class="">I have been trying different iterations of volume options to try and address this, and happened to find an option that seems to have resolved both issues. On a whim, I disabled performance.io-cache . Client access to the volume seems to be close to wire speed now, at least for large file read-writes.&nbsp;</div><div class=""><br class=""></div><div class="">Reading the documentation, it seems like performance.io-cache would not be of huge benefit to our workload, but it seems strange that it would cause all of the various issues we have been having. Is this expected behavior for disperse volumes? We had planned to transition another volume to the disperse configuration, and I’d like to have a good handle on what options are good/bad.</div><div class=""><br class=""></div><div class="">BTW: The options selected are really just based upon trial and error, with some not-very-rigorous testing.</div><div class=""><br class=""></div><div class="">My volume info is below:</div><div class=""><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Volume Name: oort</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Type: Disperse</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Volume ID: 9b8702b2-3901-4cdf-b839-b17a06017f66</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Status: Started</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Number of Bricks: 1 x (8 + 3) = 11</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Transport-type: tcp</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Bricks:</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick1: XXXXX:/export/oort-brick-1/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick2:&nbsp;XXXXX:/export/oort-brick-2/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick3:&nbsp;XXXXX:/export/oort-brick-3/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick4:&nbsp;XXXXX:/export/oort-brick-4/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick5:&nbsp;XXXXX:/export/oort-brick-5/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick6:&nbsp;XXXXX:/export/oort-brick-6/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick7:&nbsp;XXXXX:/export/oort-brick-7/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick8:&nbsp;XXXXX:/export/oort-brick-8/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick9:&nbsp;XXXXX:/export/oort-brick-9/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick10:&nbsp;XXXXX:/export/oort-brick-10/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Brick11:&nbsp;XXXXX:/export/oort-brick-11/brick</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">Options Reconfigured:</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">transport.keepalive: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">server.allow-insecure: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">cluster.server-quorum-type: server</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">cluster.quorum-type: auto</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">network.remote-dio: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">cluster.eager-lock: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">cluster.readdir-optimize: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">features.lock-heal: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.stat-prefetch: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.cache-size: 128MB</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.io-thread-count: 64</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.read-ahead: off</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.write-behind: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.io-cache: off</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.quick-read: off</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.flush-behind: on</div><div class="" style="margin: 0px; font-size: 11px; font-family: Menlo;">performance.write-behind-window-size: 2MB</div></div><div class=""><br class=""></div><div class="">Thank you for any help you can provide.</div><div class=""><br class=""></div><div class="">Regards,</div><div class=""><br class=""></div><div class="">Sherwin</div></body></html>