<html><body><div style="font-family: Arial; font-size: 10pt; color: #000000"><div>Hey folks,</div><div><br></div><div>I'm working on tracking down rogue QEMU segfaults in my infrastructure that look to be dying due to gluster. The tips that I get is that the process is in disk sleep when it dies and the process is backed only by gluster and the segfault lends to io system issues. Unfortunately I haven't figured out how to get a full crash dump so I can run it through apport-retrace to get exactly what went wrong. The other interesting thing is this happens only when gluster is under heavy load. Any tips about debugging further or getting this fixed up would be appreciated. &nbsp;</div><div><br></div><div>Segfault:</div><div><p style="margin: 0px;" data-mce-style="margin: 0px;">Dec 30 20:42:56 HFMHVR3 kernel: [5976247.820875] qemu-system-x86[27730]: segfault at 128 ip 00007f891f0cc82c sp 00007f89376846a0 error 4 in qemu-system-x86_64 (deleted)[7f891ed42000+4af000]</p><p style="margin: 0px;" data-mce-style="margin: 0px;"><br></p><p style="margin: 0px;" data-mce-style="margin: 0px;">Brick log:</p><p style="margin: 0px;" data-mce-style="margin: 0px;">[2014-12-30 20:42:56.797946] I [server.c:520:server_rpc_notify] 0-VMARRAY-server: disconnecting connectionfrom HFMHVR3-27726-2014/11/29-00:42:11:436294-VMARRAY-client-0-0-0<br>[2014-12-30 20:42:56.798244] W [inodelk.c:392:pl_inodelk_log_cleanup] 0-VMARRAY-server: releasing lock on 6e640448-aa4c-4faa-b7ad-33e68aca0d3a held by {client=0x7fe130776740, pid=0 lk-owner=ecb80<br>[2014-12-30 20:42:56.798287] I [server-helpers.c:289:do_fd_cleanup] 0-VMARRAY-server: fd cleanup on /HFMPCI0.img<br>[2014-12-30 20:42:56.798384] I [client_t.c:417:gf_client_unref] 0-VMARRAY-server: Shutting down connection HFMHVR3-27726-2014/11/29-00:42:11:436294-VMARRAY-client-0-0-0</p><p style="margin: 0px;" data-mce-style="margin: 0px;"><br></p><p style="margin: 0px;" data-mce-style="margin: 0px;">Nothing interesting in the VM log or around the segfault event in the hypervisor log</p><p style="margin: 0px;" data-mce-style="margin: 0px;"><br></p><p style="margin: 0px;" data-mce-style="margin: 0px;">Enviroment</p><p style="margin: 0px;" data-mce-style="margin: 0px;">Ubuntu 14.04 running stock QEMU 2.0.0 only modified for gfapi from&nbsp;<a href="https://launchpad.net/~josh-boon/+archive/ubuntu/qemu-glusterfs" data-mce-href="https://launchpad.net/~josh-boon/+archive/ubuntu/qemu-glusterfs">https://launchpad.net/~josh-boon/+archive/ubuntu/qemu-glusterfs</a>&nbsp;running on top of an SSD RAID0 array. The gluster volumes are connected over back-to-back 10G fiber connects running in a bond using&nbsp;balance-rr.</p><p style="margin: 0px;" data-mce-style="margin: 0px;"><br></p><p style="margin: 0px;" data-mce-style="margin: 0px;">Config</p><p style="margin: 0px;" data-mce-style="margin: 0px;">&nbsp; Filesystem mount</p><p style="margin: 0px;" data-mce-style="margin: 0px;">&nbsp; &nbsp;&nbsp;<span style="font-size: 10pt;" data-mce-style="font-size: 10pt;">/dev/mapper/VG0-VAR on /var type xfs (rw,noatime,nodiratime,nobarrier)</span></p><p style="margin: 0px;" data-mce-style="margin: 0px;"><span style="font-size: 10pt;" data-mce-style="font-size: 10pt;">&nbsp; Gluster config</span></p><p style="margin: 0px;" data-mce-style="margin: 0px;">&nbsp; &nbsp; Volume Name: VMARRAY<br>&nbsp; &nbsp; Type: Replicate<br>&nbsp; &nbsp; Volume ID: c0947aea-d07f-4ca0-bfcf-3b1c97cec247<br>&nbsp; &nbsp; Status: Started<br>&nbsp; &nbsp; Number of Bricks: 1 x 2 = 2<br>&nbsp; &nbsp; Transport-type: tcp<br>&nbsp; &nbsp; Bricks:<br>&nbsp; &nbsp; Brick1: 10.9.1.1:/var/lib/glusterfs<br>&nbsp; &nbsp; Brick2: 10.9.1.2:/var/lib/glusterfs<br>&nbsp; &nbsp; Options Reconfigured:<br>&nbsp; &nbsp; cluster.choose-local: true<br>&nbsp; &nbsp; storage.owner-gid: 112<br>&nbsp; &nbsp; storage.owner-uid: 107<br>&nbsp; &nbsp; cluster.server-quorum-type: none<br>&nbsp; &nbsp; cluster.quorum-type: none<br>&nbsp; &nbsp; network.remote-dio: enable<br>&nbsp; &nbsp; cluster.eager-lock: enable<br>&nbsp; &nbsp; performance.stat-prefetch: off<br>&nbsp; &nbsp; performance.io-cache: off<br>&nbsp; &nbsp; performance.read-ahead: off<br>&nbsp; &nbsp; performance.quick-read: off<br>&nbsp; &nbsp; server.allow-insecure: on<br>&nbsp; &nbsp; network.ping-timeout: 7</p><p style="margin: 0px;" data-mce-style="margin: 0px;">&nbsp; Machine Disk XML&nbsp;</p><p style="margin: 0px;" data-mce-style="margin: 0px;">&nbsp; &nbsp;&nbsp;<span style="font-size: 10pt;">&lt;disk type='network' device='disk'&gt;</span></p><p style="margin: 0px;" data-mce-style="margin: 0px;">&nbsp; &nbsp; &lt;driver name='qemu' type='raw' cache='none'/&gt;<br>&nbsp; &nbsp; &lt;source protocol='gluster' name='VMARRAY/HFMPCI0.img'&gt;<br>&nbsp; &nbsp; &lt;host name='10.9.1.2'/&gt;<br>&nbsp; &nbsp; &lt;/source&gt;<br>&nbsp; &nbsp; &lt;target dev='vda' bus='virtio'/&gt;<br>&nbsp; &nbsp; &lt;address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/&gt;<br>&nbsp; &nbsp; &lt;/disk&gt;</p><p style="margin: 0px;" data-mce-style="margin: 0px;"><br></p><p style="margin: 0px;" data-mce-style="margin: 0px;">Thanks,</p><p style="margin: 0px;" data-mce-style="margin: 0px;">Josh</p></div><div><br></div><div><br></div></div></body></html>