<div dir="ltr"><div><div>It would be better to use sharding over stripe for your vm use case. It offers better distribution and utilisation of bricks and better heal performance.<br></div>And it is well tested.<br></div><div>Couple of things to note before you do that:<br></div><div>1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised that you use 3.7.8 or above.<br></div><div>2. When you enable sharding on a volume, already existing files in the volume do not get sharded. Only the files that are newly created from the time sharding is enabled will.<br></div><div>    If you do want to shard the existing files, then you would need to cp them to a temp name within the volume, and then rename them back to the original file name.<br><br></div><div>HTH,<br></div><div>Krutika <br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <span dir="ltr">&lt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I couldn&#39;t find anything related to cache in the HBAs.<br>

what logs are useful in my case ? i see only bricks logs which contains nothing during the failure.<br>

<br>

###<br>

[2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod] 0-vmware-posix: mknod on /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed [File exists]<br>

[2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod] 0-vmware-posix: mknod on /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed [File exists]<br>

[2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash: rmdir issued on /.trashcan/, which is not permitted<br>

[2016-03-13 18:07:55.027635] I [MSGID: 115056] [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op) ==&gt; (Operation not permitted) [Operation not permitted]<br>

[2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4<br>

[2016-03-13 18:11:34.353463] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version: 3.7.8)<br>

[2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4<br>

[2016-03-13 18:11:34.591173] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version: 3.7.8)<br>

###<br>

<br>

ESXi just keeps telling me &quot;Cannot clone T: The virtual disk is either<br>

corrupted or not a supported format.<br>

error<br>

3/13/2016 9:06:20 PM<br>

Clone virtual machine<br>

T<br>

VCENTER.LOCAL\Administrator<br>

&quot;<br>

<br>

My setup is 2 servers with a floating ip controlled by CTDB and my ESXi server mount the NFS via the floating ip.<div class="HOEnZb"><div class="h5"><br>

<br>

<br>

<br>

<br>

On 03/13/2016 08:40 PM, pkoelle wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Am 13.03.2016 um 18:22 schrieb David Gossage:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan &lt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

wrote:<br>

</blockquote>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

My HBAs are LSISAS1068E, and the filesystem is XFS.<br>

I tried EXT4 and it did not help.<br>

I have created a stripted volume in one server with two bricks, same issue.<br>

and i tried a replicated volume with just &quot;sharding enabled&quot; same issue,<br>

as soon as i disable the sharding it works just fine, niether sharding nor<br>

striping works for me.<br>

i did follow up with some of threads in the mailing list and tried some of<br>

the fixes that worked with the others, none worked for me. :(<br>

<br>

</blockquote>

<br>

Is it possible the LSI has write-cache enabled?<br>

</blockquote>

Why is that relevant? Even the backing filesystem has no idea if there is a RAID or write cache or whatever. There are blocks and sync(), end of story.<br>

If you lose power and screw up your recovery OR do funky stuff with SAS multipathing that might be an issue with a controller cache. AFAIK thats not what we are talking about.<br>

<br>

I&#39;m afraid but unless the OP has some logs from the server, a reproducible testcase or a backtrace from client or server this isn&#39;t getting us anywhere.<br>

<br>

cheers<br>

Paul<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On 03/13/2016 06:54 PM, David Gossage wrote:<br>

<br>

<br>

<br>

<br>

On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan &lt;<br>

<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt; wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Okay so i have enabled shard in my test volume and it did not help,<br>

stupidly enough, i have enabled it in a production volume<br>

&quot;Distributed-Replicate&quot; and it currpted  half of my VMs.<br>

I have updated Gluster to the latest and nothing seems to be changed in<br>

my situation.<br>

below the info of my volume;<br>

<br>

</blockquote>

<br>

I was pointing at the settings in that email as an example for corruption<br>

fixing. I wouldn&#39;t recommend enabling sharding if you haven&#39;t gotten the<br>

base working yet on that cluster. What HBA&#39;s are you using and what is<br>

layout of filesystem for bricks?<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Number of Bricks: 3 x 2 = 6<br>

Transport-type: tcp<br>

Bricks:<br>

Brick1: gfs001:/bricks/b001/vmware<br>

Brick2: gfs002:/bricks/b004/vmware<br>

Brick3: gfs001:/bricks/b002/vmware<br>

Brick4: gfs002:/bricks/b005/vmware<br>

Brick5: gfs001:/bricks/b003/vmware<br>

Brick6: gfs002:/bricks/b006/vmware<br>

Options Reconfigured:<br>

performance.strict-write-ordering: on<br>

cluster.server-quorum-type: server<br>

cluster.quorum-type: auto<br>

network.remote-dio: enable<br>

performance.stat-prefetch: disable<br>

performance.io-cache: off<br>

performance.read-ahead: off<br>

performance.quick-read: off<br>

cluster.eager-lock: enable<br>

features.shard-block-size: 16MB<br>

features.shard: on<br>

performance.readdir-ahead: off<br>

<br>

<br>

On 03/12/2016 08:11 PM, David Gossage wrote:<br>

<br>

<br>

On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan &lt;<br>

&lt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt; wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Both servers have HBA no RAIDs and i can setup a replicated or<br>

dispensers without any issues.<br>

Logs are clean and when i tried to migrate a vm and got the error,<br>

nothing showed up in the logs.<br>

i tried mounting the volume into my laptop and it mounted fine but, if i<br>

use dd to create a data file it just hang and i cant cancel it, and i cant<br>

unmount it or anything, i just have to reboot.<br>

The same servers have another volume on other bricks in a distributed<br>

replicas, works fine.<br>

I have even tried the same setup in a virtual environment (created two<br>

vms and install gluster and created a replicated striped) and again same<br>

thing, data corruption.<br>

<br>

</blockquote>

<br>

I&#39;d look through mail archives for a topic &quot;Shard in Production&quot; I think<br>

it&#39;s called.  The shard portion may not be relevant but it does discuss<br>

certain settings that had to be applied with regards to avoiding corruption<br>

with VM&#39;s.  You may want to try and disable the performance.readdir-ahead<br>

also.<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

On 03/12/2016 07:02 PM, David Gossage wrote:<br>

<br>

<br>

<br>

On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan &lt;<br>

&lt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt; wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Thanks David,<br>

<br>

My settings are all defaults, i have just created the pool and started<br>

it.<br>

I have set the settings as your recommendation and it seems to be the<br>

same issue;<br>

<br>

Type: Striped-Replicate<br>

Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14<br>

Status: Started<br>

Number of Bricks: 1 x 2 x 2 = 4<br>

Transport-type: tcp<br>

Bricks:<br>

Brick1: gfs001:/bricks/t1/s<br>

Brick2: gfs002:/bricks/t1/s<br>

Brick3: gfs001:/bricks/t2/s<br>

Brick4: gfs002:/bricks/t2/s<br>

Options Reconfigured:<br>

performance.stat-prefetch: off<br>

network.remote-dio: on<br>

cluster.eager-lock: enable<br>

performance.io-cache: off<br>

performance.read-ahead: off<br>

performance.quick-read: off<br>

performance.readdir-ahead: on<br>

<br>

</blockquote>

<br>

<br>

Is their a raid controller perhaps doing any caching?<br>

<br>

In the gluster logs any errors being reported during migration process?<br>

Since they aren&#39;t in use yet have you tested making just mirrored bricks<br>

using different pairings of servers two at a time to see if problem follows<br>

certain machine or network ports?<br>

<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

<br>

<br>

<br>

<br>

On 03/12/2016 03:25 PM, David Gossage wrote:<br>

<br>

<br>

<br>

On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan &lt;<br>

&lt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt;<a href="mailto:mahdi.adnan@earthlinktele.com" target="_blank">mahdi.adnan@earthlinktele.com</a>&gt; wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Dears,<br>

<br>

I have created a replicated striped volume with two bricks and two<br>

servers but I can&#39;t use it because when I mount it in ESXi and try to<br>

migrate a VM to it, the data get corrupted.<br>

Is any one have any idea why is this happening ?<br>

<br>

Dell 2950 x2<br>

Seagate 15k 600GB<br>

CentOS 7.2<br>

Gluster 3.7.8<br>

<br>

Appreciate your help.<br>

<br>

</blockquote>

<br>

Most reports of this I have seen end up being settings related.  Post<br>

gluster volume info. Below is what I have seen as most common recommended<br>

settings.<br>

I&#39;d hazard a guess you may have some the read ahead cache or prefetch<br>

on.<br>

<br>

quick-read=off<br>

read-ahead=off<br>

io-cache=off<br>

stat-prefetch=off<br>

eager-lock=enable<br>

remote-dio=on<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Mahdi Adnan<br>

System Admin<br>

<br>

<br>

_______________________________________________<br>

Gluster-users mailing list<br>

&lt;<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>&gt;<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

&lt;<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a>&gt;<br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

<br>

</blockquote>

<br>

<br>

<br>

</blockquote>

<br>

<br>

</blockquote>

<br>

<br>

</blockquote>

<br>

<br>

</blockquote>

<br>

<br>

<br>

_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

<br>

</blockquote>

<br>

_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

</blockquote>

<br>

_______________________________________________<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

</div></div></blockquote></div><br></div>