<div dir="ltr"><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Hello, </div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">thanks for the info and sorry for the late reply.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">I will try to explain our complex setup.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">We are using OpenStack to create clusters of VMs for our clients.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">In these VMs we provide basic services and customers add their own applications which run on top of our clusters.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">These clusters have HA systems provided by us.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">To keep data persistent and shared across the cluster we use 3VMs to serve as Storage Nodes.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">We use 2 out of 3 VMs to store Gluster Bricks and the third one to have a quorum voter.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">There are 4-5 &quot;partitions&quot; served to all other VMs of the cluster.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">These partitions may contain or store static data (like text configuration files) or other user-generated data.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Most files stored in these partitions are text-based and ranges vary from 4-10Kbytes to 200-300MB with smaller files to be the majority of the population.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Total data stored per cluster is about 50G.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">So we have to tune somehow this installations, and to tune based on underlying hardware (assuming it is applicable in our case) is a way to go, though IMHO it could not give us too much of a benefit.</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">That&#39;s why I ask for a generic dimensioning document or guideline, if we leave hardware out of the equation, how can someone tune Gluster to make best use of RAM in cases complex like this? </div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Thanks</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Br</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif">Kostas</div><div class="gmail_default" style="font-family:trebuchet ms,sans-serif"><br></div><div class="gmail_extra"><div><div data-smartmail="gmail_signature"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div style="color:rgb(46,52,54);font-family:Ubuntu;font-size:13px"><a href="mailto:kostas.makedos@gmail.com" style="color:rgb(42,118,198)" target="_blank">kostas.makedos@gmail.com</a></div><br><p><br><br></p></div><div dir="ltr"><br></div></div></div></div></div>

<br><div class="gmail_quote">2016-05-27 21:33 GMT+03:00 Paul Robert Marino <span dir="ltr">&lt;<a href="mailto:prmarino1@gmail.com" target="_blank">prmarino1@gmail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Unfortunately that kind of tuning doesn&#39;t have any simple answers, and<br>

any one who says there is should not be listened to.<br>

<br>

It really depends on your workload and a lot of other factors such as<br>

your hardware. for example a 20 plater RAID 1+0 on spinning disks with<br>

a wide stripe needs very little cache for streaming large (MultiGB)<br>

files due to the large IOPS they can do, but would need a large cache<br>

for lots of files smaller than the stripe due to the fact that each<br>

file access is a minimum of 1 IOP which means a full read of the<br>

stripe. The reverse may be true if the files are only 4k or less on<br>

average, in which case a standalone SATA SSD would be way faster and<br>

need very little cache,but on large (MultiGB) files it would need a<br>

huge amount of cache due to the 4k per IOP size limitation in SSD&#39;s.<br>

Furthermore those scenarios assume your filesystem is correctly<br>

aligned, unfortunately they usually aren&#39;t. The reasons for this are<br>

complicated but in short the drivers (and in many cases the chipsets)<br>

for many RAID and SATA controllers do not provide the information the<br>

OS (/sys, LVM, and the filesystem) requires to align the filesystem<br>

automatically when its created.<br>

Now most DBA&#39;s will tell you they need an insane number of IOPs, what<br>

they are really telling you is how many operations the database is<br>

doing, not how many IOP&#39;s its doing. In reality databases do<br>

surprisingly few IOPs and tend more to do large (MultiGB) sequential<br>

reads into the ram used by the database processes, then do all their<br>

operations there.<br>

<br>

Also an other key factor is the IO scheduler (elevator=&quot;.....&quot; in the<br>

kernel boot options) you are using in the kernel. CFQ which is the<br>

default is great for desktops and servers running the 10 or more<br>

different services on inexpensive hardware. on most dedicated servers<br>

deadline or possibly if you have a good raid controller noop is much<br>

better. using the proper IO scheduler can have a dramatic impact on<br>

how much ram you use for cache, especially for writes.<br>

<br>

As I said there is no easy answer to this but if you can give us an<br>

idea of the typical workload then we may be able to give some advice.<br>

<div><div><br>

<br>

<br>

On Fri, May 27, 2016 at 8:27 AM, kostas makedos<br>

&lt;<a href="mailto:kostas.makedos@gmail.com" target="_blank">kostas.makedos@gmail.com</a>&gt; wrote:<br>

&gt; Hello,<br>

&gt; Can someone give me an estimate ratio between RAM consumption in<br>

&gt; a node in respect to the GB stored in its bricks?<br>

&gt; Is there a rule of thumb or a guideline document?<br>

&gt;<br>

&gt;<br>

&gt; Thank you,<br>

&gt;<br>

&gt; Best Regards<br>

&gt; Kostas Makedos<br>

&gt;<br>

&gt; <a href="mailto:kostas.makedos@gmail.com" target="_blank">kostas.makedos@gmail.com</a><br>

&gt;<br>

&gt;<br>

&gt;<br>

</div></div>&gt; _______________________________________________<br>

&gt; Gluster-users mailing list<br>

&gt; <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

&gt; <a href="http://www.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://www.gluster.org/mailman/listinfo/gluster-users</a><br>

</blockquote></div><br></div></div>