[Gluster-users] Bricks suggestions

Mon Apr 30 14:28:22 UTC 2012

>> Message: 6
>> Date: Mon, 30 Apr 2012 10:53:42 +0200
>> From: Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com>
>> Subject: Re: [Gluster-users] Bricks suggestions
>> To: Brian Candler <B.Candler at pobox.com>
>> Cc: Gluster-users at gluster.org
>> Message-ID:
>>    
>> <CAJH6TXh-jb=-Gus2yadhSbF94-dNdQ-iP04jH6H4wGSgNB8LHQ at mail.gmail.com>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> 2012/4/30 Brian Candler <B.Candler at pobox.com>
>>
>>> KO or OK? With a RAID controller (or software RAID), the RAID 
>>> subsystem
>>> should quietly mark the failed drive as unusable and redirect all
>>> operations
>>> to the working drive.  And you will have a way to detect this 
>>> situation,
>>> e.g. /proc/mdstat for Linux software RAID.
>>>
>>
>> KO.
>> As you wrote, in a raid environment, the controller will detect a 
>> failed
>> disk and redirect I/O to the working drive.
>>
>> With no RAID, is gluster smart enough to detect a disk failure and 
>> redirect
>> all I/O to the other server?
>>
>> A disk can have a damed cluster, so only a portion of itself will 
>> became
>> unusable.
>> A raid controller is able to detect this, gluster will do the same 
>> or still
>> try to reply
>> with brokend data?
>>
>> So,  do you suggest to use a RAID10 on each server?
>> - disk1+disk2 raid1
>> - disk3+disk4 raid1
>>
>> raid0 over these raid1 and then replicate it with gluster?
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: 
>> <http://gluster.org/pipermail/gluster-users/attachments/20120430/763be665/attachment-0001.htm>
>>
>> ------------------------------
> I have been running the following configuration for over 16 months
> with no issues:
>
> Fluster V3.0.0 in two SuperMicro servers each with 8x2TB hard drives
> configured as JBOD. I use Gluster to replicate each drive between
> servers and the distribute across the drives giving me approx 16TB as
> a single volume.  I can pull a single drive an replace and then use
> self heal to rebuild. I can shutdown or reboot a server and traffic
> continues to the other server (good for kernel updates).  I use 
> logdog
> to alert me via email/text if a drive fails.
>
> I chose this config because it was 1) simplest, 2) maximized my disk
> storage, 3) effectively resulted in a shared nothing RAID10 SAN-like
> storage system, 4) minimized the amount of data movement during a
> rebuild, 5) it didn't require any hardware RAID controllers which
> would increase my cost.  This config has worked for me exactly as
> planned.
>
> I'm currently building a new server with 8x4TB drives and will be
> replacing one of the existing servers in a couple of weeks.  I will
> force a self heal to populate if wit files from primary server.  When
> done I'll repeat process for the other server.
>
> Larry Bates
> vitalEsafe, Inc.
>
> ------------------------------

I should have added the other reasons for choosing this configuration:

6) Hardware RAID REQUIRES the use of hard drives that support TLER, 
which
forces you to use Enterprise drives which are much more expensive than
desktop drives, lastly 7) I'm an old-timer that has over 30 years of
exprience doing this and I've seen almost every RAID5 array that was 
ever
set up fail due to some "glitch" where the controller just decides that
multiple drives have failed simultaneously. Sometimes it takes a couple
of years, but I've seen a LOT of arrays fail this way, so I don't trust
RAID5/RAID6 with vital data. Hardware RAID10 is OK, but that would have
more than doubled my storage cost. My goal was highly available mid-
performance (30MB/sec) storage that is immune to single device failures
and that can be rebuilt quickly after a failure.

Larry Bates
vitalEsafe, Inc.