RAID
I've been working on building up a storage/services machine using 4 1TB SATA drives in a RAID configuration and I've learned a few things.
1) The motherboard (Asus K8N-DRE) has an onboard RAID controller which is apparently capable of RAID 0, 1, 0+1 and 5. I intended to put the OS on a 5th drive - an older 120GB ATA drive - and stripe/mirror the 4 1TB drives in a 0+1 array, which would apparently provide 1.8TB of striped and mirrored disk; very safe. RAID is configured with a BIOS tool. All went well, but when I booted the OS it reported 4x1TB drives instead of 1X1.8TB. Research showed that
2) This is known as fakeRAID. Rather than being a RAID controller, the onboard chipset provides "multi-channel disk controllers combined with special BIOS configuration options and software drivers to assist the OS in performing RAID operations" (from here). The quoted article also describes the value of this type of arrangement. "The most common reason for using fakeRAID is in a dual-boot environment, where both Linux and Windows must be able to read and write to the same RAID partitions". Well, I'm not doing that so it's back to square one with this whole RAID thing.
3) Further research shows that while there is a performance hit when using software RAID it is not onerous, especially in this context. Too, it is entirely transferable in that I can move the disks to a different hardware setup (even a different OS) and expect the RAID to work perfectly.
4) Setting up software RAID is not too complicated. A few sites seem to point the way for RAID 1, RAID 0 and so forth, but I want to look at RAID 5 and RAID 10 too.
Links:
http://advosys.ca/viewpoints/2007/04/setting-up-software-raid-in-ubuntu-server/
https://help.ubuntu.com/community/Installation/RAID1
http://riseuplabs.org/grimoire//storage/software-raid/
http://juerd.nl/site.plp/debianraid
The last one looks at RAID 5 as well.
This one is the one I used:
http://www.gagme.com/greg/linux/raid-lvm.php
I am now building the RAID 5 array - 200 minutes and counting.
I can check on the status of the array with this command:
mdadm --detail /dev/md0
and I can watch the process live by doing this:
watch cat /proc/mdstat
The general idea is that I'll end up with about 3TB of disk which can sustain a complete single-disk failure. I went with this because the data that we'll be storing here will be duplicates (or there will be duplicates elsewhere by default). It is not going to be an archive machine, rather it will be a service/data duplicator for projects running on TAPoR boxes.
Next step is to duplicate services. The machine was built as a stock Ubuntu LAMP+PgSQL stack, so it's mostly config tasks to complete at this point, but I will have to deploy a Tomcat stack that works in concert with Apache (like the TAPoR machines). I'll need an in-service from syadmin, I expect. I'll also take a crack at MapServer and deploying viHistory.
UPDATE: the disk is now finished, formatted and mounted. It's 3TB of ext3 goodness. I had numerous problems with the creation of a filesystem - too much information is available on such subjects - so I got some advice, and some unexpected hands-on help, from Evan regarding the subject.
What it boiled down to was, use fdisk to create a filesystem directly on the device (/dev/md0). He also edited /etc/mdadm/mdadm.conf to include the details of the array.
For future reference, if a drive fails, I need to remove it from the array using mdadm, then remove it physically, replace it, use sfdisk to copy the filesystem from a remaining disk, then add the new drive to the array and watch it start to synch.