We mentioned before that there were some really old backups (six months) of Radix's data that we were going to restore at some point. Turns out that these files are actually about a year old, and include a complete image of /home as of 26-04-2004 and an snapshot of files that had changed from that point until 22-05-2004.
But the news keeps getting better. The full image snapshot has been corrupted. Right now I've only been able to pull about 5 home directories off. I don't know if we'll get any more unless we can find a TAR recovery tool. The best bets to look for recovered data are in:
/srv/RECOVERED/public/old-backups/partial-full-2004-04-26/home
/srv/RECOVERED/public/old-backups/incr-2004-05-22/home
We'll let you know if we pull anything else useful out of here.
The copy of old data proceeds. As we mentioned before the /home directories were restored for all but about 10 people, and it looks like we completely lost the /www tree (a skeleton is back there now which you can begin to reconstruct). /crypt/users was unaffected by the crash and is about 90% copied back. Only the people with the largest directories are still coming over. You can find everything that has made it back in /src/RECOVERED/public/crypt-users. THIS IS A TEMPORARY LOCATION ONLY. /srv/RECOVERED is only where we are copying recovered data too. Even though you can write to it, consider this a read-only location. We will be removing all files from there eventually, so use it only for recovering your old data, do not put new content there.
We are moving to a monolithic /home disk for all user data. This means that what you used to put in /crypt/users, you should now put in your home directory. We are keeping local snapshots of the home directories, but there's too much data in there to perform off-site backups. We have currently deployed a 2 Gigabyte quota on /home for all users, and a shared 10 Gigabyte quota for anything owned by the Apache user (this means all "gallery" installations will be subject to a shared 10 Gigabyte quota). More news to follow.
So to sum up:
For all you Thunderbird and Eudora loving folks out there, you can once again relay mail through Radix's MTA, but you have to properly authenticate (before we had a hack in place so you didn't have to configure this). Basically, you have to configure your mail client's Outgoing SMTP Server to both use TLS (may be called SSL) as well as a Username/Password (possibly called SMTP-AUTH).
Thunderbird
Go to Tools :: Account Settings, then select "Outgoing Server (SMTP)" on the left-hand side. Set the following:
Now you should be able to send mail through Radix. When you first try to do so it will ask you to accept the TLS certificate presented (we'll have something to quiet that down later), and then ask for your password. This is the same as your usual Radix password. You should be all set!
Radix now thinks it's hosting all the sites it used to. Of course, we don't have the data to go in said websites, but the framework is back up. If you have files you want to put back in the web tree, you can find your DocumentRoot's in /var/www/[www.yourdomain.com]/doc/
First of all, we're running Debian Linux now instead of FreeBSD, someone pointed out that would be worth mentioning.
Also, WRT to data recovery, we've copied back all the home directories we were able to recover. If your data isn't there, we couldn't recover it. We have the disk though and may consider hiring a professional data recovery service if there are enough people who want to chip in for it (our only hesitation is that these folks are usually very expensive for only a marginal rate of success).
/crypt/users was unaffected and is currently copying over to the new box at a very slow pace. I wouldn't expect all data to make it over for the next week or so (we're talking nearly 100 Gigs of photos and what not).
The old backups are also copying over, and should come online sometime after this weekend. We'll note when they are available.
Everything that we've been able to recover and that is safe for public consumption is availabe in /srv/RECOVERED/public/. We've already copied over everything that has completed a copy, so the only reason to look here is to see if maybe your /crypt/users directory got copied over earlier. DO NOT COPY THIS DATA TO YOUR HOME DIRECTORY YET.
Note that we are currently running without a net, there are no backups being performed and there is no disk space management happening on /var (where /home is currently located). If any single user fills up the disk it will stop machine operations for all other users.
Squirrelmail is available again, and can be accessed at:
https://radix.cryptio.net/squirrelmail/
If you have trouble logging in, please email root@radix.cryptio.net from a separate account. If you have no other access to email please contact your friendly admin directly between the hours of 9am and 9pm Pacific time.
At this point the most essential services are up and running, so we are going to try and get some sleep. Some things will come back up over the weekend, and we continue to transfer data from the old disks to the new ones, which should complete some time next week (there is a lot of data).
We have successfully recovered everyone's mail spool and restarted mail and DNS services. This means that mail is flowing in to Radix from various points around the net. We have recovered nearly 80% of /home directory data and re-enabled SSH access to the box. Web mail and remote IMAP/POP access is not yet configured. ETA for those is still sometime tomorrow, err, today....ummm...later...
Around midnight on Monday, March 14, Radix suffered a physical disk failure that took the system down. The team worked late on Tuesday night in order to replace broken hardware and attempt a painstaking recovery from the affected disk.
As of Wednesday morning, recovered data is being fed on to the new hard drives and we are working to restore basic services such as shell access, mail processing, and web mail access. We hope to have the system stabilized for public access sometime on Thursday, March 17.
DNS requests for domains hosted on radix are distributed over a network of machines unaffected by this issue, and email bound for various domains is being queued at various locations for delivery when the system comes back up.
Please continue to visit this web page for further information.