Friday, January 15, 2010

Clean Install with Windows 7 upgrade license - from Windows XP to Windows 7

This one describes the little trick with which I could upgrade from Windows XP to Windows 7 with an upgrade license and nevertheless doing a clean install. I don't need to mention that you can do this only legally if you own a valid Windows XP license (I don't know if this is working for Vista too, but it should).
When getting my hands on a Windows 7 upgrade license a while ago I didn't know how much trouble it would be to upgrade my Windows XP SP3. After purchasing the license I was able to download my Windows 7 upgrade media. I was surprised that this was not a bootable iso-file. Instead I got two box-files and one exe-file, which when started simply extracts the box-files onto my harddisk into a folder named expandedSetup. In this folder I found five folders (boot, efi, sources, support and upgrade) and four files (autorun.inf, bootmgr, bootmgr.exe and setup.exe). Looks like a DVD structure, doesn't it?
Normally the upgrade policy from Microsoft wants people to start the upgrade process by starting the setup.exe out of a running Windows session. But this was inherently impossible because I wanted to upgrade from a 32bit Windows XP to a 64bit Windows 7. The setup.exe simply wouldn't start because it doesn't support the 32bit platform (actually it brought an error message). I read that this issue is now solved by Mircosoft and they now let you download an iso-file. Thank you ba******.
But I had these files on my laptop now. So what did I do? First I made a bootable iso out of the expandedSetup folder. Therefor I needed a valid Vista or Windows 7 bootable media from which I extracted the BootImage. I used the "Windows 7 RC - iso" which I could download legally from the Mircosoft servers last year. If you ask the unholy G. for it you will definitly find a description of how to get you hands onto a working BootImage and how to burn it together with the expandedSetup folder onto a DVD (Didn't I tell you it looks like a DVD?! ;-)).The fun part started when I finally had my bootable Windows 7 media.
The thing is that all Windows 7 install medias are mostly the same and only the product key which you enter during installation holds the information about the license. The trick was that I didn't gave the product key during the installation I just finished it without it. After the installation was finished and I logged in for the first time I had 30 days to activate my Windows 7. When I had logged in I opened the registry editor with typing regedit into the search field of the start menu and pressed ENTER. There I looked for the key
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Setup\OOBE
There I set the entry
MediaBootInstall to 0.
After that I reset the license status of Windows 7. I typed cmd into the search field of the start menu and confirmed with pressing Ctrl+Shift+ENTER and klicked OK on the appearing UAC window.
This just opened me a command prompt with administrative privileges. There I typed
slmgr -rearm
and hit ENTER. After a short while I was asked to restart. After restart and login I hit Windows+Pause buttons and on the bottom of the opening window I could enter my product key and finally activate my copy of Windows 7.
And I was done.
Again. This works and should only be done with legal copies and valid licenses of Windows XP and Windows 7.

cheers

Wednesday, January 13, 2010

Increase performance of an encrypted RAID5

Maybe someone has read my last post on encrypting a RAID5 with cryptsetup (dm-crypt/LUKS). I knew that the encryption layer would somehow cost me some performance but I was not aware of how much. Keep in mind that we're talking here about an encrypted software RAID5 which means we're losing performance by calculating the parity information and by adding an encryption layer. If someone doesn't need or want encryption or does use a hardware RAID controller then the performance should be much better of course. But nevertheless the trick I needed to apply also holds true for unencrypted RAID arrays.
So after the RAID5 was set up I started to copy my data back from the backup disk. And curious as I was I did this with the time command. But first let's look of what amount of data we're talking about
root@server:~# du -csh /mnt/RAID/DATEN/
752G    /mnt/RAID/DATEN/
752G    total
That's quite a bit. But heh needs must so let's fire it up
root@server:~# time cp -a /mnt/BACKUP/DATEN/ /mnt/RAID/

real    789m46.834s
user    0m11.391s
sys     55m13.869s
oh f*** that took a while. Simple math gives us here something like 16MiB/s write performance and that should not be acceptable. Ok I know that copying such an amount of data with cp can't be taken as a serious write performance test but anyhow it gives you an idea. Searching through the net I quickly found out what I was missing while creating the encrypted RAID5 and especially the ext3 filesystem on it. I have to admit that I didn't fully understand (not even now) all the chunk-size, stripe-size, block-size harddisk slang 1337 5P34K but I could get at least the necessary information that I needed.
Mdadm per default sets the chunk_size of the RAID array to 64K. I've read somewhere that for a file server with large files you could increase this to 128K or 256K. I don't know if that's true and how big the performance gain would be but what I know is that I didn't want to spent another two days on reshaping the RAID array. So I stuck with 64K.

I have to make a little insertion here. I also didn't care much about partition alignment. As much as I understand it, it is not that important for a software RAID where you assemble the RAID out of partitions e.g. /dev/sda1, /dev/sdb1 etc. It may have an impact when using whole devices /dev/sda etc. instead and putting the partition table on top of the RAID device /dev/md0. I simply wanted to avoid any troubles with the partitions and running into the situation of reshaping the RAID array again.

Ok back to the topic. The last time I put the filesystem on the encrypted RAID device I used
me@server:~$ sudo mkfs.ext3 -c /dev/mapper/cryptraid
which created the filesystem with a default block size of 4K. But because the data is written in 64K chunks to the RAID (see above) I should have used additional parameters to mkfs.ext3. This time I did a
me@server:~$ sudo mkfs.ext3 -b 4096 -R stride=16,stripe-width=32 /dev/mapper/cryptraid
which tells the filesystem that the data comes in 16*4K=64K chunks and the stripe-width of my three disk array is 32K. The math goes by this:
  • chunk-size = 64K (that's what mdadm uses as default, maybe not the best choice here..but anyway)
  • block-size = 4K (that's the recommended block size for a large-file filesystem)
  • stride = chunk / block = 64K / 4K = 16K
  • stripe-width = stride * [(x disks in RAID5) - 1] = 16K * [(3)-1] = 32K
If the chunk-size is 64K, it means that 64K of consecutive data will reside on one disk. If one builds a filesystem with a 4K block-size this means that there are 16 filesystem blocks in one array chunk. The stripe-width is calculated by multiplying the stride=16 value with the number of data disk in the array. That's why we use x - 1 because one disk is needed for the parity information.
Making a simple dd if=/dev/zero of=/mnt/RAID/test.dat bs=X count=Y benchmark gives:
  • bs=   4K count = 480000 = 2GiB ==>  80,9 MiB/s
  • bs=   8K count = 240000 = 2GiB ==>  88,8 MiB/s
  • bs=  16K count = 120000 = 2GiB ==>  98,0 MiB/s
  • bs=  32K count =  60000 = 2GiB ==> 107,0 MiB/s
  • bs=  64K count =  30000 = 2GiB ==>  93,1 MiB/s
  • bs= 128K count =  15000 = 2GiB ==>  94,2 MiB/s
  • bs= 256K count =   7500 = 2GiB ==>  95,8 MiB/s
  • bs= 512K count =   3750 = 2GiB ==> 103,0 MiB/s
  • bs=1024K count =   1875 = 2GiB ==>  93,5 MiB/s
so that it seems as the maximum write performance of 107 MiB/s is reached when the data comes in stripe-width pieces. The other peak is at stripe-width * stride pieces. Maybe this should tell me something but right now it doesn't. ;-)

But reading through serveral posts on the net people always mention to check the stripe_cache_size value of the RAID.
root@server:~# cat /sys/block/md0/md/stripe_cache_size
256
seemed quite low to me because everyone was talking about 4K, 8K and 16K stripe_cache_sizes. So I decided to put it to 8K
root@server:~# echo 8192 > /sys/block/md0/md/stripe_cache_size
and to fire up the copying again. This time I got
root@server: time cp -a /mnt/BACKUP/DATEN/ /mnt/RAID/

real    234m27.764s
user    0m19.860s
sys     58m33.030s
which gives something like 55MiB/s and looks much better than before. Hdparm shows me something around 90MiB/s as buffered disk reads which is together with the current write performance enough for my needs.

I'd appreciate any comments and tips on that topic because first of all I'm really not very deep into this and second I really don't know which  of the two tunings had the bigger influence on the performance increase. Very helpfull information I found here, here and here and of course on the unholy G.

cheers

Tuesday, January 5, 2010

Encrypted RAID5 with cryptsetup (dm-crypt + LUKS)

Ok folks, after successfully converting a RAID1 to a RAID5 on my home server I decided to play around with the machine a little bit more.
With cryptsetup, a package available from the ubuntu repositories, I easily encrypted the whole RAID5 array in my ubuntu-server 8.04 LTS. Cryptsetup bundles the dm-crypt package used for encryption and LUKS for key management.
This is what I did:

ok first things first: I installed the cryptsetup package
me@server:~$ sudo apt-get install cryptsetup
Ater installation I loaded the corresponding modules with
me@server:~$ sudo modprobe dm-crypt
me@server:~$ sudo modprobe dm_mod
me@server:~$ sudo modprobe aes
me@server:~$ sudo modprobe cbc
Some mention to only load dm-crypt but somehow that didn't worked for me. You can check with
me@server:~$ lsmod
if they were loaded successfully.
me@server:~$ cat /proc/crypto
showed me the available crypto stuff. After unmounting the RAID5 with
me@server:~$ sudo umount /dev/md0
I could haul out the big guns
me@server:~$ sudo cryptsetup luksFormat -c aes-cbc-essiv:sha256 -s 256 -y /dev/md0

WARNING!
========
This will overwrite data on /dev/md0 irrevocably.

Are you sure? (Type uppercase yes):
of course I'm sure I have backups ;-) (yeah I'm a little picky about that).
  • -s gives the key length in bit, so here it is 256bit which is also the maximum size
  • -y makes sure to ask for the passphrase twice 
  • -c gives the encryption parameters:
    • aes is the used encryption algortihm
    • cbc the block modus used
    • essiv is the mode for the initialization-vector
    • sha256 the hash-function which is therefor used
  • and /dev/md0 is of course the device to be formatted.
Enter LUKS passphrase:
Verify passphrase:
Command successful.
What actually happened is that the header with the crypt information is written to the RAID5. Then I opened the RAID5 with
me@server:~$ sudo cryptsetup luksOpen /dev/md0 cryptraid
Enter LUKS passphrase:
key slot 0 unlocked.
Command successful.
Now cryptraid can be used like any other device (like a parttion under /dev/sdxY) on your system. The first thing I did was to make sure that no data can be restored
me@server:~$ sudo dd if=/dev/zero of=/dev/mapper/cryptraid bs=10M
dd: writing `/dev/mapper/cryptraid': No space left on device
286160+0 records in
286159+0 records out
3000597344256 bytes (3,0 TB) copied, 86115 s, 34,8 MB/s
because encrypted zeros look exactly like encrypted data. As you can see this took me a while (fyi: a day has 86400s...so you get the point). Next thing is to put a filesystem so I did a
me@server:~$ sudo mkfs.ext3 -c /dev/mapper/cryptraid
[sudo] password for me:
mke2fs 1.40.8 (13-Mar-2008)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
183148544 inodes, 732567711 blocks
36628385 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
22357 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848, 512000000, 550731776, 644972544

Checking for bad blocks (read-only test): done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 36 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
Thanks to the -c this took again the little while of 16h. But heh, who doesn't like to check for bad blocks? The final steps are just cosmetics. Never check again the filesystem, please, neeevvveeerrr.
me@server:~$ sudo tune2fs -c 0 -i 0 /dev/mapper/cryptraid
[sudo] password for me:
tune2fs 1.40.8 (13-Mar-2008)
Setting maximal mount count to -1
Setting interval between checks to 0 seconds
give journal_data as mount option
me@server:~$ sudo tune2fs -o journal_data /dev/mapper/cryptraid
and set dir_index as option
me@server:~$ sudo tune2fs -O dir_index /dev/mapper/cryptraid
and make a short!!! check again
me@server:~$ sudo e2fsck /dev/mapper/cryptraid
e2fsck 1.40.8 (13-Mar-2008)
/dev/mapper/cryptraid: clean, 11/183148544 files, 5823464/732567711 blocks
To effectively use the encrypted RAID5 it needs to be mounted (lo and behold!)
me@server:~$ sudo mount /dev/mapper/cryptraid /mnt/RAID/
and done.

Because the server is a headless machine and I only use it via a remote console I didn't put a crypttab or mount points in fstab. I will simply write a script which handles luksOpen --> mount and umount --> luksClose respectively. So I can decide when to use the encrypted RAID and when not. Keep in mind that after luksOpen everyone who has access to the machine can see and work the encrypted data. Always make a luksClose after unmounting. But the management of an encrypted data server is up to everyones own degree of built-in paranoia ;-).

cheers

A few sources I used and abused were this, this and that apart from the almighty mother of doom G.

Monday, January 4, 2010

Convert a 2 - disk - Raid1 to a 3 - disk - Raid5 with mdadm

Ok folks, with this post I document the way of how I converted a two disk RAID1 array (2x1.5TB = 1.5TB) into a three disk RAID5 (3x1.5TB = 3TB) array on my home server running ubuntu-server 8.04 LTS. The good thing about the whole story is that I was able to do this without any data loss thanks to the linux-raid-tool-of-god mdadm.
To make a long story short I simply added the third disk of exact same size to the array and extended the filesystem.
OOOOOOkkkkkkaaaaayyyyy it's not that simple but almost:

First a few details on my RAID1 array. I built this one a year ago with two Seagate ST31500341AS 1.5TB disks on the same ubuntu server that we're talking about now (yeah..never change a running...u know). The two disks actually performed quite well so no need to get rid of them, although they needed to be patched with a new firmware right after I purchased them. Maybe you heard about the seagate firmware drama.
But as we all know that size matters and 3TB are more than 1.5TB I decided to buy two additional disks to make myself a happy new year. After working through a thousand of tests and recommendations (at least it felt like a thousand) I got myself a pair of Western Digital WD15EADS 1.5TB Caviar Green disks and fitted them into my server case.
After rebooting I checked wether they are recognized correctly or not:
me@server:~$ cd /dev/disk/by-id/
me@server:/dev/disk/by-id$ ls
total 0
drwxr-xr-x 2 root root 840 2010-01-03 20:10 .
drwxr-xr-x 6 root root 120 2009-12-31 11:09 ..

lrwxrwxrwx 1 root root   9 2009-12-31 11:09 ata-ST31500341AS_9VS0AFRP -> ../../sdb
lrwxrwxrwx 1 root root  10 2009-12-31 11:09 ata-ST31500341AS_9VS0AFRP-part1 -> ../../sdb1
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 ata-ST31500341AS_9VS0DEF3 -> ../../sda
lrwxrwxrwx 1 root root  10 2009-12-31 11:09 ata-ST31500341AS_9VS0DEF3-part1 -> ../../sda1
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 ata-WDC_WD15EADS-00S2B0_WD-WCAVY1287280 -> ../../sdd
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 ata-WDC_WD15EADS-00S2B0_WD-WCAVY1325320 -> ../../sdc
.
.
.
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 scsi-1ATA_ST31500341AS_9VS0AFRP -> ../../sdb
lrwxrwxrwx 1 root root  10 2009-12-31 11:09 scsi-1ATA_ST31500341AS_9VS0AFRP-part1 -> ../../sdb1
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 scsi-1ATA_ST31500341AS_9VS0DEF3 -> ../../sda
lrwxrwxrwx 1 root root  10 2009-12-31 11:09 scsi-1ATA_ST31500341AS_9VS0DEF3-part1 -> ../../sda1
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 scsi-1ATA_WDC_WD15EADS-00S2B0_WD-WCAVY1287280 -> ../../sdd
lrwxrwxrwx 1 root root   9 2009-12-31 11:09 scsi-1ATA_WDC_WD15EADS-00S2B0_WD-WCAVY1325320 -> ../../sdc
As it seems they are. Because I'm no good friend of Mr. Murphy I decided to backup the data from the RAID1 to one of the new disks and put the other one to the array. After creating a partition on /dev/sdd I mounted it and backuped all the data:
me@server:~$ sudo cp -a /mnt/RAID/ /mnt/sdd
did the trick. To my surprise this worked like a charm even for my rsnapshot folder containing incremental backups of half a year. Btw take a look at rsnapshot for incremental backups on local or remote machines. It works on Linux and Windows (Cygwin) as well and is really easy to set up. But back to the RAID. First I created a partition on the third disk:
me@server:~$ sudo fdisk /dev/sdc
I made sure that the new partition had the exact same size (in blocks not in bytes) as the two other ones and that the partition type is fd (linux raid).

Now I started with the funny stuff:
I first unmounted the RAID
me@server:~$ sudo umount /dev/md0
Then I stopped the RAID
me@server:~$ sudo mdadm --stop /dev/md0
Now I created the RAID5 metadata and overwrote the RAID1 metadata
me@server:~$ sudo mdadm --create /dev/md0 --level=5 -n 2 /dev/sda1 /dev/sdb1
At this point I got warnings that the two disks seem to be part of a raid array already and I was asked if I really wanted to proceed in creating the array. I decided yes but you know I had backups ;-). What actually happened then is that I created a degraded two disk RAID5 array. Rebuilding as mdadm calls it took something like 7 to 8 hours. I kept another teminal open and checked the progress from time to time with
me@server:~$ cat /proc/mdstat
After mdadm was finished I mounted the RAID again and was very happy that the data was still there. I was very happy...
Now I added the new disk to the array
me@server:~$ sudo mdadm --add /dev/md0 /dev/sdc1
At this point the third disk is added as a spare device to the array. Growing the array to it's new size was done by
me@server:~$ sudo mdadm --grow /dev/md0 --raid-disks=3 --backup-file=/home/me/raid1to5backup.file
To be honest I gave the --backup-file parameter to be on the safe side if something had gone wrong while growing but somehow the file was never created. When I checked again the output of /proc/mdstat I got a little nervous. It was telling me that the reshape will take something about 13000min at 1412K/sec. No kidding. After some googling I found a nice tweak to speed up the process a bit.
me@server:~$ cat /proc/sys/dev/raid/speed_limit_max
200000
me@server:~$ cat /proc/sys/dev/raid/speed_limit_min
1000
When I changed these values to
me@server:~$ sudo su
root@netzwerker:/home/me# echo 20000000 > /proc/sys/dev/raid/speed_limit_max
root@netzwerker:/home/me# echo 25000 > /proc/sys/dev/raid/speed_limit_min
root@netzwerker:/home/me# exit
I got a significant speed-up to about 5000K/sec. I didn't push any harder because atop showed me that the disks were now around 90% busy. After more than 14 hours I still was at
me@server:~$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdc1[2] sdb1[1] sda1[0]
      1465135936 blocks super 0.91 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      [====>................]  reshape = 24.7% (362362752/1465135936) finish=3174.3min speed=5788K/sec

unused devices: <none>
but heh, haste makes waste so I was patient. Finally after 48 hours the reshaping was finished. Now I did a
me@server:~$ sudo e2fsck -f /dev/md0
e2fsck 1.40.8 (13-Mar-2008)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md0: 240560/91578368 files (1.3% non-contiguous), 325896911/366283984 blocks
to check the filesystem which was still only 1.5TB on /dev/md0. After that I growed it to its final size of 3TB with
me@server:~$ sudo resize2fs -p /dev/md0
resize2fs 1.40.8 (13-Mar-2008)
Resizing the filesystem on /dev/md0 to 732567968 (4k) blocks.
Begin pass 1 (max = 11178)
Extending the inode table     XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
The filesystem on /dev/md0 is now 732567968 blocks long.
Done. I now have a working three disk RAID5 array with an effective storage capacity of 2.8TB. Thanks to the linux community for such a great piece of software.

Cheers.

Inspiration for the whole stunt I found mainly herehere and here apart from using the big bad G.