Building a Graphics Workstation: Hard Drives
June 21, 2008
In my experience the major bottleneck when working with large files is hard drive. One can argue that having enough RAM can avoid swapping to hard drive but one still needs to boot up the operating system, load applications, open and save files. When processing panoramas e.g. from Sydney, the temporary files reached frequently 10-15 GB and the resulting stitched image was somewhere between 500 - 800 MB 16 bit TIFF file. Adding few layers in Photoshop quickly resulted in scratch sizes more than 3-4 GB. 8 GB of RAM would likely increase the speed of processing but one still has to address the issue of hard drive 'speed'.
Issues to consider regarding the choice if the hard drives:
- Interface: SATA, IDE, SAS
- solid state drives versus magnetic
- RAID
- access time
- transfer rates
- reliability
- capacity
- cost
It is likely that you still have few older IDE hard drives with photos. Unless you want to completely transition to newer SATA interface, you want to make sure that you have at least one IDE port on your motherboard. SAS hard drives are aimed towards servers, meaning they have fast access time but are also quite expensive. That leaves SATA hard drives as the best value.
What to look for when considering particular SATA hard drive: Manufacturers generally specify capacity, random access time, and rotational speed. Generally the higher the rotational speed, the faster the access time and the transfer rate. Most of current SATA hard drives comes in 7200 rpm. WD offer 'Raptor' series with rotational speeds of 10,000 rpm. Other information that might be useful is recording technology (e.g. perpendicular recording) and the number of platters inside. Perpendicular recording allows higher areal data density on platters which improves transfer rates at the same rotational speed. Things like NCQ do not seem affect performance for desktop applications too much, but might result in less noise.
For photography one will need a lot of hard drive space. Also the faster the hard drive the better. Of course the cost becomes an issue. One of many possible options how to address this issue is to separate operating system from data and put them on separate hard drives.
Operating system, applications and temporary (scratch) could get the fastest hard drive you can afford, as they require fast access time and do not require a huge amount of space. I usually find that 20-30 GB for Windows XP 32 bit operating system is enough. You might be able to estimate the minimal size of your scratch disk based on your previous use. I usually allocate about 50 GB to scratch disk. One also needs to think about Windows XP page file (which is about 1.5x capacity of RAM) and location of Lightroom library (if you use this application) which can reach tens of GBs depending on your preview settings.
Solid state discs are based on Flash technology and have very low access times. Their transfer times are improving and prices are decreasing. They seem to be very suitable as drives for operating system. I think when product becomes available in near future I will be considering very interesting solution from Sandisk.
For data one can use high capacity hard drives. Data do not tend to be accessed as often, but are generally large files. Currently the optimal capacity seems to be 640-750 GB with the lowest cost per GB, and given their high data densities (250-333 GB per platter) they offer good transfer rates.
Many people take advantage of availability of RAID controllers on modern motherboards. There are few issues that one needs to be aware before deciding on RAID. RAID 1 (data mirroring) copies data to two (or more) hard drives simultaneously. The idea is that if one of your hard drives dies, you still have the other copy and you can replace faulty hard drive and rebuild your RAID 1. The problem arises if your RAID controller dies (e.g. you need to change motherboard with Intel ICHR9 chip). There is a chance you will have hard time to read your data. RAID 1 is not full backup solution.
RAID 0 (data stripping) writes data to both (or more) hard drives simultaneously, e.g. half of the file to each hard drive. Theoretically it sounds good, as you would expect 2x writing and reading speeds (in 2 HD RAID 0) compared to single hard drive. Practically the increase in speed is significantly less unless you work with very large files. The likelihood that your data will be lost is increased in case of RAID O, as you will loose all date in the case of any HD failure.
Reliability of hard drives: they will all fail eventually. Some of them on arrival, other after few years. Google has data on reliability on different brands, but is not sharing them. Some additional data to make informed decision in this regard is available here. I suggest that you check all your new hard drives for errors before you start using them. That might save you a lot of time later.