It’s all in the firmware. Here’s why you should care:
The Western Digital Black one terabyte drive is an outstanding value with a street price of $99, but the manufacturer advises you against using them in RAID arrays. It turns out that they have good reason. Modern high storage density drives are designed to fail regularly but safely. With all those electrical charges so close to each other, even perpendicular recording technology won’t keep the magnetic records from influencing one another in naughty ways. The modern consumer drives monitor the health of each sector of data, and if the signal to noise ratio becomes too weak, the drive packs up the data and moves it to a fresher portion of the disk platter. This happens far more often than you might expect, and is the principle reason that lower density drives are used in enterprise settings. This means two things to the user:
- High density drives become less reliable the closer they are to being full.
- High density drives will sometimes stop to perform routine data recovery – for up to two minutes, while your disk controller waits.
It is the second behavior that is of real concern in a RAID setting. Most RAID controllers will only wait a short time for a drive to respond. Usually this period is as short as 10-20 seconds. If the drive hits this time limit, the controller will mark the entire drive as failed. This is not something you want to happen routinely, as your entire array is at jeopardy of going offline. The preferred behavior in RAID environments is for the drive to quickly time-out and report the sector as bad, so that the raid controller can reconstruct the missing data from the parity information stored on the other drives in the array. RAID parity strategies combined with high density drives synergize extremely well with the fault-tolerant approach adopted in most modern high-availability infrastructure. The philosophy is to use cheap hardware, add redunancy, and then fail and recovery quickly and transparently.
So what’s the real difference between the consumer models and the “RAID Edition” drives from Western Digital? They have the same vital specs. They are even rumored to come off the same assembly line as the “Black” edition consumer grade drives. The chief difference, apart from a 60%-80% difference in price, comes down to a firmware setting. (source) The setting prevents the drive from spending too much time attempting to recover data from a sector on disk. This feature is called Time Limited Error Recover (TLER.) It’s turned off by default on the WD Black, and it’s turned on by default on the RE3. And guess what? You can change the defaults with a simple command line utility, WDTLER.EXE. Wikipedia has an article on how to make the change.
Pingback: Building a 20TB ZFS file server – Part 2: Hardware Selection « StringLiterals.com
Pingback: En.dogeno.us » How-To » Build a 20TB file server – perfect for a lab environment that requires lots of fast and reliable disk
Pingback: Build a 20TB file server – perfect for a lab environment that requires lots of fast and reliable disk | Test Blog
#1 by Jed on February 6th, 2010
| Quote
Interesting write up, thanks for that!
I don’t suppose you’ve heard of anyone successfully turning a 1TB or 2TB Black into a RE4?
Sincerely,
Jed
#2 by Markus on April 20th, 2010
| Quote
Bought 4 640GB WD Black drives last week, but it seems, that the WDTLER tool is not available anymore and, most of all, is not supported anymore. Do you know if this will still work?
I don’t want to my raid to go in degraded mode once a week or so just because of the long error correction times.
Any advice?
Cheers,
Markus