The SSD Anthology: Understanding SSDs and New Drives from OCZ
by Anand Lal Shimpi on March 18, 2009 12:00 AM EST- Posted in
- Storage
The Anatomy of an SSD
Let’s meet Mr. N-channel MOSFET again:
Say Hello
This is the building block of NAND-flash; one transistor is required per cell. A single NAND-flash cell can either store one or two bits of data. If it stores one, then it’s called a Single Level Cell (SLC) flash and if it stores two then it’s a Multi Level Cell (MLC) flash. Both are physically made the same way; in fact there’s nothing that separates MLC from SLC flash, it’s just a matter of how the data is stored in and read from the cell.
SLC flash (left) vs. MLC flash (right)
Flash is read from and written to in a guess-and-test fashion. You apply a voltage to the cell and check to see how it responds. You keep increasing the voltage until you get a result.
SLC NAND flash | MLC NAND flash | |
Random Read | 25 µs | 50 µs |
Erase | 2ms per block | 2ms per block |
Programming | 250 µs | 900 µs |
With four voltage levels to check, MLC flash takes around 3x longer to write to as SLC. On the flip side you get twice the capacity at the same cost. Because of this distinction, and the fact that even MLC flash is more than fast enough for a SSD, you’ll only see MLC used for desktop SSDs while SLC is used for enterprise level server SSDs.
Cells are strung together in arrays as depicted in the image to the right
So a single cell stores either one or two bits of data, but where do we go from there? Groups of cells are organized into pages, the smallest structure that’s readable/writable in a SSD. Today 4KB pages are standard on SSDs.
Pages are grouped together into blocks; today it’s common to have 128 pages in a block (512KB in a block). A block is the smallest structure that can be erased in a NAND-flash device. So while you can read from and write to a page, you can only erase a block (128 pages at a time). This is where many of the SSD’s problems stem from, I’ll repeat this again later because it’s one of the most important parts of understanding SSDs.
Arrays of cells are grouped into a page, arrays of pages are grouped into blocks
Blocks are then grouped into planes, and you’ll find multiple planes on a single NAND-flash die.
The combining doesn’t stop there; you can usually find either one, two or four die per package. While you’ll see a single NAND-flash IC, there may actually be two or four die in that package. You can also stack multiple ICs on top of each other to minimize board real estate usage.
250 Comments
View All Comments
zdzichu - Sunday, March 22, 2009 - link
Very nice and thorough article. I only lack more current status of TRIM command support in current operating systems. For example, Linux supports it since last year:http://kernelnewbies.org/Linux_2_6_28#head-a1a9591...">http://kernelnewbies.org/Linux_2_6_28#h...a9591f48...
Sinned - Sunday, March 22, 2009 - link
Outstanding article that really helped me understand SSD drives. I wonder how much of an impact the new SATA III standard will have on SSD drives? I believe we are still at the beginning stage for SSD drives and your article shows that much more work needs to be done. My respect for OCZ and how they responded in a positive and productive way should be a model for the rest of the SSD makers. Thank you again for such a concise article.Respectfully,
Sinned
529th - Sunday, March 22, 2009 - link
The first thing I thought of was Democracy. Don't know why. Maybe it was because a company listened to our common goal of performance. Thank you OCZ for listening, I'm sure it will pay off!!!araczynski - Saturday, March 21, 2009 - link
very nice read. the 4/512 issue seems a rather stupid design decision, or perhaps more likely a stupid problem to find this 4/512 solution as 'acceptable'.although a great marketing choice, built in automatic life expectancy reduction.
sounds like the manufacturers want the hard drives to become a disposable medium like styrofoam cups.
perhaps when they narrow the disparity down to 4/16, i might consider buying an ssd. that, or when they beat the 'old school' physical platters in price.
until then, get back to the drawing board and stop crapping out these half arsed 'should be good enough' solutions.
IntelUser2000 - Sunday, March 22, 2009 - link
araczynski: The 4/512 isn't done by accident. It's done to lower prices. The flash technology used in SSDs are meant to replace platter HDDs in the future. There's no way of doing that without cost reductions like these. Even with that the SSDs still cost several times more per storage space.araczynski - Tuesday, March 24, 2009 - link
i understand that, but i don't remember original hard drives being released and being slower than the floppy drives they were replacing.this is part of the 'release beta' products mentality and make the consumer pay for further development.
the 5.25" floppy was better than the huge floppy in all respects when it was released. the 3.5" floppy was better than the 5.25" floppy when it was released. the usb flash drives were better than the 3.5" floppies when they were released.
i just hate the way this is being played out at the consumer's expense.
hellcats - Saturday, March 21, 2009 - link
Anand,What a great article. I usually have to skip forwards when things bog down, but they never did with this long, but very informative article. Your focus on what matters to users is why I always check anandtech first thing every morning.
juraj - Saturday, March 21, 2009 - link
I'm curious what capacity is the OCZ Vertex drive reviewed. Is it an 120 / 250g drive or supposedly slower 30 / 60g one?Symbolics - Friday, March 20, 2009 - link
The method for generating "used" drives is flawed. For creating a true used drive, the spare blocks must be filled as well. Since this was not done, the results are biased towards the Intel drives with their generous amount of spare blocks that were *not* exhausted when producing the used state. An additional bias is introduced by the reduction of the IOmeter write test to 8 GB only. Perhaps there are enough spare blocks on the Intel drives so that these 8 GB can be written to "fresh" blocks without the need for (time-consuming) erase operations.Apart from these concerns, I enjoyed reading the article.
unknownError - Saturday, March 21, 2009 - link
I also just created an account to post, very nice article!Lots of good well thought out information, I'm so tired of synthetic benchmarks glad someone goes through the trouble to bench these things right (and appears to have the education to really understand them). Whats with the grammar police though? geez...