Write Amplification: The Next Device Optimization Battle?

Wikipedia describes Write Amplification as “an undesirable phenomenon associated with Flash memory and solid-state drives (SSDs) where the actual amount of physical information written is a multiple of logical amount intended to be written,” and offers this formula to calculate it:

Total Data written to media

______________________             = Write Amplification

Data written by user

The Wikipedia article is technically correct, but only tells part of the story in my opinion. Write amplification does occur in flash memory such as raw NAND, e•MMC and SSDs with file systems like JFFS2, YAFFS2, UBIFS or ext4, as Wikipedia describes, but it also occurs in servers, desktops, laptops, tablets, cellphones and anywhere else data is stored.

If the Wikipedia formula above is the definition of write amplification, then it also occurs on Android, Linux, Windows, Windows CE, Windows Mobile, and any RTOS application that allows a mismatch between the size of individual writes and the logical block size of the media. One place where this mismatch will occur that may not be obvious is in the writing of metadata.

Metadata functions as a map to pinpoint where the user data resides on the physical media, as well as to record contextual information such as date, time and directory. In order to be accessible later, data files and their associated location metadata must be completely written to the media before power to the system is turned off. Also, as files are updated, the metadata must also be periodically updated. Frequent updates to metadata will likely result in increased overhead (write amplification), however infrequent updates to the metadata may result in increased risk of data loss when power is lost unexpectedly.

Total data written to media       User data + metadata written

_______________________ =  _________________________                     = Write Amplification

Data written by user                                      Data written by user

 

Hard disks of the past didn’t suffer from write amplification; data could be easily mapped by sector since disk sectors translated to the physical location of data on the disk, and therefore metadata to define the location was not needed. Identifying which sector held which data was managed by accessing the physical area defined by the logical block address (LBA). This is no longer true for modern hard disk drives, which now identify bad sectors and use metadata to remap data to replacement sectors. Modern hard disk drives now also store error correcting codes (ECC) within each sector in order to assure data accuracy. Assuming a 128-bit ECC for a 512 byte sector, the write amplification is relatively small at 520 to 512 (about 1.5% Write Amplification overhead).

The FAT file system uses a 32-bit value to map a block of file system data. So if a block of data is 512 bytes, a 32-bit value written to keep track of that block translates into a ratio of 128 to 1. Unfortunately, writing the 32-bit value and making sure it gets to the media adds another 512 byte block of data that must be written, giving us a ratio of 2 to 1 (write amplification = 2). What’s worse, FAT file systems typically maintain two copies, so another 512 byte block is written to update the other FAT, making the ratio 3 to 1 when performed in a single operation – a less than optimal write amplification ratio by almost any standard.

A database operates much like a file system in that it uses metadata to track user data. The above formula for write amplification applies to databases too. Also, like many file systems, databases use transactions to make sure all the data is on the disk and the database does not get corrupted due to unexpected power loss. This reliability requirement increases write amplification even further!

In a paper entitled “Revisiting Storage for Smart Phones,” NEC Laboratories highlighted this problem of exponential metadata growth by analyzing the data use of an Android mobile phone running common applications like Facebook and Twitter. The applications only downloaded 1.6MB in a two hour period over the wireless network, however they wrote nearly 30 MB to the flash. There may be other factors at play, but by rough estimates the SQLite and YAFFS2 Android storage stack appears to have a write amplification factor of 20! The graph below illustrates how much data came from the network vs. the total written out to the flash memory.

Graphic from “Revisiting Storage for Smart Phones,” by Kim, Agrawal, Ungureanu NEC Labs

The ramifications of a 20x write amplification factor in terms of power usage, performance and device life are significant, making me wonder how much unnecessary work device manufacturers are doing to boost processor speed, lengthen battery life, and deal with endurance issues, when what they really should be doing is taking a look at reducing write amplification.

Read More About Datalight File Systems

 

RoySherrill | May 8, 2012 | Uncategorized

CES: The Embedded Storage Perspective

Steve Ballmer did a nice job kicking off the keynotes on Monday night with an impassioned presentation about Windows Phone, Windows 8, and Xbox 360, but when it ended I found myself wondering why he didn’t talk about any other Microsoft products. Windows phone looks pretty slick though, and I’m assuming it will run on eMMC flash for data storage. Next year Microsoft will be passing the baton to someone else for their traditional opening keynote and will not be back – not sure what (if anything) that means. We’ll also have to wait and see if Ryan Seacrest will be invited back…

Sony announced a new flash memory card promising even faster performance, which goes to show users are still looking for faster flash-based devices and manufacturers are paying attention. One of the guys in the Sony booth also mentioned a flash card that can read up to1 GB/second that is coming soon. He didn’t have any samples available, but he sure enjoyed telling me about it.

There were tons of SSDs displayed at the show. They’re not very exciting to look at since they all look the same, but check out these specs! 80K random read IOPS and 36K random write IOPS – amazing!

There was a lot of talk about super thin and light ultrabooks, which sounds like yet another Windows product following in Apple’s footsteps (MacBook Air). After schlepping my heavy Dell all over Storage Visions and CES though, I have to admit feeling some pangs of want; the new Windows ultrabooks look awfully sleek and comfortable to work with. The latest version demands an on-circuit board SSD design to meet the form factor and weight requirements.
Coming to a beach near you: water-proof cell phones! I love the ocean and beach life, but I’m in the habit of leaving my phone at home. This demo showing water-proof phones in an aquarium was certainly eye-catching, though I wonder if people really want to be connected while they’re swimming. Clearly the eMMC in these phones is water-proof as well, but we have yet to see any in our labs. Then again, our engineers have been known to work and swim.

Ken Jacobson with Qualcomm had a nice keynote presentation Tuesday morning, including “augmented reality,” a new feature that allowed him to animate several plastic Sesame Street characters, which were talking and interacting with the demonstrator. It being Vegas, Qualcomm decided to include some very funky dancers just for pure entertainment value.
Automotive was hot this year, with lots of really nice cars. It was a little odd to me that graphics and CPU chip maker NVIDA had a Lamborghini in its booth, but it definitely got my attention!

Car telematics demos were everywhere. This in-dash version looked nice, but I was surprised that last year’s big concept of connecting a cell phone to function as your telematics device was nowhere to be seen. It seemed like such a good idea given that this technology becomes obsolete so much faster than the average life of a car. Do they really expect us to use the same telematics for 15 years? Or is there some kind of planned obsolescence at work here?

Forget about driving a Prius; this year’s uber-environmentalist should be driving this solar car. It only costs a few hundred thousand dollars, but think of how much you’ll save in gas. Note to my fellow Seattleites: Do not attempt between October and April, you may not reach your destination.

Below is a concept car from Audi that will out-Smartcar the Smartcar.

RoySherrill | January 23, 2012 | Uncategorized

Breaking Through the Sub-20nm NAND Flash Barrier

That cracking you may or may not have heard last month was the sound of SanDisk and Toshiba breaking the sub-20 nanometer NAND barrier. Flying in the face of conventional wisdom (and more than a few industry analysts), both companies recently announced they will be delivering 19nm NAND this year. Intel and Micron are close behind, each with their own 20 nanometer announcements. Those who said it couldn’t (or shouldn’t) be done had some very compelling reasons, chiefly that the physics behind multi-cell architecture in a 1x nanometer cell are shaky at best. How many electrons will there be in a 1x nanometer cell? How many levels of data can possibly be detected with so few of them? The supporting technologies for this detection, not to mention correction of the unavoidable errors that will creep in will be critical .

In an industry that has come to expect product innovation in the form of shrinking die sizes being announced roughly every 12-18 months, keeping pace with this trend indefinitely is not only pushing the boundaries of physics, but also manufacturers’ technical abilities. How low can they go? While the introduction of 19nm parts show that innovation and scaling of NAND Flash memory continues moving at breakneck speed, one wonders when the end point of this shrinkage will finally come. And while the drive for NAND innovation has dramatically improved both the cost and performance of the technology, moving to ever smaller die sizes is beginning to have severe consequences on data storage reliability and flash endurance – challenges which must be addressed not only by the supporting hardware technologies but also by the file system and flash management software. Bottom line: Will the devices you’re responsible for provide the performance, life span and flexibility your customers require? What contingencies should you be planning for as the storage technologies get ever smaller?

Learn more about Datalight flash management software

RoySherrill | June 27, 2011 | Flash Memory Manager, Uncategorized

Flash Memory & Android Dominate ESC Silicon Valley

The topic of storage technology seemed to be everywhere at last week’s Embedded Systems Conference in San Jose, appearing in numerous key note speeches, presentations and exhibit booths.  It appears the industry is waking up to the difficulties of storing and managing a torrent of data being produced by new mobile applications.

Micron’s presence both on the show floor and conference sessions highlighted their philosophy of creating application-specific storage technologies. In particular, Tom Eby’s keynote address considered both ends of the device and storage spectrum, dividing the market into devices that run applications and those that don’t, that is, devices that demand LOTS of Storage and those that run on meager memory systems (i.e., feature phones).  An interesting side note for Micron customers, Tom announced Micron’s Product Longevity Program (also referred to as PLP) which assures developers availability of Micron flash parts for a 10 year period – especially helpful for makers of long-life-cycle embedded products. Also from Micron, Wanmo Wong gave an excellent presentation on flash file system options for Linux and Android in which he expounded a laundry list of questions that must be answered when making that selection. It was just the right amount of detail for a Linux and flash memory newbie, highlighting the sheer number of issues that must be addressed when selecting the right flash file system for a particular application.

The Woz’ (Steve Wozniak, chief scientist at Fusion-IO) gave a lively fireside chat on the challenges and roadblocks for passionate engineers, from societal problems like our school systems’ failure to nurture brilliant engineering minds, to the difficult balance between getting a (sometimes tedious) job done and following your engineering passion.

Virtually every storage technology was on display from the embedded storage vendors around the globe, from PCM (Phase Change Memory) chips, to eMMC 4.41 parts, to on-board storage, USB and 2-1/2 inch HDD form factor solutions.  Flash storage solutions were presented by Apacer, Innodisk, Viking, STEC and others.  In a sea of slick memory packaging, the Viking example below really screamed embedded to me…

On the Android front, Datalight demonstrated how open source and great proprietary solutions can come together and give developers the best of both worlds. Our temporary home in booth 2320 featured this sneak preview of our upcoming Android support on a TI Beagle board. The tiny-but-slick Android demo is shown at Jimm’s right elbow.

Learn more about Datalight products for Linux/Android

 

RoySherrill | May 11, 2011 | Uncategorized