Write Amplification: The Next Device Optimization Battle?

Wikipedia describes Write Amplification as “an undesirable phenomenon associated with Flash memory and solid-state drives (SSDs) where the actual amount of physical information written is a multiple of logical amount intended to be written,” and offers this formula to calculate it:

Total Data written to media

______________________             = Write Amplification

Data written by user

The Wikipedia article is technically correct, but only tells part of the story in my opinion. Write amplification does occur in flash memory such as raw NAND, e•MMC and SSDs with file systems like JFFS2, YAFFS2, UBIFS or ext4, as Wikipedia describes, but it also occurs in servers, desktops, laptops, tablets, cellphones and anywhere else data is stored.

If the Wikipedia formula above is the definition of write amplification, then it also occurs on Android, Linux, Windows, Windows CE, Windows Mobile, and any RTOS application that allows a mismatch between the size of individual writes and the logical block size of the media. One place where this mismatch will occur that may not be obvious is in the writing of metadata.

Metadata functions as a map to pinpoint where the user data resides on the physical media, as well as to record contextual information such as date, time and directory. In order to be accessible later, data files and their associated location metadata must be completely written to the media before power to the system is turned off. Also, as files are updated, the metadata must also be periodically updated. Frequent updates to metadata will likely result in increased overhead (write amplification), however infrequent updates to the metadata may result in increased risk of data loss when power is lost unexpectedly.

Total data written to media       User data + metadata written

_______________________ =  _________________________                     = Write Amplification

Data written by user                                      Data written by user

 

Hard disks of the past didn’t suffer from write amplification; data could be easily mapped by sector since disk sectors translated to the physical location of data on the disk, and therefore metadata to define the location was not needed. Identifying which sector held which data was managed by accessing the physical area defined by the logical block address (LBA). This is no longer true for modern hard disk drives, which now identify bad sectors and use metadata to remap data to replacement sectors. Modern hard disk drives now also store error correcting codes (ECC) within each sector in order to assure data accuracy. Assuming a 128-bit ECC for a 512 byte sector, the write amplification is relatively small at 520 to 512 (about 1.5% Write Amplification overhead).

The FAT file system uses a 32-bit value to map a block of file system data. So if a block of data is 512 bytes, a 32-bit value written to keep track of that block translates into a ratio of 128 to 1. Unfortunately, writing the 32-bit value and making sure it gets to the media adds another 512 byte block of data that must be written, giving us a ratio of 2 to 1 (write amplification = 2). What’s worse, FAT file systems typically maintain two copies, so another 512 byte block is written to update the other FAT, making the ratio 3 to 1 when performed in a single operation – a less than optimal write amplification ratio by almost any standard.

A database operates much like a file system in that it uses metadata to track user data. The above formula for write amplification applies to databases too. Also, like many file systems, databases use transactions to make sure all the data is on the disk and the database does not get corrupted due to unexpected power loss. This reliability requirement increases write amplification even further!

In a paper entitled “Revisiting Storage for Smart Phones,” NEC Laboratories highlighted this problem of exponential metadata growth by analyzing the data use of an Android mobile phone running common applications like Facebook and Twitter. The applications only downloaded 1.6MB in a two hour period over the wireless network, however they wrote nearly 30 MB to the flash. There may be other factors at play, but by rough estimates the SQLite and YAFFS2 Android storage stack appears to have a write amplification factor of 20! The graph below illustrates how much data came from the network vs. the total written out to the flash memory.

Graphic from “Revisiting Storage for Smart Phones,” by Kim, Agrawal, Ungureanu NEC Labs

The ramifications of a 20x write amplification factor in terms of power usage, performance and device life are significant, making me wonder how much unnecessary work device manufacturers are doing to boost processor speed, lengthen battery life, and deal with endurance issues, when what they really should be doing is taking a look at reducing write amplification.

Read More About Datalight File Systems

 

RoySherrill | May 8, 2012 | Uncategorized | Leave a comment

Performance Testing Reliance Nitro and ext3/4

Recently we ran benchmarks comparing ext3 and ext4 to Reliance Nitro on eMMC and Linux. Here’s what we learned….

IOzone Test – Reliance Nitro compared to ext3/4 on eMMC

Test Environment

Operating System: Linux 2.6.39 with eMMC enhancements
Reference Platform: OMAP 4430 PandaBoard
Memory: Socketed eMMC via SD adapter Linux SanDisk 4GB iNAND with 3 partitions

  • 64 MB boot partition
  • 512 MB root file system
  • 3.4 GB test partition

 

Test Configurations

Testing was run on three different PandaBoards: two Rev A3 boards and one Rev A2 board in two different configurations that reflect distinctly different operating environments that Linux file systems may experience.
Each test configuration was run against each file system twice on each target board, for a total of six tests per file system/configuration. The six sets of test results for each file system/configuration were then averaged together, graphed and analyzed.

The configurations were as follows:

Configuration 1 – Low RAM, fsync
In this configuration the target system had little available RAM, and the test issued the fsync command to the file system to force data to be written out to the media. This measures the performance of the file system when only small amounts of data can be cached and the data must be written out to the media. The goal of this configuration is to measure the file system overhead.

  • Designed to show hardware throughput
  • Embedded Device or High Reliability Device Use Case
  • Boot Arguments:
    root=/dev/mmcblk0p2 rw rootwait mem=64M console=ttyO2,115200n8 init=/sbin/init earlyprintk
  • Test Arguments:
    iozone –Raze -+w 1 –q 16m –g 128m

Configuration 2 – High RAM, no fsync
This test configuration had large amounts of available RAM and did include the fsync command. This measures the performance of the file system when large amounts of data can be cached and the data need not be written to the disk; therefore measuring the file system interaction with the cache.

  • Reflects performance through internal caches
  • Closer to desktop or data center use case
  • Boot arguments:
    root=/dev/mmcblk0p2 rw rootwait mem=463M console=ttyO2,115200n8 init=/sbin/init earlyprintk
  • Test Arguments:
    iozone –Raz -+w 1 –q 16m –g 128m

Limitations

  • Benchmarks not a normal use case
    • IOzone creates a single file, then deletes it before creating the next
    • Tests were run on freshly formatted disk
  • Test runs were single threaded
    • IOzone can test multithreaded
  • Reliance Nitro is protecting user data; ext3/ext4 journals only the file metadata

It is important to note that IOzone and the tests run are for benchmarking purposes and do not attempt to measure real world use case performance. IOzone creates a single file, and deletes it before creating the next, and the test was run against a freshly formatted disk. Both test configurations used were single-threaded, while this is the default “automatic” behavior for IOzone, it is unlikely for a real world use case. Target systems have many files that are accessed in different ways, often all at once.

Additionally, IOzone measures performance and in no way compares the data at risk for a given file system. For all tests run, Reliance Nitro is protecting user data and file data via transaction points while ext3 and ext4 are only protecting file metadata via the journal.

Performance Data

Performance data was generated using IOzone v3.397.

The test results are graphed in a manner that allows side-by-side comparison of each file system’s performance in each test case. The test executes the test cases against file sizes that are powers of 2 from 64 KB – 128 MB, and record sizes that are also powers of 2 from 4 KB – 16 MB.

The vertical axis of the graphs shows the performance measure during the test, in units of kilobytes per second (KB/s).

The horizontal axis shows the file sizes, and the data points between the file sizes represent the record sizes.  For example, the first data point (immediately above the 64 KB mark) is for 64 KB files with 4 KB records, the next is for 64 KB files with 8 KB records, and so on.

Configuration 2

 

In both test configurations, the read performance of all three file systems is largely comparable.

With the sequential read test, all three file systems appear to be operating outside the cache at 32 MB file size and greater which is the point where the files are too large to be cached. Ext3 and ext4 are operating within the cache on 8 MB and 16 MB file sizes. All three file systems are within the cache below 8 MB. A likely reason for Reliance Nitro operating outside the cache at smaller file sizes than ext3 and ext4 is that Reliance Nitro allocates more RAM for private structures therefore starving the cache sooner.

 

 

 

Sequential writes are comparable in Configuration 1, in Configuration 2 ext3 and ext4 perform better due to more optimized integration with cache.

With random writes in Configuration 1, Reliance Nitro outperforms at larger file sizes perhaps due to the extent based design, this could also explain ext4’s advantage over ext3 since it is also extent based. In Configuration 2, ext3 and ext4 perform random writes faster due to better integration with the cache.

IOzone Testing Conclusions

  • Read performance of all three file systems is comparable
  • Configuration 1: sequential writes are comparable, random writes Reliance Nitro outperforms at larger file sizes
  • Configuration 2: ext3 and ext4 perform better on sequential and random writes due to Linux cache integration

Linux has a very cache-centric I/O architecture that, with proper support in the file system, leverages available RAM to avoid doing actual I/O to the storage media.  The media used in this test is capable of reading at about 25 MB/s and writing at about 11 MB/s.  Measured performance in excess of those constraints indicates that some or all of the data is being cached.

In Configuration 1, all three file systems tested perform comparably on reads and sequential writes. For random writes Reliance Nitro outperforms ext3 and ext4 and 16 MB file sized and above likely due to the extent based design of Reliance Nitro.

In Configuration 2, there is ample available RAM to cache the test files and the test is not forcing the data to be written to the disk with fflush/fsync, so the read and write performance of all three file systems consistently exceeds the disk’s capabilities. Ext3 and ext4 perform the same in almost all test cases, but Reliance Nitro performs slower in write cases.  This is because Reliance Nitro integration with the Linux page cache is not optimized, resulting in a less efficient cached performance.

While optimal use of the Linux cache is valuable in terms of performance; cache utilization and its costs and risks must be considered for the individual use case. Using the cache is RAM intensive meaning it can add both BOM cost in the form of extra memory needed as well as additional power consumption cost. Cached data also adds a reliability risk. When data is cached, it is not flushed to the media so it is vulnerable in the event of power loss. Reliance Nitro can dynamically mitigate this risk with transaction points, although further cache optimization is required to boost performance for some use cases.

 

Michele Pike | March 21, 2012 | Uncategorized | Leave a comment

Device Longevity using Software

The new chief executive for Research in Motion Ltd., Thorsten Heins, mentioned recently that 80 to 90 percent of all BlackBerry users in the U.S. are still using older devices, rather than the latest Blackberry 7.

Longevity of a consumer device is something that we at Datalight know belongs firmly in the hands of the product designer, rather than being limited by the shortened lifespan of incorrectly programmed NAND flash media. Both Datalight’s FlashFX Tera and Reliance Nitro incorporate algorithms which reduce the Write Amplification on all Flash media. These methods are especially important on e-MMC, which is at its heart NAND flash. In addition, the static and dynamic wear leveling in FlashFX Tera provides even wearing of all flash for maximum achievable lifetime.

Shorter lifetime for some consumer devices, such as low end cell phones, may be found acceptable. However, many newer converged mobile devices that command a higher price, such as tablets, are expected by consumers to have a much longer lifetime. These devices may be replaced by the primary user with some frequency, although since they are viewed as mini-computers and therefore less “disposable,” they will likely be handed down to younger users rather than being discarded or recycled. Consumers will protest in if they discover their $500 tablet only has a lifespan of 3 years, and they will be even more upset if due to flash densities and write amplification that the next version they purchase may have even a shorter lifespan.

How will flash longevity affect your new embedded design?

Thom Denholm | March 6, 2012 | Extended Flash Life, Flash Industry Info, Flash Memory, Flash Memory Manager | Leave a comment

Datalight Sponsors Local High School Robotics Team

The Arlington Neobots are not like other high school technology clubs. For one thing they have access to a phenomenal pool of mentors from local technology companies like Boeing, Microsoft and now Datalight. They also have a growing number of female members, a rarity in youth organizations oriented to math and science.

Founded in 2008 with seed money from Boeing, the team competes in an annual robot building competition created by national non-profit organization FIRST (For Inspiration and Recognition of Science and Technology), and this year the competition is already ramping up. For 2012, FIRST has challenged the robotics teams to a game similar to basketball called Rebound Rumble. Six teams are split up into two alliances of three; one alliance is blue and the other red. During the 2-minute and 15-second match, teams compete by trying to make as many baskets as they can. Part of the match is devoted to a 15-second autonomous mode where the robot is controlled through an XBox Kinect instead of the robot’s standard remote control. There are four hoops – one high, two middle, and one low. The higher the hoop, the more points awarded for making a basket in it.

The Neobots will need to work together in teams to finish their robot by the competition deadline. First, the one-week design phase involves team analysis of the game and its rules manual, and a group decision on game strategy and design criteria for the team robot. Next, the team will split into design groups to brainstorm, research and present their findings to the team. Then, using 3D models and prototypes, each group will propose a robot design to be voted on by the team. After the design is established, the build phase involves again breaking into sub-groups that are each assigned projects like System Integration, Programming, and Drive-Base, and other functions. The team will follow an iterative process; every major milestone will be tested rigorously before they proceed.

You might ask why Datalight would sponsor a high school robotics club. VP of Engineering Ken Whitaker puts it this way; “This is one of the most important things we can do as a technology company. What you’re seeing in its raw form is the next generation of embedded engineers, and we have a responsibility to nurture and support them. In a few years time I could see any of these motivated students ending up on my engineering team.”

Learn more about Datalight

RobHart | February 20, 2012 | Datalight Products | Leave a comment

CES: The Embedded Storage Perspective

Steve Ballmer did a nice job kicking off the keynotes on Monday night with an impassioned presentation about Windows Phone, Windows 8, and Xbox 360, but when it ended I found myself wondering why he didn’t talk about any other Microsoft products. Windows phone looks pretty slick though, and I’m assuming it will run on eMMC flash for data storage. Next year Microsoft will be passing the baton to someone else for their traditional opening keynote and will not be back – not sure what (if anything) that means. We’ll also have to wait and see if Ryan Seacrest will be invited back…

Sony announced a new flash memory card promising even faster performance, which goes to show users are still looking for faster flash-based devices and manufacturers are paying attention. One of the guys in the Sony booth also mentioned a flash card that can read up to1 GB/second that is coming soon. He didn’t have any samples available, but he sure enjoyed telling me about it.

There were tons of SSDs displayed at the show. They’re not very exciting to look at since they all look the same, but check out these specs! 80K random read IOPS and 36K random write IOPS – amazing!

There was a lot of talk about super thin and light ultrabooks, which sounds like yet another Windows product following in Apple’s footsteps (MacBook Air). After schlepping my heavy Dell all over Storage Visions and CES though, I have to admit feeling some pangs of want; the new Windows ultrabooks look awfully sleek and comfortable to work with. The latest version demands an on-circuit board SSD design to meet the form factor and weight requirements.
Coming to a beach near you: water-proof cell phones! I love the ocean and beach life, but I’m in the habit of leaving my phone at home. This demo showing water-proof phones in an aquarium was certainly eye-catching, though I wonder if people really want to be connected while they’re swimming. Clearly the eMMC in these phones is water-proof as well, but we have yet to see any in our labs. Then again, our engineers have been known to work and swim.

Ken Jacobson with Qualcomm had a nice keynote presentation Tuesday morning, including “augmented reality,” a new feature that allowed him to animate several plastic Sesame Street characters, which were talking and interacting with the demonstrator. It being Vegas, Qualcomm decided to include some very funky dancers just for pure entertainment value.
Automotive was hot this year, with lots of really nice cars. It was a little odd to me that graphics and CPU chip maker NVIDA had a Lamborghini in its booth, but it definitely got my attention!

Car telematics demos were everywhere. This in-dash version looked nice, but I was surprised that last year’s big concept of connecting a cell phone to function as your telematics device was nowhere to be seen. It seemed like such a good idea given that this technology becomes obsolete so much faster than the average life of a car. Do they really expect us to use the same telematics for 15 years? Or is there some kind of planned obsolescence at work here?

Forget about driving a Prius; this year’s uber-environmentalist should be driving this solar car. It only costs a few hundred thousand dollars, but think of how much you’ll save in gas. Note to my fellow Seattleites: Do not attempt between October and April, you may not reach your destination.

Below is a concept car from Audi that will out-Smartcar the Smartcar.

RoySherrill | January 23, 2012 | Uncategorized | Leave a comment

Storage Visions Awards

Last week we attended Storage Visions, a show adjacent to CES that focuses on data storage solutions. This year’s theme was Heavy Storage for thin Clients.
We were honored to be a finalist for the Storage Visions award “New Enabling Consumer Storage Technology,” for the Datalight Reliance Nitro fault-tolerant file system. Although we didn’t get to take home the trophy, we wanted to congratulate our partner, Micron, for winning the award with their Real SSD C400 Self Encrypting Drive.

Learn more about Datalight Embedded File Systems

Michele Pike | January 20, 2012 | Uncategorized | 2 Comments

The Next Generation File System for Windows

There’s a lot of buzz on the MSDN blog site regarding their latest file system post. http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx - and plenty of insightful comments as well.

I for one am happy to see people talking about file system features, especially Data Integrity, knowledge of Flash Media, and faster access through B+ trees. Of course, Datalight’s own Reliance Nitro file system has had all this and more for some time now…

Microsoft has a new term for a thing we’ve seen often in the case of unexpected power loss – a “Torn Write”. They point this out as a specific problem for their journalling file system, NTFS, but updating any file system metadata in place can be problematic. It looks to me like this new file system, ReFS, handles this by bundling the metadata writes with other metadata writes or with the file data. If the former, this demonstrates the trade-off between Reliability and Performance that we are very familiar with at Datalight. Bundling smaller writes will help with spinning media and flash. In time we will see how much control the application developer has over this configuration – another important point for our customers.

One of the commenters posted that error correction belongs at the block device layer, and I tend to agree. Microsoft’s design goal “to detect and correct corruption” is a noble one, but how would they detect corruption for user data? Additional file checksums and ECC algorithms would be intrusive and potentially time consuming. Keeping a watch on vital file system structures is important, of course, and a good backup in case block level error detection fails.

I look forward to reading more from Microsoft’s file system team in the future, and especially hope to see a roadmap for when these important changes will make it down to the embedded space.

Learn more about what happens during a power interruption.

 

Thom Denholm | January 18, 2012 | Reliability, Uncategorized | Leave a comment

Software Perspective on eMMC

We here at Datalight are seeing a lot of interest in this weeks “Software Perspective on eMMC” presentation, across a broad spectrum. This is apparently a pretty hot topic!

If you are interested in joining us, seats are still available – http://www.datalight.com/welcome/web-seminar-switching-to-emmc

Thom Denholm | December 5, 2011 | Datalight Products, Flash Industry Info | Leave a comment

Advances in Nonvolatile Memory Interfaces Keep Pace with the Data Volume

This article entitled Advances in Nonvolatile Memory Interfaces Keep Pace with the Data Volume, recently published in RTC Magazine, gives a nice overview of maintaining performance on newer technologies.

 

Learn more about Datalight and ClearNAND

Michele Pike | November 22, 2011 | Flash Memory, Flash Memory Manager, Performance | Leave a comment

Announcing Reliance Nitro 2.5

Today we are excited to announce Reliance Nitro 2.5. This release was focused on the requirements of “Converged Mobile Devices” or multi-function mobile device types. To meet the requirements of the market, Reliance Nitro 2.5 continues to be a competitive advantage for any device storing critical data, plus now improves responsiveness, security and control of the device. For more information visit our Reliance Nitro 2.5 landing page or view our press releases:
Datalight Releases File Level Secure Delete on eMMC 4.4x
and
Datalight Improves User Experience with Low-Latency Reliable File System

Read more about our exciting new release

 

Michele Pike | November 10, 2011 | Uncategorized | Leave a comment