The new chief executive for Research in Motion Ltd., Thorsten Heins, mentioned recently that 80 to 90 percent of all BlackBerry users in the U.S. are still using older devices, rather than the latest Blackberry 7.
Longevity of a consumer device is something that we at Datalight know belongs firmly in the hands of the product designer, rather than being limited by the shortened lifespan of incorrectly programmed NAND flash media. Both Datalight’s FlashFX Tera and Reliance Nitro incorporate algorithms which reduce the Write Amplification on all Flash media. These methods are especially important on e-MMC, which is at its heart NAND flash. In addition, the static and dynamic wear leveling in FlashFX Tera provides even wearing of all flash for maximum achievable lifetime.
Shorter lifetime for some consumer devices, such as low end cell phones, may be found acceptable. However, many newer converged mobile devices that command a higher price, such as tablets, are expected by consumers to have a much longer lifetime. These devices may be replaced by the primary user with some frequency, although since they are viewed as mini-computers and therefore less “disposable,” they will likely be handed down to younger users rather than being discarded or recycled. Consumers will protest in if they discover their $500 tablet only has a lifespan of 3 years, and they will be even more upset if due to flash densities and write amplification that the next version they purchase may have even a shorter lifespan.
How will flash longevity affect your new embedded design?
I for one am happy to see people talking about file system features, especially Data Integrity, knowledge of Flash Media, and faster access through B+ trees. Of course, Datalight’s own Reliance Nitro file system has had all this and more for some time now…
Microsoft has a new term for a thing we’ve seen often in the case of unexpected power loss – a “Torn Write”. They point this out as a specific problem for their journalling file system, NTFS, but updating any file system metadata in place can be problematic. It looks to me like this new file system, ReFS, handles this by bundling the metadata writes with other metadata writes or with the file data. If the former, this demonstrates the trade-off between Reliability and Performance that we are very familiar with at Datalight. Bundling smaller writes will help with spinning media and flash. In time we will see how much control the application developer has over this configuration – another important point for our customers.
One of the commenters posted that error correction belongs at the block device layer, and I tend to agree. Microsoft’s design goal “to detect and correct corruption” is a noble one, but how would they detect corruption for user data? Additional file checksums and ECC algorithms would be intrusive and potentially time consuming. Keeping a watch on vital file system structures is important, of course, and a good backup in case block level error detection fails.
I look forward to reading more from Microsoft’s file system team in the future, and especially hope to see a roadmap for when these important changes will make it down to the embedded space.
We here at Datalight are seeing a lot of interest in this weeks “Software Perspective on eMMC” presentation, across a broad spectrum. This is apparently a pretty hot topic!
One consideration I would add to this quite excellent summary is about the availability of drivers. Raw NAND has been around for quite a while and the market supplies a large range of drivers. Many of these will utilize the basic functionality of SmartNAND and other EZ NAND chips with only small modifications. Drivers for eMMC, on the other hand, are much harder to find. Only Linux has a freely available driver, which Google’s Android has taken advantage of in recent releases.
At Datalight, we continue to be excited by both of these new technologies. From the JEDEC eMMC parts, the cool features such as Secure Delete and Replay Protected Memory Block are very exciting. On the other hand, the sheer performance of Toshiba’s SmartNAND and other EZ NAND solutions is very much in demand.
Just in case you’ve ever wondered what we mean when we say Reliance Nitro is a “transactional file system,” or how it differs from FAT or ExFAT, have I got the video for you. Using a whiteboard and a seemingly endless supply of colored pens, I doodled my way through a 2-minute explanation of the technology behind our flagship product and why Reliance Nitro provides rock-solid reliability and lightning-fast mount times. Hope you enjoy it!
Flash Memory Summit has come of age with a conference agenda that has technical depth and an exhibit hall with professional quality booths and demos. This year SSD and PCI-e memory card solutions ruled, and Micron demonstrated a PCI-e memory card that had 128 individual Enhanced ClearNAND devices on it, 16 chips on each of 8 DIMMs. It totaled up to 2 terabytes of flash. It was very cool to see this kind of application demonstrated, and it reaffirmed our decision to support ClearNAND. In addition to demonstrating ClearNAND support in our booth, we debuted new file system technology to mitigate write amplification. There were a lot of tough questions asked, but the most common reaction was, “Really, you’re doing that many IOPs with small (4kb) random writes? How are you doing it?” We saw several references to write amplification in presentations and on the show floor, so clearly this is an issue the industry is actively wrestling with.
One of our senior software engineers at the show was particularly impressed by what he termed the “unbelievable performance” of some of the flash hardware being demonstrated (well over a hundred thousand IOPs), including the aforementioned PCI-e ClearNAND solutions. He said, “In some of the panel discussions, they were talking about IOPS in the millions! With the individual flash memory cells getting slower as lithography shrinks, there is only one way this is being achieved: pipelining and parallelism.”
This year we hosted the Software panel, or so-called Lunatic Fringe. It was surprisingly well attended for a show focused on hardware, and it ran the gamut from Flash Memory on Linux to Databases to a file system for SD cards. For next year, I’d like to see a push towards an Embedded portion of the show, because there was plenty of interest in our embedded software solutions.
Finally, I was excited to see some of the Enterprise level solutions mirror what Datalight is doing at the embedded scale. Tier 1 caching, which is using a few SSDs to speed up the slower hard drives? That matches our Paired Storage performance enhancement. Treating Flash differently than a Hard Drive? We’ve been doing that since the beginning, and it’s only getting better in this next release.
To protect against unexpected power loss, so common in the embedded world, file writes need to be atomic.
Linux file systems ext3 and ext4 were designed for server or desktop environments. Google developer Tim Bray suggests that appropriate use of fsync() can mitigate the risk of data loss, but I am sure that’s not the best solution. The use of delayed allocation means that metadata is committed but the data is not. Alternatively, both can be committed to the journal at a performance penalty. Performance is crucial in both desktops and devices, but not at the expense of data corruption.
This problem is readily demonstrated when updating files, an action which usually happens “in place”. This is quite common in database and other important system files. When power is lost, data can be overwritten only partially, or else metadata can be altered to point to where the data will be updated but has not been. Another alternative to liberal use of fsync() is a rename strategy, that is, write only new data, then rename and replace the old file. Rename is atomic, at least.
The best solution, and one which does not require applications to change the way they do writes, is to perform all data writes atomically. In addition to that, the file system should never overwrite live data and always retain a “known good” state on the media. This way caching does not have to be removed – either user data changes get to the disk fully or not at all. No partial writes or incorrect metadata, and no mount-time journal rebuilds or disk checks either.
Instead of adapting a desktop or server file system for embedded use, it is far better to use a file system designed specifically for embedded use.
The JEDEC eMMC 4.4 specification added two variations to the basic erase command for data security. These were:
Secure Erase – A command indicating a secure purge should be performed on an erase group. The specification states that this should be done not only to the data in this erase group but also on any copies of this data in separate erase groups. This command must be executed immediately, and the eMMC device will not return control to the host until all necessary erase groups are purged. One erase group is the minimum block of memory that can be erased on a particular NAND flash.
Secure Trim – Similar to Secure Erase, this command operates on write blocks instead of erase groups. To handle this properly, the specification breaks this into two steps. The first step marks blocks for secure purge, and this can be done to multiple sets of blocks before the second step is called. The second step is an erase with a separate bit flag sequence that performs all the requested secure trims.
This feature was changed in the eMMC 4.5 specification, due out later this year, and neither of these commands will be functional. To properly handle this change and allow a board design to support multiple types of eMMC parts, the file system or driver will have have a built in flexibility. The alternative, assuming both eMMC vendor drivers work in the design, is still a complete recoding phase and full software test cycle.
Do you need defrag? It mostly depends on your hardware and your use case. While defragmenting a file system can make the computer run faster, it’s not the only answer.
Fragmentation is usually caused when modifying a file. Overwriting the file or making it larger means storing a fragment of the file in a new place, unless the file system creates a complete new copy of the file. Databases are particularly susceptible here – they are usually large files and often updated in the middle.
Another way fragmentation happens is when the file system initially stores the file in pieces. This could happen if the file system is not configured to keep file blocks together, or if the media is fairly full and there are no spaces of sufficient size for the new file.
What about the impact of fragmentation? In the days of rotating media, a fragmented file meant extra head movement and platter rotation to read the file. With flash media, the extra overhead is just additional block reads – a far smaller cost.
Avoiding fragmentation if you’re using Reliance Nitro can be as simple as customizing your transaction points. Instead of transacting on a timed basis, create a new transaction point only when the entire file is on the media, at “file close”. Similar settings may be available on other file systems.
If your use case causes fragmentation, a valid workaround might be to reformat the media after backing up the database files. A fresh file format is fairly quick on modern hardware, and can be coupled with a bad block test as well.
JEDEC, the Joint Electron Devices Engineering Council (see http://www.jedec.org), is a group of manufacturers and suppliers collaborating to create specifications for Flash memory access and parts. The current revision of their specification for Embedded MultiMedia Cards, eMMC, is 4.41, and is available on the website above.
I’m excited to be a part of Datalight and JEDEC, and am looking forward to the upcoming eMMC 4.5 and UFS 1.0 specifications. Datalight is not in the business of manufacturing hardware, of course, but our file system products like to work closely with the underlying driver. Until those products are fully eMMC 4.5 compliant, what can you expect?
The most fundamental thing a Reliance file system needs is a block device that writes data when it says it will. Any data left in a cache or not flushed upon command could be lost data in an unexpected power loss. The Enhanced Reliable Write feature means any eMMC 4.41 or 4.5 flash part will work perfectly with Reliance and Reliance Nitro.
The High Priority Interrupt feature of eMMC basically means that a block device write might pause, reporting back only a partial write. This is fully supported in the Reliance Nitro file system, which will then loop back and continue the write after the HPI is complete.
The Trim feature of eMMC 4.4.1 is being replaced by a Discard feature in eMMC 4.5. The latter fits in more closely with the way Reliance interacts with our own FlashFX product.
Basic functionality (Read, Write, and Erase) is of course supported, and full compliance with eMMC 4.5 is on Datalight’s roadmap, so keep an eye out here for more news soon.