The USB Mass Storage class (also called UMS) is a protocol that allows a device connected through USB (Universal Serial Bus) to become accessible to a host computing device. This allows file transfers between the device and the desktop, as long as the file system used on the device is known to the desktop. One common example of this is the FAT file system. Other file systems can be supported through an external driver. An example of this is a device formatted with Reliance Nitro can be accessed through the Reliance Nitro Windows Driver (RNWD) on a Windows host.
To the host machine, the USB device appears similar to an external hard drive, enabling drag-and-drop file transfers. In fact the host has exclusive access to the media shared through this USB Mass Storage class. The connection happens below the file system layer – the physical media is revealed through this protocol. All transfers are done on a media block basis, and all metadata writes are separate from file data writes.
This can cause a few problems, especially when a transfer is interrupted. To truly prevent corruption, the host computer must release access to the device before it is disconnected. Often the software on the device cannot access the media while any transfer is in operation. One example of this is Android, which warns you that applications and data stored on the SD card will be inaccessible.
Another option which is gaining popularity is the Media Transfer Protocol, or MTP. This is an extension of the Picture Transfer Protocol which was originally developed for cameras. A host computer equipped with MTP will be able to communicate with the device at a layer above the file system with basic packet transfer commands. The unit of storage is a local file rather than a media block. The target device can use whatever file system is required for the device, and the device can access that file system simultaneous with the host computer with no fear of corruption.
In some ways this makes the Media Transfer Protocol rather like a transactional file system – either the entire file is written or nothing is. In most cases, the storage media is not affected by failed transfers, because the protocol on the device side handles this properly. This works especially well with the Reliance Nitro transactional file system. A customer implementing MTP on their device can use the Dynamic Transaction Points of Reliance Nitro to perform a manual transaction after each completed MTP transfer. Best of all, file data and metadata writes can be done at the same time, in an atomic fashion.
In summary, Datalight’s software works with both of these standard protocols. The improvements and advantages of Media Transfer Protocol make it the superior solution for embedded devices.
Thom Denholm | March 20, 2013 | Consumer Mobile, Consumer Other, Reliability |
Datalight’s Reliance Nitro and journaling file systems such as ext4 are designed to recover from unexpected power interruption. These kinds of “post mortem” recoveries typically consists of determining which files are in which states, and restoring them to the proper working state. Methods like these are fine for recovering from a power failure, but what about a media failure?
When a media block fails, it is either in the empty space, the user data, or the file system data. A block from the empty space can be detected on the next write, which will either cause failure at the application, or will be marked bad internally and the system will move on to another block. When a media block in the user space fails, it cannot be reliably read. Often, the media driver will detect and report an unreadable sector, resulting in an error status (and probably no data) to the user application. When a media block containing file system data or metadata fails, it is the responsibility of the file system to detect and (if possible) repair that damage. Often the best thing that can be done is to stop writing to the media immediately.
In some ways, blocks lost due to media corruption present a problem similar to recovering deleted files. If it is detected quickly enough, user analysis can be done on the cyclical journal file, and this might help determine the previous state of the file system metadata. Information about the previous state can then be used to create a replacement for that block, effectively restoring a file.
Metadata checksums have been added to several file system data blocks for ext4 in the 3.5 kernel release. Noticeably absent from this list are the indirect and double indirect point blocks, used to allocate trees of blocks for a very large file. The latest release of Datalight’s Reliance Nitro file system (version 3.0) adds CRCs to all file system metadata and internal blocks, allowing for rapid and thorough detection of media failures.
Optional within this new version of Reliance Nitro is using CRCs on user data blocks, for individual files or entire volumes. This failsafe can be configured to write protect the volume or halt system operations. Diagnostic messages are also available to indicate the specific logical block number of the corrupted block.
The combination of full CRC protection on every metadata block and optional protection of user file data blocks is one of the key attributes of this release of Reliance Nitro. Embedded system designers can detect more media failures in testing, and can diagnose failed units more quickly, leading to greater success in the marketplace.
Thom Denholm | January 26, 2013 | Flash File System, Flash Memory, Reliability |
Before embedded devices, file systems were designed to work in servers and desktops. Power loss was an infrequent occurrence, so little consideration was given to protecting the data. Frequent checks of the file system structures were important, and were often handled at system startup by a program such as chkdsk (for FAT) or fsck (for Linux file systems). In each case, the OS could also request a run of these utilities when an inconsistency is detected, or when the power was interrupted.
The method behind these tools is a check of the entire disk – reading each block to determine if it is allocated for use, then cross checking with an allocated list located elsewhere on the media. FAT file systems have little other protection, and can only flag sections of the media without matching metadata by creating a CHK file for later user analysis. Linux file systems add in a journal mechanism to detect which files are affected, and can often correct the damage without user intervention.
These utilities are necessary because these basic file systems are not atomic in nature – data and metadata are written separately. Datalight’s Reliance Nitro file system treats updates as a single operation, and thus the file system is never in a state where it would need to be corrected. Our Dynamic Transaction Point technology allows the user to customize just how atomic their design is, protecting not just a block of data and metadata but the whole file – half a JPG is pretty much useless, from a user perspective.
The repairs that fsck and chkdsk can perform are completely unnecessary with the Reliance Nitro file system. At the device design level, this results in quicker boot times for a system that is completely protected from power failure. A file system checker is of course provided, and is useful for detecting failures caused by media corruption.
Taking chkdsk and fsk to the next level of protection would be a tool to repair some media corruption. If a block of data on the media becomes only partially readable, this tool could read it multiple times (to try and collect the most data) and store the results in a newly allocated block, correcting the file system structures appropriately. User intervention would likely be required to understand if enough data was recovered to make this effort worthwhile. Stay tuned for more updates on this topic.
Thom Denholm | January 7, 2013 | Reliability |
The challenge – making ext4 just as reliable as Datalight’s Reliance Nitro file system, within limitations of the POSIX specification. Unlike most real world embedded designs, performance and media lifetime are not a consideration for this exercise.
1) Change the way programs are written. By changing the habits of coders (and underlying libraries), it is possible to gain some measure of reliability. I’m talking specifically about the write(), fsync(), close(), rename() combination discussed on some forums. This gets around the aggressive buffering of ext4, and gives some assurance that a file either exists or doesn’t. What this does not do is handle an update, which overwrites part of the file, unless an entire new copy of the file is written.
What this also fails to handle is a multithreaded environment. As each of these operations is not atomic, cohesiveness can be lost if power is interrupted while one thread is performing an fsync() while another is writing, or issuing a rename.
2) Journal the data as well as the metadata. For smaller files or overwrites, this solution will have all the data in the journal and available for playback when recovering from a power loss. While this worked well in ext3, this option was changed in ext4. The delayed allocation strategy gives an increase in performance, but can cause more loss of data than is expected. Applications which frequently rewrite many small files seem especially vulnerable.
Shortening the system’s writeback time, stored in shell variables such as dirty_expire_centisecs, can help mitigate this by reducing the amount of data which can be lost.
Some changes just aren’t possible, however. Writes are not atomic, as the metadata and data are written separately. The power can be interrupted between the two types of writes, no matter what is done at the application level. While these suggestions make ext4 more reliable, the cost in changes to user code, loss of performance and additional wear are not insubstantial. Far better to use a file system designed for unexpected power loss, where atomic writes allow the system designer to decide when to put data at risk.
Thom Denholm | October 31, 2012 | Reliability |
Embedded device designers have to come up with systems that handle variations in network traffic, resource constrained environments, and battery limitations. All that pales in comparison with challenges at system start and power down times.
In powering down a multi-threaded device, each application will want to get state information committed to the media. The operating system will have its own set of files or registries to update before the power goes off. At the low end, the MLC NAND flash or eMMC media requires power to be maintained while writing a block, in order to prevent corruption of that or other blocks on the media. Therefore a large group of block writes arrive at the media level simultaneously, all with the requirement to finish up before the power drops.
The only thing worse than having to perform several writes in a short time frame is when those writes display the worst performance characteristics of eMMC media. These block writes are often located in different locations, essentially defining Random I/O.
When restoring power, the same situation occurs in reverse, with each thread requesting state information. For some file systems, the interrupted writes will need to be validated, the file system checked, or the journal replayed. Here the time pressure is not with actual loss of power, but with frustrated users, who want the device to be on, now!
This is a very important use case to consider for designers of file systems and block device drivers. While shutdown consists of primarily writes, a multithreaded file system should not block reads. The same applies at startup time, where a write from one thread should not block the other read threads. Where possible, the block device driver should sequence the reads and writes to match the high performance characteristics of the media, allowing the most blocks to be accessed in the least amount of time.
We are currently working on an eMMC-specific driver that will manage the sequencing of writes. It promises to improve write performance by many times the rate of standard drivers. Check back with us for more on this topic next month!
If possible, applications should also be written to minimize the startup and shutdown impact. With more applications being written by outside developers, this is getting harder to control. The system designer must focus on what they can control – choice of hardware, device drivers and file system to mitigate these problems.
Thom Denholm | June 26, 2012 | Extended Flash Life, Flash Memory, Performance, Product Benefit, Reliability, Uncategorized |
There’s a lot of buzz on the MSDN blog site regarding their latest file system post. http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx - and plenty of insightful comments as well.
I for one am happy to see people talking about file system features, especially Data Integrity, knowledge of Flash Media, and faster access through B+ trees. Of course, Datalight’s own Reliance Nitro file system has had all this and more for some time now…
Microsoft has a new term for a thing we’ve seen often in the case of unexpected power loss – a “Torn Write”. They point this out as a specific problem for their journalling file system, NTFS, but updating any file system metadata in place can be problematic. It looks to me like this new file system, ReFS, handles this by bundling the metadata writes with other metadata writes or with the file data. If the former, this demonstrates the trade-off between Reliability and Performance that we are very familiar with at Datalight. Bundling smaller writes will help with spinning media and flash. In time we will see how much control the application developer has over this configuration – another important point for our customers.
One of the commenters posted that error correction belongs at the block device layer, and I tend to agree. Microsoft’s design goal “to detect and correct corruption” is a noble one, but how would they detect corruption for user data? Additional file checksums and ECC algorithms would be intrusive and potentially time consuming. Keeping a watch on vital file system structures is important, of course, and a good backup in case block level error detection fails.
I look forward to reading more from Microsoft’s file system team in the future, and especially hope to see a roadmap for when these important changes will make it down to the embedded space.
Thom Denholm | January 18, 2012 | Reliability, Uncategorized |
If you’ve noticed the numerous posts lately on the Datalight blog regarding JEDEC and eMMC, you might be wondering why we’re so excited about this particular standard. There are many features that this “smarter” memory will enable for OEMs; In this post I’ll focus on one of those features in the eMMC specification –secure delete.
Securely deleting information on flash memory is more complicated than it seems. For one thing, files are constantly being moved around to ensure even wear of the flash, resulting in multiple copies of file data on the media. Furthermore, when a file is marked for delete, it is typically not physically deleted, rather the space is only marked as available to be overwritten. Until that happens, the “deleted” data is still present and recoverable on the media. In fact, the University of California San Diego Non-volatile systems lab has produced an in depth study of file deletion on flash memory, where they found significant data still present on the media even after deleting the files. A copy of the report can be found at: http://cseweb.ucsd.edu/users/swanson/papers/Fast2011SecErase.pdf
In order to securely delete a file on raw flash, you must use a controller that will either track every block where the file has been stored, or will overwrite the space the file was stored in each time it is moved. The latter describes exactly the secure erase and secure trim features found in the eMMC 4.41 standard. This means that the hardware will finally be capable of securely deleting files –brilliant! There is just one problem: Who has software to support this functionality? As of this writing, there is no file system which supports the feature. While an application can make a call to the media to delete a file securely, the file system may have a backup copy stored somewhere. Fact is, the file system must support the secure delete capabilities of the hardware in order for these features to function correctly.
If an OEM wants to take advantage of the secure erase and secure trim features, their application will need to communicate with the eMMC driver, which may differ from part to part. As the only software company that is an active member of JEDEC, we are excited offer support for quite a few eMMC features. File system support for secure erase and secure trim will be coming later this summer!
Michele Pike | June 29, 2011 | Flash File System, Reliability |
To protect against unexpected power loss, so common in the embedded world, file writes need to be atomic.
Linux file systems ext3 and ext4 were designed for server or desktop environments. Google developer Tim Bray suggests that appropriate use of fsync() can mitigate the risk of data loss, but I am sure that’s not the best solution. The use of delayed allocation means that metadata is committed but the data is not. Alternatively, both can be committed to the journal at a performance penalty. Performance is crucial in both desktops and devices, but not at the expense of data corruption.
This problem is readily demonstrated when updating files, an action which usually happens “in place”. This is quite common in database and other important system files. When power is lost, data can be overwritten only partially, or else metadata can be altered to point to where the data will be updated but has not been. Another alternative to liberal use of fsync() is a rename strategy, that is, write only new data, then rename and replace the old file. Rename is atomic, at least.
The best solution, and one which does not require applications to change the way they do writes, is to perform all data writes atomically. In addition to that, the file system should never overwrite live data and always retain a “known good” state on the media. This way caching does not have to be removed – either user data changes get to the disk fully or not at all. No partial writes or incorrect metadata, and no mount-time journal rebuilds or disk checks either.
Instead of adapting a desktop or server file system for embedded use, it is far better to use a file system designed specifically for embedded use.
View whitepaper: Breakthrough Performance with Tree-based File Systems
Thom Denholm | June 8, 2011 | Performance, Reliability |
The JEDEC eMMC 4.4 specification added two variations to the basic erase command for data security. These were:
Secure Erase – A command indicating a secure purge should be performed on an erase group. The specification states that this should be done not only to the data in this erase group but also on any copies of this data in separate erase groups. This command must be executed immediately, and the eMMC device will not return control to the host until all necessary erase groups are purged. One erase group is the minimum block of memory that can be erased on a particular NAND flash.
Secure Trim – Similar to Secure Erase, this command operates on write blocks instead of erase groups. To handle this properly, the specification breaks this into two steps. The first step marks blocks for secure purge, and this can be done to multiple sets of blocks before the second step is called. The second step is an erase with a separate bit flag sequence that performs all the requested secure trims.
This feature was changed in the eMMC 4.5 specification, due out later this year, and neither of these commands will be functional. To properly handle this change and allow a board design to support multiple types of eMMC parts, the file system or driver will have have a built in flexibility. The alternative, assuming both eMMC vendor drivers work in the design, is still a complete recoding phase and full software test cycle.
Thom Denholm | March 25, 2011 | Reliability |
Earlier this month (June 4th) SpaceX sent the Falcon 9 rocket on its maiden voyage from Cape Canaveral Air Force Station with a successful orbital insertion. Falcon 9 is a reusable spacecraft which will be used to resupply the International Space Station under the Commercial Orbital Transportation Services (COTS) program. We are delighted that SpaceX selected Datalight Reliance and FlashFX Pro to protect mission-critical data from the rigors of space travel, such as shock, vibration, temperature extremes and radiation. Emily Shanklin,Director, Marketing and Communications for SpaceX had this to say: “Datalight software enables reliable flash-based embedded computing for SpaceX’s upcoming Falcon 9 and Dragon spacecraft missions in the data-hostile conditions of space.”
Michele Pike | June 23, 2010 | Flash File System, Flash Memory Manager, Military/Aerospace, Reliability |