Author Topic: / (Read 2476 times)

Floid · « **on:** January 24, 2009, 10:53:50 PM »

Most *NIX systems can be tuned for no or minimal swap, although of course you will have problems when you run out of memory.

Intel was pushing the "turbo memory" flash caches briefly until it turned out they offered no advantages, I believe this suggests that *certain* modern devices can take the wear.

I have yet to really run a flash device to exhaustion (aside from frying them outright in defective hardware

, but I've determined the following:

Because FAT is the 'worst' type of filesystem for Flash (the FAT is in a fixed location and constantly rewritten), nearly all USB sticks include wear leveling in hardware.

FAT is also the most popular filesystem on CompactFlash but for some reason I hear more people complaining about failed CompactFlash devices. CF has been around longer, so I'm sure the controllers built into old (e.g. 4MB) cards had no, or more primitive wear leveling, but I'm not sure what the deal with failure rates on newer cards could be. It could be that certain controllers on certain cards skimp on this feature under the assumption that they will just be used in cameras/for "unimportant" data, but from my experiences, it seems more likely that CF is simply more electrically vulnerable -- something about the interface seems to increase the chances of frying the controller, or at least certain lines into the controller on the card. Heck, it could also be that the wear leveling is there but lower-quality flash dies capable of withstanding fewer writes are being used in some cards because they're "just for photos."

Perhaps even the cheapest USB devices are able to offer better isolation against ESD or faulty power.

I know very little about SecureDigital, but even with the exposed contacts, I've heard very few complaints about failures and haven't experienced any yet myself. Though cameras that can be turned off in the middle of writing out a file from their buffers have been responsible for much unpleasant "data corruption" in my use! No idea about write leveling or that form of longevity.

Netbooks are a sort of special case, because they use some relatively new and unusual devices, and the hardware is somewhat diverse. If I understand correctly (don't trust me, I haven't been paying much attention), the early Eees paired some Flash on the board with some common Flash-to-PCI or Flash-to-IDE chipset. Most Atom machines are based around an Intel reference design that either just exposes a PCI(!?) interface to plug in a "SSD" card, or as probably is actually a repurposing of the old Turbo Memory interface (which again, would look like some sort of PCI hardware) and the reviews I've been reading have got it wrong. In all these cases, you will want to find out what the chips between you and the flash are actually doing before wasting a lot of time trying to outsmart them.

When it comes to those devices, and also the big, $1,000+ SSDs for "enterprisey" applications, it has been observed that performance goes down as the disk fills up and the wear-leveling algorithms have to "work harder" to find where they put your data on a read or decide where to put your write. Thus, while a conventional defrag will only make things much worse (and cause a lot of useless wear), and trying to low-level format with "conventional" tools (the same utility you used in 1985?) will only cause problems, it's likely these things will benefit from an occasional complete wipe so they can go back to laying down data nearly linearly (subject to the wear map, etc, but not all the other variables that need tracking after years of use).

Intel, SanDisk, etc are putting a lot of work into making that less necessary under "normal" use patterns (which probably assume NTFS, HFS+, and EXT3 these days), but I assume the occasional (every N years, or when you start to notice performance problems) backup followed by issuance of whatever magical ATA command (vendor-specific?) can zero the disk and inform it that no sectors on it are important will bring some of the magic back on an "aging" drive.

At the same time, you'll want to be monitoring the device via SMART (if it supports it) to determine if it "knows" it has taken too many writes and is about to fail. But per the above, they've been working on getting the longevity of the flash cells up to the "years of constant rewrites" figure, so don't panic too much. (I've heard the MTBF on the Intel product is claimed to be 5 years under data-center loads; individual users probably make fewer writes even when swapping. Linus Torvalds bought in, so see when he starts complaining.)

...

Right now, titanium-dioxide "memristors" seem like the tech that will give us extraordinarily cheap solid-state storage without the "wear" concerns (and the complicated mechanisms for avoiding same). They're still a few years away, though, and might be more vulnerable to developing bit errors when being read -- but that's an easier problem to solve than wear leveling techniques (and the associated multiple levels of indirection and thus 'fragmentation') in Flash.

Floid · « **Reply #1 on:** January 27, 2009, 09:51:25 PM »

Quote

Lorraine wrote:
Okay that last post went a bit over my head. But I think I got the gist of it! :rtfm:

I have found a page with some Linux SSD info:

-- Link

Quote
Never choose to use a journaling file system on the SSD partitions

Wait wouldn't this mean most modern filesystems? Surely not.

Hooboy, I write a "detailed" post and completely forget this issue!

Sadly, lorddef's response is also a bit "bollocks."

Journals were invented to improve filesystem integrity while allowing writes to occur with reduced/minimal blocking (that is, letting them commit asynchronously) on spinning disks. Well, even that's an oversimplification; journaling really found its niche in systems that had to be able to check and return to integrity rapidly -- there are many related approaches, like the FFS "softupdates" approach roughly equivalent to ext3's data=ordered, which don't use a "journal" but order the data directly into its position in the FS.

A filesystem like ext3 has the following options for its journal: (credit this quote to whoever wrote Ubuntu's mount(

manpage)

Quote

data=journal / data=ordered / data=writeback
Specifies the journalling mode for file data. Metadata is always journaled. To use modes other than ordered on the root file system, pass the mode to the kernel as boot parameter, e.g.
rootflags=data=journal.

journal
All data is committed into the journal prior to being written into the main file system.

ordered
This is the default mode. All data is forced directly out to the main file system prior to its metadata being committed to the journal.

writeback
Data ordering is not preserved - data may be written into the main file system after its metadata has been commit‐ ted to the journal. This is rumoured to be the highest- throughput option. It guarantees internal file system integrity, however it can allow old data to appear in files after a crash and journal recovery.

However, you also have the option to mount the file system as sync, that is, fully synchronized, at the expense of actually having to wait for all writes to complete. A disk mounted sync should have as much integrity as a journaled one (though thanks to races when power is cut off, and hardware that 'lies' before performing its own write caching, it's still possible to corrupt a filesystem if you're "lucky" enough either way).

...

For people who wanted a full sync level of integrity, even the type of journal ext3 offers as data=journal had a performance advantage on a spinning disk -- rewriting all data into one linear journal [and then moving that out into the actual FS asynchronously] had less seek expense than constantly moving the disk heads to the next available sectors.

To some extent, this just amounts to rewriting the same data twice on a SSD. Flash is still nonlinear (data is blanked and rewritten in certain block sizes) so there might be some mild advantage in performance ... but you do want to avoid *constantly* rewriting a journal as may be permitted on a spinny disk. Or in other words, if you're rewriting the journal more often than a FAT system rewrites the FAT, you should expect the SSD to fail faster than it would have with FAT. :-D

Depending on your FS, it might be possible to find the tunables (the infamous-for-being-dangerous if-used-improperly with-mechanical-disks Linux "laptop-mode" might be a shortcut) to change the commit frequency to 30 seconds or more. This would still be mildly 'abusive,' but probably well within the wear-leveling algorithms of hardware meant to put up with FAT or journaled NTFS. Remember, if journaling, you need to make sure the actual journal isn't rewritten more often than that, as opposed to any tunables for guarantees to the rest of the filesystem from the journal. (I'm using ext3 as the common example, but I've never used it on a SSD or felt a need to tinker with any tunables at that level, so YMMV.)

...

On a laptop, you probably don't want to lose *any* data, including what you were working on in the last 30 seconds before power goes away (battery died? dropped it? battery's dead/not present and the power plug falls out?), so if you can eat the performance degradation, it can make sense to mount /home or the equivalent sync. (Alternatively, I've toyed with the idea of creating a separate /sync and creating some symlinks so I could save anything "worky," like a document in progress, to ~/sync.)

Unfortunately, this also makes the "boring," easily reproducible writes -- like installing software from a package -- take longer, so you probably want /usr to be "less than sync," be it journaled or through whatever other mechanism can keep you from blocking before writes commit. But you also don't want it to get completely corrupted -- you only want to have to force-reinstall that last package you were playing with, not your whole OS -- so use a mechanism with better guarantees than fully async!

It can get a little ridiculous to make this fine-grained, but for similar reasons, you usually want /boot to be something like a sync mount -- so when you see your new kernel finish copying over, you can be pretty sure that it's made it there -- and /etc is another candidate, because you don't want to lose that last edit to a configuration file if you bump the reset switch rather than shutting down properly. (Of course, nearly everyone, myself included, will just live with a distribution's defaults for / there on less-than-critical systems.)

On top of all this, note that the presence of a journal does make fsck faster because, if the journal is present, fsck trusts it [and may not inspect the rest of the real FS unless you ask]... if you're mounting sync without a journal, a fsck might take longer because it has no such "this is what was going on right before the power was cut" cheat-sheet, but the FS will be in an equally consistent state.

Confused yet?

---
Edit: And don't forget noatime to avoid the performance degradation and stress on the write-leveling features for making lots of little atime writes everywhere. Unless you need atime.

Edit #2: Whups, just caught that I'd said "journal=ordered" (nonsense) when I meant "data=journal". I was trying to refer to the case where *all* data -- both metadata ("these blocks are allocated") and data ("this is what's in each block" -- the full content of the write) is journaled.

Author Topic: / (Read 2476 times)

Floid

Re: SSD 'Friendly' OS/Filesystem

Floid

Re: SSD 'Friendly' OS/Filesystem