Welcome, Guest. Please login or register.

AuthorTopic: Amiga history: Why was "Disk doctor" so spectacularly bad at its job? Here is why...  (Read 1757 times)

0 Members and 1 Guest are viewing this topic.

Offline olsen

If you have been around long enough, you may remember the "standard" disk repair utility which shipped with Kickstart/Workbench versions 1.x and 2.0, which was dropped when Workbench 2.1 was introduced.

Whenever a floppy disk error came up, as they did in those early days, you would see an AmigaDOS requester window urging you to repair your floppy disk with the "Disk doctor" command. Many of us tried, because there was practically no alternative, and very few succeeded. Because it had such a poor track record, it became a running joke. You had to be really desperate to use "Disk doctor", because like it or not, "Disk doctor" mostly left the disk in a poorer state than it was before, but sometimes it succeeded in "rescuing" data.

So, why exactly did it work so poorly? I did some research this week, because I was curious.

The purpose of "Disk doctor" was to return a damaged volume back to operational state again after the disk validator or the Amiga ROM file system had failed to achieve this. Specifically, "Disk doctor" was supposed to make the disk validator work again, by repairing the on-disk data structures, even reconstructing the first block and the root directory from scratch if necessary.

To this end, it scanned the entire disk, figuring out which disk block contents were still sound, and if there were any read errors. This was the first step at which things could go very wrong. If "Disk doctor" found that the first block of the disk, or the root block, were unreadable, it would first attempt to rewrite their contents and then reformat these blocks as a last resort (this could correct "hard errors"). However, it would not just format these individual single blocks, it would format the entire track instead of just that block. So you lost not just the contents of one block, you lost 11 blocks of data. Because "Disk doctor" overwrote those blocks with the contents of whatever was last in its scan buffer, random data was repeated all over those 10 trashed blocks.

And it only got worse from there. By knocking out those blocks, "Disk doctor" could damage file contents, would later detect that corruption and subsequently delete those files. Deletion was incomplete in that the data structures which allow for directory scanning would be damaged, and "Disk doctor" was unaware of the need to repair them. This in turn rendered the disk validator inoperable.

Mind you, pre-existing damage to file contents had the same knock-on effect: "Disk doctor" detected the file damage, but did not know that it had to repair the directory data structures. Once a directory was corrupted, "Disk doctor" would never repair it.

"Disk doctor" version 1.3.3 (and presumably older versions, too) suffered from a bug in the initial scanning process. It failed to scan the entire portion of the disk which was used by the file system, omitting the final two blocks. If there was a file or directory header stored there, or a file data block, then "Disk doctor" would assume that these directory entries did not exist, or that the files associated with the "missing" data blocks were corrupt.

As part of its recovery operation "Disk doctor" identified files and directories whose parent directories no longer had valid parent directories themselves. These were considered "orphaned" directory entries, because they were no longer reachable through the existing directory structures. "Disk validator" would add these orphans to the root directory, making them accessible again. This may sound reasonable, but there was a twist: if several orphaned files shared the same name, they would all show up in the root directory. If you tried, as the "Disk doctor" documentation recommended, to copy the contents of the "corrected" disk to a separate disk, only one of those orphans which shared the same name would be copied. Trying to delete those files with the same name on the corrected disk would most likely lead to all of them getting deleted.

That was, of course, not the end of it. "Disk doctor" had trouble recovering files which were longer than 72 blocks of data (on a floppy disk, that would be 36865 bytes or larger). Due to how the Amiga ROM file system works, a file covering more than 72 blocks of data (assuming a block size of 512 bytes) needs to record where the other blocks are stored in a separate data structure called the "extension block list".

"Disk doctor" ran into trouble with the extension blocks because of a mix-up in the disk read routines. One version read blocks, verified that their contents were sound and that the block was of a certain type. The other version did not insist on checking all of that, and it did not have to. The problem was that "Disk doctor" used the version which insisted that the block had to be of a certain type, which happened not to match the extension block type. Hence, "Disk doctor" considered all files larger than 72 blocks to be corrupt and would schedule them for deletion. Which had further knock-on effects, corrupting the directory structures.

This still is not the end of the tale. "Disk doctor" used an incredible amount of memory for its internal bookkeeping. You might have been able to "correct" a floppy disk with its 1760 blocks on your 512 KByte Amiga (which would require 140 KBytes of RAM), but with a 20 MByte hard disk partition you would need at least 1 MByte of RAM. In the early days, that was a lot to ask for.

"Disk doctor" reached the end of its usefulness (for some values of "useful") when it turned out to be too hard to adapt it for use with the Fast Filesystem (FFS) in 1988. Commodore struggled to find a proper fix for its limitations, but the basic problem was that so much of "Disk doctor"'s code was restricted to the original Amiga ROM file system layout and operations.

How directories and data blocks look like is notably different for the FFS, but confusion prevailed. The documentation for the last "Disk doctor" version 1.3.5 manages to both discourage and encourage you at the same time to use it with FFS volumes. As it turned out, if you were so unlucky to try, "Disk doctor" could render perfectly workable FFS format directory structures unsuitable for use with the FFS. If you were spectacularly unlucky (the first disk block was unreadable), "Disk doctor" would simply assume that your volume was actually in standard Amiga ROM file system layout and then make it so: when rewriting the first disk block, it always assumes that the disk uses the original Amiga ROM file system layout.
« Last Edit: September 09, 2017, 08:42:53 AM by olsen »
 

Offline BozzerBigD

It was before my time but DiskSalv does perform admirably for FFS partitions and was programed by 'hardware' Amiga guru Dave Haynie!!!
"Art challenges technology. Technology inspires the art."

John Lasseter, Pixar Animation Studios
 

Offline Matt_H

@ olsen

Wonderful analysis, thank you!

@ thread

Dave Haynie's notes about Disk Doctor (from the Disk Salv documentation) also merit repeating here. The story he recounts is that the developers were unsure of whether to keep Disk Doctor, so they put its sourcecode on a disk and ran Disk Doctor on it. Needless to say, it did not survive. :)

We may never know if that's 100% true, but it's a great story.
 

Offline Pentad

I remember using Disk Doctor back in the day and it was near useless. The jokes about Disk Doctor and malpractice seemed to be everywhere. The only time it sort of worked for me was on a copied game that somehow developed errors. I ran Disk Doctor on it and it actually fixed the disk...or so I thought.

When I booted the game it actually worked but the sprites were all messed up. I was amazed that the Disk Doctor some how repaired enough of the disk to make the game run but managed to corrupt the sprite data.

After a year or so on AmigaOS 1.2, the release of AmigaOS 1.3 was amazing. With the advent of the FFS, we saw faster boot times, less corruption, and the whole system was a lot more stable.  FFS was written in Assembler so speed jumped like 10x over the OFS.  What a time to be alive.  :-)

-P
2015 15" Macbook Pro Retina * 2.8 GHz QCore * 16 GB RAM, 1TB SSD * Windows 10 via Boot Camp * Amiga via Emulation (WinUAE in WINE Staging)
 

Offline olsen

Quote from: BozzerBigD;830475
It was before my time but DiskSalv does perform admirably for FFS partitions and was programed by 'hardware' Amiga guru Dave Haynie!!!


DiskSalv 1.0 was released in May 1986, but "Disk doctor" certainly is older ;)

No change history is recorded for "Disk doctor", so my best guess is that it may have been written or adapted from earlier code by Dr. Tim King in 1984/1985.
 

Offline olsen

Quote from: Matt_H;830476
Dave Haynie's notes about Disk Doctor (from the Disk Salv documentation) also merit repeating here. The story he recounts is that the developers were unsure of whether to keep Disk Doctor, so they put its sourcecode on a disk and ran Disk Doctor on it. Needless to say, it did not survive. :)

We may never know if that's 100% true, but it's a great story.


This story was one of the reasons why I looked into how "Disk doctor" worked. Dave Haynie acknowledges that the story came from a secondary source ("The story I was told goes something like this").

Can "Disk doctor" corrupt an otherwise perfectly sound disk, rendering files stored on it unreadable? From what I now know about how "Disk doctor" works, there are only two cases in which this might happen. Both cases assume that the disk is physically readable and writable, and the on-disk data structures are all consistent.

Case #1: "Disk doctor" would find orphaned files and directories on the disk, representing data which has been deleted. These files and directories would be added to the root directory. If any of these newly-added directory entries would share the same names as the "Disk doctor" source code files would use, then this could render the source code inaccessible.

Case #2: This assumes that the floppy disk uses the FFS format rather than the original Amiga file system format. "Disk doctor" always resets the root directory and subsequently adds those files back to it (this is actually a side-effect and arguably unnecessary to begin with). Because the FFS format requires that the directory entries are added so that they are sorted by ascending block number, and "Disk doctor" fails to do that, those files might not show up when the directory contents are listed. This could happen to the "Disk doctor" source code files, which would appear to be gone, although the individual files could still have been opened (using the "Type" or "Copy" commands).

If the assumption that the disk was sound to begin with does not hold, trouble would be expected, and "Disk doctor" would likely leave the volume in a corrupted state.
 

Offline rvo_nl

  • Lifetime Member
  • Hero Member
  • *****
  • Join Date: Oct 2006
  • Posts: 853
  • Total likes: 0
great story, thanks for writing it up!
Amiga 1200 (1d4) Kickstart 3.1 (40.68), Elbox Power/Winner tower (450w psu), BlizzardPPC 603e+ @240mhz & 060 @50mhz, 256MB, Bvision, IDE-fix Express, IndivisionAGA, 120GB IDE, cd, dvd, Cocolino, Micronik Keycase, PCMCIA Ethernet, Ratte monitor switcher, Prelude1200, triple boot WB3.1 / OS3.9 / OS4.1, Win95 / MacOS8.1
 

Offline gregthecanuck

Thanks for the Disk Doctor history. A good read and some good follow-up as well.
 

Offline Thomas Richter

If I recall correctly, it was the OFS in kickstart 1.2 which presented an error message such as

Quote
Use Disk Doctor to corrupt disk in df0:
if the file system or restart segment (aka "disk validator") found a problem it could not resolve. Telling.
 

Offline kolla

Speaking of file systems...
* when booting, if no RTC is available, the OS will set system time to the creation time stamp of the filesystem from which it boots  - very clever, if only there was a tool to adjust and update this time stamp :)
* OS4.1 comes with a newer version of FFS (FFS2?), with long filenames (0x444F5307) - is there a filesystem handler for OS3.x that supports this?

EDIT:

Oh, to stay somewhat on topic - DiskSalv rarely solved much for me, however, I have seen QuarterBackTools perform miracles :)
« Last Edit: September 09, 2017, 11:23:45 AM by kolla »
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC
 

Offline bloodline

  • Master Sock Abuser
  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 12113
  • Total likes: 0
    • http://www.troubled-mind.com
Brilliant analysis! :)

Offline olsen

Quote from: kolla;830500
Speaking of file systems...
* when booting, if no RTC is available, the OS will set system time to the creation time stamp of the filesystem from which it boots
Actually it's even weirder - the default ROM file system reads the volume 'last altered date', and if the system time happens to be unset, will change it to that date.

Quote
 - very clever, if only there was a tool to adjust and update this time stamp :)
The way the default file system works, it should be sufficient to change a file on the volume. That change will bubble up to the root directory, and it should replace the 'last altered date' unless that change came with a time stamp which preceded what's already recorded in the root directory.

Quote
* OS4.1 comes with a newer version of FFS (FFS2?)
Well... it's a reimplementation in 'C' and not so much a version of the code which existed before it. It's like the original FFS, which was reimplemented in assembly language by Steve Beats, with the precursor written in a completely different language (BCPL).

Quote
... with long filenames (0x444F5307) - is there a filesystem handler for OS3.x that supports this?
Of course there is :)  I wrote it, starting in 2001, for use on AmigaOS 2.x/3.x. This is still the most complex and challenging software I ever wrote. The AmigaOS 4 and MorphOS versions are ports of the original 68000 implementation. I still use the 68000 version to this day on my A3000UX development machine at home.
 

Offline olsen

Quote from: Thomas Richter;830499
If I recall correctly, it was the OFS in kickstart 1.2 which presented an error message such as


if the file system or restart segment (aka "disk validator") found a problem it could not resolve. Telling.


That might have come from the SetCPU command, which "pranked" the original requester text through a patch. In the original text it says "corrected", the same word which "Disk doctor" uses when it prompts you to "Insert disk to be corrected and press RETURN". At least it's consistent ;)
 

Offline Thomas Richter

Quote from: kolla;830500
* OS4.1 comes with a newer version of FFS (FFS2?), with long filenames (0x444F5307) - is there a filesystem handler for OS3.x that supports this?

Yes.
 

Offline klx300r

@ olsen

thanks for posting! love the insiders perspective ;)
____________________________________________________________________
c64-dual sids, A1000, A1200-060@50, A4000-CSMKIII
Indivision AGA & Catweasel MK4+= Amazing
! My Master Miggies-Amiga 1000 & AmigaOne X1000 !
--- www.mancave-ramblings.blogspot.ca ---
  -AspireOS.com & Amikit- Amiga for your netbook-
***X1000- I BELIEVE *** :angel: