Hi Thomas and Olsen!
Thank you for your answers, very interesting!
The Amiga operating system and all its components matche one of the textbook definitions of legacy code: code for which no automated tests exist. It is not designed to be testable, which makes it very hard to both design effective tests and to deploy a testable version in the same context as it would be in if it were production code and not test code (ROM space is the major limitation here). The operating system source code is a mix of 'C' and assembly code, both of which provide very little leverage to achieve abstraction layers which are much more easily available in object-oriented languages.
If you are used to testing frameworks to make your life easier, you know what you are missing. I did investigate how testing of complex 'C' code is being done, and the sad result seems to be that is an unsolved problem. The testing frameworks I found which apparently catered to the need of automated 'C' code testing seemed to be not much more useful and powerful than the equivalent to planting assert() calls and the occasional printf() in the code (debugging like it's 1983), so I was none the wiser.
As far as I can tell there is no worthwhile test framework for 'C' legacy code, but I'd very much like to be proven wrong on that... So we did not use one on AmigaOS, which is why there are no use cases and no test suite to speak of.
Compatibility testing still worked very much like it did when Commodore was still in business: testing by exercising the features, discovering compatibility issues more or less by chance as well as by methodical exploration.
One exception is the new "Disk Doctor" which I wrote from scratch for Workbench 3.1.4. The code was designed to be testable, and it even includes a damage simulation feature which makes the disk access layer pretend that certain forms of data corruption or read errors exist on the medium. Automated testing involved having Disk Doctor examine and recover data from hundreds of Amiga disk images.
I don't know of a good framework to test C code either, even less so for OS and OS ROM and libraries... Could you explain in more details what you mean about "It is not designed to be testable"? Naively, I would have think that, being a set of libraries (in ROM or software), there could be tests for each and every available data structures and functions provided by these libraries, but I'm certainly missing something here... in particular, when it comes to dynamic stuff, like tasks and processes.
Could you tell us more about the features that you exercised? Do you have a set of programs that you know are "tough" on the OS and "play" with them to test the OS?
Best!