Welcome, Guest. Please login or register.

Author Topic: PowerPC accelerator - how does that work then?  (Read 11065 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Tension

Re: PowerPC accelerator - how does that work then?
« on: January 30, 2012, 08:27:59 PM »
Quote from: Karlos;678302
Neglecting OS4.x and MorphOS 1.x for the moment, there are 2 solutions on OS3.x. The following description is somewhat generalized and may have mistakes and omissions as it was a long time ago that I last looked at it.

First there was PowerUP (or just PUP). This was developed by phase5 as the software to allow applications to take advantage of the PPC. Essentially, it comprises a kernel for the PPC, along with some APIs for developers and a set of patches to the host operating system to bootstrap the processor.

Once the system was up and running, the (patched) host OS was able to load PPC binaries in ELF format (it may have been modified a bit, but essentially the same format described in the old System V). An ELF could be an entire application for the PPC, or just a set of PPC optimised functions to be used by an otherwise 68K application. Code running on the PPC was basically independent of the 68K until it needed to call a host OS function (note that the PUP kernel provides memory management, tasks and so on for the PPC side).
The mechanism required to call 68K code from the PPC (or vice versa) necessitated that both processors flush their data cache. The parameters for the call would then be passed through memory from one to the other. While the "other" CPU is working, the corresponding process on the current CPU is basically asleep until the opposite CPU completed and returned. This process was simply called a "context switch" (a widely used term in other areas of computing, but technically accurate nonetheless). I'm afraid I forget the specific underlying details on how all this was achieved exactly. If memory serves, there was an arbitration "server" process involved in the communication between the two processors. Perhaps someone better qualified can answer that.

The system worked, but early versions suffered poor performance due to the overhead of said context switches (this was addressed later). In the time it took either processor to flush it's cache, it could have done millions of instructions. Other criticisms revolved around the use of ELF which was seen as an alien standard on the Amiga* which had it's own hunk format for binaries. Soon thereafter, WarpUP appeared (later called WarpOS), a rival system that extended the amiga hunk format to allow PPC code segments in executables, leading to "fat" binaries.

Fundamentally, it was similar to PUP: a PPC kernel, API and a set of OS patches. The main differences were in the implementation. WarpUP tried to mirror Exec's structure for the services it provided. Threads were in pairs, one for each CPU and only one of which would be active at any time (there were asynchronous context switches but they were tricky to use). A single threaded application that used the PPC would thus really be two threads, one on either chip, only one of which was running at any instant, the other sleeping and waiting for it's counterpart to return. However, they shared many fundamental components, such as signals. If you sent CTRL-C to one, the corresponding signal bit would be set in both the Task and TaskPPC (or at least it was routed to whichever was active, I forget).

It still required cache flushing in the same way as PUP, the main advantage of the system (at the time) was that it offered considerably quicker at it. That performance gap closed in later versions of PUP, but by then we'd enjoyed the "kernel" wars and WarpUP had become the unofficial standard. At least until OS3.9 where the picture.datatype used it. That's about as official as it ever got.

Each system had it's strengths and weaknesses. The Achilles heel of both was the context switch. It may have been faster on WarpUp, but in real terms it was still shockingly expensive. You had to carefully design code to minimise these if you wanted your application to run acceptably.

*ironically, all the offshoot operating systems use it for their native binary format.


barry do teh money dance

Offline Tension

Re: PowerPC accelerator - how does that work then?
« Reply #1 on: January 30, 2012, 10:01:54 PM »
Quote from: Karlos;678311
Also, the above...

/me slowly edges backwards towards the door, smiling at the strange man from Belfast in a non-threatening manner :lol:


Sorry, I think moobunny is starting to get inside my head. ;)