Author Topic: AROS SMP Research: Technical Discussion (Read 29853 times)

Ezrec · « **on:** August 22, 2013, 05:56:14 PM »

Continued from thread: http://www.amiga.org/forums/showthread.php?t=65748

Ezrec · « **Reply #1 on:** August 22, 2013, 06:02:28 PM »

Quote from: psxphill;745838

Is that how your forbid works? That the other cpu is only stopped when it's quantum expires? That would probably be a mistake, the first cpu might be trying to send a message to port that is on the other cpu (which currently requires a forbid to make sure the port doesn't disappear before you send the message).

In that case, you (the programmer) need to update your code anyway, since you could have gotten pre-empted right before the SendMsg()/Signal() and lost that port, even on AmigaOS 3.x

Quote from: psxphill;745838

How is the second cpu task switching if it has no hardware interrupts? Is it controlled by the timer on cpu0?

That is architecture specific. On the 'unix hosted' AROS environment, CPU0 proxies the timer scheduling interrupt for the other CPUs.

On pc-x86, we will probably use the Local APIC timers, and an IPI signal to tell the cores to 'please stop running for a bit, until I tell you otherwise'.

psxphill · « **Reply #2 on:** August 22, 2013, 06:12:55 PM »

Quote from: Ezrec;745841

In that case, you (the programmer) need to update your code anyway, since you could have gotten pre-empted right before the SendMsg()/Signal() and lost that port, even on AmigaOS 3.x

You need a forbid round the find/sendmsg, but you won't want the forbid to wait for all cpu's to finish their quantum. When the forbid happens the other cpu's need to stop what they are doing immediately.

Ezrec · « **Reply #3 on:** August 22, 2013, 06:17:47 PM »

Quote from: psxphill;745843

You need a forbid round the find/sendmsg, but you won't want the forbid to wait for all cpu's to finish their quantum. When the forbid happens the other cpu's need to stop what they are doing immediately.

I think you're misunderstanding: the Forbid() doesn't return until the other CPUs have stopped.

Ezrec · « **Reply #4 on:** August 22, 2013, 06:31:44 PM »

Quote from: Ezrec;745844

I think you're misunderstanding: the Forbid() doesn't return until the other CPUs have stopped.

And I think *I* have something wrong. Michal Shulz did some rough performance calculations, and even though my method (wait for quantum to expire) is semantically correct, the performance penalty is terrifying.

I'll experiment with signalling the other cores to stop immediately, and see how that works out.

psxphill · « **Reply #5 on:** August 22, 2013, 06:57:01 PM »

Quote from: Ezrec;745846

And I think *I* have something wrong. Michal Shulz did some rough performance calculations, and even though my method (wait for quantum to expire) is semantically correct, the performance penalty is terrifying.

Yeah that was my point. Making forbid wait for the other cpus will mean that these four lines of code will take over 1 task quantum.

forbid()
permit()
forbid()
permit()

The first forbid() will take anywhere from nothing to 1 task quantum depending on how it aligns with the other cpu's tasks.

Stopping the other cpu's immediately will have some performance penalty, which even though it's much higher than the overhead in AOS 3.1, it should be nowhere near a quantum.

You also don't want the other cpu's tasks to lose their quantum when another cpu does a forbid(), the other cpu's tasks should have the quantum extended by the time they are suspended.

Rather than signalling the other cpu, it might be enough to actually stop them. The performance might depend on architecture, plus I don't know how you're abstracting all this stuff, so either way might make more sense.

Ezrec · « **Reply #6 on:** August 22, 2013, 07:19:51 PM »

Quote from: psxphill;745851

Yeah that was my point. Making forbid wait for the other cpus will mean that these four lines of code will take over 1 task quantum.

Ok, looks like we're on the same page now.

Michal's planning on using IPI to signal the other CPUs to stop (on x86 SMP, there's isn't some "magic register" you can use to stop other CPUs, you have to ask them nicely), but it's a lot faster than waiting until they reach a Switch()/Dispatch() point.

psxphill · « **Reply #7 on:** August 22, 2013, 07:44:41 PM »

Quote from: Ezrec;745855

Michal's planning on using IPI to signal the other CPUs to stop (on x86 SMP, there's isn't some "magic register" you can use to stop other CPUs, you have to ask them nicely), but it's a lot faster than waiting until they reach a Switch()/Dispatch() point.

Cool, that should work better.

Do you think the time the cpu is suspended not counting towards the current tasks quantum make sense? Otherwise the fairness will depend on what is running on the other cpu's & you could get one task that is permanently starved in pathological cases. If it's got it's own timer that fires when the quantum is up then it might just be a case of pausing it, but if you can only stop it you'd need to keep track of he current time left and use that when you start the cpu again.

Ezrec · « **Reply #8 on:** August 22, 2013, 07:46:22 PM »

Quote from: psxphill;745857

Do you think the time the cpu is suspended not counting towards the quantum make sense? Otherwise the fairness will depend on what is running on the other cpu's & you could get one task that is permanently starved in pathological cases.

Right now, suspended CPUs do not have their Elapsed updated when they are suspended, so they should not be starved.

minator · « **Reply #9 on:** August 22, 2013, 08:00:51 PM »

It's interesting that this is being tried but I suspect it will never get past the experimental phase.

Even if it can be made to work, it's going to serialise the CPUs so much that that's no point having multiple CPUs.

It might be possible to show a nice speedup on some long running highly parallelisable benchmark but that's it. In any real system apps will be constantly stalling the system and you don't need to be Gene Amdahl to know what the result will be.

matthey · « **Reply #10 on:** August 22, 2013, 09:12:22 PM »

@Ezrec
Congratulation! Great effort! You have already proved some people wrong with your experiments.

How are you handling the ENABLE/DISABLE FORBID/PERMIT macros (ables.i) that increment and decrement the ExecBase IDNestCnt and TDNestCnt?

Quote from: minator;745861

It's interesting that this is being tried but I suspect it will never get past the experimental phase.

Even if it can be made to work, it's going to serialise the CPUs so much that that's no point having multiple CPUs.

It might be possible to show a nice speedup on some long running highly parallelisable benchmark but that's it. In any real system apps will be constantly stalling the system and you don't need to be Gene Amdahl to know what the result will be.

The performance of most current SMP processors would be limited by limitations of the AmigaOS. However, specialized hardware (and fpga-ware whatever you want to call it) could drastically reduce this overhead and increase compatibility. ExecBase could be setup in a particular area of memory with certain addresses that are monitored for changes and trigger some fpga programming action that affects all cores. Some of the multi-tasking and multi-core handling could even move into hardware (fpga code). Think of the Fido processor (68k) with it's semi-hardware handling of multi-tasking (it has a per task time slice countdown value with auto hardware interrupt when the time is up) being upgraded to SMP. It would be a little bit complex in hardware but then could offer the advantage of more protection of SMP and multi-tasking from errant and malicious software. Add partial memory protection with an MMU and virtual addressing for >4MB memory support (each task would be limited to 2MB or so) and the Amiga with 68k might be competitive again (with an ASIC). Gunnar von Boehn would like to make a multi-core version of the 68k Apollo processor. Duplicating the cores in fpga is very simple. The rest is just giving Jason what he needs provided his ideas do not have flaws

.

Ezrec · « **Reply #11 on:** August 22, 2013, 09:34:39 PM »

Quote from: minator;745861

Even if it can be made to work, it's going to serialise the CPUs so much that that's no point having multiple CPUs.

Very true.

I'm investigating a spinlock-style SignalSemaphore that has a lower latency for protecting frequently used internal data structures in Exec.

Just so people know - even though I am one of the m68k developers for AROS, AROS SillySMP is *not* targeted for m68k. It is only for *existing* processors with *actual* SMP hardware.

Once someone puts an actual piece of SMP m68k hardware into my hands, I'll be happy to develop for it.

Zac67 · « **Reply #12 on:** August 22, 2013, 09:48:20 PM »

Sorry if I'm a bit naive here - but what would be the problem with leaving the other cores running on a Forbid() as long as they stay in userland?
Of course, you'd need to take care of them not running over another Forbid() or into an interrupt. Both could be prevented with a simple semaphore in Forbid() and all relevant interrupts (which need to be deferred until Permit()).

psxphill · « **Reply #13 on:** August 23, 2013, 12:12:19 AM »

Quote from: Zac67;745872

Sorry if I'm a bit naive here - but what would be the problem with leaving the other cores running on a Forbid() as long as they stay in userland?

Because Forbid() is used to protect userland data structures shared between tasks too. As those tasks might be running on different cpu's then you have no choice but to stop other cpu's.

kamelito · « **Reply #14 on:** August 23, 2013, 12:22:32 AM »

Might be interesting to ask Carl Sassenrath what he thinks about it.
Kamelito

Author Topic: AROS SMP Research: Technical Discussion (Read 29853 times)

Ezrec

AROS SMP Research: Technical Discussion

Ezrec

Re: AROS SMP Research: Technical Discussion

psxphill

Re: AROS SMP Research: Technical Discussion

Ezrec

Re: AROS SMP Research: Technical Discussion

Ezrec

Re: AROS SMP Research: Technical Discussion

psxphill

Re: AROS SMP Research: Technical Discussion

Ezrec

Re: AROS SMP Research: Technical Discussion

psxphill

Re: AROS SMP Research: Technical Discussion

Ezrec

Re: AROS SMP Research: Technical Discussion

minator

Re: AROS SMP Research: Technical Discussion

matthey

Re: AROS SMP Research: Technical Discussion

Ezrec

Re: AROS SMP Research: Technical Discussion

Zac67

Re: AROS SMP Research: Technical Discussion

psxphill

Re: AROS SMP Research: Technical Discussion

kamelito

Re: AROS SMP Research: Technical Discussion