Can anyone tell whats the speed penalty because of the contex switching now?
Virtually none. Context switching on dual CPUs was due to the two CPUs fighting over the address space. With one CPU that's gone. It'll be just like amithlon, but of course better. :-)
Ok, software context switching slows it down a little too - about a ten-thousandth of the slowdown hardware switching does...maybe. How much slowdown there will be I don't think anyone knows. A truly tiny amount. It's a bone of contention - MOS say their way is faster, OS4 say their way (using the MMU) is faster. Only the benchmarks will prove it, and no amount of trolling and flaming will change that (I thought I'd add that before the FUD throwing started).