That's something I don't understand why most software for the 68k/PPC side of things on the Amiga was released in the WarpOS versions and not the vanilla PPC versions instead. As when you do any speed comparisons WarpUP versions always run slower... :confused:
It depends. Early versions of PowerUp suffered a larger performance hit when switching between 68K <-> PPC.
On my machine, timing a round trip context from PPC to an empty 68K routine and back took around 2ms for PowerUP and 1ms for WarpOS. For context-switch heavy code, that made quite a difference. Later versions of the PowerUP kernel seemed to close that gap.
Was it easier to program in WarpUP than plain old PPC (PowerUp) , I wouldn't have thought so, anyone who has ever written something in WarpUP able to clarify this... 
No, the same caveat applied to both: write your code to minimise context switches.
Back in those days, I wrote myself a simple library in vbcc that provided a simple multiple buffered Screen (or window with offscreen BitMap) with input handling, stream handling and other features. The implementation of the display context had a single "refresh" method, that when you invoke it, does a Run68K() call (WarpOS). The 68K code it invoked then does all the screen buffer switching and gathers all the IDCMP messages into a local buffer where keypresses were decoded into their character codes and other such things. On return, the PPC then invoked user-supplied callbacks for the buffered input events before returning completely to the callee.
In this manner, refreshing the display and processing the input events took just 2 context switches - to 68K and back. Hard to optimize it any further than that.
I saw a lot of code written in those days that just treat OS calls normal. This was in part encouraged by compilers like StormC that made it transparent. End result was code that resulted in a context-switch orgy.