If I still had a working PPC classic and a job that didn't consume all my time these days, I might look into it.
-edit-
That said, the strategy for solving problems like this is as I already said.
Never call multiple 68K native OS functions directly from the PPC in time critical sections. Always, and I mean always, write a 68K "entry" function that can then call your OS functions before returning and have the PPC call this.
The solution I used had a Context structure. This contained the Screen, ScreenBuffer and Window pointers, it also contained a small Event buffer (implemented as an array of Event structures). These Event structures contained a type field and a bunch of unions the meanings of which depended on the type. The Context structure also contains several function pointers that act as callback handlers for events.
A PPC native function "update(struct Context*)" would take a Context instance and then pass it to a 68K function via Warpos/Run68K().
This 68K function would then flip the ScreenBuffers and then process the IDCMP events from the Window's IDCMP port using a standard GetMsg() based loop. Processing these events simply involved decoding them and filling up values in the Context's Event array.
Unless you are really unlucky, you'd never fill this buffer between frames though it is an obvious design flaw that it couldn't really hold more than 64 events per frame. Any more than that would end up waiting to be processed in the next frame. However, in practise I rarely managed to get more than four of five between frames even when blitzing the mouse ;-)
Once this was completed, the 68K function would return back to the PPC.
Before completing, the PPC would then step through the events that were received, calling the Context's appropriate callback functions for each event.
The upshot of all this is that you basically got 1 full PPC->68K->PPC round switch when flipping the screen buffer and retrieving the input.