This reminds me of the BSD(?) Big Lock(?) discussion. All SMP stuff passed through a single lock and they did a lot of work to split the different parts into having their own lock.
First port of call is of course to drop Forbid/Permit use and go to using semaphores where possible, and perhaps find a lockless design.
IIRC I optimized my Exec patch to make semaphores assume success on first try and so not call Forbid if the semaphore was not held by someone else.