What a mess audio on Amiga is 
Not really. I would rather say "The typical mis-use of the audio.device is a mess".
If audio.device would be used properly, the situation would be no problem: Intuition (via IPrefs) "steals" the channels for the duration of the DisplayBeep().
The application whose audio channels has been stolen is informed about this fact, and could re-allocate the channels when IPrefs is done. The protocol for that is all there, just that "nobody cares" to use it. Typically, the audio.device is used to allocate channels at the beginning (if at all), and then applications just "poke the hardware", assuming "all is fine".
Nowadays, one would of course assume that audio.device implements a software mixer of the various audio sources, but that's probably just too much complexity for a poor 68K.