So there is a a kind of selection process such that instructions that doesn't depend on sequent instructions could be done in parallel while the rest is single pipeline?
Yeah, you can't execute in parallel if the first instruction modifies a register that the second uses: for example
MOVEQ #0,D0
TST.W D0
The secondary pipeline can't execute all instructions either. Floating point instructions can only be dispatched from the primary pipeline for instance.
From the description it would seem that it checks in the DS stage whether it can be executed in parallel, which implies there is one fifo for both execution units. I'd have thought that would make it tricker than a fifo for each pipeline, but the documentation is what you'd have to go on for a pure clone.
The manual is largely vague on the FIFO:
"The instruction is pre-decoded for pipeline control information" "
The MC68060 variable-length instruction system is internally decoded into a fixed-length representation and channeled into an instruction buffer.
There are 96 bytes for the FIFO. Someone claims it's 16 entries of 6 bytes each, but the longest instruction is 10 bytes and there is no way you're going to squeeze an
MOVE $10000,$20000 instruction into 6 bytes. It's more likely to be 6 entries of 16 bytes or 4 entries of 24 bytes. I can't find anything that suggests that instructions are split into multiple "micro ops", like Intel does.
The 68060 cannot execute out of order and doesn't do anything complex like register renaming that Intel did on the Pentium pro. It really is the simplest design for dual issue that you can possibly do.
There is no reason why you have to 100% duplicate the functionality exactly. However if there is documentation available then it might make sense to do it the same as they probably spent a while designing it, so it's probably good.