Nice work.
What are the differences between this implementation and the actual arcade hardware? Did that have a 16-bit bus?
I presume that the 1280x1024 VGA output is not generated by the game, but by some form of scan doubler/quadrupler, capturing the simulated game video hardware's output? I guess adding Scale2x, etc, algorithms in hardware to this module doesn't touch the arcade game hardware at all, and can be re-used for other games in the future?