Author Topic: Coldfire status (Read 8520 times)

jdiffend · « **on:** May 12, 2006, 08:00:36 PM »

Ok, do a search and you'll find this discussed by several people in depth in several old threads. The oldest ones probably aren't around anymore (1990ish?) but to summarize:

The Coldfire CPU is missing some 68K instructions, some are there but act differently than the 68K equivalent and the coldfires currently have some additional hardware built in that must be configured on startup so it doesn't conflict with existing Amiga hardware. It also wasn't as feasible on older Coldfire cores due to a few missing instructions.

There are ways around all the software problems, but to do this would require a new Amiga exec.library since it executes illegal (on the coldfire) instructions early in the startup code. It's so early that it crashes before it could even transfer to an expansion ROM (actually, it happens *while* it's checking for one).

FWIW, *IF* someone could get past the exec issue it would be the fastest 68K Amiga any of us are likely to see even with the illegal instruction traps. Oh yeah, the latest Amiga ROMs are FULL and the instruction traps require some space.
I also think a couple other devices might need to be native Coldfire code due to the close tie in with the exec. (timer.device?)

I actually gave a commented disassembly of the latest exec to a guy (he gave me the raw disassembly and I commented it) with a V2 Coldfire dev board over a year ago and he was going to port it but I don't know how far he got. Actually, I think it was 2 years ago that Freescale gave away the dev boards so it's been even longer than that.

If someone had access to the Amiga ROM sources this wouldn't be a bad project. Without that you have to reverse engineer things, fix them, etc...

jdiffend · « **Reply #1 on:** May 12, 2006, 08:29:49 PM »

Quote

motorollin wrote:
Well, I suppose a lot of stuff wouldn't need to be emulated (i.e. custom chips). But emulating a 68k processor on a 266MHz CPU probably wouldn't be much fun, and also may not work for non OS-friendly software.

--
moto

You aren't emulating the 68K in the same way as most people think of emulation. Most 68K instructions would be executed natively.

If Motorola's figures based on older Coldfire cores are correct (based on millions of lines of existing code) it would be 20% slower than native Coldfire code (that's about a 200MHz 68000 at least). And those cores were missing more instructions and address modes than the 4e which would be even faster.

Also remember that as shared libs and devices are updated to be coldfire native code, all apps that use them will be sped up.

Non OS friendly stuff that won't run on an 060 certainly wouldn't run on a Coldfire.
Also, some OS friendly stuff would need to be patched to run properly on the Coldfire.

jdiffend · « **Reply #2 on:** May 12, 2006, 08:47:03 PM »

Quote

Actually it's more complex than that. It depends on the frequence of the emulated instructions in the specific code being executed, and the method of emulation used.

Motorola analyzed millions of lines of code and a 20% speed reduction was the final number they figured on but some code may be slower or faster. But since most code is generated by a compiler it's unlikely the number will go higher than that.

Quote

IMO to be usable the Dragon m68k emulation should be complete (and not fail with multiplication overflow and such stuff, and it should emulate supervisor mode). To get this you'd need full emulation of both user and supervisor mode. This is not fast on 266MHz chip.

Software could be written to scan through a program looking for the mul/div instructions where they wouldn't function the same. They could be replaced with an illegal instruction that could be trapped just like the others.

Most software doesn't use supervisor mode so the changes to the interrupt handling wouldn't be a problem. Games had a habit of breaking on some Amigas back in the day so I wouldn't expect to run them on the Coldfire as is.

Quote

jdiffend already covered some other stuff regarding this, but you can also visit the other threads (use the search function).

I think I end up discussing this about once a year.
Nobody ever seems to use the search.

Quote

Anyway, I'd love to be wrong here, I'd love to see 266MHz m68k, but I believe I won't see that.

Still, Elbox can easily prove me wrong by releasing the thing and having it perform as promised.

Well, I have no doubts they can do the hardware... it's the software that is a PITA. Since the Minimig can load the OS from a device I have no doubts someone could load the Kickstart and patch it.

jdiffend · « **Reply #3 on:** May 12, 2006, 08:58:41 PM »

BTW, wheter or not the mul/div is a problem depends on how the compilers generated code. If it wasn't optimal to do the mul/div and then test-branch it should rarely appear. If it was optimal it will be all over the place and the code pattern from the compiler should be easy for a program to identify and patch as I suggested.

The full emulation would only be needed if someone couldn't figure out another way or for problem software. If you did do full emulation it would be pretty easy to use just in time compilation and write out the alternate code to execute natively. It wouldn't be like going from 68K to X86. Your registers are the same (on the 4e anyway... stupid supervisor mode on 2-3 cores sucks) and most instrucitons are the same.

jdiffend · « **Reply #4 on:** May 12, 2006, 09:02:56 PM »

Quote

Doobrey wrote:
I was gonna jump in and say something like ' Easy, override the ROM on a reset with one that inits the coldfire and then go back to the standard kickstart once the traps are setup'

But then the coldstart in exec resets all traps/exceptions to the ones in ROM. So that's the simple trap method up the creek :-x

Actually... that's not far from what I was thinking. Start with an alternate ROM, copy the existing kickstart to RAM, patch it, jump to the kickstart in RAM and turn off the boot ROM. Sorta like a 1000 that patches the ROM after loading it.

jdiffend · « **Reply #5 on:** May 12, 2006, 09:35:47 PM »

Quote

Piru wrote:
That doesn't work well. The problem is that there is no way to determine if some value is code or data. Checksums would also be a problem (just try running xvs.library or elbox drivers when the code has been modified, for example).

The checksums are copy protection and there isn't much you can do about it unless you want to crack the software completely.

BTW, I've seen it done in the past.

jdiffend · « **Reply #6 on:** May 13, 2006, 07:47:23 AM »

Quote

Piru wrote:
It depends on if you want to fail randomly or not. Personally I wouldn't want to have such code in my system, but that's just me.

Scuse me but there is no randomness to it at all.
And good for you.

jdiffend · « **Reply #7 on:** May 13, 2006, 09:39:10 AM »

Quote

utri007 wrote:
Piru is moust likely one of the world, he knows much amiga hardware.

Actually, you need to know the guts of the AmigaOS more than Amiga hardware. All you do with the hardware is exactly what the old OS does. It may take more instructions but it's not that big of a deal. The OS is written to deal with faster CPUs so timing shouldn't be an issue within the OS.

:roll: I'd at least want someone working on it that knows that Amiga executables are normally stored on disk with separate hunks for code, data and bss sections and that it's easy to have a program scan through an executable file, find a code hunk, search for a numeric pattern of an illegal instruction and then patch it. There was a program that did this with a 68000 instruction that was used in some early Amiga software that was illegal on the 68010 or higher. But then that was before the Amiga 500 was even introduced so he may not remember that. Still, the hunk info is even on a Wiki page.

Quote

So would you like to wrote new amiga exec for coldfire ?

Actually, there is a code translator that can convert most 68K assembly to Coldfire code but last I checked it still hadn't been updated to support the 4e core which requires fewer changes. Whenever the converter doesn't know what to do it marks the code and you can do it by hand in those spots. I tested it on some code but not the exec due to differences in the superviser register mode in the colfire chips it supports. Without support for the 4e core's addition to the supervisor mode the translator is almost useless on supervisor code. At least as far as the Amiga exec goes... it depends heavily on the register that is missing on all but the 4e coldfires.

However, with the disassembly I have and an updated translator the exec itself wouldn't be that bad. The most work to the exec would be to support the stack frame differences for interrupts and integration of the instruction traps. There's still probably more than a month of work there. Without the conversion tool it would take a very long time. The exec has a lot of code.

At that point the Coldfire could run through the self tests, setup the timer device (which would also need to be native Coldfire code), set up memory, autoconfig devices, setup graphics/intuition and move to the bootup sequence.
Then you have to see what fails and fix that.

Quote

We need positive additude in this dark time.

I'd settle for a lot less FUD.

jdiffend · « **Reply #8 on:** May 13, 2006, 08:14:09 PM »

Quote

platon42 wrote:
@jdiffend:

Sorry to interfere, but I'd suggest you check Harry 'Piru' Sintonens background and references before you continue to doubt the things he says. I suppose only a hand full of people (if any) know more about the amiga internals than Piru.

Nobody is infallable and it's pretty obviouse he favors some CPU besides the Coldfire.

Piru has obviously done a lot with the miggy from some of the posts I've read elsewhere but I disagree with him for a reason. I spent a lot of time looking at what would be required to do this.

FWIW, I have a background with the Amiga, software engineering, embedded systems and hardware. My opinions are based on experience and research.

Quote

And no, there is *no* way to find out if a word in a data or code section is code or data -- other than doing a full CPU emulation and stepping through the code that's reached (and still this will not yield the sections that contain dead or exceptional code). Though there are code and data hunks in an executable, nothing ever has prevented a coder to mix the contents at his will.

It was a simplistic cheap shot but it has some bearing on the discussion even if it wasn't totally valid.

Yes it's true you can have data in code segments. You can also put code in data segments if you really want. Since you are scanning code blocks it's more likely to accidently patch what you think is one instruction but is actually a combination of two others.

That's the possibilities. However, reality has a few things in our favor so you don't have to identify it as code or data. This has more to do with odds of failure than possibility it will fail. It will always be possible for a patcher to fail.

Remember, I'm also talking specificly about the math instructions that are legal but with different behavior. Also remember than not all math instructions will need to be patched. Trappable instructions don't need patched. Those conditions mean there are a pretty limited number instructions that may need patched to begin with.

This also has a lot to do with the general nature of code and data. If you look through a data segment you'll see a lot of zeros, $ff, $0f... stuff like that at random intervals. The odds of a data pattern in a code block matching one of the math instructions in a legal manner is very low but certainly possible.
The odds of it being followed by data matching a legal branch on condition code instruction (the conditions where it actually needs patched) is even lower.
If you decode a couple more instructions in the sequence to see if they are legal then the odds of failure are lower still.

If you look through the bytes making up a progrom you will also notice that you may find instructions made up from part of the bytes of two instructions. However, it's usually soon followed by an illegal instruction as well. You will also notice frequently used sequences of bytes. Since compiler output is pretty consistant it should make it easier to identify safely patchable sequences of code. Assembly would be less predictable and may suffer a higher failure rate.

Code in a data segment will cause this to fail. If a game loads in some code specific to a level as data or from a data file it would never be patched. But then I said games would be a problem.

This kind of patcher isn't about making ALL software work, it's about making the MOST software work for the least effort. The more intelligence you add to the patcher the more reliable it will be.

Quote

And also with my own technical background, I don't see how a ColdFire board could ever *work* (regardless of the performance) without full CPU emulation, especially regarding those non-compatible instructions with different behaviour or side-effects, that cannot be trapped and then emulated on the fly.

I never said you wouldn't need full emulation for some software. Games will probably require it since many have compatibility issues anyway.

The instructions that can't be trapped but have different behavior are exactly what I was talking about patching.

Quote

My conclusion is: The Dragon is never going to work in an amiga system, or, if it does, it would be using full CPU emulation (possibly with JIT?*), and hence, unbearably slow -- much slower than an MC68060/040.

I'm commenting strictly on the feasability of using the Colfire 4e attached to an Amiga... not on the Dragon itself. That gets into design specific details and I have no technical info on the Dragon.

Full emulation without JIT would be slow but remember, you don't have to emulate hardware as well as the CPU and OS calls would be native. JIT would be fast enough to run the real problem software like games which usually required a 68000 or 68ec020.
The emulator would *not* be needed for most software outside of games if someone spent any amount of time working on the patcher.

Quote

* I don't think Elbox has technically enough skilled people to write something like a JIT compiler.

That would not surprise me but I don't have any relationship with them so I can't say.
Without any real details about their hardware I wouldn't even want to guess.

jdiffend · « **Reply #9 on:** May 13, 2006, 08:30:05 PM »

Quote

FrenchShark wrote:
Hello,

I think you are talking about me :-)

Yup. Good to hear from you.

Quote

I do have the exec.library ported to the coldfire 5282 (it was the first coldfire to have the USP/SSP implementation).
This coldfire exec.library is not compatible with the 68k exec.library because of some differences in ExecBase!

Uh oh... here it comes.

Quote

For example, IDNestCnt and TDNestCnt have to be LONG since addq.b does not exist on CF and you cannot replace the addq.b by multiple instructions : the update of IDNestCnt and TDNestCnt must be done with one atomic instruction.

I'd have to look to see if the 4e supports addq.b. I know it supports more address/data modes but which specific modes I can't remember.

Quote

I have also a piece of timer.device translated into CF, it is necessary for the time slice part of the multi-tasking.

Yeah, I saw the tie in very early in the exec code. If I had a commented version of the timer.devide I'd pass it on but I haven't touched the disassembly files at all since I sent the exec to you.

Quote

Oh, BTW I had a though about how to provide a full CPU emulation on the Coldfire : we can use the trace mode and check if the instruction is compatible : when it is compatible you exit the exception and let the CF do the work when it is not you execute the emulated code. You can also use the debugging registers to separate the memory into two areas : one containing 68k code, one containing CF code (CF kickstart, libraries compiled for CF, etc...)

Regards,

Frederic

That would definately work... slowly but it would work. I think someone suggested that once but I can't remember who or in what group.

jdiffend · « **Reply #10 on:** May 13, 2006, 08:34:34 PM »

Quote

I tried this approach once on 68060@64. The result was slower than 68000@7 (constant exceptions really kill the performance it seems, must be the stack memory accesses).

Maybe coldfire 5282 exceptions aren't as slow as 68060, though?

The exception hit wouldn't be as bad just from the higher MHz and with the normally higher performance of the Coldfire. It should make 68000-7 emulation acceptable but fast it won't be.

jdiffend · « **Reply #11 on:** May 13, 2006, 08:41:57 PM »

Quote

It's the stack memory accesses, plus flushing the pipeline, plush thrashing the instruction cache. Given that faster processors generally have longer pipelines I doubt the situation would be any better on Coldfire; it would most likely be worse.

The Coldfire doesn't have a deep pipeline like the P4. Part of the reason it doesn't support the more complex address modes is to keep the pipeling smaller. It is probably at least 4 or 5 levels deep though.

jdiffend · « **Reply #12 on:** May 14, 2006, 01:31:19 AM »

Quote

Quote
For example, IDNestCnt and TDNestCnt have to be LONG since addq.b does not exist on CF and you cannot replace the addq.b by multiple instructions : the update of IDNestCnt and TDNestCnt must be done with one atomic instruction.

I'd have to look to see if the 4e supports addq.b. I know it supports more address/data modes but which specific modes I can't remember.

The manual indicates that the 4e core does have the additional features they originally published for it 5(?) years ago. It has the higher code compatibility and density they originally promised but the 4e manual just refers to the standard coldfire series manual when you get to the instruction set section and I can't be sure if the addq.b instruction is among the additions.

It's been too long since I looked at my exec disassembly to remember if multiple instructions would be an issue here or not so I really don't know.

BTW, the chip uses a 5 stage execution pipeline and it has a 4 stage prefetch pipeline to help prevent stalls.

Author Topic: Coldfire status (Read 8520 times)

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status

jdiffend

Re: Coldfire status