Welcome, Guest. Please login or register.

Author Topic: Coldfire status  (Read 8508 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Piru

  • \' union select name,pwd--
  • Hero Member
  • *****
  • Join Date: Aug 2002
  • Posts: 6946
    • Show only replies by Piru
    • http://www.iki.fi/sintonen/
Re: Coldfire status
« Reply #44 from previous page: May 13, 2006, 06:50:09 PM »
Yeah. 68060 spends 19 cycles per instruction on trace exception (1 read, 3 write cycles), and that is not counting the actual exception code itself (except fetching the first instruction).

Considering normally 68060 spends only couple of cycles per inst in average, this is quite slow indeed.
 

Offline jdiffend

  • Sr. Member
  • ****
  • Join Date: Apr 2002
  • Posts: 302
    • Show only replies by jdiffend
Re: Coldfire status
« Reply #45 on: May 13, 2006, 08:14:09 PM »
Quote

platon42 wrote:
@jdiffend:

Sorry to interfere, but I'd suggest you check Harry 'Piru' Sintonens background and references before you continue to doubt the things he says. I suppose only a hand full of people (if any) know more about the amiga internals than Piru.

Nobody is infallable and it's pretty obviouse he favors some CPU besides the Coldfire.

Piru has obviously done a lot with the miggy from some of the posts I've read elsewhere but I disagree with him for a reason.  I spent a lot of time looking at what would be required to do this.

FWIW, I have a background with the Amiga, software engineering, embedded systems and hardware.  My opinions are based on experience and research.

Quote
And no, there is *no* way to find out if a word in a data or code section is code or data -- other than doing a full CPU emulation and stepping through the code that's reached (and still this will not yield the sections that contain dead or exceptional code). Though there are code and data hunks in an executable, nothing ever has prevented a coder to mix the contents at his will.

It was a simplistic cheap shot but it has some bearing on the discussion even if it wasn't totally valid.  

Yes it's true you can have data in code segments.  You can also put code in data segments if you really want.  Since you are scanning code blocks it's more likely to accidently patch what you think is one instruction but is actually a combination of two others.  

That's the possibilities.  However, reality has a few things in our favor so you don't have to identify it as code or data.  This has more to do with odds of failure than possibility it will fail.  It will always be possible for a patcher to fail.  

Remember, I'm also talking specificly about the math instructions that are legal but with different behavior.  Also remember than not all math instructions will need to be patched.  Trappable instructions don't need patched.  Those conditions mean there are a pretty limited number instructions that may need patched to begin with.  

This also has a lot to do with the general nature of code and data.  If you look through a data segment you'll see a lot of zeros, $ff, $0f... stuff like that at random intervals.  The odds of a data pattern in a code block matching one of the math instructions in a legal manner is very low but certainly possible.
The odds of it being followed by data matching a legal branch on condition code instruction (the conditions where it actually needs patched) is even lower.
If you decode a couple more instructions in the sequence to see if they are legal then the odds of failure are lower still.

If you look through the bytes making up a progrom you will also notice that you may find instructions made up from part of the bytes of two instructions.  However, it's usually soon followed by an illegal instruction as well.  You will also notice frequently used sequences of bytes.  Since compiler output is pretty consistant it should make it easier to identify safely patchable sequences of code.  Assembly would be less predictable and may suffer a higher failure rate.

Code in a data segment will cause this to fail.  If a game loads in some code specific to a level as data or from a data file it would never be patched.  But then I said games would be a problem.

This kind of patcher isn't about making ALL software work, it's about making the MOST software work for the least effort.  The more intelligence you add to the patcher the more reliable it will be.

Quote
And also with my own technical background, I don't see how a ColdFire board could ever *work* (regardless of the performance) without full CPU emulation, especially regarding those non-compatible instructions with different behaviour or side-effects, that cannot be trapped and then emulated on the fly.

I never said you wouldn't need full emulation for some software.  Games will probably require it since many have compatibility issues anyway.

The instructions that can't be trapped but have different behavior are exactly what I was talking about patching.

Quote
My conclusion is: The Dragon is never going to work in an amiga system, or, if it does, it would be using full CPU emulation (possibly with JIT?*), and hence, unbearably slow -- much slower than an MC68060/040.

I'm commenting strictly on the feasability of using the Colfire 4e attached to an Amiga... not on the Dragon itself.  That gets into design specific details and I have no technical info on the Dragon.

Full emulation without JIT would be slow but remember, you don't have to emulate hardware as well as the CPU and OS calls would be native.  JIT would be fast enough to run the real problem software like games which usually required a 68000 or 68ec020.  
The emulator would *not* be needed for most software outside of games if someone spent any amount of time working on the patcher.

Quote
* I don't think Elbox has technically enough skilled people to write something like a JIT compiler.

That would not surprise me but I don't have any relationship with them so I can't say.
Without any real details about their hardware I wouldn't even want to guess.
 

Offline jdiffend

  • Sr. Member
  • ****
  • Join Date: Apr 2002
  • Posts: 302
    • Show only replies by jdiffend
Re: Coldfire status
« Reply #46 on: May 13, 2006, 08:30:05 PM »
Quote

FrenchShark wrote:
Hello,

I think you are talking about me :-)

Yup.  Good to hear from you.

Quote
I do have the exec.library ported to the coldfire 5282 (it was the first coldfire to have the USP/SSP implementation).
This coldfire exec.library is not compatible with the 68k exec.library because of some differences in ExecBase!


Uh oh... here it comes.

Quote
For example, IDNestCnt and TDNestCnt have to be LONG since addq.b does not exist on CF and you cannot replace the addq.b by multiple instructions : the update of IDNestCnt and TDNestCnt must be done with one atomic instruction.

I'd have to look to see if the 4e supports addq.b.  I know it supports more address/data modes but which specific modes I can't remember.

Quote
I have also a piece of timer.device translated into CF, it is necessary for the time slice part of the multi-tasking.

Yeah, I saw the tie in very early in the exec code.  If I had a commented version of the timer.devide I'd pass it on but I haven't touched the disassembly files at all since I sent the exec to you.

Quote
Oh, BTW I had a though about how to provide a full CPU emulation on the Coldfire : we can use the trace mode and check if the instruction is compatible : when it is compatible you exit the exception and let the CF do the work when it is not you execute the emulated code. You can also use the debugging registers to separate the memory into two areas : one containing 68k code, one containing CF code (CF kickstart, libraries compiled for CF, etc...)

Regards,

Frederic

That would definately work... slowly but it would work.  I think someone suggested that once but I can't remember who or in what group.
 

Offline jdiffend

  • Sr. Member
  • ****
  • Join Date: Apr 2002
  • Posts: 302
    • Show only replies by jdiffend
Re: Coldfire status
« Reply #47 on: May 13, 2006, 08:34:34 PM »
Quote
I tried this approach once on 68060@64. The result was slower than 68000@7 (constant exceptions really kill the performance it seems, must be the stack memory accesses).

Maybe coldfire 5282 exceptions aren't as slow as 68060, though?

The exception hit wouldn't be as bad just from the higher MHz and with the normally higher performance of the Coldfire.  It should make 68000-7 emulation acceptable but fast it won't be.
 

Offline jdiffend

  • Sr. Member
  • ****
  • Join Date: Apr 2002
  • Posts: 302
    • Show only replies by jdiffend
Re: Coldfire status
« Reply #48 on: May 13, 2006, 08:41:57 PM »
Quote

It's the stack memory accesses, plus flushing the pipeline, plush thrashing the instruction cache. Given that faster processors generally have longer pipelines I doubt the situation would be any better on Coldfire; it would most likely be worse.

The Coldfire doesn't have a deep pipeline like the P4.  Part of the reason it doesn't support the more complex address modes is to keep the pipeling smaller.  It is probably at least 4 or 5 levels deep though.
 

Offline Georg

  • Jr. Member
  • **
  • Join Date: Feb 2002
  • Posts: 90
    • Show only replies by Georg
Re: Coldfire status
« Reply #49 on: May 13, 2006, 10:32:11 PM »
Quote
For example, IDNestCnt and TDNestCnt have to be LONG since addq.b does not exist on CF and you cannot replace the addq.b by multiple instructions : the update of IDNestCnt and TDNestCnt must be done with one atomic instruction.


I used to think so, too, but then started to guess that they probably do not need to be atomic. The reason being that despite context switches (task - task, task - interrupt) they are saved/restored and so if a context is left and later returned to the IDNestCnt/TDNestCnt value is the same as it used to be when context was left.

They would need to be atomic if a context switch could cause a change of value.

context 1:
  read idnestcnt to register
  inc register
  [context switch to something else]
  [context switch back to here]
  /* no problem, as idnestcnt is same as when context was    left */
  write register to idnestcnt


 
 

Offline boing

  • Sr. Member
  • ****
  • Join Date: Apr 2002
  • Posts: 293
    • Show only replies by boing
    • http://www.TribeOfHeart.org
Amateurs
« Reply #50 on: May 13, 2006, 11:19:17 PM »
Three pages of mental masturbation.

You shouldn't even begin to speculate like this until well after you've looked at the registers, instructions and addressing modes for the specific chip used by Elbox.

Pontificate after doing that. You know who you are.


 

Offline jdiffend

  • Sr. Member
  • ****
  • Join Date: Apr 2002
  • Posts: 302
    • Show only replies by jdiffend
Re: Coldfire status
« Reply #51 on: May 14, 2006, 01:31:19 AM »
Quote

Quote
For example, IDNestCnt and TDNestCnt have to be LONG since addq.b does not exist on CF and you cannot replace the addq.b by multiple instructions : the update of IDNestCnt and TDNestCnt must be done with one atomic instruction.

I'd have to look to see if the 4e supports addq.b.  I know it supports more address/data modes but which specific modes I can't remember.

The manual indicates that the 4e core does have the additional features they originally published for it 5(?) years ago.  It has the higher code compatibility and density they originally promised but the 4e manual just refers to the standard coldfire series manual when you get to the instruction set section and I can't be sure if the addq.b instruction is among the additions.

It's been too long since I looked at my exec disassembly to remember if multiple instructions would be an issue here or not so I really don't know.

BTW, the chip uses a 5 stage execution pipeline and it has a 4 stage prefetch pipeline to help prevent stalls.
 

Offline Piru

  • \' union select name,pwd--
  • Hero Member
  • *****
  • Join Date: Aug 2002
  • Posts: 6946
    • Show only replies by Piru
    • http://www.iki.fi/sintonen/
Re: Coldfire status
« Reply #52 on: November 13, 2006, 02:28:41 PM »
Quote
Anyway, I'd love to be wrong here, I'd love to see 266MHz m68k, but I believe I won't see that.

...and it seems I wasn't wrong. Elbox themselves quote performance between 040 and 060. From the videos it looks more on the 040 side.