Author Topic: FPGA Replay Board (Read 953060 times)

ChaosLord · « **Reply #359 on:** March 28, 2011, 06:31:34 PM »

Ok so start the tracer in the kickstart.

But how can the JIT tracing work inside of interrupts?

freqmax · « **Reply #360 on:** March 28, 2011, 07:17:13 PM »

It ought to be possible to create a working 68060 softcore, however it might be slow. Anyway a 68020 uses 200k transistors, while the 68060 uses 2500k transistors. Assuming linear size relations.. The 68020 softcore according to yaqube, takes about 60% out of the Xilinx Spartan-3E 1200 with 19k logic gates in total. Using two of the newer Spartan-6 XC6SLX75 for 120 USD each with 75k logic gates each it should be posssible to implement a 68060 softcore with the free Xilinx ISE Webpack synthesis software.

Any takers?

Iggy · « **Reply #361 on:** March 28, 2011, 07:24:44 PM »

Quote from: ChaosLord;625339

You make it sound very very very easy.

If it is that easy then why couldn't Elbox get it working at high speed on their Dragon?

What do you do about games that don't use LoadSeg() ?

Chaos bring up something that has bothered me from the start. Not only are Coldfire processors missing some 68K instructions, other instructions don't work exactly the same way as they do on a 68K.
If everything has to run through a JIT interpreter, then the performance hit may nullify the speed advantage.

A real '060 or an FPGA emulated 68K processor may have a performance advantage over a Coldfire processor running 68K code through a JIT interpreter.

psxphill · « **Reply #362 on:** March 28, 2011, 07:35:08 PM »

Quote from: vidarh;625337

You misunderstand what I was describing. I was describing a purely software solution similar to a JIT (just in time compiler)

JIT may work, but it'll be slower & need more ram. Dealing with self modifying code is especially tricky.
Code accessed through jump tables are also difficult to find until you actually get to it.

There is a difference between writing a JITing for a language that was designed for it & one that isn't.

While I don't think it's as easy or good as you think, I'd love to see how it works out.

Iggy · « **Reply #363 on:** March 28, 2011, 07:39:08 PM »

Quote from: psxphill;625363

JIT may work, but it'll be slower & need more ram. Dealing with self modifying code is especially tricky.

Frankly, that is a trick I always avoided. A true sign of bad programming.

A few OS' I've worked with essentially forbid self modifying code.

psxphill · « **Reply #364 on:** March 28, 2011, 08:25:04 PM »

Quote from: Iggy;625366

Frankly, that is a trick I always avoided. A true sign of bad programming.

self modifying code is fine as long as you clear the cpu caches afterwards, so as long as you ditch the jit cache when the caches are cleared then it'll work.

How about pushing an address on the stack and then returning?

Although rts will need to cope anyway as the code you're returning to might have been flushed if your jit cache fills up. So it will have to always do a lookup to find the real code.

vidarh · « **Reply #365 on:** March 28, 2011, 09:01:43 PM »

Quote from: psxphill;625363

JIT may work, but it'll be slower & need more ram.

Slower and need more RAM than what? The alternative is to not run the application at all, or run it under emulation. For stuff you have source to, recompiling it is the better alternative.

In terms of RAM, unless the code is self-modifying, you can get away with only very minor amounts for cases like m68k to CF, in order to patch in emulation of instructions that are not supported and that can't be replaced in-line with code of the same size.

You only need to maintain two copies of the code *if* you need to deal with self modifying code. A simple solution is to not deal with it in ordinary cases, and possibly not at all (frankly, given the small, finite amount of legacy code relying on self modifying code, it's probably better to spend the time patching the few programs that do).

Quote

Code accessed through jump tables are also difficult to find until you actually get to it.

Exactly, and that's the reason to do a tracing JIT instead of static translator.

With a tracing JIT it's easy, as you'll always hit a breakpoint when the branch should happen until all paths have been completely traced. A major point of a tracing JIT as opposed to a method based JIT is exactly to make it trivial to handle control flow.

Quote

There is a difference between writing a JITing for a language that was designed for it & one that isn't.

Yes, but in this case it's vastly *easier*, as the mapping function for the vast majority of instructions is simply the identity function (that is, nothing is done other than to skip to the next instruction).

The existence of JIT's that JIT m68k to i386 or PPC demonstrates a worst case bound where all instructions need to be JIT'd. Yet there are decently performing JIT's that do that. For m68k to Coldfire the case is far simpler.

vidarh · « **Reply #366 on:** March 28, 2011, 09:07:03 PM »

Quote from: psxphill;625374

self modifying code is fine as long as you clear the cpu caches afterwards, so as long as you ditch the jit cache when the caches are cleared then it'll work.

Having to clear the cache is another reason why it's seen as bad practice, beyond being horribly unmaintainable. There's a reason people pretty much stopped doing it in the mid 80's.

Quote

How about pushing an address on the stack and then returning?

Simple enough to detect.

Quote

Although rts will need to cope anyway as the code you're returning to might have been flushed if your jit cache fills up. So it will have to always do a lookup to find the real code.

There would be no "jit cache" - that's the entire point of how to make it fast and simple - you'd patch the live code directly, so no, it doesn't need to do any lookups because to pushed address would be the address of the real code.

Iggy · « **Reply #367 on:** March 28, 2011, 09:27:34 PM »

>Having to clear the cache is another reason why it's seen as bad practice, beyond being horribly unmaintainable. There's a reason people pretty much stopped doing it in the mid 80's.

Honestly, self modifying code isn't just a bad practice, its a recourse used by sloppy programmers. For the small improvement you might see in performance you destroy easy traceability and can no longer use re entrant code.

freqmax · « **Reply #368 on:** March 28, 2011, 09:32:18 PM »

Self modifying code can have significant perfomance gains when cycles are hard to come by.

Btw, does self modifying code put pipelined cpus into an undefined state?

vidarh · « **Reply #369 on:** March 28, 2011, 09:56:42 PM »

Quote from: freqmax;625394

Self modifying code can have significant perfomance gains when cycles are hard to come by.

Lets separate two definitions here. *Generating* code at runtime is not necessarily a bad thing - after all that's what a JIT does. *Modifying* code by writing into already in-use parts of the code segment is a nasty thing.

The former is easy enough to handle with JIT too - any indirect jump would necessarily need to be replaced with a guard/breakpoint (not necessarily a trap - a jump to a handler in the JIT is sufficient, and can be much cheaper) that ensures no direct jump to untranslated code happens unless the indirection can be shown to be "safe" (relative to a known base, such as a library base).

Actual self modifying code as opposed to code that safely generates new code is not necessary for performance at all in my view - I believe you can get all the benefits of it by generating code in cleaner ways. But even self-modifying code is not _necessarily_ a big problem to handle - in most cases you can reasonably easily determine with tracing which instruction sequences can lead to writes to address ranges in the code segment, though it does complicate the tracer for very little benefit.

Frankly, I haven't seen self modifying code used for any good purpose since my Commodore 64 days (and then for cycle exact timing for raster effects, not for performance)... I'd be very interested in seeing a good example of it being used in a way where it couldn't easily be avoided without sacrificing a lot of performance.

psxphill · « **Reply #370 on:** March 28, 2011, 10:12:45 PM »

Quote from: vidarh;625387

There would be no "jit cache" - that's the entire point of how to make it fast and simple - you'd patch the live code directly, so no, it doesn't need to do any lookups because to pushed address would be the address of the real code.

So all you're going to do is patch at load time? You might find some software that works for, but you can't get 100% coverage of all opcodes on all software at load time. Even worse you might patch some data, because you can't be guaranteed that you'll figure out which is code and which is data (technically it can even be both).

Quote from: vidarh;625400

Frankly, I haven't seen self modifying code used for any good purpose since my Commodore 64 days (and then for cycle exact timing for raster effects, not for performance)... I'd be very interested in seeing a good example of it being used in a way where it couldn't easily be avoided without sacrificing a lot of performance.

Is copy protection a good purpose?

vidarh · « **Reply #371 on:** March 28, 2011, 11:14:32 PM »

Quote from: psxphill;625408

So all you're going to do is patch at load time?

No. At runtime. It wouldn't be a JIT if it tried to do it all at once.

Quote

You might find some software that works for, but you can't get 100% coverage of all opcodes on all software at load time.

Which is why you trace the execution until each branching point and JIT trace by trace rather than the whole thing at once, at which point determining the instruction stream is trivial (couple of hundred lines of C, at most, as I said - I have about half a dozen M68k instruction decoders sitting around on my harddisk from various disassemblers and other tools).

Doing it this way means you can analyze each trace fairly easily to determine if the branch point is static (return to caller or branch to a specific address) or dynamic (in the latter case you'd need to insert a jump to a small guard function to ensure you don't jump to untranslated code unless to you can compute the full set of branch points. If in doubt you err on the side of treating it as dynamic, at a slight performance cost.

In reality, the cases here you'd need a guard are so rare that it's most likely not even worth optimizing (though there are a number of well understood ways of doing it, such as polymorphic inline caching, first developed for Self).

Note that for example jump tables for the most part does *not* fall in this category, as recognizing sufficient number of the most common jump tables approaches is fairly simple and handling them easy enough (add a small guard function that checks bounds, and adds breakpoints for all functions between the previous highest/lowest jump table values used if they can't be statically determined, or otherwise just jumps to them, trigger a breakpoint if the code hasn't been traced yet - you suffer a worst case cost of a couple of compares and branches once the translation has been done).

Quote

Is copy protection a good purpose?

No. Given that few of them prevented anything from getting copied for more than days back in the day, I'd say that's an exceedingly good example of how pointless it is. While getting originals to run would be nice, and handling the most basic self modifying code is reasonably straightforward, it's a clear example of what I'd consider a waste of time given that finding cracks is easy enough.

Hattig · « **Reply #372 on:** March 29, 2011, 12:54:06 PM »

Can we move the Coldfire recompilation posts into a different thread please.

[If you're patching a few, incompatible instructions, then you're surely better off doing this once up front, and making the fixed binary available online, or integrating the patching mechanism into the loader (presumably whdload does a similar thing for Amiga games). No need to over-engineer a solution here.]

Everblue · « **Reply #373 on:** March 30, 2011, 08:06:13 AM »

Sorry for asking - haven't been following lately, but there is a price and/or date yet?

Cheers!

espskog · « **Reply #374 from previous page:** March 30, 2011, 08:17:27 AM »

I got mine yesterday (first batch). Not being sure if what I payd is what batch#2-boards will sell for, I won't be able to comment on that price. I believe that Mike has written some messages on this thread a while ago what he'll charge for the boards.

Author Topic: FPGA Replay Board (Read 953060 times)

ChaosLord

Re: FPGA Replay Board

freqmax

Re: FPGA Replay Board

Iggy

Re: FPGA Replay Board

psxphill

Re: FPGA Replay Board

Iggy

Re: FPGA Replay Board

psxphill

Re: FPGA Replay Board

vidarh

Re: FPGA Replay Board

vidarh

Re: FPGA Replay Board

Iggy

Re: FPGA Replay Board

freqmax

Re: FPGA Replay Board

vidarh

Re: FPGA Replay Board

psxphill

Re: FPGA Replay Board

vidarh

Re: FPGA Replay Board

Hattig

Re: FPGA Replay Board

Everblue

Re: FPGA Replay Board

espskog

Re: FPGA Replay Board