Welcome, Guest. Please login or register.

Author Topic: in case you are interested to test new fpga accelerators for a600/a500  (Read 38689 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline Lurch

  • Lifetime Member
  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 1716
    • Show only replies by Lurch
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #194 on: March 31, 2015, 07:49:37 AM »
Quote from: xboxOwn;787053

I just want to see new games coming to my A500 again. I...I want to BUY games for my A500. I want that box again, that manual again, that disk again, that price tag again. I spend 70 dollars on PS 4 games...you don't think I can cough up 40 bucks for an Amiga game?? I will treat my A500 as a console that is all. Another console to buy games for it.


I think there is a market for that too, kind of why I got the Amiga Future sub. I forgot how good it was to read a real magazine.

I also think there is room in the market for new games that take advantage of faster CPUs. I would love to see some more taking advantage of the 030s out there.

Even if they are improvements over the old school platformers like ruff'n'tumble, Bubble 'n Squeak, Bubba 'n Stix or The Chaos Engine I/II. Push what the 030 is capable of.

I'd buy an updated version of any of those especially ruff'n'tumble.
-=[LurcH]=-
A500 Plus Black 030@40MHz 128MB | A1200T 060@80MHz 320MB | Pegasos II G4@1GHz 1GB  | Amiga Future Sub
 

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show only replies by biggun
    • http://www.greyhound-data.com/gunnar/
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #195 on: March 31, 2015, 08:05:14 AM »
Quote from: matthey;787026
I don't think anyone involved with the Phoenix/Apollo project has considered a software library for 100% ColdFire compatibility up to ISA_C (excluding MAC, EMAC and FPU) but it could be done if there was a specific purpose and enough demand. /QUOTE]

If you compare the Coldfire and the 68K general instruction set then
its obvious that Codlfire general Instruction set is much smaller and simpler.

- A good number of Instructions are missing.
- Some Instructions are crippled. E.g MOVEM
- The ALU operations do not support BYTE/WORD/LONG anymore but only LONG.
- The available EA modes are limited, this heavily weakens all Immidiate instructions.

The MAC instruction are specialist. They for example include a MOVE with another OPERATION.
The included move in the MAC is useful for Coldfire as it allows "hiding" memory access time.
For Phoenix this is not needed as Phoenix has streaming detection and can prefetch and will
by itself with normal code already hide memory latency.
Some MAC operation include special operation handling.This is usefull for DSP applications.
This is nice. The Phoenix includes conditional instruction rewrite.
This means the same benefits could already be generated with normal instructions in many casesby the Phoenix core.

So yes the MAC instructions are for some special purpose areas nice.
But Phoenix provides general CPU improvements as the conditional rewrite and memory streaming which add some of the MAC benefits to all general 68k programs already and this automatically.

If you compare CPU architectures then you see that the best Coldfire the Coldfire V4e,
can do 1 ALU operation per cycle - while Phoenix can do 3 operations per cycle.

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #196 on: March 31, 2015, 08:39:36 AM »
Quote from: biggun;787060

If you compare CPU architectures then you see that the best Coldfire the Coldfire V4e,
can do 1 ALU operation per cycle - while Phoenix can do 3 operations per cycle.


My 68060 @50Mhz can already do 3 operations per cycle.

Code: [Select]

loop addq.l #4,d0
     addq.l #1,d1
     bne.b loop

That is 3 entire instructions executed every clock cycle.

This proves my 50Mhz 060 is a 150 MIPS machine.
A 100Mhz 060 is a 300 MIPS speed demon.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show only replies by biggun
    • http://www.greyhound-data.com/gunnar/
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #197 on: March 31, 2015, 08:49:19 AM »
Quote from: ChaosLord;787062
My 68060 @50Mhz can already do 3 operations per cycle.

Code: [Select]
loop
     addq.l #4,d0
     addq.l #1,d1
     bne.b loop
That is 3 entire instructions executed every clock cycle.

This proves my 50Mhz 060 is a 150 MIPS machine.
A 100Mhz 060 is a 300 MIPS speed demon.

First of all, prediction a branch its not an ALU operation.
Do 3 ADDs would be three ALU operations.

And did actually measure the above?
« Last Edit: March 31, 2015, 08:54:12 AM by biggun »
 

Offline Lurch

  • Lifetime Member
  • Hero Member
  • *****
  • Join Date: Dec 2003
  • Posts: 1716
    • Show only replies by Lurch
-=[LurcH]=-
A500 Plus Black 030@40MHz 128MB | A1200T 060@80MHz 320MB | Pegasos II G4@1GHz 1GB  | Amiga Future Sub
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #199 on: March 31, 2015, 09:32:59 AM »
@Gunnar

So you are saying you can do 3 ALU operations per clock?  If so, that is awesome!

Do you mean the normal ALU?  Or are you counting the EA-Calculator as an ALU? Or ?

And can you do a correctly predicted branch at the same time as these 3 ALU ops?
So that would be 4 instructions simultaneously.  And can you do a FPU operation at the same time?

Give some examples.  C0derz gotta code! :)
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline OlafS3

Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #200 on: March 31, 2015, 10:09:19 AM »
Quote from: Thomas Richter;787031
Neither - nor. Nor do I have an axe somewhere. The problem when modifying the ISA is the value/price ratio. The value of the above extensions are minimal, the cost is potential software incompatibility, potentially causing a lot of useless support requests for whomever creates software. There are really better ways to spend the ISA space.  

Really, Gunnar and I chat frequently, and friendly. But that still does not mean that one cannot have an argument from time to time. I personally would not extend the ISA in that way, or at least for so little returns.

I think sometimes people problems see where no really is

There will be not only a new core but also a (planned) chipset for the card and both will be in development and change over time. So updating will be required in any case. Affected will be software that directly hits the hardware, applications will (hopefully) only use the OS (f.e. AROS 68k) and thus not be affected at all. Games that are planned to run everywhere will also not be affected. So we talk about certain heavy software like a specific browser version or certain games that might make full use. All old 68k software will not be affected either. As long as it is clear what is added every developer can decide if he wants to use new features or not. And users have to update the FPGA regularly anyway (because of both changes of core and chipset).
 

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show only replies by biggun
    • http://www.greyhound-data.com/gunnar/
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #201 on: March 31, 2015, 12:12:35 PM »
Quote from: ChaosLord;787065
@Gunnar

So you are saying you can do 3 ALU operations per clock?  If so, that is awesome!

Not me. Phoenix does 3 ALU operations per clock.

Quote from: ChaosLord;787065


Give some examples.  



MOVE.L ($1245998,A0,D7*4),D2
ADDA.L #$00012345,A0
SWAP   D7
ANDI.W  #$88aa,(-8456,A7,D0*2)
OR.L     D7,D2
LEA       (18,A1,A5*8),A6

6 instructions = executed in 2 cycle

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #202 on: March 31, 2015, 12:36:18 PM »
Quote from: biggun;787067
Not me. Phoenix does 3 ALU operations per clock.
:lol:


Quote

MOVE.L ($1245998,A0,D7*4),D2
ADDA.L #$00012345,A0
SWAP   D7
ANDI.W  #$88aa,(-8456,A7,D0*2)
OR.L     D7,D2
LEA       (18,A1,A5*8),A6

6 instructions = executed in 2 cycle


That is kRaZy! :crazy:  That would be like 14 kazillion PPC instructions!

How do I align the code to make it go "down the pipe" 3 instructions at a time?
A) Don't do anything its all 1000% automatic.
B) Code must be aligned to a word address that is evenly divisible by 3.
C) Other?


In real life I don't think I have any "real" code that can execute 3 instructions at a time (in an important time-critical loop) due to dependencies.  Its honestly really hard to write code that executes just 2 instructions at a time on 060 due to dependencies.  But still this is a nice technological feature. :thumbsup:

I predict a lot of cycle-counters will be selling their Grandmas to code this chip. :)
« Last Edit: March 31, 2015, 12:39:52 PM by ChaosLord »
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show only replies by biggun
    • http://www.greyhound-data.com/gunnar/
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #203 on: March 31, 2015, 05:52:25 PM »
Quote from: ChaosLord;787070
:lol:
That is kRaZy! :crazy:  That would be like 14 kazillion PPC instructions!


Lol yes.
And the 68060 needs 8 clocks for the Instructions above which Phoenix can do in 2


often very cool is the below:


 BLT .thisway
 MOVE.L #$F00FF,D0
.thisway


What do you think how long do the 2 instruction take on Phoenix and on other systems?

guest11527

  • Guest
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #204 on: March 31, 2015, 11:23:25 PM »
Quote from: biggun;787078
Lol yes.
And the 68060 needs 8 clocks for the Instructions above which Phoenix can do in 2

So wait. Why exactly do we need a "move zero extended" instruction again?

After all, "moveq #0,d0; move.b (a0),d0" could also be merged into a single "meta"-instruction, right?

Similarly, "move.w (a0),d0;ext.l d0" could also be merged into one instruction....

I see now even less the need to extend the ISA.
 

Offline xboxOwn

  • Jr. Member
  • **
  • Join Date: Mar 2015
  • Posts: 97
    • Show only replies by xboxOwn
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #205 on: March 31, 2015, 11:30:13 PM »
Quote from: Thomas Richter;787092
So wait. Why exactly do we need a "move zero extended" instruction again?

After all, "moveq #0,d0; move.b (a0),d0" could also be merged into a single "meta"-instruction, right?

Similarly, "move.w (a0),d0;ext.l d0" could also be merged into one instruction....

I see now even less the need to extend the ISA.


Can someone please tell me what this assembly source code does?
 

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show only replies by biggun
    • http://www.greyhound-data.com/gunnar/
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #206 on: April 01, 2015, 07:18:46 PM »
Quote from: Thomas Richter;787092

So wait. Why exactly do we need a "move zero extended" instruction again?


If you want to blame someone, then blame Intel its their fault.  :-)
Intel did research studies in the 80th and found out that these instructions are very useful.


Quote from: Thomas Richter;787092

After all, "moveq #0,d0; move.b (a0),d0" could also be merged into a single "meta"-instruction, right?
Similarly, "move.w (a0),d0;ext.l d0" could also be merged into one instruction....


This is correct.
In theory a smart CPU could fuse this


move.b  (ea),Dn
extb.l     Dn


and


move.w  (ea),Dn
ext.l     Dn


Could be fused to be single cycle each

I agree with you that this is an option
This was also my argument that Phoenix could do this in single cycle already.....

Offline asymetrix

  • Full Member
  • ***
  • Join Date: May 2007
  • Posts: 118
    • Show only replies by asymetrix
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #207 on: April 01, 2015, 08:13:38 PM »
Great stuff.

What we need is a scriptable assembly language with OpenGL commands.

This would make assembly portable, with OO overloading compatible commands.

We also need complete regression testing for each command to test pass / fail cpu emulation.

Nice info on chip enhancements
http://www.gamedev.net/page/resources/_/technical/graphics-programming-and-theory/graphics-programming-black-book-r1698
 

Offline biggun

  • Sr. Member
  • ****
  • Join Date: Apr 2006
  • Posts: 397
    • Show only replies by biggun
    • http://www.greyhound-data.com/gunnar/
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #208 on: April 01, 2015, 08:34:11 PM »
Quote from: Thomas Richter;787092

I see now even less the need to extend the ISA.


Thomas
I understand the idea of in theory being able to run new applications on old hardware.

But to be honest I'm not sure how realistic this hope is.
Please read my points and explain your view on them.


As ou know the new cards are going to have a local RTG Graphic Video card included.
The CPU has very fast access to this memory.
We talk here about direct read /write access from CPU with several hundred MB/sec.
This means direct single pixel manipulation will be very very fast.
Compared to even the fastest AMIGA Zorro card we have here a speed up of 50 times  or 100 times.
Yes the local memory access will be 50 or 100 times faster than you Memory Access over Zorro.

As soon new applications use this an runing on old 68060 system with RTG card is hopeless
 as the speed difference between new and old system is over 10 times.

This means if you new game or application uses the new card.
Then the frate rate on old system would be so much slower - its will be come pointless.

I assume you see this too, right?

The new CPU is so much faster than we can now look at stuff like H264 playback which is totally senseless to try on old 68030.
So whether there is a new instruction used or not - does not make a difference the new video datatype will anyway not run on old slow 68030.
And also 68060 will be too slow in many cases.


Olaf did ask the a similar question. But you did not answer it yet.
He did ask you what happens if people use the SAGA chipset in their apps?

I agree with you Thomas that a "hello world" does not need any new instructions.
But a "killer app" which we all look forward too like modern webbrowser or good video player
these will anyway out-spec the old system - the whether the use new instructions or not makes no difference the old 68K CPU are simply to slow for them anyway.


Do you agree here or what is your point?

Offline alphadec

  • Full Member
  • ***
  • Join Date: Oct 2003
  • Posts: 118
    • Show only replies by alphadec
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #209 from previous page: April 01, 2015, 09:03:36 PM »
Quote from: biggun;787159


The new CPU is so much faster than we can now look at stuff like H264 playback which is totally senseless to try on old 68030.
So whether there is a new instruction used or not - does not make a difference the new video datatype will anyway not run on old slow 68030.
And also 68060 will be too slow in many cases.



When can we see some screengrabs where SASG is used, or is too early. ?
Amiga 4Ever