Welcome, Guest. Please login or register.

Author Topic: in case you are interested to test new fpga accelerators for a600/a500  (Read 39783 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
I prefer A3000T, A1200T and A4000T because a real Man(tm) uses a computer that requires a Fork Lift to get the thing in/out of the car at the Amiga Meetings. :D
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Quote from: Lurch;786750

Who doesn't want a 500Mhz A500? Kind of fitting when you think about it A500...500MHz...


I am sorry but these cards are not 500 Mhz.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Quote from: Thomas Richter;786950
Which compilers would support it, which assemblers would? Does it enable any "killer features"?

Matt Hey has been ready for years now, with 1 or more assemblers revved up and ready to go as soon as Gunnar publishes the official bitmap list of all the instructions.


Quote

Instead of adding a series of seemingly nice, but in reality almost useless low-level additions, it would be much more sensible to add higher level functionalities that enable new killer features the Amiga did not have originally, and from which multiple programs could benefit. Say, JPEG decoding or MP3 decoding. None of the new instructions make these tasks simpler, easier or faster - they are too low-level. What it would probably take would be a hardware engine for some of the components of these standards (Huffman decoder, DCT, digital filters... hence, multiply-add instructions, shift-and-mask instructions and so on).

...

I know, there is the cycle counter party whose members would sell their grandma to reduce the number of clock cycles in a totally uninteresting part of a program, and maybe you want to address the demand from those folks. However, beware! The outcome of such activitly is usually less useful than one may hope for, leave alone the stability, and the increased performance is usually not what you have hoped for.


The 2 main ways to speed up a program are A) cycle-counting to cycle-optimize things and B) rewriting the algorithm from the ground up to be more efficient.

We are not allowed to rewrite the mp3 algorithm or the jpeg algorithm or the h264 algorithm or the h265 algorithm.  So the ONLY thing we can do to speed up our software right now is to do the cycle-counting tricks.  The best way is to add new instructions to the ISA.

All the asm coders on the Natami forum asked for "Multiply-add with clipping" instruction(s) because they allow a dramatic speed increase for audio and video codecs.  Iirc it applies to jpeg decoding also.

Does the Apollo Core have a MADD instruction?  When I left off years ago, Gunnar wasn't allowing it.  But maybe he changed his mind now?  I asked to read the instruction manual to find out but Gunnar hasn't provided it yet.

Sadly without a list of the new instructions, and their bitpatterns (sometimes referred to as opcodes) I can't go around saying how great the new ISA is when I don't even know what it is.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Quote from: Thomas Richter;787000
A multiply-add-capable eight-fold vector logic would be a nice addition for a JPEG DCT, for example.


Well at least we can all sit around the campfire singing kumbaya because we all can agree on that 1 sentence. :)
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Quote from: biggun;787060

If you compare CPU architectures then you see that the best Coldfire the Coldfire V4e,
can do 1 ALU operation per cycle - while Phoenix can do 3 operations per cycle.


My 68060 @50Mhz can already do 3 operations per cycle.

Code: [Select]

loop addq.l #4,d0
     addq.l #1,d1
     bne.b loop

That is 3 entire instructions executed every clock cycle.

This proves my 50Mhz 060 is a 150 MIPS machine.
A 100Mhz 060 is a 300 MIPS speed demon.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
@Gunnar

So you are saying you can do 3 ALU operations per clock?  If so, that is awesome!

Do you mean the normal ALU?  Or are you counting the EA-Calculator as an ALU? Or ?

And can you do a correctly predicted branch at the same time as these 3 ALU ops?
So that would be 4 instructions simultaneously.  And can you do a FPU operation at the same time?

Give some examples.  C0derz gotta code! :)
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Quote from: biggun;787067
Not me. Phoenix does 3 ALU operations per clock.
:lol:


Quote

MOVE.L ($1245998,A0,D7*4),D2
ADDA.L #$00012345,A0
SWAP   D7
ANDI.W  #$88aa,(-8456,A7,D0*2)
OR.L     D7,D2
LEA       (18,A1,A5*8),A6

6 instructions = executed in 2 cycle


That is kRaZy! :crazy:  That would be like 14 kazillion PPC instructions!

How do I align the code to make it go "down the pipe" 3 instructions at a time?
A) Don't do anything its all 1000% automatic.
B) Code must be aligned to a word address that is evenly divisible by 3.
C) Other?


In real life I don't think I have any "real" code that can execute 3 instructions at a time (in an important time-critical loop) due to dependencies.  Its honestly really hard to write code that executes just 2 instructions at a time on 060 due to dependencies.  But still this is a nice technological feature. :thumbsup:

I predict a lot of cycle-counters will be selling their Grandmas to code this chip. :)
« Last Edit: March 31, 2015, 12:39:52 PM by ChaosLord »
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
For those of you who don't understand what is being discussed, I will now summarize:

Matt Hey has shown up to the debate with a library full of facts and figures and has won the debate on all counts.

Matt shoots.... he scores!  Matt Hey FTW!

PlaySound CrowdGoesWild.8svx
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
@Thomas Richter

You seem to put a lot of energy into opposing the addition of new instructions added in to Jay Miner Compatible Computers.

I would dearly love to read some of your criticisms of Intel adding in new instructions to Bill Gates Compatible Computers.  Where may I read these scathing criticisms?

Intel gets to add new instructions all the time.  Endlessly.  This must frustrate you terribly.  It would be most enjoyable to me to read your crushing rebukes of Intel Corporate Policy.

If you could provide a link to some of your old writings or just write up some of your complaints against Intel here in the forum then I am certain they would be very educational.

Thanks! :)
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Quote from: kolla;787236
And WHDLoad runs fine on Phoenix - but WHDLoad and the games it supports also run fine on any Amiga with a bit of RAM already, none of the old games have any use for an improved 68k CPU.

All my games have use for an improved 680x0 CPU.

Even my old A500 games required a 68020+ after circa 1990.  25Mhz 68030 accelerators were sold everywhere for A500 in 1990s.  They had 68020 and 68030 accelerators in 1980s too.

All my games have use for a faster CPU.
All my games have use for additional instructions.
All my games have use for more RAM.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #10 on: April 03, 2015, 01:37:44 PM »
Quote from: ElPolloDiabl;786969
If you are going FPGA which is better at parallel processing instead of raw MHz just keep going with it. It will probably end up better or faster than a Coldfire system.

What can be done on the software side to fix this? Could you add libraries that make it Coldfire compatible?


No.

Coldfire was designed on purpose to be permanently incompatible with all previous processors.

It was very deviously and fiendishly designed.  Someone paid someone a LOT of money as a bribe to cook up this wacked out mysteriously hyper-incompatible cpu.

When Matt et. al. talk about coldfire compatibilty they are referring to adding in a few new instructions that can be 100% compatible.  But you need to understand that some coldfire instructions are just completely ridiculously incompatible.

If you want to run Coldfire code on a non coldfire cpu then you must use emulation.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #11 on: April 05, 2015, 02:26:13 PM »
Quote from: psxphill;787436
I suspect you'll find that in practise it will, but time will tell eh.
 
 Zorro 3 is flakey as hell, I don't envy anyone trying to make something compatible with everything (unless that too is not a priority).


What exactly does the MMU do to the Zorro bus?
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #12 on: April 05, 2015, 02:37:37 PM »
All the MMU could possibly do to the Zorro bus is 1 or more of the following:
1. Mark memory ranges as copyback cacheable.
2. Mark memory ranges as write-through cacheable.
3. Mark memory ranges as noncacheable.
4. Mark memory ranges as write-protected.

I don't know if kickstart requires an MMU to be present on Zorro III or not.  It is certainly possible.  Due to design or just by accident.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #13 on: April 05, 2015, 05:57:41 PM »
Does the Phoenix core you have been working with, have all 4 TTR registers?
Are they compatible with 68060?

I think PsxPhill would be perfectly happy as long as the mini-MMU that is contained in ALL 68060 chips was included in the Phoenix/Apollo/68070/WhateverItIsCalled.
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show all replies
    • http://totalchaoseng.dbv.pl/news.php
Re: in case you are interested to test new fpga accelerators for a600/a500
« Reply #14 on: April 08, 2015, 03:36:55 AM »
Quote from: Thomas Richter;787545
It can, but why? New 68Ks can be bought for cents,


Amiga community does not always make cents :D
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA