Welcome, Guest. Please login or register.

Author Topic: FastCache040+ Released!  (Read 13868 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline SpeedGeekTopic starter

FastCache040+ Released!
« on: October 05, 2017, 09:34:29 AM »
FastCache040+ 1.0 ©SpeedGeek 2017
             
INTRODUCTION:
FastCache040+ is a patch to replace the CachePreDMA() and
CachePostDMA() functions of most 68040/060 libraries. While
the old functions are adequate they are far from optimal.
These old functions have 3x more code then the new ones
provided with this patch!

Also, the new functions implement a much more efficient method
of managing the Copyback cache for DMA. While every system
will have some CPU performance loss under DMA conditions, the
new functions keep this performance loss to a bare minimum.          
               
FEATURES:
- Replaces CachePreDMA() and CachePostDMA() with smaller
  and more efficient code
- Replaces complex MMU code with simple and fast DTTR code
- Temporarily changes Copyback mode to Write Through for DMA
  (but only when required!)
- Never flushes the ATC!
- Never flushes the DC for Chip RAM DMA!          
- Uses 68040/060 library detection code
- Will not patch itself
- 100% Assembler code

CODE SIZE COMPARISONS:
- FastCache040+ 1.0  (NewFunc 132 bytes)
- 68060.library 46.7 (OldFunc 304 bytes)
- 68040.library 44.2 (OldFunc 414 bytes)  

REQUIREMENTS:
- Amiga with 68040 or 68060 CPU and MMU
- 68040.library or 68060.library

WARNING:
Do NOT use this patch with GigaMEM, VMM or any similar
virtual memory software! Do NOT use this patch with any
code which uses the MMU to write protect or remap modified
data structures!

NOTES:
Remapping a mirror image of the Kickstart ROM with the MMU
is OK! The new functions still have one thing in common with
the old functions. They do NOT translate virtual addresses
as specified in the Amiga RKRM! For more info on the old
functions see the Enforcer.guide by Michael Sinz.          

HISTORY:
v1.0 - First release

Here is the link:

http://eab.abime.net/showthread.php?p=1189690#post1189690
« Last Edit: October 11, 2017, 01:38:29 PM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #1 on: October 05, 2017, 03:31:48 PM »
Quote from: Oldsmobile_Mike;831335
How do these patches compare to THoR's mmu libraries?

Well, you probably don't know that he has insisted the old function code was the only way to guarantee reliable DMA transfers... and of course I strongly disagree with his claim. :furious:

Nevertheless, I don't recommend my patches for use with his MMU libraries.
They are not tested for compatibility, so if they do work it's by chance rather than by design.  ;)
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #2 on: October 06, 2017, 03:45:25 AM »
** NEWS UPDATE **

Sorry, there was a bug in v1.0 with the patch install code. :angry:

v1.1 - Fixed a bug which prevented the patch from installing
              - Added code to use OldCachePreDMA for MEMF_24BIT
       transfers (I don't know why errors occured here)
« Last Edit: October 06, 2017, 07:27:19 AM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #3 on: October 06, 2017, 02:13:31 PM »
** 2ND NEWS UPDATE **

v1.2 released (updated patch size info)  
 - Added code to use OldCachePostDMA for MEMF_24BIT
transfers (So MMU Pages can be restored to original)

EDIT:
OK, I believe I have found a solution to the MEMF_24BIT transfer
error problem without OldPre/OldPost calls. Unfortunately, the cache mode would have to be changed to NoCache.

This would make the NewFunc code a little smaller but could reduce CPU performance a little for MEMF_24BIT transfers.

So it's a trade off situation... will give it some more thought! :biglaugh:
« Last Edit: October 06, 2017, 04:37:07 PM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #4 on: October 10, 2017, 09:14:05 PM »
** 3RD NEWS UPDATE **

v1.3 Released!
- Added code to change MEMF_24BIT transfers to NoCache.
This eliminated all OldFunc calls. MEMF_24BIT transfers may have
some CPU performance loss but the NewFunc code performance
benefits should still justify this.

NOTES: v1.2 will still be available for download for users if they
believe using OldFunc calls is still justified. The v1.2 NewFuncSrc
for lbC00004E should read as follows:
CINVA    NC        ;Support 060, 040 not sure?

EDIT:
v1.4 Released!
- Removed MEMF_24BIT code from PreDMA/PostDMA for the
case of 16 byte aligned transfers. This will allow
some MEMF_24BIT transfers to be cache enabled!

EDIT2:
The v1.4 NewFuncSrc for lbC000080 should read as follows:
ORI.W   #$8000,D1    ;Cache WT mode + User FC
« Last Edit: October 13, 2017, 04:43:37 PM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #5 on: October 14, 2017, 01:43:26 PM »
Ok guys, now it's your turn to post your compatibility results!

Please provide information on 68040.library or 68060.library vendor and  version. Also, accelerator card type and vendor is requested too. Thank  you! :)
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #6 on: October 15, 2017, 05:58:20 PM »
** 4TH NEWS UPDATE **

The was another stupid version bug in v1.4 which has now been fixed (It  was a just a fully functional v1.4 reporting itself as v1.3).

I now have a simple benchmark tool called "CacheDMAmips" (see attached  image). I will probably release it when I am satisfied with the  compatibility results. ;)

EDIT: CacheDMAmips was removed for  providing bogus results. Obviously, programs compiled on an old "Pile  of Crap" C compiler and using v34 timer.device functions are not so  reliable. Mips benchmark results are generally bogus anyway! Thus a new  improved benchmark tool is called for! :biglaugh:
« Last Edit: October 23, 2017, 04:59:03 PM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #7 on: October 17, 2017, 12:28:58 AM »
Quote from: matt3k;831781
Hey Speed,

Been running 1.4 for a few days.  No issues so far.  System feels faster, not sure if it's reality. :)  Thanks for the great work.

System used for test:
Amiga 3000
Phase 5 Cyberstorm PPC
68060 version: 46.15

I have the command in my user-startup behind some other performance programs:
My CPU 060 Best
MemTrailer 96
MinStack 70000
CopyMem060
UtilPatch060
FastCache040+

Thanks for the info Matt! :)

Unfortunately, your system is very similar to my system (A3000, A3660, 68060.library 46.7). So hopefully, some users with different systems will post their results too.
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #8 on: October 17, 2017, 12:34:52 AM »
Quote from: Thomas Richter;831804
No wonder, there's nothing in your system that calls CachePre/PostDMA(), even though it should. Thus, a rather pointless exercise on your side.

There are reasons for these functions, and what this patch essentially does is that it disables or bypasses one of the functionalities the functions should have.

Their API is certainly not very wisely designed, though that is not a reason to break them...

The A3000 scsi.device uses these functions, and just in case you didn't know Commodore was testing the 040 CPU prototype card (which eventually was replaced by the A3640) on the A3000 long before the A4000 was released!

http://www.bigbookofamigahardware.com/bboah/product.aspx?id=221
« Last Edit: October 17, 2017, 02:15:15 AM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #9 on: October 23, 2017, 04:39:06 PM »
Ok, here are images of the new improved benchmark tool. Sadly, only 1 user has provided compatibility results so far? :rolleyes:
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #10 on: October 28, 2017, 04:42:41 PM »
** 5TH NEWS UPDATE **

The new benchmark tool has now been released! The lamers who failed to  provide compatibility feedback owe a BIG THANKS to the users who did. A  very special Thanks to thebajaguy for providing feedback on multiple systems! :)

BTW, these benchmark results were easily predictable. It's a No-Brainer!
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #11 on: March 30, 2018, 02:45:03 PM »
** 6TH NEWS UPDATE **

v1.5 - Found an occasional Recoverable Alert bug which could
possibly result in a crash but only on 060 systems!
The simple fix was to move "CINVA NC" in PostDMA to the
end of the code.
- Removed the "+" character from the executable name due
to a unknown "Feature" of the Amiga Shell causing script
execution and version command problems.

EDIT: [CPU060 NOWRITEBUFFER] with the Phase5 46.7 68060.library seems to  be a more reliable solution than the v1.5 update. Some more testing is  required.
« Last Edit: March 31, 2018, 01:57:44 AM by SpeedGeek »
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #12 on: April 01, 2018, 05:50:24 PM »
** 7TH NEWS UPDATE **

v1.6 - Added code to PostDMA to Flush the cache conditionally
       (if the Store buffer and cache are enabled). Added NOPs
       to sync the pipelines before RTE (CINVA is now obsolete)

UPDATE:
68040 users can use v1.4 or v1.5 if they like since they will
be a little faster than v1.6 but 68060 users should use v1.6!
68060 users will now have a performance trade off to consider
in deciding whether to enable the store buffer.
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #13 on: April 04, 2018, 02:34:55 PM »
** 8TH NEWS UPDATE **

v1.6P5 Removed code to allow PostDMA cache Flush for the case of      
       16 byte aligned transfers. Added code to skip PostDMA
       cache Flush for the case of cache disabled MEMF_24BIT
       transfers.

UPDATE:
v1.6P5 is my last attempt solve compatibility problems with
the Phase5 68060.library and Store buffer enabled. This
library is unstable and buggy WITH or WITHOUT FastCache040+
so either disable the Store buffer or expect the problems to
continue with only a MINIMAL improvement provided by this
patch!
       
v1.7 - Removed all v1.6P5 PostDMA cache flush code so most users
       (except Phase5 68060.library users) can run at full speed!

UPDATE:
Phase5 68060.library users should use v1.6P5. All others users
can (probably) use v1.4, v1.5 or v1.7 without any problems.
 

Offline SpeedGeekTopic starter

Re: FastCache040+ Released!
« Reply #14 on: April 21, 2018, 01:09:45 PM »
** 9TH NEWS UPDATE **

FastCache040+ v1.6P5 has been removed. Phase5 68060.library users should use FixMapP5 before using this patch.

FixMapP5 1.2 ©SpeedGeek 2018 (MMU Handler ©Michael Sinz 2001)
             
INTRODUCTION:
FixMapP5 is a tool to modify some of the default MMU mapping of
the Phase5 68040 and 68060 libraries. This can improve stability
and prevent crashing under the following condition:          

- Hardware or software interrupts which occur during a Chip RAM access by the 68060 (In particular when Store buffer is enabled).

Software bugs which allow illegal writes to the $F80000 Standard Kickstart ROM can cause a debugging problem in Copyback mode so this patch corrects that problem as well.

FEATURES:
- Changes Chip RAM mode to Precise (68060 only)
- Changes Standard ROM cache to Writethrough (68040 or 68060)
- Uses 68040/060 library detection code
- 100% Assembler code

REQUIREMENTS:
- Amiga with 68040 or 68060 CPU and MMU
- Phase5 68040.library or 68060.library

WARNING:
This tool was developed ONLY for use with the Phase5 libraries but
it does NOT actually verify such usage. So it can and probably
will mess up the mapping of ANY other libraries!        

CREDITS:
Thanks to Michael Sinz for his freely distributable MMU handler.

HISTORY:
v1.0 - First release
v1.1 - Added code to skip mapping $F00000 space (which included $F80000 space) for CyberstormPPC, CyberstormMK3 and BlizzardPPC
v1.2 - Replaced FindName() with FindResident() since v1.1 wasn't working at all. Also, fixed a typo on module names.
« Last Edit: April 28, 2018, 12:11:52 AM by SpeedGeek »