Author Topic: New Kickstart 3.9.1 68k on the way (Read 38283 times)

matthey · « **on:** December 14, 2014, 11:30:48 PM »

Quote from: Thomas Richter;779710

For a quick moment, I want to come back to the math libraries. As Olaf already said, they are rarely used, but for floating point intensive applications, it does make a difference. FPU intensive: JPEG decoding is one nice example where a FPU is beneficial. I haven't had the time to compile my JPEG on Amiga, thus I took the freedom of choosing another benchmark to show the point - Mandelbrot computation. Here I have a program available (DMandel) which comes with various (assembler optimized) computing kernels, IEEE Doubbas, and FPU (and others). Everthing on my 68060@50, note that IEEEDoubBas *also* uses the FPU, but requires register ping-pong to do the work.

Numbers for a zoomed in Mandelbrot: With IEEEDoubBas: 5:15 minutes, with raw FPU, 1:15 min. I believe that's a non-neglibigle difference. I'm usually not much a fan of "optimizing pointless register moves away", but when it makes a difference, it makes a difference. It is probably an artificial benchmark, but it shows one thing: If you *need* to do numerics, it's probably best to go directly on the FPU.

Direct FPU support on the 68060 is likely using many software 6888x instructions through traps for Mandrelbot calculations since most compilers don't avoid the traps (except new unleased vbcc) vs the handicapped IEEE library functions using mostly software also. Or did you use MuRedox for your stats? I guess this tells us that less software and more hardware fp usage is probably faster. Direct FPU use probably won't become more common until FPUs are more common. The new fpga processors are not getting them yet. It looks like the IEEE libraries will be around for awhile. Using the IEEE libraries really isn't that bad for non-CPU intensive multi-68k processor distributions, with a FPU. The IEEE support in vbcc works surprising well. The default vbcc 68k distribution uses IEEE instead of direct floating point. I have compiled my own 68060+FPU version which is significantly faster though.

Quote from: Thomas Richter;779807

The reason why I'm against that ROM-idea is simply because it does not allow users to exchange components. If I have to fiddle-open my machine every time I'm updating a component, chances are better than even that I'll break the ROM socket at some time. A minimal bootstrap ROM could be very stable and would not require a lot of updating. Everything else can be placed on flash, and can be upgraded easily by writing on a regular file system.

Given that you get such Flash-ROMs in GB size today for pennies, there's no reason to allocate an entire partition just for system components, write-lock it in regular operating mode, or even unmount if if it is no longer needed.

You shouldn't have to exchange ROMs with the new fpga hardware. Installing a kickstart to a flash slot could be nearly as easy as installing any other file. I use Blizkick to install a custom ROM in MAPROM which is easy. There is a nice advantage to write protecting the whole kickstart and being able to select different standard kickstarts quickly. Yes, the individual modules can be write protected also but it's faster and easier to write protect the whole kickstart. That's not to say that I would put everything in kickstart like Cosmos but the current kickstart setup isn't that bad either. A few more updated modules could allow booting from more modern devices, better diagnostics and a more consistent and stable core OS. Only mature and stable OS components should go in the kickstarts though. The big problem with kickstarts is that the developers are not producing new ones that can be distributed

.

matthey · « **Reply #1 on:** December 15, 2014, 05:27:47 PM »

Quote from: Thomas Richter;779853

No, the mandelbrot computations only use add,sub and multiplication. Thus, MuRedox makes no difference here. The only traps that may occur are due to non-normalized results where the FPU requires some help. IEEE uses the same instructions, but includes software overhead to load the numbers from the CPU registers into the FPU registers and back. While that makes typically no difference (the called function is long, the register ping-pong is short - intuition!) it makes a difference here. The called function is short (a single add, or sub, or mul) and the overhead is large compared to the actual function. For your average all-day purpose, it will hardly make any difference, indeed. But for that purpose, you don't need an FPU in first place either.

I thought that mandelbrot used 6888x logarithm instructions but I see that the basic algorithm uses mostly normal fp math.

Quote from: Thomas Richter;779853

Do you mean, it uses IEEE for compiling - or IEEE for the running program? The latter is switchable, but the former is pretty critical. To parse floating point constants in C code correctly, you need a *higher* precision than that used for computing in the program (otherwise, you get an additional loss in the compilation phase you want to avoid). For optimizing, you should run in the C compiler exactly the same computations as the code would have performed, so that's not good news either. Gcc has its own math library for emulating various FPUs and math models, and yes - for good compilation and optimization, this is really required.

This may be true for compiling direct FPU code with the IEEE library using version of vbcc but not so for when compiling IEEE versions of programs where the lower precision becomes the standard precision. Yes, it would be good to make direct FPU using compiles of vbcc available as well. I will suggest this when the new version is finalized. I could always make publically available unofficial compiles of the new version of vbcc as well.

Most versions of GCC compiled code open the IEEE double precision math libs (mixing IEEE lib and direct FPU code) which auto changes the FPCR to double precision rounding. GCC also likes to use the FD and FS instructions which are good for IEEE compliance but precision is less than the 68k FPU supports. Vbcc uses regular F instructions even for 68040 and 68060 FPU libraries so the code will execute on 68881-68060. This gives extra intermediate precision and backward compatibility at the cost of IEEE compliance but the extra precision may be lost at function calls where double precision fp values are passed to functions (except where inlines can maintain extended precision). The FPCR rounding precision can be changed to double precision using C99 functions for better IEEE compliance. Vbcc 68k may eventually get fp register passing libraries with full extended precison as the overhead of passing extended precision values on the stack is expensive. My point is that there isn't any current 68k compiler that I am aware of which is capable of maintaining full extended precision. You will get considerably less with direct compiled code or with the IEEE libraries.

Quote from: Thomas Richter;779853

True, except that handling of the files or exchanging modules within the kickstart is harder, i.e. the overall user experience is not quite as good for updates. Otherwise, when I remember the Natami here, it booted so fast it made no difference whether it went through another reset or not, so I don't need an updated rom for this machine in first place. Protecting modules can be done easily by MuProtectModules, no need for a ROM actually.

There is more effort in compiling kickstarts for developers but the result is easy to distribute (like any archive) and there should be less install and corruption issues. The new fpga hardware will initially not have MMUs but they can still have MAPROM support with write protection.

Author Topic: New Kickstart 3.9.1 68k on the way (Read 38283 times)

matthey

Re: New Kickstart 3.9.1 68k on the way

matthey

Re: New Kickstart 3.9.1 68k on the way