Author Topic: New ppc board by Acube/A-Eon: A1222 "Tabor" (Read 159281 times)

Hans_ · « **on:** October 13, 2015, 09:38:05 PM »

Does anyone have any hard data on the performance with software trap based FPU emulation? I'd be interested to see how much of an impact it actually has, but I can't find any published benchmarks anywhere.

NOTE: When doing any benchmarks to test this, you need to make sure that you're using software compiled with hardware FPU enabled, otherwise you're testing GCC's software FPU emulation.

Hans

Hans_ · « **Reply #1 on:** October 14, 2015, 12:33:53 AM »

Quote from: matthey;797406

Hard data is going to be difficult to find. How are the floating point operations going to be performed after the exception?

1) full precision floating point using integer instructions
2) reduced precision floating point using integer instructions
3) partial hardware accelerated floating point (possible with this new PPC board? Lattice FPGA?)

I'm talking about whatever FPU emulation is currently available (under Linux), using whatever benchmarks are available. Sure, people can argue ad nauseum about whether benchmark results are meaningful, but having any data would be better than making big claims about performance based on assumptions.

Of course the emulation is going to have overhead; I'd like to know how it actually performs.

Hans

Hans_ · « **Reply #2 on:** October 14, 2015, 08:25:14 PM »

Quote from: matthey;797444

The article I linked to has performance tests but they are for individual floating point instructions/functions. Obviously, a function/instruction like fabs is going to be near full speed (possibly a faster operation in an integer register) while a fsqrt is going to be slower than slow. The 3 common floating point emulations are compiler softfloat, FastFPE and NetWinder (NWFPE). The performance varies with the effective precision where the hardware FPU has the most precision for the floating point format (half, single, double, extended, quad). The effective precision of the floating point data in the IEEE floating point format is already reduced during many operations (extended precision like the 68k FPU uses can avoid this) with a hardware FPU. Some applications like games and 3D graphics probably want the best performance but math, science and engineering people would rather have extended precision floating point like the 68k FPU uses.

I could not find any good comprehensive benchmarks but I did see that one of the FastFPE authors using a StrongARM@200MHz with a Linux kernel tested 1.1 MFlops with a ~570 ns trap overhead and 0.4 MFlops with a ~2040 ns trap overhead. ARM claims 1.3 MFlops/MHz for the VFP9-S and 2.0 MFLops/MHz for the VFP10 which would be 260MFlops and 400MFlops at 200MHz respectively. I doubt any applications or games would use enough floating point to benchmark several hundred times faster on a StrongARM@200MHz with VFP but maybe it gives an idea of how handicapped software floating point can be.

http://linux-arm-kernel.infradead.narkive.com/gqDFIXbv/kernel-2-6-and-fastfpe

All very interesting, and it does demonstrate how big such overhead can be. Nevertheless, none of the results in the documents you linked to are for the P1022 or even for a PowerPC processor.

The performance of such an emulator will depend on a number of factors ranging from how well the emulator itself has been designed (i.e., the software) through to the CPU architecture and indeed the design of the individual chip. I have no idea how any of these factors compare between the P1022 and the ARM CPUs in the test results, so I'd really prefer some results from the actual device.

Hans

Hans_ · « **Reply #3 on:** October 24, 2015, 12:49:58 AM »

Quote from: kolla;797982

What prevents binaries from asking the OS about the abilities of the hardware, and run code accordingly? I mean, other operating systems manage to have "fat" binaries that contain entirely different architectures, but on Amiga it is not even possible to have one binary for variations within one architecture.

This is already done in various programs for altivec/non-altivec code. The W3D_SI driver, for example, will use altivec on hardware that has it, and non-altivec code on machines that don't. The graphics library goes one step further and has copy routines that are optimized for specific processors (incl. using DMA on certain platforms). IIRC, MiniGL has a few altivec routines as well.

The examples above are a bit different from simply having a fat binary with completely different compiles sitting there side-by-side. Instead, the developer himself/herself compiles different versions of just the parts that matter. I personally prefer that approach, as the sections of a program that actually benefit from altivec/SPE/whatever is usually limited.

Hans

Hans_ · « **Reply #4 on:** October 25, 2015, 08:14:25 PM »

Quote from: Iggy;798105

Ah, I did not examine those closely enough.
So gPU acceleration is a good thing.
I'm curious as to how Hyperion implemented it.
I thought the primary reason that Linux and MorphOS didn't use it was the difficulty in getting really complete documentation for GPUs themselves.

The GPU acceleration that he's talking about is the composited video feature of the latest Radeon HD driver.

Quote from: Iggy;798105

I mean, when you compare Linux systems that have proprietary drivers supplied by the gpu manufacturers to systems with open drivers you see a big performance hit in the latter.

AFAIK, AMD's proprietary drivers work only on x86/x64 systems. Linux on the Sam460ex uses the open-source drivers, and they are definitely lagging behind AMD's proprietary drivers. I think that I managed to get Radeon HD 7xxx series (Southen Islands GPUs) working on AmigaOS before the open-source Linux driver did, and that was also before AMD released the docs for the new series. Ironically, I deciphered the new GPU instruction set from the LLVM code in their Work-In-Progress (WIP) Gallium3D driver.

Hans

Author Topic: New ppc board by Acube/A-Eon: A1222 "Tabor" (Read 159281 times)

Hans_

Re: New ppc board by Acube/A-Eon: A1222 "Tabor"

Hans_

Re: New ppc board by Acube/A-Eon: A1222 "Tabor"

Hans_

Re: New ppc board by Acube/A-Eon: A1222 "Tabor"

Hans_

Re: New ppc board by Acube/A-Eon: A1222 "Tabor"

Hans_

Re: New ppc board by Acube/A-Eon: A1222 "Tabor"