Those timing tests are dead-on accurate and reflect all other tests made
by big and professional companies I read ... Those tests can be used to
give you a very clear picture of what you're buying ...
If you showed that timing graph to a real benchmarking forum they'd just laugh at you.
1) What version of the tools benchmarked was being used? If not the same version, then no comparison can be made.
2) What compiler was used for each system? If not the exact same version of the same compiler, then no comparison can be made.
3) What RAM did each computer have - did each platform have the best RAM available? If one platform was using sub-optimal RAM, and another was using fastest available, then no comparison can be made.
4) Do the tasks involve HD access? If so, what HDD was used? Was one using an SSD and another an HDD? Did one have DMA available and another not? Was one SATA, and another PATA?
5) Were each of the benchmarks carried out on the latest versions of the OS on a clean installation? If not, no comparison can be made.
6) Were all background tasks halted before the benchmark was executed? If not, no comparison can be made.
7) Was the network disconnected from each machine? If not, no comparison can be made.

Does one of the systems support SSD instructions and one not? If so, then the benchmark is meaningless except in the area of SSD apps, and should not be used arbitrarily.
Benchmarking between systems isn't just a matter of running a program with the same parameters on two systems and timing the difference. You have to be certain that ALL other parameters are equal.
I remember the thread that spawned those benchmarks. Piru had gone through the forums and found results from other people, regardless of their system. I also know that his benchmarks were using an older version of the tool under AmigaOS 4.
These "benchmarks" were just numbers trawled from forums, and have no meaning whatsoever, as they only show 10% of the full picture.