Welcome, Guest. Please login or register.

Author Topic: New improved intuition.library version from the Kickstart 3.1  (Read 74264 times)

Description:

0 Members and 4 Guests are viewing this topic.

guest11527

  • Guest
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #134 on: August 28, 2014, 04:01:51 PM »
Quote from: Thorham;771893
To olsen:

Perhaps you're right, but I'm a 68k coder and when I see crap code like that I can't help but think what I wrote earlier.

But look, this is exactly the difference between a "coder" and a "software engineer". In essence, the "register-ping-pong" costs pobably like 20-30 cycles per function per call, probably in total around 400 cycles. Consider that in perspective with the amount of cycles you can save by *not* copying a cliprect by an improved algorithm.
 

Offline kamelito

Re: New improved intuition.library version from the Kickstart 3.1
« Reply #135 on: August 28, 2014, 06:22:42 PM »
@Olsen
As they bought it is the owner now being Amiga Inc or Hyperion?
Kamelito
 

Offline itix

  • Hero Member
  • *****
  • Join Date: Oct 2002
  • Posts: 2380
    • Show only replies by itix
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #136 on: August 28, 2014, 07:33:07 PM »
Quote from: Thomas Richter;771904
Quote

And another note. Now when you have optimized away function setup you still have problem with slow gfx output. What you have achieved is CopyMemQuick() optimization of intuition...


I don't quite understand what you want to say here...


CopyMemQuick() is perfect example of redundant micro optimization. Assuming that routines are done right, CopyMemQuick() is never faster than CopyMem(). Saving six asm instructions on each CopyMemQuick() call is not worth it.
My Amigas: A500, Mac Mini and PowerBook
 

guest11527

  • Guest
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #137 on: August 28, 2014, 08:16:22 PM »
Quote from: itix;771914
CopyMemQuick() is perfect example of redundant micro optimization. Assuming that routines are done right, CopyMemQuick() is never faster than CopyMem(). Saving six asm instructions on each CopyMemQuick() call is not worth it.

No, it depends. The *correct* optimization is to avoid copying memory if you can. Instead, just pass a pointer. However, there are cases where you *must* copy memory, and then of course, the performance impact of the memory copy is dominant in the algorithm. In *that* case, it helps of course to optimize it.

It is really quite simple: Measure where the time is spend in your algorithm. If you have a bottleneck where 80% of the CPU time is spend in 20% of the code, optimize that 20%. If you have to move window on the screen, you have to copy the data - no way around it. You can then either use the blitter (if you can), or use the CPU. Since that's the dominant operation, you should invest your time there to get this specific part fast.

Or to put it into different words: How often are the movem-instructions within CopyMemQuick() called for the average copy? Probably several thousand times. How often is the register-ping-pong used when calling intuition? At most hundred times. See why it makes a difference?

Basic rule of profiling: Measure first. Then optimize.

I've here a super-fast JPEG 2000. Almost everything is in C++. Except the 5% of the code where it really really matters. The code is fast because it is C++ - it allowed constructions where memory is touched as little as possible, keeping the data in cache whenever possible. This required the correct algorithm, and that algorithm was easier to formulate, debug and benchmark in a higher level language. Once that is done, and the hot-spots are understood, the hot-spots that touch a lot of data in a small loop were optimized. The amount of time the code spends in other parts of the code is below 5%, so I don't even bother touching it *except* for getting the data organized correctly so the 95% bottleneck gets its data in an ideal way, namely in the CPU cache.
 

Offline olsen

Re: New improved intuition.library version from the Kickstart 3.1
« Reply #138 on: August 28, 2014, 08:25:57 PM »
Quote from: kamelito;771911
@Olsen
As they bought it is the owner now being Amiga Inc or Hyperion?
Kamelito

Probably neither... Amiga likely bought the compiler source code under license, and licenses such as these are not necessarily transferrable. If the company ownership changes, or the company goes into liquidation, you might have to talk to the licensor about terms, and you may have to pay a fee in order to continue using the source code.

When ESCOM acquired certain Commodore assets, paperwork and documentation on software licenses, even contracts with publishing companies such as those who made the RKMs and the AmigaDOS manual, were lost. They staid lost, or became more lost when Gateway 2000 acquired patents and stuff from ESCOM.
« Last Edit: August 28, 2014, 08:37:18 PM by olsen »
 

Offline olsen

Re: New improved intuition.library version from the Kickstart 3.1
« Reply #139 on: August 28, 2014, 08:31:24 PM »
Quote from: itix;771914
CopyMemQuick() is perfect example of redundant micro optimization. Assuming that routines are done right, CopyMemQuick() is never faster than CopyMem(). Saving six asm instructions on each CopyMemQuick() call is not worth it.

You're assuming that CopyMemQuick() was always supposed to leverage an unrolled movem.l loop.

There are notes in the old autodocs which hint that somebody was dreaming about having hardware-accelerated data copying operations available at some point. I take it that this type of hardware actually did exist for 68k Sun workstations, so this wasn't completely unrealistic.

Given how cheap Commodore was, the ambition never resulted in such hardware showing up, though.

I'm speculating: had this hardware existed for the Amiga, it would have hooked into CopyMemQuick().
 

Offline itix

  • Hero Member
  • *****
  • Join Date: Oct 2002
  • Posts: 2380
    • Show only replies by itix
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #140 on: August 28, 2014, 09:02:42 PM »
Quote from: olsen;771920
You're assuming that CopyMemQuick() was always supposed to leverage an unrolled movem.l loop.


No, I am assuming CopyMem() is highly optimized just like CopyMemQuick(). If CopyMemQuick() is using better memory copy routine than its CopyMem() counterpart then there is something seriously wrong.

Quote

There are notes in the old autodocs which hint that somebody was dreaming about having hardware-accelerated data copying operations available at some point. I take it that this type of hardware actually did exist for 68k Sun workstations, so this wasn't completely unrealistic.

Given how cheap Commodore was, the ambition never resulted in such hardware showing up, though.

I'm speculating: had this hardware existed for the Amiga, it would have hooked into CopyMemQuick().


I always assumed they meant blitter. DMAing between chip and fast ram could have been nice.
My Amigas: A500, Mac Mini and PowerBook
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show only replies by Thorham
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #141 on: August 28, 2014, 09:18:57 PM »
Quote from: Thomas Richter;771905
But look, this is exactly the difference between a "coder" and a "software engineer".
Bollocks.

Quote from: Thomas Richter;771905
In essence, the "register-ping-pong" costs pobably like 20-30 cycles per function per call, probably in total around 400 cycles.
I understand that, it's just impossible to look at code like that, and not think how crappy that is, and comment on it.

Quote from: Thomas Richter;771905
Consider that in perspective with the amount of cycles you can save by *not* copying a cliprect by an improved algorithm.
Yeah, the right algorithm. You told me this before, and it was just as obvious then as it is now. You assume again that I don't understand that, while it's an open door of obviousness.
 

Offline kamelito

Re: New improved intuition.library version from the Kickstart 3.1
« Reply #142 on: August 28, 2014, 09:33:09 PM »
Quote from: Thomas Richter;771905
But look, this is exactly the difference between a "coder" and a "software engineer". In essence, the "register-ping-pong" costs pobably like 20-30 cycles per function per call, probably in total around 400 cycles. Consider that in perspective with the amount of cycles you can save by *not* copying a cliprect by an improved algorithm.


I've seen a benchmark of the legacy layers.library vs yours, pretty amazing.
I'm wondering why that library was not optmized like yours back then.
Is all Amiga libraries being profiled and optimized like the layers one?
Is the actual GCC any good at doing optimizations for PowerPC after Apple "departure"?
Kamelito
 

Offline kamelito

Re: New improved intuition.library version from the Kickstart 3.1
« Reply #143 on: August 28, 2014, 09:50:49 PM »
"You can then either use the blitter (if you can), or use the CPU. Since that's the dominant operation, you should invest your time there to get this specific part fast."

I suppose that in some case you could also use the blitter and CPU in parallel I've seen this for example to clear the screen in some demos. (read demoscene).

"If you have to move window on the screen, you have to copy the data - no way around it."

isn't it possible with the copper to have for each window a copperlist (of course you're limited by the copper resolution for the window positionning/width/height depending on the screen mode used) to avoid copying datas but just changing some values in the copperlist of the moved window?

Kamelito
 

Offline buzz

  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 612
    • Show only replies by buzz
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #144 on: August 29, 2014, 12:16:11 AM »
The arrogant elitism in this thread is really irritating. As though a few think they are the only ones who can develop professionally and know better than everyone else. It links in directly with the out of date attitudes to source too imho. As a retro computing scene, the Amiga scene really stinks sometimes.
 

Offline ferrellsl

Re: New improved intuition.library version from the Kickstart 3.1
« Reply #145 on: August 29, 2014, 12:29:37 AM »
Quote from: buzz;771934
The arrogant elitism in this thread is really irritating. As though a few think they are the only ones who can develop professionally and know better than everyone else. It links in directly with the out of date attitudes to source too imho. As a retro computing scene, the Amiga scene really stinks sometimes.

Agreed.  And it's a sad commentary about those whom you mention because their egos seem tied to or intertwined with a platform that's been dead for years.  It makes me wonder how they act in person.
 

guest11527

  • Guest
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #146 on: August 29, 2014, 02:56:02 AM »
Quote from: buzz;771934
The arrogant elitism in this thread is really irritating. As though a few think they are the only ones who can develop professionally and know better than everyone else. It links in directly with the out of date attitudes to source too imho. As a retro computing scene, the Amiga scene really stinks sometimes.

None of the knowledge presented here is new, nor my own invention or finding, or in any way over everyone's head. It is really simple, basic stuff, depending on really simple basic math. Actually, one of the first examples that are usually taught is that a super-fast super-optimized bubble-sort will still be slower than a quick-sort in BASIC (or perl, or python) if there is enough data to look at. The "kool koderz" will complain about the slowness of a high-level language, the software engineer will instead understand that the high level code is not even faster, but also easier to maintain, update and bugfix than the super-optimized algorithm. It's a different perspective you take once you have to work "for living" and not just for fun. If there's still time to optimize that, good for you.

Yet, at the same time, I see here people arguing about such elementary truths, and that hurts a lot. It's a matter of education everybody here can simply fix himself, just by making the experiment and taking the experience. I understand that some people here probably haven't, but what I do not understand is the resistance to actually take this as a serious suggestion, not as a matter of arrogance, but as a matter of life experience one still has probably still to make. There is still time to argue afterwards. Actually, I see it pretty much a matter of stubbornness to argue about elementary truths - something that really drives me mad.

If that makes you feel any better: Yes, I started with assembler on the Amiga. However, sooner or later, as your projects grow in size, you'll see that you'll hit a "brick wall" where you create code that requires an amount of maintence to update or improve that you cannot manage anymore. Especially in the "improve" part you'll learn that this often goes beyond micro-optimizations - you'll often find yourself in the position to turn around entire code parts because your architecture did not work as intended.

Once again: Intuition is a very high-level library, with very few elementary calls made a relatively rarely. It really does not matter much how fast or slow that part is. In a sense, you're looking at the wrong end of the picture and should understand instead the high-level  instead of the low level code. I know that's all you see, but I can't help it. Yes, there are probably a couple of things that are wrong in intuition, but nothing seriously so. It's especially not the "register ping-pong" that's wrong. It's a minor annoyance that hurts nobody and that could be avoided by simply recompiling the code, but without giving a measurable benefit. Pronounciation on "measurable". Why do you care about this then in first place?
 

guest11527

  • Guest
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #147 on: August 29, 2014, 03:08:53 AM »
Quote from: kamelito;771926
I'm wondering why that library was not optmized like yours back then.
Is all Amiga libraries being profiled and optimized like the layers one?
The fact is, that layers *was* actually optimized, that's probably the thing one should understand. It was just optimized for a different goal. That's probably one of the surprises one has to learn, and one of the "take home" lessons from all this discussion.

Yes, surprising, isn't it?

The point is: At the time layers was written, the CPU was slow, the blitter was fast, and chipmem was an expensive resource. Thus, layers was designed with these goals in mind: Use as little buffer memory as possible, probably of the expense of additional data manipulations to be made by the fast blitter. This resulted in an algorithm that uses the double-xor trick to avoid an additional buffer and to copy data between screen and off-screen buffer. Amongst many other decisions, of course. It was the right algorithm.

Nowadays (or rather, ten years ago) things changed: The CPU became fast, the blitter slow, graphics memory was not even reachable by the blitter, so the CPU had to emulate the blitter, and buffer memory became cheap. Thus, it requires a completely different algorithm to make this fast. Avoid double-xor, use extra buffers if necessary to avoid any extra copy operation that would slow down the CPU.

Nobody at CBM back then was overly stupid in creating the slice & dice operation in layers. They just optimized for the "wrong" goal, for today's perspective. The *good* part about layers is that it was written in C, not assembly, so one could take the algorithm apart and replace it by something equivalent, optimized for a different goal.  

That's "software engineering", actually. Consider that you're shooting at a moving target, and that your target might move too fast to make low-level stuff feasible to approach your problem. And within ten years, the hardware was apparently already moving too fast for CBM to take the opportunity to optimize...
Quote from: kamelito;771926
Is the actual GCC any good at doing optimizations for PowerPC after Apple's departure?

I seriously don't know. I don't have enough experience with the PPC to judge. Yes, I did a bit of programming on PPC, but that's not sufficient to make any statements on the compiler quality.
 

guest11527

  • Guest
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #148 on: August 29, 2014, 03:22:29 AM »
Quote from: ferrellsl;771935
Agreed.  And it's a sad commentary about those whom you mention because their egos seem tied to or intertwined with a platform that's been dead for years.  It makes me wonder how they act in person.

Quite nicely actually. Anyhow, it's not about the dead platform I care so much about. I care about engineering practise, or rather, the apparent absence in the minds of some people of it at times. AmigaOs engineering is nothing more than a fairly general example of a wider class of problems you approach when constructing larger projects.

Yes, I'm really serious about "coders" and "engineers". There is a difference, and you should at least try to understand why I'm stressing this difference so much, and what the difference is about. Nothing wrong to start as a "coder", but you should try to conquer new worlds and understand more perspectives as soon as you grow older.
 

Offline Cosmos AmigaTopic starter

  • Hero Member
  • *****
  • Join Date: Jan 2007
  • Posts: 954
    • Show only replies by Cosmos Amiga
    • http://leblogdecosmos.blogspot.com
Re: New improved intuition.library version from the Kickstart 3.1
« Reply #149 from previous page: August 29, 2014, 05:18:59 AM »
New beta 7 maybe this day...

I removed all 68000/010 support : take a lot of cycles for nothing for me...

There is a big difference between 000/010 and 020+ : I cannot lay good code anymore with all the 000/010 limitations...



:)