IThe basic code generation was ok but they did some weird stuff like branching into CMP.L #imm,Dn instructions for little if any advantage
That's what the peephole optimizer does, to avoid a branch around an instruction. Instead, this instruction is hiding in the data of the cmp.l# instruction. That's probably not an advantage on the 060 as it probably invalidates the branch-prediction cache, but it was at least a common optimization on even older microprocessors, like the 6502 (yes, really) where the BIT instruction served a similar purpose for avoiding "short branches".
There are way too many byte and word operations for the 68060 which is most optimal with longword operations also.
That rather depends on the source code. If the source uses a WORD, then what can the compiler do? There's an interesting interaction with the C language here I only mention for the curious (it doesn't make the compiler better or worse, it is just a feature that makes the time for the optimizer harder) and that is integer promotion. As soon as you have an operation with an integer literal (or any wider data type in general), it is first promoted to int. Thus, even something trivial like
short x = 2;
short y = x+1;
requires (as by language) first to widen the x to an int, then add one, then cast it down to a short. This is a trivial example where the optimizer will likely remove all the cruft of wideing and shorting, but there are more complicated examples like
if (x+1 == y)
which first widen x on the left, add the integer one, then y has to be widened, and then a full 32 bit comparison has to be made. And that's of course not the same as just adding a one to x in word since it differs in one single race condition, so all the widening cannot be optimized away. If I write that in assembler, and I know from other constraints that a wrap-around cannot happen (or I don't want to bother about for other reasons, who knows..) then I can do that of course much better by a single addq.w #1,dx, cmp.w. But its strictly speaking not correct, and not the same comparison.
In the end, it doesn't really matter much, unless you're in a tight loop somewhere in a computing-intense algorithm, and then you would probably look closer on what is actually happening there.
So, long story short: Some of the "seemingly useless" instructions are really there to follow the C language specs.
Looking at other compilers code generation is a good start. It's hard to imagine that Green Hills compiler was once better after looking at the intuition.library disaster.
It's not really a disaster. Greenhill haven't had registerized parameters, thus you see a lot of register ping-pong, but that's probably the only bad thing about it. Besides, it isn't heavy duty code to begin with.
Don't you mean the LIBFUNC macro causes the function to use A4 like a small data pointer to load A6 from GfxBase in the library base? What does the LIBFUNC macro look like?
LIBFUNC for SAS/C is just __saveds, i.e. requires the compiler to reload its NEAR data pointer, i.e. a4. Then there is another magic compiler switch that tells the compiler that the near data pointer comes actually from A6 plus an offset, where the offset depends on the size of the library base and whether there is any other magic that requires an offset from the library, to be determined at link time.
Thus, what the compiler essentially generates is a
lea NEAR(a6),a4
for __saveds in library code. Since NEAR is unknown until link time, the instruction remains in, even when the linker replaces the NEAR with zero. Which is the reason why you see some "seemingly useless" "lea 0(a6),a4) in layers, because the compiler could not possibly figure out that here NEAR=0, and at link time it is too late to remove that.
The C language did not specify much back then so every compiler had it's own customized features and pragmas. We have better C standards with C99 now that should be used where possible over custom compiler features.
Yes, but they don't have anything to say about library generation. A "shared library" is nothing C (or C99) has to say anything about, leave alone an Amiga shared library. So in one way or another, it requires some compiler support to build one, even nowadays. SAS/C offered a pretty good infrastructure for that, which is the reason why it is still what I use today. There's nohting in C99 to help you with that. The only other alternative would be to use a couple of assembler stubs (aka "register ping-pong") which is what happened for intuition. You didn't like that either. (-:
It's always a pain to convert the old stuff though. You should see the GCCisms that the AROS 68k build system uses and would need updated to compile with vbcc. It makes these problems look easy.
I have no doubt about that, but that's pretty much the reason why I'm reluctant to switch the compiler. These are code-generation problems I want to stay away from. I would reconsider if there would be the potential for a dramatic speedup or a dramatic size reduction when switching, but that doesn't look too likely.