Welcome, Guest. Please login or register.

Author Topic: New Replacement Workbench 3.1 Disk Sets www.amigakit.com  (Read 6733 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« on: November 14, 2014, 07:22:23 AM »
Quote from: olsen;777237
Actually, if one were to build a new ROM there could be sufficient free space to keep all components in it. I tested this with a fully native build that utilizes SAS/C for almost all components. That change allows for considerable space savings, e.g. the A4000T ROM has enough room for the SCSI/ATA scsi.device flavours and workbench.library to fit. Taken a step further, the disk-based icon.library that was part of AmigaOS 3.5/3.9 could fit, too. The same could be done for the A1200 V40 ROM which has almost no free space left.

Please leave the icon.library out of Kickstart. Everybody is using PeterK's icon.library because it's much faster, smaller and supports most Amiga icon types.

Quote from: olsen;777276
The A4000T ROM has even more crammed into it, which includes the ATA scsi.device, the really large A4091 scsi.device (which itself includes the bootstrap script for the NCR SCSI controller) and the large AA graphics.library. The combination of these components left no room for workbench.library, which was moved out to disk.

V40 was built almost exclusively using Lattice 'C' 5.04 which did not feature the more refined and effective optimization functionality available later in SAS/C. Commodore did not use SAS/C for production code, or for that matter, used 68020 code generation, because of code generation maturity issues. Because of this, all the compiled 'C' code was targeted for the plain 68000, and no 68020 specific optimizations could have helped to reduce the code size for the A1200/A3000/A4000/A4000T.

SAS/C "refined and effective optimizations"? You have to be kidding. The icon.library was compiled with SAS/C and PeterK's optimized version is now about 35% smaller with much added functionality (my record library reduction is 43% but that was an early version of GCC/EGCS which I could take to half size with some effort). I would say that SAS/C is better for size than speed. I have a working and well tested workbench.library which is 191168 bytes without any hand optimizations from me (it has bug fixes applied). I bet I could optimize away another 10kB or so with basic hand optimization (getting rid of that slow SAS/C copymem routine would probably save 500 bytes alone). Granted the code quality is nowhere near as bad as the intuiton.library. It might be worth trying vbcc for small executable sizes. Vbcc's features:

+ best 68k peephole optimizing assembler ever in vasm
+ uses optimized inlined assembler functions (the default)
+ sophisticated optimizations that exceed SAS/C (some don't seem to work)
+ cross-assembler for fast compiles on faster computers
+ good Amiga and 68k features and support (Amiga hunk output, IEEE math libraries for fp)
+ actively maintained by knowledgeable and helpful people who know the 68k and Amiga
+ source code available and compiles on a 68k Amiga with few dependencies
+ free for Amiga use
+ easy Amiga installation
+ good c99 support
? some of the link code is highly optimized (hit and miss)
- the 68k backend is average at best
- no 68k instruction scheduler
- lacking tools although many GCC and SAS/C tools are compatible (CPR debugger)
- slow at compiling
- memory hungry
- no C++ support

There should be a much improved version of vbcc out in the next few weeks. SAS/C is a dead end last decade compiler. How about giving the new version of vbcc a try?
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« Reply #1 on: November 14, 2014, 09:01:43 AM »
Quote from: Thomas Richter;777321

Back then had a good optimizer, compared to the other compiler(s) that were available, Manx (Aztec) namely. Gcc had an even better optimizer, but no (or unsufficient) native support for the Amiga toolchain, so it was often not an option to use.


That was what 20 years ago now?

Quote from: Thomas Richter;777321

I really haven't tried vbcc, so I cannot judge, but a lot of the compiles here depend on some SAS/C magic, for example layers depends on SAS/C being able to use the library base as the "data segment" of the code, so "global variables" (that's not exactly a C term, I know) appear in the library base. I'm unclear whether vbcc can do that, but gcc couldn't back then.


Vbcc uses the standard:

A7=stack pointer
A6=library base
A5=frame pointer (unused unless -use-framepointer)
A4=small data pointer (unused unless -sd)

Both A5 and A4 are free by default so there should be no conflicts. There is a function attribute called __saveds which loads A4 or a function called geta4() which can be placed at the start of a function using small data when not compiled for it. If A4 is used as the pointer to your library base, it may not take much to convert to compiling with vbcc. What SAS/C attributes and functions are used to setup and access the library base in this way?

Vbcc supports these attributes:
 __far, __near, __chip, __saveds, __interrupt, __amigainterrupt, __stdargs, __section

Quote from: Thomas Richter;777321

So it's much more  a question of the overall tool chain than really some particular feature of the optimizer. SAS/C, even the latest version, has some bugs in its optimizer, too. Try to use mathieedoubbas mathematics and see it spill the lower 32 bits of the IEEE double floats from time to time... )-:


Vbcc has bugs too but they are getting fixed :).
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« Reply #2 on: November 14, 2014, 12:36:42 PM »
Quote from: olsen;777325
Well, I'm not using it ;) So far I'm reasonably satisfied with the icon.library which I wrote for the OS 3.5 update. If there is a problem badly in need of a solution, it's in how workbench.library and icon.library interact for directory scanning and display. It just does not scale: larger icon files, more files, no matter what, the performance and responsiveness quickly goes down the drain.

I can't comment on the size and functionality of the replacement icon.library, as I have never used it. I only spent a couple of months rewriting the icon.library from scratch, integrating NewIcons support, colour icon support, etc., making it work better with workbench.library, building new APIs, etc. The focus was not on optimizations for size or speed, because icon loading is pretty much restricted by what the file system can do (and that isn't much). My focus was more on making the whole thing as robust as I could, and on opening up the API.

I appreciate that you are proud of your own work, and I'm not saying it's bad, but PeterK's icon.library really is significantly faster and it supports PNG and AmigaOS 4 icons as well as everything it did before while shrinking over 1/3. There is good and then there is amazing ;).

Quote from: olsen;777325
Hey, I wrote "*more* refined and effective", and the reference was Lattice 'C' 5.04. SAS/C 6 was definitely an improvement considering the quality of the code optimization. However, it did take a couple of years to mature (1995-1996), by which time Commodore could no longer put it to good use.

The guys that did SAS/C were professional, fixing a lot of bugs and giving a lot of Amiga support. The basic code generation was ok but they did some weird stuff like branching into CMP.L #imm,Dn instructions for little if any advantage and they loved the double memory indirect addressing modes like ([d16,An],od.w) which was used more with later versions (IBrowse has 1968 uses). These didn't hurt the 68020 code as much as for 68040 and 68060 where instruction scheduling is sorely needed. There are way too many byte and word operations for the 68060 which is most optimal with longword operations also. The direct FPU generation is poor for the 6888x and worse for the 68040+. It should be possible to generate good quality code for the 68020-68060, excluding the 16 bit 68000.

Quote from: olsen;777325
I was told that Commodore was a driving force in getting SAS, Inc. to improve the code generator and the optimizer. They would submit samples of code as produced by the Green Hills compiler (obviously, they could not share compiler source code) and ask the compiler developers at SAS to replicate the results. Step by step the compiler improved.

Looking at other compilers code generation is a good start. It's hard to imagine that Green Hills compiler was once better after looking at the intuition.library disaster. The Green Hills compiler is still around and pretty well respected in the embedded market for it's optimizing capabilities. They still have a ColdFire backend but I couldn't tell whether they had dropped 68k support.

Quote from: olsen;777325
Colour my curious. Where do I start?

The lack of an interactive source debugger is something of a dealbreaker, though. I'd hate to go back to where I was back in 1987. Life's too short for peppering your code with printf()s and assert()s, rerunning it, watching it crash, modifying it and rerunning it all over again. Now CodeProbe may not be much fun, but it's not that big a productivity sink as "old school" debugging is.

Frank Wille's vbcc site is here:

http://sun.hasenbraten.de/vbcc/

Unfortunately, the version there is pretty old now. There should be a new version available anytime (surely before the end of the year) with a huge number of bug fixes and improvements. You can always e-mail Frank for the newest sources also.

The newest version of vbcc for the Amiga 68k target generates Amiga symbols and debug information (with -g) in the Amiga Hunk format executables that is compatible with CodeProbe and BDebug. CodeProbe is a good debugger but I prefer BDebug from the Barfly package.

http://aminet.net/dev/asm/BarflyDisk2_00.lha

BDebug is another great developer tool you should try if you have not.

Quote from: Thomas Richter;777326
Sorry, that's not exactly what the problem is. The trick that is used here is the following: If you have a file like this:

struct Library *GfxBase;

void LIBFUNC foo(struct RastPort *rp)
{
 SetAPen(rp,0);
}

then, with a properly defined LIBFUNC macro, SAS/C will place "GfxBase" as an object into the library you are creating (*not* the data segment), will create a library entry for "foo" and will use a6 to load a4 with the "near" data pointer, i.e. it will use the library base as "near" data. SAS/C can also create the fd file for your, or place the functions at specific offsets in the library.

Don't you mean the LIBFUNC macro causes the function to use A4 like a small data pointer to load A6 from GfxBase in the library base? What does the LIBFUNC macro look like?

Quote from: Thomas Richter;777326
As said, SAS/C provides a lot of system-specific "magic" to support AmigaOs development and a bit of the toolchain depends on this magic.

There are a couple of other "magical" support features it supports, so it does require some work to move from one compiler to another for system components. For user programs, this is much less an issue.

The C language did not specify much back then so every compiler had it's own customized features and pragmas. We have better C standards with C99 now that should be used where possible over custom compiler features. It's always a pain to convert the old stuff though. You should see the GCCisms that the AROS 68k build system uses and would need updated to compile with vbcc. It makes these problems look easy.
« Last Edit: November 14, 2014, 12:39:18 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« Reply #3 on: November 14, 2014, 04:46:37 PM »
Quote from: Thomas Richter;777348
That's what the peephole optimizer does, to avoid a branch around an instruction. Instead, this instruction is hiding in the data of the cmp.l# instruction. That's probably not an advantage on the 060 as it probably invalidates the branch-prediction cache, but it was at least a common optimization on even older microprocessors, like the 6502 (yes, really) where the BIT instruction served a similar purpose for avoiding "short branches".


I don't see other compilers like GCC or vbcc doing this trick. The 68020+ can hide code in a TRAPcc.W and TRAPcc.L instruction (TPF in ColdFire) as well to avoid a branch sometimes with an if-then-else (actually recommended and described in the ColdFirePRM). This technique can save a couple of cycles on the 68040 with it's large instruction fetch but it's not worth it on the 68020 and is usually slower on the 68060. It's not very friendly for debugging either.

Quote from: Thomas Richter;777348

That rather depends on the source code.  If the source uses a WORD, then what can the compiler do? There's an interesting interaction with the C language here I only mention for the curious (it doesn't make the compiler better or worse, it is just a feature that makes the time for the optimizer harder) and that is integer promotion. As soon as you have an operation with an integer literal (or any wider data type in general), it is first promoted to int. Thus, even something trivial like

...

In the end, it doesn't really matter much, unless you're in a tight loop somewhere in a computing-intense algorithm, and then you would probably look closer on what is actually happening there.


Modern compilers will commonly promote shorter integer sizes to the register size, if there isn't too much overhead. Many superscalar processors like the 68060 have internal optimizations for and can only forward full register results. Unfortunately, the 68060 doesn't have the instructions it needs to make this happen efficiently like the MVS/MVZ ColdFire instructions and the MOVSX/MOVZX x86 instructions. I've been trying to get the ColdFire instructions (as encoded on the CF) into a new 68k like ISA for the new fpga processors coming out where most recently I have referred to them as SXT/ZXT (SXT replacing the EXT name which is still supported).

http://www.heywheel.com/matthey/Amiga/68kF_PRM.pdf

Promoting integers to the register size simplifies the backend also. Vbcc does this a lot and generates better 68060 code as a result with some cost to 68020 code speed and size (the 68040 handles the big instructions but is slowed some by extra instructions). GCC is somewhere in the middle between vbcc and SAS/C, trying to be smart about whether to promote registers or not.

Quote from: Thomas Richter;777348

It's not really a disaster. Greenhill haven't had registerized parameters, thus you see a lot of register ping-pong, but that's probably the only bad thing about it. Besides, it isn't heavy duty code to begin with.


It may not be a processor intensive library but it's one that could have a significant percentage of it's code optimized away. Register ping-pong usually isn't too bad but it can commonly result in register spilling which is bad. This compiler did a poor job of peephole optimizing also.

Quote from: Thomas Richter;777348

LIBFUNC for SAS/C is just __saveds, i.e. requires the compiler to reload its NEAR data pointer, i.e. a4. Then there is another magic compiler switch that tells the compiler that the near data pointer comes actually from A6 plus an offset, where the offset depends on the size of the library base and whether there is any other magic that requires an offset from the library, to be determined at link time.

Thus, what the compiler essentially generates is a

lea NEAR(a6),a4

for __saveds in library code. Since NEAR is unknown until link time, the instruction remains in, even when the linker replaces the NEAR with zero. Which is the reason why you see some "seemingly useless" "lea 0(a6),a4) in layers, because the compiler could not possibly figure out that here NEAR=0, and at link time it is too late to remove that.


Ok, so vbcc already has the __saveds attribute. It just needs a way to set the global data pointer in A4 to something besides what small data uses. It would be nice to add support for resident/pure executables that use global variables (in allocated memory) like SAS/C supports as this is also similar. I need to do some further research and investigating (I have the SAS/C manuals which are good). It may help if you could show how the custom data pointer is setup.

The "lea 0(a6),a4" may be difficult for the compiler to optimize but it's not a problem for a peephole optimizing assembler like vasm. There are only a few link code related optimizations that vasm can't make (where the section is unknown before linking) which are usually branches (JSR->BSR and JMP->BRA).
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« Reply #4 on: November 15, 2014, 08:38:49 PM »
Quote from: Thomas Richter;777374
It's basically set at the beginning of the __MERGED segment, or, if that grows larger than 32K, right in the middle so data can be accessed with positive or negative offsets. I don't think the manual states that, at least I don't remeber having it seen there. The trick for library code is that it is reloaded relative to the library base, and not relative to the __MERGED segment, so the data is allocated by exec when the library is created, allowing the lib to be placed in ROM.  

It looks like vlink for the 68k Amiga will use the value 0x7ffe (32766) for the small data __MERGED section offset unless overridden.

Quote from: Thomas Richter;777374
No, look, the lea __NEAR(a4),a6 is never seen by the assembler. The __NEAR section is generated by the linker (or not generated at all, in case of the library), and SLink is smart about to which point of the segment a4 will point, depending on how large the data is. Thus, in the end, __NEAR can be zero, or 0x8000, or some other value, if constant data is moved into the __MERGED segment that is addressed by absolute addresses rather than relative to a4. Thus, it is only the linker that knows what the symbol will be, and it is the linker that puts it in. For layers, the library base is so small that it is just put at the beginning and __NEAR remains zero, but when the linker puts in the zero offset, it is too late to patch up the code for a smaller move instruction as the linker would have to resolve all relative branches around such lea's. Which, of course, it cannot do since the references are simply lost at this point.  The best it could do is to replace it by a move and a NOP, but that's not exactly an improvement either (in fact, it spills the pipeline).

I see what you mean about the lea __NEAR(a4),a6 optimization now. The symbol isn't evaluated until link time after the assembler. Vbcc does have cross-module optimizations (high optimization levels have bugs so I don't generally use more than -O2) which could take care of the JSR->BSR and JMP->BRA optimization I talked about earlier but it's highly doubtful it would be able to take care of symbols that are defined for the linker. Vbcc's 68k backend generated assembler code for loading the small data base looks like this.

Code: [Select]
  xref t_LinkerDB
   lea t_LinkerDB,a4

That's not going to work as the library base is dynamically allocated making it impossible to put a label (t_LinkerDB) there. I believe it would need a "MOVE.L custom_DB,a4" or similar. So much for my hopes of being able to use most of the small data handling. It looks like it would need some custom work in the backend to make it happen and it's tricky. Here are links to the vbcc 68k backend (machine.c) and startup.asm so you can have a look.

http://www.heywheel.com/matthey/Amiga/machine.c
http://www.heywheel.com/matthey/Amiga/startup.asm

Frank Wille can be e-mailed for the latest version of the vbcc sources.

Quote from: wawrzon;777418
if you think vbcc is lacking in comparison with sas/c it might be work to talk to phx personally. perhaps there is room for improvement. i have a feeling he is open for suggestions and it would be great to have an up to date compiler for amiga-m68k that is actively maintained and aware of system requirements, which apparently is not the simplest case when going with gcc.

If we could figure out what to do, we could make the changes and do the testing but Volker would need to look it over and ok it. The backend is complex enough that it's easy to have unintended consequences with changes.

Quote from: olsen;777487
This was an optimizing 'C' compiler intended for use on Sun 2 / Sun 3 workstations, which was adapted so that it emitted Amiga compatible 68k assembly language source code (as an intermediate language). That source code was then translated using a 'C' language precursor version of the ancient "assem" assembler into object code format suitable for linking. Mind you, this was not an optimizing assembler, just a plain translator. All optimizations happened strictly within the 'C' compiler.

What exactly rubs you the wrong way with Intuition?

There isn't anything wrong with how the Green Hill's compiler works but there are signs of lack of maturity in the compiler like this:

Code: [Select]
  movea.w ($c,a3),a0  ; movea.w moves and then sign extends to a longword
   move.l a0,d7  ; d7 is used later so this is needed
   ext.l d7  ; Unnecessary instruction! We are already sign extended!
   movea.l d7,a0  ; Unnecessary instruction! We are already sign extended!
Waste: 2 instructions, 4 bytes

Code: [Select]
  movea.w ($1c,a0),a1  ; movea.w moves and then sign extends to a longword
   move.l a1,d1  ; Unnecessary instruction! d1 is not used later!
   ext.l d1  ; Unnecessary instruction! We are already sign extended!
   movea.l d1,a1  ; Unnecessary instruction! We are already sign extended!
   cmpa.l a3,a1
Waste: 3 instructions, 6 bytes, 1 scratch register

Code: [Select]
  cmpi.l #$f3333334,d2
   blt lab_1521c
   cmpi.l #$f3333334,d2  ; Unnecessary big instruction! CC from first cmpi still valid
   bne lab_1521c
Waste: 1 instruction, 6 bytes

Code: [Select]
  divs.w #2,d0  ; could be asr.l #1,d0 as the remainder is unused
Waste: 2 bytes and a bunch of cycles

Code: [Select]
  move.l d0,d0  ; funny or should we say scary way to do a tst.l
   seq d0  ; the next 3 instructions could be replaced with and.l #1,d0
   neg.b d0
   ext.w d0  ; 68020 extb.l could replace next 2 instructions
   ext.l d0
   lea ($c,sp),sp
   ext.l d0  ; Unnecessary instruction!
Waste: 3 instructions, 2 bytes

When I see MOVE.L Dn,Dn, I know the compiler has problems with it's register management. Compilers repeat the same mistakes of course. Add in all the function stubs because it can't do registerized function parameters and it's pretty ugly. It might be passable for a 68000 but for a 68020+ there are a lot of places that EXTB.L could be used, MOVE.L mem,mem instead of MOVE.B mem,mem and index register scaling in addressing modes.
« Last Edit: November 15, 2014, 08:43:21 PM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« Reply #5 on: November 16, 2014, 12:23:03 AM »
Quote from: Thomas Richter;777560
  It would need a lea _DATA(a6),a4 here for libraries as the "bss-segment" is part of the library base, and the compiler generated library startup code would have taken care to copy the constant data to there.  SAS has also a couple of additional options, as to give every program opening the library a new library base, if you configure it right. As said, there is a lot of magic happening here, also for "load and stay resident" programs, where SAS/C also provides a startup that copies the data segment into a private memory segment and relocates data from a private database of relocation offsets. I personally had never use for this, but it's just another example what the compiler was able to offer as a service and how well it was integrated into the system.  As far as the quality of the compiled code goes, gcc 2.95 was ahead of SAS/C, but it was harder to use, even slower, and it was hard to use it for anyhting else but POSIX compliant C, i.e. integration was much worse.

I agree that it would be good to add support for several of the SAS/C custom data pointer features at the same time as the support needed will be similar. It would probably need to link with a custom startup as well as requiring changes in the 68k backend and some new options in vc (vbcc launch program). I still need to do some more research and I would probably need Frank's help. He wrote vlink so he knows the linker stuff like the back of his hand which could be very useful. If you looked at the source, it's complex but at least manageable unlike GCC. Jason McMullan tried to change some things in GCC for AROS and what follows are some of his comments.

 
Quote from: Jason McMullan
Fix gcc to give diagnostics when it feels 'forced' to use a frame pointer, sufficient to allow a programmer to make the correct changes so that gcc would not make a frame pointer in that routine.
 
 - This code is very convoluted and ugly. I tried this once, and ran away screaming.
 

 
Quote from: Jason McMullan
Fix gcc to never need a frame pointer on m68k
 
 - This may be impossible. reload1.c is an impenetrable morass of evil.
 

With vbcc we have friendly support help, manageable sources and the source is available. That's about as good as it gets.

Quote from: Thomas Richter;777560
  Just a couple of short comments here: First of all, you're describing a set of low-level peephole optimizations the compiler misses.

I don't believe what I have described are peephole optimizations except for possibly the DIVS instruction (MOVE Dn,Dn -> TST Dn would be peephole but it's ridiculous enough that the best peephole assemblers don't look for it). Integer DIV by immediates can involve a complex algorithm that converts the DIV to a multiply by a magic number, shifts and adds. This is beyond a this for that low level substitution. GCC has been doing magic number integer DIV to MUL since the Amiga days but Motorola ignorantly took out an important tool for the multiply on the 68k by removing the 64 bit multiply. The other examples show that the compiler was not aware at times of the data types and sizes in it's registers and had issues with it's register management. These are serious issues that could potentially lead to a crash but it's bad enough that they cause poor quality code.

Quote from: Thomas Richter;777560
I don't have any experience with greenhill, but its not untypical that many optimizations happen at a higher level though code-flow analysis, e.g. leading to dead-code removal that do not short at instruction level. Thus, just from looking at the instruction level, you only get a very incomplete view on the compiler and its performance. What you rather see here is probably a lack of maturity of the code-generator, but that's only one out of many phases a compiler has to go through. That doesn't mean that the upper levels were any better, I just don't know. All I'm saying is that your judgement is a bit premature.  In addition, allow me to fix a couple of your suggestions: First, divs #2,d0 is *not* equivalent to asr.w #1,d0. The former rounds to zero, the latter rounds to minus infinity. In specific (-1/2) = 0, but (-1 >> 1) = -1, so you cannot replace one without the other. Condition codes are also not equivalent.

DIVS.W is 32/16 so it is necessary to start with an ASR.L #1,D0. You are correct that a correction is necessary. Maybe something like this:

Code: [Select]
  asr.l #1,d0
   bcc.b .skip
   addq.l #1,d0
.skip:

This is still a big savings over a hardware divide. This is normally not done by a peephole optimizer.

Quote from: Thomas Richter;777560
 tst.l d0, seq d0 is also not equivalent to and.l #1,d0.

My point was that a Scc+NEG.B+EXT.W+EXT.L can be replaced by a Scc+AND.L#1 which is significantly faster on most 68k processors.

Quote from: Thomas Richter;777560
move.l d0,d0 is probably untypical for a tst, but it would probably work completely alike, but it is indeed more likely that the code generator had a hickup here and did not notice that there was no need to generate the code in first place.   Allow me to add that if you look nowadays at code gcc generates on x64 platforms you'll find that it generates considerably ugly assembler from your sources, with many code repetitions and jumps around labels etc. Still, the results show that the decisions made by the compiler were not that bad, its partially due to loop unrollment, and it also features a pretty good high-level code analysis that can shorten a lot of computations - but at a much higher level you would usually notice. Thus, it's not always that easy to see from the assembler what the compiler really intended, or what the original code should have been. That works for modern compilers, at least, only at the very low optimizer settings.

Optimized code may be ugly, especially if it's big and increases complexity. I didn't give any marks for looks though, other than MOVE Dn,Dn remark. This code does not look particularly complex. It looks like it's mostly data movement with a lot of sign extensions, especially word->longword. It's too bad compilers haven't learned that this comes for free to an address register.

I did an ADis disassemble to a vasm reassemble with peephole optimizations and intuition.library went from 114536 to 107776 bytes for a savings of 6760 bytes or 5.90%. I would expect that compiling for the 68020 would save somewhere around that much again. I don't know if vbcc would do any better as it doesn't always generate the most optimized code yet either (vasm tries to clean it up but only safe peephole optimizations can be made). Vbcc wouldn't need the function stubs anymore so that could save more.
« Last Edit: November 16, 2014, 12:28:47 AM by matthey »
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show all replies
Re: New Replacement Workbench 3.1 Disk Sets www.amigakit.com
« Reply #6 on: November 16, 2014, 10:25:33 PM »
Quote from: wawrzon;777618
@matthey
where do the jason quotes come from? its interesting in context of compiling aros with vbcc.


From you :D.

http://en.wikibooks.org/wiki/Aros/Platforms/68k_support/Developer/Compiler

@ThoR and Olsen
I read in the SAS/C manual about libinit, libinitr and the whole library building system interface. Some support by a compiler is necessary but this is a lot. Aminet has example library build systems that I need to research for vbcc (even GCC is supported). Maybe then I would be educated enough to talk to Frank Wille who has probably already been asked for more library support. I imagine it will take me a while (no hurry with current situation anyway) so I'm for letting the thread get back to topic. If it was possible, I think it would be good to have all the sources compiling with one compiler (but switchable to others if there is enough support) using a build system that could build everything or particular modules compiling for the 68000 or 68020 (with instruction scheduling for 68060 if possible). I don't see a need to abandon 68000 owners (includes fpga hardware with only 68000 support) and compiling for the 68020 is a nice space saver and extra performance for the larger AGA 68020 modules. It doesn't take much time to compile both and the size is plenty small enough to distribute on the internet. I have a feeling that there will be a time in the not so distant future when those sources are compiling again ;).