Author Topic: Layers.library V45 on the aminet (Read 66217 times)

guest11527 · « **Reply #254 from previous page:** September 15, 2014, 06:33:15 PM »

Quote from: olsen;773078

It's not necessarily a bad idea, you just have to know to which end the patches are created. Collapsing more complex assembly code to less complex code, saving space and reducing execution time used to get a lot more respect when storage space was scarce and CPUs used to be far less powerful. Like, say, in the 1980'es and 1990'ies.

Yes, indeed, these were the "even older days" of computing. Back then, in the 6502-times, squeezing more program into less RAM was pretty much a necessity given that you had so little of it. I remember back then on the Atari (yes, the Atari 800XL, same chip designer, different company), the file management system (back then called "DOS") was bootstrapped from disk, took probably 5K of your precious RAM space, and had pretty limited capabilities. Plus it took time to bootstrap that 5K (it wasn't a 1541, so it wasn't as bad as on the C64, after all.)

Indeed, one can try to rewrite the whole thing, throw out the less-used part of ROM space (for a parallel port interface that came so late to the market that no devices were ever made to fit into this port), and replace the newly available 3K of ROM with a more powerful version of the 5K DOS, and cleanup the math stuff on the way. For such extremely tiny systems, this type of hobby did make sense because it was a noticeable improvement (as in: 5K more for your programs from the total of 40K available). Not that it was commercially viable - it wasn't.

Anyhow, byte counting stopped making sense, already when 512K were the norm, and priorities changed. As soon as projects go bigger, one starts to notice that there is no benefit in squeezing out each possible byte, or each possible optimization. There is too much code to look at, and problems are typically related to maintain the full construction rather than to make it fast.

As Olsen already said, either execution time is not critical because I/O or human input is limiting the speed, or 80% of the program time is spend in less than 20% of the program. In such a case, the 20% are then hand-tuned, probably written in assembly. For 68K, I did this myself. Nowadays, not even than anymore, we had a specialist for that in the company when I worked on problems that required this type of activity. Even then, it turns out that the really critical part is not even the algorithm itself, but to keep the data in cache, i.e. construct the algorithm around the "worker" such that data is ideally pipe-lined, and that again was done in a high-level language (C++).

To keep the story short, even today the use of Assembly even for optimization is diminishing. There are hot-spots where you have to use it, but if speed is essential, you typically want to be as flexible as possible to re-arrange your data structures to allow for fast algorithms, and to organize data such that the data access pattern fits to the CPU organization - and you don't get this flexibility in Assembler. It sounds weird, but a high-level language and more code can sometimes make an algorithm faster.

But anyhow, I confess I did byte counting in the really old days, two generations of computers ahead, and yes, it created a good deal of spaghetti code, though requirements were quite a bit different. http://www.xl-project.com/download/os++.tar.gz

It's part of becoming a good engineer to learn which tools you need to reach your goal, and which tools to pick for a specific use case, and foremost to understand what the problem actually is (this is more complicated than one may guess). Ill defined problems create ill program architecture. Not saying that I'm the perfect software engineer - I have no formal education in this sector - but at least I learned a bit of by failing often enough.

Heiroglyph · « **Reply #255 on:** September 15, 2014, 06:53:14 PM »

Byte counting is one thing, but have you seen the size of AROS 68k executables?

There are examples of commands in the c: directory where the OS3.1 version is a few hundred bytes, yet the AROS68k version is multiple thousands of bytes.

That's the type of thing that makes me facepalm when it comes to AROS.

It's not an issue of C vs. assembly either, as I have been working on real native versions based on their C code that are as small and sometimes smaller than the OS3.1 versions.

Sometimes it's added AROS specific calls, sometimes sloppy linking, often just awkward logic, but it's wasteful.

You can't throw away that much memory and disk bandwidth and expect it to run as well as the original OS did on the same hardware.

vxm · « **Reply #256 on:** September 15, 2014, 07:18:03 PM »

Quote from: Thomas Richter;773046

Well, look, first of all, I don't think we're "pulling in opposide directions", or that we aren't moving. This thread is about a move, after all, and I hope it's a move in the right direction.

Second, whether a processor is 7Mhz or not does not matter: If a given function (say AndRectRegion()) takes 1% of overall calling time to move a window, it does not matter whether you speed this up by a factor of two (realistic, if AllocMem is bypassed) or a factor of 1000 (unrealistic, unless you replace it by a NOP), the net effect will be NIL. It's really a very elementary truth that is independent of the processor speed. To get a speedup of two, *every single* function in the call path would have to be speed up by the very same factor, and that's not going to happen.

The problem really is that Cosmos has apparently never worked in a larger software project with exploding complexity, and thus has no feeling in what type of modifications one would want to make, and in which step of the project. Yes, of course it makes sense to optimize a bottleneck of a program, and there to use processor-specific code. But it does not make sense if you have additional constraints that are harder to characterize, such as maintainability, or portability. If the code can work on an old machine, and the speed impact by not using the latest processor instructions is below measurable, it makes no sense to use such optimizations. Basically you compromize compatibility and get nothing in return. In the same way, it does not make sense to replace an AllocMem() by a copy on the stack (all provided this is valid) if you don't receive a measurable effect. Again, you would compromize maintainability (as in: There is a single constructor call for a certain object) and would not gain anything measurable in return.

As always, you have to find compromizes in development, especially when it comes to larger and complex problems, and "running time" is not the one and only goal. You not only want to deploy your software on a wide variety of hardware, you also want compatibility to existing software, and you also want to be able to read your code from ten years from now. You also want that customers can install and use the software easily, and are not confused by compatibility issues that version C of a library can only work with programs P and Q, but program R requires version B, and program S may work with C, but only if specific settings are made...

The overall problem cannot be reduced to a simple count of cycles.

Pedagogically, it would have been more elegant and efficient to directly explain why everything has its reason for being.

guest11527 · « **Reply #257 on:** September 15, 2014, 07:32:35 PM »

Quote from: Heiroglyph;773084

Byte counting is one thing, but have you seen the size of AROS 68k executables?

Why do you mind so much? The additional size is most likely due to the compiler startup code and some minimal glue code. If you compile (naively) a C executable on the Amiga, you also end up with an 8K binary (or something in that magnitude). You get only less for the commands in C: because the compiler is there setup to bypass the POSIX compatibility layer; you have a specialized compiler that allows you to jump into the head of your code without any startup code, hence allows you to make the startup yourself.

So for example, your average startup code will parse arguments for you by POSIX rules (AmigaOs does that for you with ParseArgs), will create stdout/stdin/stderr for you (AmigaOs will have its own calls for Output(), Input()) and will open the libraries for you (AmigaOs will require you to do that yourself without a startup code).

On the other hand, POSIX offers you an abstraction layer that makes AROS independent from the underlying Os that was used to compile it, and indpendent from the remaining infrastructure. It is hence much easier to compile executables for it by a standard compiler, without all the environment-specific logic. It's hence easier to contribute.

One way or another, I wouldn't be so worried about this. This problem will go away as soon as the infrastructure becomes more complete if it is a problem in first place. If you compare this with the early 1.2 Workbench disk, there (almost) every command in C: had the BCPL startup glue linked to it (because it was BCPL after all) that only vanished once the Software Distillery rewrote the whole stuff with a more capable compiler (or a compiler that was more adapted to the entire situation), namely Lattice C.

guest11527 · « **Reply #258 on:** September 15, 2014, 07:36:35 PM »

Quote from: vxm;773087

Pedagogically, it would have been more elegant and efficient to directly explain why everything has its reason for being.

Yes, certainly. (-: But would that have stopped anyone from arguing? It probably goes without saying that every decision has its reason, and every situation has various perspectives from which you can look at them.

itix · « **Reply #259 on:** September 15, 2014, 08:01:54 PM »

Quote from: Thomas Richter;773058

FFS is *not* slow. Ok, it is slow in *one particular* discipline, and that's listing of directories.

It has been more than 15 years when I used FFS but I recall it had another deficiency: it is slow at seeking large files. This became problem when first Doom ports appeared to Amiga and their WADs were relatively large. This of course was not problem at design time when harddisks were small but expensive.

Quote

Other filing systems use more complex directory structures, with the benefit of making directory reading faster, but making file manipulation slower.

Which is not necessarily what user deem important. Slow directory reading is immediately visible to the user. User can see it, he can measure it with his wris%&$#?@!%&$#?@!%&$#?@!%&$#?@!ch. Faster opening or reading of files, I am not sure if user observes that.

But it should be also mentioned that part of slowness was due to how directory scanning had to be done in the software. Scanning had to go through ExNext() loop and number of context switches and DOS packet traffic between scanning task and fs task. ExAll() solved this (when directly supported by the fs) although it is bit hack.

Edit: haha, wris%&$#?@!%&$#?@!%&$#?@!%&$#?@!ch == w r i s t w a t c h

Heiroglyph · « **Reply #260 on:** September 15, 2014, 08:15:39 PM »

Quote from: Thomas Richter;773088

Why do you mind so much? The additional size is most likely due to the compiler startup code and some minimal glue code.

We'll have to agree to disagree here. To me this is a symptom of a larger issue with AROS that is related to the current discussion.

IMHO, it's not a viable replacement for the original OS until it can run in reasonably similar resources and 4KB of wasted disk space, bandwidth and memory on a single command is patently ridiculous.

Working on the commands in c: is a stepping stone for me to get a workflow down while I become acquainted with the code base.

wawrzon · « **Reply #261 on:** September 15, 2014, 08:34:44 PM »

Quote from: Heiroglyph;773092

We'll have to agree to disagree here. To me this is a symptom of a larger issue with AROS that is related to the current discussion.

IMHO, it's not a viable replacement for the original OS until it can run in reasonably similar resources and 4KB of wasted disk space, bandwidth and memory on a single command is patently ridiculous.

Working on the commands in c: is a stepping stone for me to get a workflow down while I become acquainted with the code base.

i agree that these are some of reasons that prevent aros to become real alternative on 68k.

perhaps some aros developer could comment on that. is it some linking issue, perhaps against the arosc lib?

woulnt it be good to disassemble and examine asm of some possibly simple aros68k c: command to actually exactly check where the differences and problems are in order to optimize them away if possible.

Heiroglyph · « **Reply #262 on:** September 15, 2014, 08:39:04 PM »

Quote from: wawrzon;773094

woulnt it be good to disassemble and examine asm of some possibly simple aros68k c: command to actually exactly check where the differences and problems are in order to optimize them away if possible.

You don't have to, just look at the code and compile it yourself.

Disassembly is rarely needed to find stuff like this, you just need someone to care about it.

Edit: Let me rephrase, lest people think that I don't like the AROS devs. It's not "care" as in giving a damn, I meant care as in making that one of your priorities.

Thorham · « **Reply #263 on:** September 15, 2014, 08:43:37 PM »

Quote from: olsen;773078

Optimizing assembly code can be a rewarding exercise, like solving a chess puzzle, or doing calculus (yes, some people do that as a hobby, like playing "Candy Crush"; I'm still holding out for "Calculus Crush" for the iPhone). It follows a certain set of rules, there are rigid constraints and the number of possible solutions is small. Perfect entertainment!

Indeed

Quote from: Thomas Richter;773082

There are hot-spots where you have to use it, but if speed is essential, you typically want to be as flexible as possible to re-arrange your data structures to allow for fast algorithms, and to organize data such that the data access pattern fits to the CPU organization - and you don't get this flexibility in Assembler.

That's just not true, of course. Re-arrange your data, implement a better algorithm. How don't you get this flexibility in assembler, of all things? Of all languages out there, assembler has to be the one with the fewest restrictions. And yes, it's more work, but in assembly language almost everything is more work.

Quote from: Thomas Richter;773082

It sounds weird, but a high-level language and more code can sometimes make an algorithm faster.

That implies that it's not possible to always implement optimal algorithms in an optimal way in assembly language. Guess what all code that's compiled to native becomes. Right, assembly language. For 68020/30 probably NOTHING beats properly written assembly language.

If you know you're right, than post a good example of an algorithm for which this applies, because I sure would like to see it. Or more than one, if possible.

biggun · « **Reply #264 on:** September 15, 2014, 09:00:05 PM »

Quote from: Thorham;773096

Indeed
If you know you're right, than post a good example of an algorithm for which this applies, because I sure would like to see it. Or more than one, if possible.

Of course you can in theory implement all and everything in ASM.
The point is that if you write complex algorithms its a lot easier to keep the overview if you use a high level language.

guest11527 · « **Reply #265 on:** September 15, 2014, 09:33:44 PM »

Quote from: Thorham;773096

That's just not true, of course. Re-arrange your data, implement a better algorithm. How don't you get this flexibility in assembler, of all things? Of all languages out there, assembler has to be the one with the fewest restrictions. And yes, it's more work, but in assembly language almost everything is more work.

Look, have you ever done a complex project? You seem to say "everything is in assembler, so you can do everything in assembler in first place". That's simply not correct. If I change an interface in a higher level language, I can create objects and interfaces in such a way that I can replace data structures without changing the objects. I would recompile, and the compiler would give me errors and warnings if something is not correct.

If you do the same in assembly, you would need to re-adjust the objects, re-check the interfaces. register assignments, scratch-registers... In other words, your daily work is so much tight up in low-level details that you are losing any type of control on the *real* work, and you're lost in problems usually not worth mentioning. Given that you have only a finite memory, and a finite time to memorize the interfaces, your mental capacities are tight up by nonsense the computer can take care of much better.

Quote from: Thorham;773096

That implies that it's not possible to always implement optimal algorithms in an optimal way in assembly language.

Correct. Because you will sooner or later find yourself in the situation that you're lost in your code and you're unable to do anything without breaking it.

Quote from: Thorham;773096

Guess what all code that's compiled to native becomes. Right, assembly language. For 68020/30 probably NOTHING beats properly written assembly language.

Many things do, depending on what you want to do. Shell script does for simple file management properties. C does for many every-day problem solving. Arexx does for remote-controlling programs. Even Basic does for a simple throw-away code. On everyday use, I use C, C++, Bash-script, python, sed, perl, make, java, javascript, html,...

Software engineering means using the right tools to get the job done as best as possible. Assembler rarely ever is. It's hard to see that without having gone through this. I can only suggest to try it once and setup a complex larger project from scratch.

Quote from: Thorham;773096

If you know you're right, than post a good example of an algorithm for which this applies, because I sure would like to see it. Or more than one, if possible.

Pretty much everything I do on an everyday basis. Well, anyhow, just for the means of it, my offer was already that you probably start writing a JPEG 2000 from scratch from the specs in assembler. My prediction is that you'll give up because you'll be lost somewhere within the mess the code becomes. You'll start implementing, then will notice that your overall design will not work, will start again, will make adjustments, will notice that the adjustments fit not to the rest of the code, will start again... You'll be left in a nightmare of development without much hope to get any reasonable program. In the end, if you're lucky, you probably get something that works, but most likely because you have forced program structures together that are sub-optimal, without understanding that your selection of algorithms was more due to low-level ease of implementation than high-level analysis of the problem.

In assembler, you're loosing the view on the problem; you cannot make out the core of the problem because you're lost in details. Or as we say here, "you don't see the wood because your view is blocked by trees".

Professional software is something different than rewriting the "List" command in assembler.

wawrzon · « **Reply #266 on:** September 15, 2014, 10:21:02 PM »

Quote from: Heiroglyph;773095

You don't have to, just look at the code and compile it yourself.

Disassembly is rarely needed to find stuff like this, you just need someone to care about it.

Edit: Let me rephrase, lest people think that I don't like the AROS devs. It's not "care" as in giving a damn, I meant care as in making that one of your priorities.

im not blaming you in any way, i think you bring some valid points, im just trying to figure out how they could be solved, in absence of better proposals.

the idea to disassemble a binary wasnt exactly to discover what actually gets statically linked or the like, what can be read from sources, just if there is anything particularly strange that happens at compile time. afair aros version of gcc needs to be patched downstream to produce proper aros binaries, it doesnt get pushed and maintained upwards for obvious reasons so some crap might happen, and it might be possible to fix, if it was the case.

Heiroglyph · « **Reply #267 on:** September 15, 2014, 10:37:17 PM »

Quote from: wawrzon;773103

im not blaming you in any way, i think you bring some valid points, im just trying to figure out how they could be solved, in absence of better proposals.

I'm not sure what will become of this work, but here's a sample. I'm having trouble getting a table to format right in here, so read if you can, sizes are in bytes.

These aren't cherrypicked, they're just the ones I've done and tested enough to feel confident in them working like the originals.

Code: [Select]


Command		OS3.1	Mine	AROS (Vision 2.7)
AddBuffers	444	356	2168
Break		432	504	3732
ChangeTaskPri	460	476	3096
DiskChange	312	248	2052
Lock		536	628	2788
MakeDir		464	364	2732
MakeLink	700	556	2528
Relabel		584	580	3312
SetClock	668	636	4840
SetDate		688	724	3212

Total		5,288	5,072	30,460

wawrzon · « **Reply #268 on:** September 15, 2014, 10:47:37 PM »

Quote from: Heiroglyph;773106

I'm not sure what will become of this work, but here's a sample. I'm having trouble getting a table to format right in here, so read if you can, sizes are in bytes.

These aren't cherrypicked, they're just the ones I've done and tested enough to feel confident in them working like the originals.

Code: [Select]
Command OS3.1 Mine AROS (Vision 2.7) AddBuffers 444 356 2168 Break 432 504 3732 ChangeTaskPri 460 476 3096 DiskChange 312 248 2052 Lock 536 628 2788 MakeDir 464 364 2732 MakeLink 700 556 2528 Relabel 584 580 3312 SetClock 668 636 4840 SetDate 688 724 3212 Total 5,288 5,072 30,460

hello! i trust you and i noticed it myself, but one of the possibilities would be for example, that the binaries have not been stripped and still contain the debug symbols. this might amount to the 1/3to half of the size.

OlafS3 · « **Reply #269 on:** September 15, 2014, 11:06:32 PM »

Quote from: Heiroglyph;773106

I'm not sure what will become of this work, but here's a sample. I'm having trouble getting a table to format right in here, so read if you can, sizes are in bytes.

These aren't cherrypicked, they're just the ones I've done and tested enough to feel confident in them working like the originals.

Code: [Select]
Command OS3.1 Mine AROS (Vision 2.7) AddBuffers 444 356 2168 Break 432 504 3732 ChangeTaskPri 460 476 3096 DiskChange 312 248 2052 Lock 536 628 2788 MakeDir 464 364 2732 MakeLink 700 556 2528 Relabel 584 580 3312 SetClock 668 636 4840 SetDate 688 724 3212 Total 5,288 5,072 30,460

My first reaction was "who cares today for disk capacity" but then I thought a little and must admit that this is indeed important when you want to run it on classic hardware. Aros devs use PCs for development so stripping software down from 3 KB to 500 Byte has not a very high priority but if people want it in that direction it would be good if more people become involved. Of course you can replace most components so if you have reprogrammed something it could be used. But this is certainly a basic problem, many components certainly could be optimized.

Author Topic: Layers.library V45 on the aminet (Read 66217 times)

guest11527

Re: Layers.library V45 on the aminet

Heiroglyph

Re: Layers.library V45 on the aminet

vxm

Re: Layers.library V45 on the aminet

guest11527

Re: Layers.library V45 on the aminet

guest11527

Re: Layers.library V45 on the aminet

itix

Re: Layers.library V45 on the aminet

Heiroglyph

Re: Layers.library V45 on the aminet

wawrzon

Re: Layers.library V45 on the aminet

Heiroglyph

Re: Layers.library V45 on the aminet

Thorham

Re: Layers.library V45 on the aminet

biggun

Re: Layers.library V45 on the aminet

guest11527

Re: Layers.library V45 on the aminet

wawrzon

Re: Layers.library V45 on the aminet

Heiroglyph

Re: Layers.library V45 on the aminet

wawrzon

Re: Layers.library V45 on the aminet

OlafS3

Re: Layers.library V45 on the aminet