Author Topic: CopyMem Quick & Small released! (Read 14388 times)

olsen · « **on:** January 03, 2015, 04:59:56 PM »

Quote from: biggun;780976

One question:

If you compare the time needed to develop the memcopy with the time spend on talking / defneding it here. How does this time compare?

Seems to me it's not so much about developing something, it's more about making sure that what is developed has a positive impact and no side-effects. Measure twice, cut once

However, this somewhat sober and "not much fun" side of system software development doesn't seem to be much in favour here. More or less, this speaks of the Amiga in its current form as a hobby.

Nothing wrong with computers as hobbies, or the fun of tinkering with the operating system. Spoilsports like Thomas and me do seem to have the engineering side of the operating system patches in mind, because that's what a lot of software builds upon, and it's sadly too easy to break things and never quite find out what actually caused the problems. If you're playing with the fundamentals of the operating system there comes a bit of responsibility with it, and that can't always be fun.

I'm not sure if this has been mentioned before, but the operating system itself, as shipped, hardly ever uses CopyMem() or CopyMemQuick() in a way which would benefit from optimization. In ROM CopyMem() is used to save space, and it usually moves only small amounts of data around, with the NCR scsi.device and ram-handler being the exceptions. The disk-loaded operating system components tend to prefer their own memcpy()/memmove() implementations, and only programs which were written with saving disk space in mind (e.g. the prefs editors) use CopyMem() to some extent. Again: only small amounts of data are being copied, which in most cases means that you could "optimize" CopyMem() by plugging in a short unrolled move.b (a0)+,(a1)+ loop for transfers shorter than 256 bytes.

I have no data on how third party applications use CopyMem()/CopyMemQuick(), but if these are written in 'C' it's likely that they will use the memcpy()/memmove() function which the standard library provides, and that typically isn't some crude bumbling implementation. However, it might benefit from optimization.

Now if you wanted to make a difference and speed up copying operations that are measurable and will affect a large number of programs, I'd propose a project to scan the executable code loaded by LoadSeg() and friends and replace the SAS/C, Lattice 'C' and Aztec 'C' statically linked library implementations of their respective memcpy()/memmove() functions with something much nicer. That would not be quite the "low-hanging fruit" of changing the CopyMem()/CopyMemQuick() implementation, but it might have a much greater impact.

olsen · « **Reply #1 on:** January 07, 2015, 12:49:43 PM »

Quote from: kolla;781105

CygnusEd has the option of using OS routines, and on native chipset that is a major slowdown. Ditto for MuchMore iirc.

CygnusEd's own custom display update routines bypass several layers of operating system routines which need to be able to handle any case of moving and clearing the screen contents. By comparison CygnusEd can restrict itself to dealing with just one single bit plane and there is no need to handle clipping or occlusion. Actually, if you are using the topaz/8 font then CygnusEd can even bypass the operating system's text rendering operations altogether.

Well, this is how it can work out if you know exactly which special case you need to cater for. If you have to have a general solution you'll always end up making sacrifices with regards to performance and resource usage.

olsen · « **Reply #2 on:** January 07, 2015, 01:11:43 PM »

Quote from: matthey;781162

I agree. I like the idea of using the OS but it needs to provide reasonably optimal functions. Is aligning the destination and using an urolled MOVE.L loop too much to ask for CopyMem()/CopyMemQuick() when it is competitively the fastest for the 68000-68060? Would it be a bad thing if Olsen sold more copies of Roadshow because the memory copying bottleneck was reduced?

Roadshow is a peculiar case. Because incoming and outgoing data needs to be copied repeatedly, the copying operation better be really, really well-optimized. For Roadshow I adapted the most efficient copying routine I could find, and this is what accounts for Roadshow's performance (among a few other tricks).

Because the TCP/IP stack needs to handle overlapping copying operations it would not have been possible to use CopyMem(), which is not specified to support it.

There is also special case copying code in Roadshow in order to support the original Ariadne card which due to a hardware bug handles single byte writes to its transmit buffer incorrectly (a single byte write operation is treated like a word write operation with a random MSB value). This is what the S2_CopyFromBuff16 SANA-IIR3 command is for, in case you always wondered why this oddball command is part of the standard

A general, efficient copying function has its merits, but it also ought to be sufficiently general in operation. Neither CopyMem() nor its subset CopyMemQuick() will handle overlapping copy operations, and none even flag an error condition if they cannot do what the caller asks them to (they show "undefined behaviour" instead).

The lack of support for overlapping copy operations is likely a deliberate design choice. If you read the addendum to the original exec.library AutoDocs, it mentions that a future version of CopyMem() might use hardware acceleration to perform the operation. To me it's not quite clear what was meant by that. It could mean that somebody was thinking about putting the blitter to use, if available, to move the data around (which could have been twice as fast as copying the data using the CPU on the original 68000 Amiga design; you would have to wait for the blitter to become ready for use, and you'd have to wait for it to complete its work, which taken together may have nullified any possible speed advantage over using the CPU straight away). Or it could mean that somebody was considering adding DMA-assisted memory copying functionality to the Amiga system design, like the 1985 Sun workstations reportedly had.

Now there is an interesting question which I'd like to ask Carl Sassenrath

olsen · « **Reply #3 on:** January 07, 2015, 01:44:25 PM »

Quote from: Thomas Richter;781175

For the records: I took the time and checked what ViNCEd does (the 3.9 Shell console). Actually, it does have the raster-scroll optimization as well, however, if I look at the sources nowadays and see how I had to "jump in circles" to get this done correctly, I would really not recommend doing so anymore.

What CygnusEd does is not limited to picking a single bit plane to render into, and which should be scrolled: it directly talks to the blitter itself to perform the update operation, bracketed between LockLayers()/OwnBlitter() and DisownBlitter()/UnlockLayers(), calling WaitBlit() before hitting the blitter registers.

The parameters which are fed into the blitter registers are precalculated in 'C', and only the hardware access itself is written in assembly language.

Cool stuff indeed

olsen · « **Reply #4 on:** January 11, 2015, 11:28:35 AM »

Quote from: itix;781476

Interestingly, the original Ed found from Workbench 1.3 does not. On Amiga 500 it could not handle even 50 kB text files very well. There was always noticeable lag when inserting new line.

The original Ed managed its text buffer in a very peculiar manner. The entire text was stored in a single consecutive buffer, whose contents had to be moved around prior to inserting text, and after removing text.

Because of how the BCPL memory management worked out, the text was not managed by storing a list of pointers which then referenced the individual lines. Instead, the management data structures were interleaved with the text itself. If you looked at it, you would find that the whole text buffer would be broken down into individual lines, which would begin with a pointer to the next line, followed by the text itself (which would begin with a byte indicating how long the text is). This is one of the reasons why the size of the file managed by Ed is restricted.

Because the text has to be shuffled around, and because there is another layer of management data structures which keeps track of where the first and the last line currently displayed ends up, which lines precede that, etc. every display update and every editing operation kicks off an avalanche of data structure maintenance operations.

The code is clearly optimized so as to minimize the impact of display updates, which was a sensible and necessary strategy back in the 1970'ies when you had one server connected to a bunch of terminals through slow serial links.

The scalability of the implementation is poor, though, and that's because the code is not optimized to minimize the impact of making changes in the document. Even reading and writing files is painfully slow because the editor has to extract the interleaved management/text data in order to store it to disk, and it has to interleave the management/text data when it reads documents from disk.

Here comes the fun part: the operating system which the old "Ed" was a part of was likely written using "Ed" itself. How can you tell? The "Ed" source code is larger than 64K, and the author broke it down into six individual files, presumably in order to keep it manageable. The longest line in that source code is about 105 characters, and almost every line is shorter than 80 characters. The same holds true for the dos.library code, which contains only a few files that are larger than 64K (by 1985 somebody must have used a different text editor, I suppose).

olsen · « **Reply #5 on:** January 11, 2015, 05:07:43 PM »

Quote from: Thorham;781482

What an absolutely horrible editor that Ed thing

We can only judge the design from today's point of view. Nobody sets out to write a slow, restricted text editor. What drove the design must have made great sense at the time "Ed" was written, but it did not hold up. I suppose even by 1985 the limitations may have become hard to bear.

Or maybe the Amiga as it was back then may have been just as powerful as the minicomputers of the late 1970'ies on which the TRIPOS system, from which AmigaDOS and its shell, commands, etc. derive, was developed, and the constraints we see today did not seem like constraints at all back then. There must have been a reason why TRIPOS was picked for the Amiga, other than there was time pressure and not many other options were available then.

olsen · « **Reply #6 on:** January 12, 2015, 03:42:14 PM »

Quote from: kolla;781488

Actually, I like Ed, it seems related to vi.

When I used "vi" for the first time (that must have been around 1991-1992; I never used a Unix system before I went to university), I was puzzled by the fact that some of the control sequences were exactly the same as in "Ed".

Since TRIPOS seems to have had so much in common with early Unix, by way of imitation and reimplementation, you can't rule out that the design of "Ed" was shaped by "vi". Both were created in about the same time frame around 1978. Also, there's another odd "parallel evolution" in that what "Edit" is for "Ed", "ed" is for "vi".

olsen · « **Reply #7 on:** January 12, 2015, 05:28:07 PM »

Quote from: Thomas Richter;781545

There's actually more stuff like this. Look at the Aztec-C editor "Z" that came with one of the later versions. The same type of crude (aka "unusable") editor. "Ed" has a lot in common with "vi", though the v37 version finally got a menu (which improved usability by about 200%).

Back in the early days of Amiga programming (that would have been 1987/1988 in my case) it was hard to find a decent programmer's editor.

I knew "Z" but quickly discarded it for being too obtuse. Funny that the Aztec 'C' documentation gave it such prominence, stressing the fact how compatible it was with "vi". I think the defining sentence in the documentation was "if you know vi, then you know Z", which works the other around, too, but not in Z's favour: I didn't have a clue what the documentation was talking about in the first place ("vi"? was that a roman numeral or something? and what does the number six have to do with text editors anyway?) and had to conclude that whatever the authors were so excited about probably wasn't for me.

My first 'C' programs were written using "Ed", until the programs became too large to endure the time it took for "Ed" to read and write them. At some point "Ed" even complained that the file was too large. I could take a hint: if the row of '@' characters "Ed" printed as it read a file was so long that it caused the screen to scroll it was high time to look for something else. The more '@' characters "Ed" printed, the slower it became, like it was climbing a steep hill and sweating & cursing with every step; I held out for more cartoon character swearing but "Ed" never even once admitted that it wanted to use one of "#$&%*", possibly because it was too well bred, coming from a posh British university. I now know that the original "Ed" prefers files to be not much larger than 10.000 characters. I could have used the often overlooked "size" parameter for "Ed", but then again who has that much patience in the long run?

Back then the next best text editor which I could find was on a Fish Disk with a number < 100, written by a French author (if I remember correctly). That too had its limitations. Don't get me started on "Microemacs" which, while it shipped on the Workbench disk, was barely usable either. And then there was "uEdit" (also found on a Fish Disk), which at the time appeared to me to be some sort of science fiction experiment gone terribly wrong. There were other Amiga text editors which I tried along the way. There was something called "SuperEd" which was not only fast, but also had a crash recovery feature (which I learned to appreciate). Then there was strange editor which was ported over from the Atari ST which had a split screen feature that promised to be super wonderful: how odd that it only supported *vertical* split screen (and didn't have a crash recovery feature, which I quickly learned that it ought have had).

These were really tough times. Eventually, I was saved by discovering what still is my Amiga text editor of choice, and in fact would be *the* text editor of choice on any platform, if it were more portable than it is. "CygnusEd" for life

Quote

"Ed" was partially ok, good enough to modify the startup-sequence, but not really usable for anything beyond that. "vi" is pretty much the same, and I'm still scared that people use that (or vim) to work on projects, but who am I to judge... :wq

I suppose "vi" is somewhere in the sweet spot of being quick to launch and (given enough available brain capacity) quickly allows you to commit keystroke sequences to muscle memory. Yes, it's a weird design, but so is the standard keyboard layout. If you learned touch-typing, it's amazing how well you can use that weird layout at great speed. It doesn't work quite so well with more heavy-weight editors such as the original "emacs".

olsen · « **Reply #8 on:** January 13, 2015, 12:10:45 PM »

Quote from: Thorham;781594

I also hate how CubicIde uses Lisp as it's scripting language. Terrible!

Well, it worked for Emacs

Should you wake me up in the dead of the night, stressing that the fate of the world depended upon me instantly adding scripting language support to an application, I'd probably start yawning, make coffee and write a Lisp-like language interpreter.

With the exception of "Forth", there's probably no other type of programming language which is both robust and powerful, and as easy to implement. Whether this necessarily translates into a language which empowers the user or just succeeds in making his life harder is up for debate.

Sometimes it's enough just to make a system scriptable which wasn't scriptable before.

olsen · « **Reply #9 on:** January 13, 2015, 01:59:10 PM »

Quote from: Thorham;781597

Quote
Should you wake me up in the dead of the night, stressing that the fate of the world depended upon me instantly adding scripting language support to an application, I'd probably start yawning, make coffee and write a Lisp-like language interpreter.
Or you could get Lua, and add that. Seems a lot easier than writing a script language from scratch. Not to mention that Lua is a lot nicer than Lisp

Sometimes you don't get to choose, and there are overriding constraints which spell out in so many words why we can't always have nice things.

I've been in that position several times, and although I don't recommend the approach, it can make great sense to explore the boundaries set by the constraints and use that playing field to the best of your ability.

That can lead to wicked strange solutions which you'd rather not admit having cooked up in a moment of weakness, but then sometimes the overriding constraints are what guide your decision-making and not that nagging conscience of yours that keeps reminding you that the choices you are forced to make may not look so good in the long run. Being a programmer can suck.

Quote

Quote
Sometimes it's enough just to make a system scriptable which wasn't scriptable before.
Why not just add a nice language? Best choices for a script language seem to be Lua or a C interpreter.

You don't always get to choose. The last time I was really upset by the choices made in a scripting language design was when I had to write something in AppleScript to clean up my iTunes library. What a bizarre language. Why is Apple holding onto it, and its equally bizarre ecosystem? Because that scripting language, and all the ideas that went into its design, has been around for decades with nobody willing to admit that it doesn't hold up well.

Quote

Lua is easy, and easy to add if you're working in C. It's also pretty fast, works well on old systems like lower end 68k Amigas (68020/30), and very portable (SASC compiles it properly).

Adding a C interpreter is good, because many programmers know C. That's why FrexxEd's script system is so nice. If you know C, then you know FrexxEd's script language.

I think that Lua's a decent enough design, which is both powerful, well-documented and something newcomers can learn and apply. It's also embeddable with a small memory footprint. I once came close to using it in one of my applications, but then time constraints made me - wait for it - knock off one of those Lisp-like language interpreters instead (the fate of the world didn't exactly depend upon it, and if it did I didn't notice, but sometimes you just want to finish a project and not keep on tinkering).

As for using a 'C'-like language for the purpose of scripting, I can see the attraction for programmers who are already familiar with the language. For everybody else it's a long and ardous journey to even become competent in using the language, so I wouldn't want to force it upon anybody.

olsen · « **Reply #10 on:** January 13, 2015, 03:29:39 PM »

Quote from: Thorham;781606

Quote
I think that Lua's a decent enough design, which is both powerful, well-documented and something newcomers can learn and apply. It's also embeddable with a small memory footprint. I once came close to using it in one of my applications, but then time constraints made me - wait for it - knock off one of those Lisp-like language interpreters instead (the fate of the world didn't exactly depend upon it, and if it did I didn't notice, but sometimes you just want to finish a project and not keep on tinkering)

What kind of time constraints cause you to have to make concessions like that?

I create, modify and debug programs both as part of my day job and as a hobby. The programming I do for fun happens in my somewhat limited spare time. In that situation you can weigh the benefits and drawbacks of one solution to a specific problem against a different solution by looking at how long it would take to implement it, and how well it would solve the problem at hand.

In the case of wiring up a Lua interpreter vs. plugging in some slightly grubby but well-tested Lisp-like language interpreter it was tempting to use the leverage which Lua would have provided, because it was a "real" language with variables, loops and all those shiny things that just might come in handy (or may never get used).

However, not all applications really need that power, and they still get the job done which they were intended for. So plugging in the Lisp-like language interpreter solved the problem at hand, with a minimum of implementation and testing effort. Lua would have provided a much more powerful and well-rounded solution, but I would have had to spend another day or two to get it working properly, and just maybe I wouldn't have used the flexibility and options which Lua would have provided anyway.

So, sometimes a "good enough" solution can beat a "good" or even "perfect" solution.

olsen · « **Reply #11 on:** January 13, 2015, 05:36:07 PM »

Quote from: Thorham;781610

Quote
However, not all applications really need that power, and they still get the job done which they were intended for. So plugging in the Lisp-like language interpreter solved the problem at hand, with a minimum of implementation and testing effort. Lua would have provided a much more powerful and well-rounded solution, but I would have had to spend another day or two to get it working properly, and just maybe I wouldn't have used the flexibility and options which Lua would have provided anyway.
It depends on the software, sure, but for some software you shouldn't make such concessions.

For some software, 'limitless power' is part of the design goal. FrexxEd is a good example of that, and it's script language is the main reason why it's so powerful.

My outlook on software quality and how to get it has changed over the years. When I started noodling around with BASIC on whatever home computer I could get my hands on my curiousity was the driving force in getting stuff done. A couple of years down the line I got it into my head that working on a program actually is a task that can always be finished.

At times that even was true when somebody was willing to pay me money for the work I did, or when somebody was very keen on putting my work to good use. That's when you had to make sure that everything you promised or hoped for was accounted for and in the box, before you closed it, tied a bow around it and handed it over.

I've been programming for some odd 30-31 years now, both as a hobby and as a profession, and I couldn't help noticing that there were recurring patterns in the work I did. One important pattern is in that your work is rarely finished, and that you will end up iterating on it. You'll invariably find bugs, understand your own work and working method better, understand what the requirements of the project were better than you did before, and with that insight will come the need to give the job another go, so as to make things better.

This is one of the key insights I gained: your choices, when it comes to designing and implementing software, may not be the best at the time you make them, but that is not the end of the story. You will return to your work, and this time it may improve. Even if it doesn't, then maybe the next iteration will be better.

With this insight you gain a different perspective on how you spend your time on the project. You begin accept that you will be unable to make the best choices, and that the next best thing you can do is focus on specific parts of the task which benefit most from your attention. This is where you'll discover that you have been making trade-offs all the time. Some code may be best in a state in which it's readable and not necessarily optimized for time or space. Some code may be best in a state in which it's optimized. Some code just doesn't benefit from any polishing at all. Turns out that some of the trade-offs you make don't look so good in hindsight, and off you'll go for another round of making better choices.

And that's about it: just because I pick one quirky scripting language that has trouble walking and chewing gum at the same time over an arguably superior alternative it doesn't have to stay that way forever.

I distrust the notion of perfect code or the perfect solution for a problem, as implemented by a program. Perfect code has no bugs and always solves the problem at hand. I've seen that, but the scope such perfect code covers is usually tiny, and if it isn't, it takes a crazy amount of work to produce it. I'm not in the business of producing that kind of work

Trying to get to the perfect solution, that I can agree with as part of a process. But you can only get very, very close (asymptotically close, for the mathematically inclined among us) to it and never quite reach that point. Close enough is good enough for me, as otherwise you'll spend your time chipping away at only one small part of the interesting stuff you might otherwise get a chance to explore instead.

Author Topic: CopyMem Quick & Small released! (Read 14388 times)

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!

olsen

Re: CopyMem Quick & Small released!