Welcome, Guest. Please login or register.

Author Topic: CopyMem Quick & Small released!  (Read 14245 times)

Description:

0 Members and 1 Guest are viewing this topic.

guest11527

  • Guest
Re: CopyMem Quick & Small released!
« Reply #59 on: January 05, 2015, 03:33:51 PM »
Quote from: Georg;781067
Maybe smart if this works on a "per cliprect" basis, but does it?

Actually, it is a single buffer. Manually clipping the text before rendering it to screen would complicate matters a lot. Clipping is done in BltTemplate() of the graphics library once rendering is complete.
Quote from: Georg;781067
 
Quote from: Georg;781067
Otherwise for things like text output in hidden simple refresh windows (like output in a shell window while compiling something, with the source code text editor in the front hiding all or most of it) it can do a lot of unnecessary work in the off-screen buffers.
I wouldn't be so sure. Look, you have to clip at some point. You can either clip while rendering the glyphs (which is what 1.3 did) or clip only once. Given that the complexity of the clipping is pretty high compared to rendering the text itself, it is probably better to "do the additional work" because it results in a much simpler algorithm. I believe the right approach is to optimize for the *typical* case, and the typical case is that the window you render text to is front-most, thus no clipping done.  
Quote from: Georg;781067
Similar for long text strings where big parts may ends up being clipped away. Like maybe in a listview gadget.

Actually, the typical ASL/Reqtools requester isn't *that* stupid. I don't know how MUI works, but the system requesters only render those lines that are actually visible on the screen and not those that are clipped away completely.
 

guest11527

  • Guest
Re: CopyMem Quick & Small released!
« Reply #60 on: January 05, 2015, 03:43:43 PM »
Quote from: Thorham;781066
No, it's not, hence the reason FBlit+FText makes a real difference.
How much, and is that due to FBlit? How does that work on graphics cards?  
Quote from: Thorham;781066
Which is slow, because you get additional memory accesses. Far better to do everything in registers, write to chipmem and be able to use the CPU pipeline on 68020+.
Well, there isn't really much chance to avoid memory accesses. You can probably get away rendering in fast ram for graphics cards in first place and then copy directly to the screen, but in one way or another, you need to fiddle all the bits in the right places to begin with, and there isn't much to be optimized *unless* you restrict yourself to some "nice" font sizes. Optimizing topaz.8 is pretty easy and you can double the speed of the Os, but that's really the exception.  
Quote from: Thorham;781066
You can write a properly optimized font renderer for any normal text editor font size. You can also take syntax coloring in account and not write all bit planes for each character.

Actually, all this bit-plane handling is pretty much obsolete in first place (I mean, custom-chip graphics), but leave this as it is: Rendering only a single bitplane is pretty dangerous for an Os function because it cannot know what else is on the screen. For the program, it may be possible (I believe ViNCed even does that, but my memory is fading) - but you don't need a new Os function for that, or need to write your own renderer. You can just set the rastport flags.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show only replies by Thorham
Re: CopyMem Quick & Small released!
« Reply #61 on: January 05, 2015, 04:45:44 PM »
Quote from: Thomas Richter;781078
How much, and is that due to FBlit?
Don't have any numbers, but text rendering is definitely faster with with FBlit+FText on my system. I use 688x564 double scan screen modes, and these patches improve things quite a bit. FrexxEd in 16 colors benefits a lot from this. Goes from annoying to use in 16 colors to working perfectly fine.

Quote from: Thomas Richter;781078
How does that work on graphics cards?
It doesn't, because it patches the OS blitter functions with CPU based functions. You don't need FBlit for graphics cards anyway, so it doesn't really matter.

Quote from: Thomas Richter;781078
Well, there isn't really much chance to avoid memory accesses.
Of course, but instead of reading font data and writing to a buffer which you later have to copy to chipmem, you can do the work in registers, write those to chipmem directly and utilize the pipeline on 68020+.

Quote from: Thomas Richter;781078
You can probably get away rendering in fast ram for graphics cards in first place and then copy directly to the screen
For graphics cards you might not have to bother with anything. I wouldn't optimize for graphics cards anyway. Not interested, and probably always faster than native anyway.

Quote from: Thomas Richter;781078
but in one way or another, you need to fiddle all the bits in the right places to begin with, and there isn't much to be optimized *unless* you restrict yourself to some "nice" font sizes.
There's plenty of room for optimizing, and you don't have to restrict yourself to nice font sizes at all. The optimizations come from doing the work in registers and using the pipeline when doing chipmem writes.

Quote from: Thomas Richter;781078
Actually, all this bit-plane handling is pretty much obsolete in first place (I mean, custom-chip graphics)
Yes, but it's what you have to deal with when writing native Amiga software. Native chipset is also the most important to get fast, because it needs extra speed the most. This is one of the reasons why I'm not concerned with graphics cards: They just don't need optimizing as much as the chipset does.

Quote from: Thomas Richter;781078
but leave this as it is: Rendering only a single bitplane is pretty dangerous for an Os function because it cannot know what else is on the screen.
That's one of the reasons why I would write my own font blitting routine.
 

guest11527

  • Guest
Re: CopyMem Quick & Small released!
« Reply #62 on: January 05, 2015, 06:22:36 PM »
Quote from: Thorham;781081
Of course, but instead of reading font data and writing to a buffer which you later have to copy to chipmem, you can do the work in registers, write those to chipmem directly and utilize the pipeline on 68020+.

Just to give you an idea what I'm talking about: There are fonts that are wider than 32 pixels and higher than 32 pixels, thus no chance to put that everything into registers. The Os also keeps care about making the font bold, italic, underline or any combination thereof.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show only replies by Thorham
Re: CopyMem Quick & Small released!
« Reply #63 on: January 05, 2015, 06:50:19 PM »
Quote from: Thomas Richter;781086
There are fonts that are wider than 32 pixels and higher than 32 pixels
And why would anyone use those for text editors on native screens? Typical font sizes are much smaller for text editing.

Quote from: Thomas Richter;781086
thus no chance to put that everything into registers.
Sure you can, because you don't have to copy whole characters to registers. You only copy parts of characters.

Quote from: Thomas Richter;781086
The Os also keeps care about making the font bold, italic, underline or any combination thereof.
You can do that in your own font renderer, too.

When dealing with native graphics there are quite a few ways to get things to run faster. You have to decide for yourself if it's worth doing or not. In my opinion it is.
 

guest11527

  • Guest
Re: CopyMem Quick & Small released!
« Reply #64 on: January 05, 2015, 09:58:13 PM »
Quote from: Thorham;781087
And why would anyone use those for text editors on native screens? Typical font sizes are much smaller for text editing.
I would rather say that this depends on the screen resolution and on the eyes of the user. At least, I wouldn't make base an optimization on this unless I have also a fallback mode that allows arbitrary fonts.

Quote from: Thorham;781087
When dealing with native graphics there are quite a few ways to get things to run faster. You have to decide for yourself if it's worth doing or not. In my opinion it is.

Well, as you wish. I personally would pick an editor on other qualities, though.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show only replies by Thorham
Re: CopyMem Quick & Small released!
« Reply #65 on: January 05, 2015, 10:23:36 PM »
Quote from: Thomas Richter;781096
I would rather say that this depends on the screen resolution and on the eyes of the user. At least, I wouldn't make base an optimization on this unless I have also a fallback mode that allows arbitrary fonts.
Just saying that it seems odd that you'd use fonts wider than 32 pixels for text editing, that's all. And no, I don't have hawk eyes (glasses) ;)

Quote from: Thomas Richter;781096
Well, as you wish. I personally would pick an editor on other qualities, though.
You're implying that I only look at one thing. In fact, for existing text editors the only speed requirement I have is that the editor is fast enough. FrexxEd is an example of this. It's not the fastest (not by a long shot), but it's undoubtedly one of the most powerful editors in Amiga 68k land (makes CygnusEd look like Ed).

If I would write my own editor, then I'd go for speed as well as power, ease of use and try to write a tidy, clean program that's maintainable (but yeah, in asm). I know how to get speed, so why not put that into the software I might write? This is especially important on low end 68k. Not much CPU time available, so getting good speed without sacrificing features is important.
« Last Edit: January 05, 2015, 10:26:31 PM by Thorham »
 

guest11527

  • Guest
Re: CopyMem Quick & Small released!
« Reply #66 on: January 05, 2015, 10:42:20 PM »
Quote from: Thorham;781100
If I would write my own editor, then I'd go for speed as well as power, ease of use and try to write a tidy, clean program that's maintainable (but yeah, in asm). I know how to get speed, so why not put that into the software I might write? This is especially important on low end 68k. Not much CPU time available, so getting good speed without sacrificing features is important.

Well, here is my math. How much time does an average editor sped in rendering text, compared to waiting for my input? My personal guess is that it doesn't really matter that much in real world applications, unless the editor is "brain dead". For example, the editor of Microsoft Basic (aka AmigaBasic) was brain dead and too slow for any reasonable work, but everything else I remember was simply fast enough, including "Ed", and all of them used the plain simple Os routines.

As far as my editor choices are concerned, I'm still using GoldEd on the Amiga, mostly because it can be customized to the very end. It runs here compiler, linker, debugger, configuration editor, jumps to errors, between sources... It is considerably more powerful than CED. Ok, Ced is certainly a nice editor, but not quite on par with GED.

For unix, it's emacs. Actually, more an operating system written in Lisp with an editor front-end.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show only replies by Thorham
Re: CopyMem Quick & Small released!
« Reply #67 on: January 05, 2015, 11:52:27 PM »
Quote from: Thomas Richter;781101
Well, here is my math. How much time does an average editor sped in rendering text, compared to waiting for my input?
It's about the scrolling (line and page). The scroll speed depends on the combination of text rendering speed, syntax coloring system speed and scroll routine speed. If it's too slow, then editor becomes uncomfortable to use. Especially in hires double scan modes with 16 colors. That's why speed is important.
 

Offline kolla

Re: CopyMem Quick & Small released!
« Reply #68 on: January 06, 2015, 01:19:51 AM »
CygnusEd has the option of using OS routines, and on native chipset that is a major slowdown. Ditto for MuchMore iirc.
B5D6A1D019D5D45BCC56F4782AC220D8B3E2A6CC
---
A3000/060CSPPC+CVPPC/128MB + 256MB BigRAM/Deneb USB
A4000/CS060/Mediator4000Di/Voodoo5/128MB
A1200/Blz1260/IndyAGA/192MB
A1200/Blz1260/64MB
A1200/Blz1230III/32MB
A1200/ACA1221
A600/V600v2/Subway USB
A600/Apollo630/32MB
A600/A6095
CD32/SX32/32MB/Plipbox
CD32/TF328
A500/V500v2
A500/MTec520
CDTV
MiSTer, MiST, FleaFPGAs and original Minimig
Peg1, SAM440 and Mac minis with MorphOS
 

Offline danbeaver

Re: CopyMem Quick & Small released!
« Reply #69 on: January 06, 2015, 02:54:32 AM »
Quote from: psxphill;780766
I think that might be a perception bias. I have a 2.5ghz Windows 8.1 laptop and if commodore had anything that felt this quick they wouldn't have gone bankrupt.

IMHO, the corrupt heads of Commodore and I mean Irving Gould (Chariman) & Mehdi Ali (president    of the Commodore) would have ruined ANY company in their "Need for Greed."
 

guest11527

  • Guest
Re: CopyMem Quick & Small released!
« Reply #70 on: January 06, 2015, 08:56:43 AM »
Quote from: Thorham;781102
It's about the scrolling (line and page). The scroll speed depends on the combination of text rendering speed, syntax coloring system speed and scroll routine speed. If it's too slow, then editor becomes uncomfortable to use. Especially in hires double scan modes with 16 colors. That's why speed is important.

There are other alternatives, though. You don't need to scroll every line. Buffer scroll commands or output commands, interpret several commands at once and use jump scrolling then. ViNCEd does that to avoid slowing down the output, i.e. while it prints its data, it already buffers new incoming commands, then executes all at once without scrolling through each of them individually.
 

Offline Thorham

  • Hero Member
  • *****
  • Join Date: Oct 2009
  • Posts: 1150
    • Show only replies by Thorham
Re: CopyMem Quick & Small released!
« Reply #71 on: January 06, 2015, 09:44:49 AM »
Quote from: Thomas Richter;781127
There are other alternatives, though. You don't need to scroll every line. Buffer scroll commands or output commands, interpret several commands at once and use jump scrolling then. ViNCEd does that to avoid slowing down the output, i.e. while it prints its data, it already buffers new incoming commands, then executes all at once without scrolling through each of them individually.
Yes, of course you don't scroll ten lines individually when you want to scroll ten lines. The software creates ten lines worth of space with one copy operation, and prints ten lines, otherwise you get ten screen copy operations, which is very slow. However, if those lines are printed slowly, then it's still going to be slow (especially bad for page up and page down when you have lots of long lines).

I understand that you prefer using the OS for things. It's less work and it's easier to get things to run on graphics cards and what not. On the peecee that's usually fine, but on Amiga hardware you can do better if you write your own optimized code. You just have to put in the extra effort that's required do it properly, and when you do, you'll end up with software that runs better on lower end machines, and I think that's important.
 

Offline psxphill

Re: CopyMem Quick & Small released!
« Reply #72 on: January 06, 2015, 10:26:46 PM »
Quote from: Thomas Richter;781101
Well, here is my math. How much time does an average editor sped in rendering text, compared to waiting for my input? My personal guess is that it doesn't really matter that much in real world applications,

I've heard that argument before and I don't buy it. If it's easily possible to write code that can render text faster then you should do that, because there are easily situations where an average editor is too slow. Like if you're running something reasonably intensive in the background.

Just being fast enough when nothing else is running isn't fast enough.
 
 Sure we need it all to be standardised and consistent so it makes it easy to write software, but that should be doable.
 

Offline itix

  • Hero Member
  • *****
  • Join Date: Oct 2002
  • Posts: 2380
    • Show only replies by itix
Re: CopyMem Quick & Small released!
« Reply #73 on: January 06, 2015, 11:59:46 PM »
Quote from: psxphill;781156
I've heard that argument before and I don't buy it. If it's easily possible to write code that can render text faster then you should do that, because there are easily situations where an average editor is too slow. Like if you're running something reasonably intensive in the background.


If you are running something CPU intensive in the background, like compiling large project with GCC, all you need is a good scheduler.
My Amigas: A500, Mac Mini and PowerBook
 

Offline matthey

  • Hero Member
  • *****
  • Join Date: Aug 2007
  • Posts: 1294
    • Show only replies by matthey
Re: CopyMem Quick & Small released!
« Reply #74 from previous page: January 07, 2015, 02:15:23 AM »
Quote from: psxphill;781156
I've heard that argument before and I don't buy it. If it's easily possible to write code that can render text faster then you should do that, because there are easily situations where an average editor is too slow. Like if you're running something reasonably intensive in the background.

Just being fast enough when nothing else is running isn't fast enough.
 
 Sure we need it all to be standardised and consistent so it makes it easy to write software, but that should be doable.



I agree. I like the idea of using the OS but it needs to provide reasonably optimal functions. Is aligning the destination and using an urolled MOVE.L loop too much to ask for CopyMem()/CopyMemQuick() when it is competitively the fastest for the 68000-68060? Would it be a bad thing if Olsen sold more copies of Roadshow because the memory copying bottleneck was reduced? We need to improve and use Amiga profilers but memory copying is a CPU intensive task that is easily improved. The Amiga philosophy has always been about efficiency and not just replacing the CPU with a faster one.

Quote from: itix;781160
If you are running something CPU intensive in the background, like compiling large project with GCC, all you need is a good scheduler.


The 68k frontend for vbcc, vc, had the task priority lowered for better multi-tasking. Editing is now practical while compiling which is very convenient.

I believe 68k GCC will use the current shell process priority (ChangeTaskPri).