Amiga.org

Operating System Specific Discussions => Amiga OS => Amiga OS -- Development => Topic started by: DamageX on January 09, 2006, 07:03:48 AM

Title: newb questions, hit the hardware or not?
Post by: DamageX on January 09, 2006, 07:03:48 AM: I want to write some programs in assembly (please don't suggest a HLL as I won't use one) on my A2000. I want to do file I/O, some graphics on a HAM6 screen, and perhaps audio as well. Now the question is how to go about it. Looking at bits of source code and documentation that I've been able to find, the method of disabling the OS and accessing hardware registers directly seems to be very clear. But, can OS calls still be used for file I/O in this case? And where could I learn to open a custom screen and play audio the OS-friendly way anyways?
Title: Re: newb questions, hit the hardware or not?
Post by: Piru on January 09, 2006, 08:21:50 AM: @DamageX
Quote
Now the question is how to go about it. Looking at bits of source code and documentation that I've been able to find, the method of disabling the OS and accessing hardware registers directly seems to be very clear. But, can OS calls still be used for file I/O in this case? And where could I learn to open a custom screen and play audio the OS-friendly way anyways?

You can do exactly this, but you need to be very careful when setting it up or it will b0rk. Basically you need to take over display only (and prevent system from opening any requesters, since you won't see them!), and keep the system running on the background. You should then use input.device inputhandler to get input from the user. If you intend to use the disk I/O, you can't disable the interrupts or the dma.

Here's a simple startup code that does exactly this:
hwstartup.asm (http://www.iki.fi/sintonen/src/hwstartup/hwstartup.asm)

With this startup code the system is still running normally while the display is taken over by it. You can do the disk I/O just fine.

To get user input you can modify ih_code to process input events (IECLASS_RAWKEY etc) BEFORE the events are nulled.

The display can be built several ways. You can build your own copperlist that sets up display window size, depth and plane pointers, and then render directly to the bitmaps. Or you can use graphics.library low level display routines to build view, viewport, rasinfo, bitmap etc. In this case the display is "loaded" with LoadView + WaitTOF. The advantage here is that you can use normal graphics.library rastport routines for rendering (just InitRastPort the rp and set rp_BitMap to point to your own bitmap, then fire away with gfx draw routines). However, if you intend to use HAM6 display, graphics.library functions aren't that useful.

For audio you can allocate the hardware via audio.device and then proceed banging the hw, or you can use audio.device for playing the samples (more restrictive), or you could use ahi.device (but this makes little sense if you bang hw already, the only reason would be to allow using soundcards).
Title: Re: newb questions, hit the hardware or not?
Post by: DamageX on January 10, 2006, 01:45:07 AM: This is just the kind of info I needed, thanks ^_^
Title: Re: newb questions, hit the hardware or not?
Post by: DamageX on July 14, 2014, 08:22:40 AM: for the record...
Code: [Select]
.nosignal: move.l _WBenchMsg(pc),d0 beq.s .notwb move.l a0,a1 jsr _LVOForbid(a6) jsr _LVOReplyMsg(a6) .notwb: move.l d7,d0 rts
I was informed that this part is buggy. a0 was perhaps a typo for d0, and it is also not clear if a1 is preserved during the call to LVOForbid
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 14, 2014, 09:22:34 AM: Quote from: DamageX;225347
I want to write some programs in assembly (please don't suggest a HLL as I won't use one) on my A2000. I want to do file I/O, some graphics on a HAM6 screen, and perhaps audio as well. Now the question is how to go about it. Looking at bits of source code and documentation that I've been able to find, the method of disabling the OS and accessing hardware registers directly seems to be very clear. But, can OS calls still be used for file I/O in this case? And where could I learn to open a custom screen and play audio the OS-friendly way anyways?

I would strongly discourage you to access the hardware yourself. The Os can do everything you can do by using the hardware directly, there is no advantage. So for example, users will no longer be able to switch screens if you do this, or run savely multiple programs. Also, people then often run into a Forbid() to get rid of task-switching, though any Os-bound I/O call will require task switching and enabled interrupts.

Opening a custom screen requires the Os, but is not hard. Information like this can be found in the RKRMs. Basically, you have to fill in a struct NewScreen and perform an OpenScreen() from intuition, or use an OpenScreenTagList() directly. See intuition/screens.i.
Title: Re: newb questions, hit the hardware or not?
Post by: OlafS3 on July 14, 2014, 10:38:04 AM: Quote from: DamageX;225347
I want to write some programs in assembly (please don't suggest a HLL as I won't use one) on my A2000. I want to do file I/O, some graphics on a HAM6 screen, and perhaps audio as well. Now the question is how to go about it. Looking at bits of source code and documentation that I've been able to find, the method of disabling the OS and accessing hardware registers directly seems to be very clear. But, can OS calls still be used for file I/O in this case? And where could I learn to open a custom screen and play audio the OS-friendly way anyways?

I would also discourage to directly hit the hardware. There are a couple of new cores for FPGA in development (FPGA Arcade and the Apollo core are examples, but I also know of at least two others). If you want that your software runs everywhere it is a safer bet to use the OS.
Title: Re: newb questions, hit the hardware or not?
Post by: Jose on July 14, 2014, 11:37:01 AM: 8 years after !? Awesome :)
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 14, 2014, 12:07:29 PM: Quote from: OlafS3;768906
If you want that your software runs everywhere it is a safer bet to use the OS.
You can't always do that if you want your software to run well on current low to mid end systems. Things such as planar CPU blitting (in cases where it's faster than c2p+chunky blitting). You just don't want to use the OS for that.

Making your software system friendly is one thing (I'm all for that), but getting it to run everywhere isn't always a good idea (depending on what the software is).
Title: Re: newb questions, hit the hardware or not?
Post by: biggun on July 14, 2014, 12:27:02 PM: Quote from: OlafS3;768906
I would also discourage to directly hit the hardware. There are a couple of new cores for FPGA in development (FPGA Arcade and the Apollo core are examples, but I also know of at least two others). If you want that your software runs everywhere it is a safer bet to use the OS.

I can only speak for Apollo/ Phoenix CPU and for the S-AGA Chipset.
With them you can savely hit the hardware as you are used to do it.
I myself write old school copperlists when I write demos testing the CPU or the AGA chipset.
So doing this is save and no problem at least for the Apollo/Phoenix CPU and the SAGA++ chipset.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 14, 2014, 03:45:37 PM: Quote from: Thorham;768912
You can't always do that if you want your software to run well on current low to mid end systems. Things such as planar CPU blitting (in cases where it's faster than c2p+chunky blitting). You just don't want to use the OS for that.

There is no problem using the blitter if the Os calls do not offer you want you need - and if you are sure what you are doing. You can reserve the blitter for your own purpose by OwnBlitter()/DisownBlitter() from the graphics.library, or by using QBlit()/QSBlit() to queue a blitter job.

By "you are sure what you are doing" I mean that you cannot, in general, expect that you have a simple old planar bitmap for your screen. Graphics cards exist since a long time, and there working with the blitter is often a bad idea. Whenever possible, use the system calls to get your job done, because then you'll be sure it's done correctly even on non-native hardware.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 14, 2014, 04:52:36 PM: Quote from: Thomas Richter;768926
There is no problem using the blitter if the Os calls do not offer you want you need
I actually meant blitting with the CPU (because it's faster than the CPU on 68020+).

Quote from: Thomas Richter;768926
By "you are sure what you are doing" I mean that you cannot, in general, expect that you have a simple old planar bitmap for your screen.
So, you'd have to sacrifice performance on actual native hardware just because the OS might not open a planar screen on non-native hardware? That means that basically everything you'd write ends up running like crap on actual Amigas without GFX boards. Sounds like a bad idea.
Title: Re: newb questions, hit the hardware or not?
Post by: itix on July 14, 2014, 05:02:12 PM: Quote from: Thomas Richter;768926

By "you are sure what you are doing" I mean that you cannot, in general, expect that you have a simple old planar bitmap for your screen. Graphics cards exist since a long time, and there working with the blitter is often a bad idea. Whenever possible, use the system calls to get your job done, because then you'll be sure it's done correctly even on non-native hardware.

He was going to use HAM6 which locks out certain configurations anyway.
Title: Re: newb questions, hit the hardware or not?
Post by: Sean Cunningham on July 14, 2014, 05:05:28 PM: Speed or compatibility, you can rarely pick more than one of these.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 14, 2014, 05:05:28 PM: Quote from: Thorham;768935
So, you'd have to sacrifice performance on actual native hardware just because the OS might not open a planar screen on non-native hardware? That means that basically everything you'd write ends up running like crap on actual Amigas without GFX boards. Sounds like a bad idea.

No, I do not mean this. Do you have an idea how much "performance you sacrifice" by using the Os calls? Or "how worse" the Os is in using the native hardware? Actually, not much. The core function (BltBitmap()) is pretty much low-level and bare to the metal. I wonder whether you can do any better yourself. The amount of additional checks needed for non-native hardware are a single pointer (in the bitmap). How much performance does that cost?

Sorry, but that's a typical argument of the "cycle counter party": Just hack the hardware because we don't know any better and argue "it's for speed". Rule number one for performance optimizations: Measure first. Then optimize.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 14, 2014, 05:55:52 PM: @Thomas Richter

I've researched the OS functions extensively in the interest of encapsulating the most efficient special effects of the Amiga chipsets in shared libraries. This is not only to make things easier for newbies on the Amiga but also to be able to reroute some of them to emulations on graphics cards for better compatibility.

BltBitMap only works on the Blitter. What about MrgCop, the rather incomplete interface to the Copper? CINIT, CBUMP, CWAIT, and CMOVE are hardly enough macros to suffice! The Copper is able to implement display-list like properties to queue the Blitter much more efficiently than QBlit can do using the CPU and an interrupt. Also, I'd like to be able to merge partial Copper lists based on different starting raster positions so that multiple effects can be combined. (Both usages need not be concurrent since waiting for the Blitter and raster positions in the same CWAIT tend to result in race conditions.) Neither the CNEXTBUFFER macro nor its underlying subroutine to skip to another buffer of Copper instructions was ever implemented even though certain infrastructure was added to graphics/copper.h indicating its existence.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 14, 2014, 06:48:48 PM: Quote from: Thomas Richter;768938
The core function (BltBitmap()) is pretty much low-level and bare to the metal. I wonder whether you can do any better yourself.
If this call uses the blitter, then you can do much better with a 25 mhz 68020/30, depending on what you want to do.

An example is a Fire Emblem/Advance Wars style game, where you have several layers of tiles (all of which can be animated) and sprites (all animated) that are aligned to 16 pixels on the x-axis. With the CPU you can do nice pipelined 32 bit blits that do two tile positions at the same time by reading data for both, and doing a single transpose on that data. Much faster than what the blitter can do.

Quote from: Thomas Richter;768938
The amount of additional checks needed for non-native hardware are a single pointer (in the bitmap). How much performance does that cost?
Nothing, because you simply do that once when the software starts. Not a problem.

Quote from: Thomas Richter;768938
Sorry, but that's a typical argument of the "cycle counter party": Just hack the hardware because we don't know any better and argue "it's for speed".
In the case of the blitter it's already known that it's slower than a 25 mhz 68020/30 with fastmem, especially for the simple blits I described above.

Quote from: Thomas Richter;768938
Rule number one for performance optimizations: Measure first. Then optimize.
Obviously :)
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 14, 2014, 10:03:32 PM: Quote from: Thorham;768944
If this call uses the blitter, then you can do much better with a 25 mhz 68020/30, depending on what you want to do.
Whether it does or does not use the blitter is the matter of circumstances. But that's exactly the reason for *using* this function. If you hack the hardware yourself, you cannot take advantage of improvements or configurations which would otherwise help you to keep your program operating (say on the graphics card) or working fast (by taking advance of a replacement implementation).
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 14, 2014, 10:36:34 PM: Quote from: Thomas Richter;768958
Whether it does or does not use the blitter is the matter of circumstances. But that's exactly the reason for *using* this function. If you hack the hardware yourself, you cannot take advantage of improvements or configurations which would otherwise help you to keep your program operating (say on the graphics card) or working fast (by taking advance of a replacement implementation).
Took a look at BltBitmap() in the autodocs, and it looks like a generic blit routine. Definitely not going to be faster for the example I gave above than an optimized, specialized, pipelined 32 bit CPU blit routine. The example above is too specific to handle properly with a generic routine.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 15, 2014, 06:39:10 AM: Quote from: Thorham;768959
Took a look at BltBitmap() in the autodocs, and it looks like a generic blit routine. Definitely not going to be faster for the example I gave above than an optimized, specialized, pipelined 32 bit CPU blit routine. The example above is too specific to handle properly with a generic routine.

You still do not get it. BltBitmap() may use the CPU, and may run into an optimized, specialized, pipelined 32 bit CPU blit routine. Actually, it most certainly does if P96 is installed.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 15, 2014, 08:09:28 AM: Quote from: Thomas Richter;768970
You still do not get it. BltBitmap() may use the CPU, and may run into an optimized, specialized, pipelined 32 bit CPU blit routine. Actually, it most certainly does if P96 is installed.
I get it perfectly fine, but you don't seem to get the difference between generic, and non-generic blitting routines. BltBitmap() has a source bitmap, a mask and a destination bitmap. How is that going to be optimal when you have a background tile, sprite+mask, overlay+mask and on top of that some pixels from a sprite one tile below, because the sprites are 24 pixels high and another mask for that (tiles are 16x16, sprites are 24 pixels high)? With a specialized routine, you can read ALL of that data in one go, AND you can do it for two tile positions at the same time. With BltBitmap() you need multiple calls to do one tile position.

Basically you're trying to tell me that a generic, one size fits all routine is better than specialized code that has been written specifically for the job. It's just not true.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 15, 2014, 11:19:23 AM: Quote from: Thorham;768972
Basically you're trying to tell me that a generic, one size fits all routine is better than specialized code that has been written specifically for the job. It's just not true.

It is at least better than writing a software that does not work for some configurations. That is what the operating system is good for.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 15, 2014, 11:26:39 AM: The main advantage of using the OS functions for blitting is that it works asynchronously so that the CPU can multitask while the Blitter is still plotting graphics.

Thorham's approach is the old-fashioned way of using the CPU to copy pixels. It works best when you have sufficient cache memory and CPU time to do it that way.

The Blitter is clocked slow (3.5 MHz) and doesn't cache the mask plane when blitting BOBs. I'd call that a shortcoming of the Blitter more than a shortcoming of the OS routines though.

Once faster, FPGA-based chipsets come on the scene, other advantages for using OS functions appear: Using native chunky modes instead of chunky to planar conversion will save a lot of CPU time, for example. If you're banging the hardware, your program will be oblivious to this new chunky hardware.

Does that sum it up?
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 15, 2014, 11:36:30 AM: Quote from: Thomas Richter;768982
It is at least better than writing a software that does not work for some configurations. That is what the operating system is good for.

Or would be good for if it adequately supported all features of the chipset... Copper lists anyone?

My opinion is that until we run all of our software in a static-compiled VM or have some way of propagating macros at install time rather than compile/assemble time, the OS will have some compatibility advantage. Once we reach that point, however, a lot of stuff will shift from runtime to the programming ABI as accessed by the API defined by the OS so that unnecessary runtime checks can be optimized away via constant propagation and dead-code elimination in an ahead-of-time compiler. That's one thing (and maybe the only thing) that AmigaDE did right.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 15, 2014, 11:51:41 AM: Quote from: Thomas Richter;768982
It is at least better than writing a software that does not work for some configurations. That is what the operating system is good for.
It's not when your target is AGA+68020/30 and possibly A1200+trapdoor fastmem. Why do we have to take higher end systems in account, and make sacrifices on the low end, while the software could work perfectly fine on low end?

This is Amiga we're talking about, not some high end computer where making use of the OS for everything might make sense. For low end (68020/30) using the OS is fine. Well behaved software is nice, after all, and killing the OS (except for WHDLoad and demos) isn't a good thing, same for not using the OS's screen open functions. But blitting has to be fast on low end machines, or the software is going to run like crap.

Why does everything have to be adapted for high end machines and custom expansion boards? Want to run Amiga software? Use an Amiga.

Quote from: SamuraiCrow;768983
The main advantage of using the OS functions for blitting is that it works asynchronously so that the CPU can multitask while the Blitter is still plotting graphics.
With the blitter, yes, but the blitter is too slow.

Quote from: SamuraiCrow;768983
Thorham's approach is the old-fashioned way of using the CPU to copy pixels. It works best when you have sufficient cache memory and CPU time to do it that way.
It's also the only way to get good performance on low end machines. It's old fashioned because the hardware is old, and many Amiga users use this hardware.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 15, 2014, 12:00:37 PM: Quote from: Thorham;768987
It's also the only way to get good performance on low end machines. It's old fashioned because the hardware is old, and many Amiga users use this hardware.

Hand-optimizing is a pain but being able to reinstall and reoptimize software from some bytecode that is optimized and compiled at install time will get your cake and let you eat it too! (At the cost of some install time.)
Title: Re: newb questions, hit the hardware or not?
Post by: nicholas on July 15, 2014, 12:06:34 PM: Quote from: Thorham;768987
It's not when your target is AGA+68020/30 and possibly A1200+trapdoor fastmem. Why do we have to take higher end systems in account, and make sacrifices on the low end, while the software could work perfectly fine on low end?

This is Amiga we're talking about, not some high end computer where making use of the OS for everything might make sense. For low end (68020/30) using the OS is fine. Well behaved software is nice, after all, and killing the OS (except for WHDLoad and demos) isn't a good thing, same for not using the OS's screen open functions. But blitting has to be fast on low end machines, or the software is going to run like crap.

Why does everything have to be adapted for high end machines and custom expansion boards? Want to run Amiga software? Use an Amiga.

With the blitter, yes, but the blitter is too slow.

It's also the only way to get good performance on low end machines. It's old fashioned because the hardware is old, and many Amiga users use this hardware.

It's not like we can't detect the machine we are running on and use a different codepath for each architecture. Best of both worlds. :)
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 15, 2014, 12:10:50 PM: Quote from: SamuraiCrow;768989
Hand-optimizing is a pain
680x0 assembly language is one of my computer related hobbies, and optimizing is part of the fun :) Actually, it's probably my favorite part of 680x0 coding. It's also the reason I stick to my 50 mhz 68030: It's challenging and interesting to get certain things to be fast on such systems.

Quote from: SamuraiCrow;768989
but being able to reinstall and reoptimize software from some bytecode that is optimized and compiled at install time will get your cake and let you eat it too! (At the cost of some install time.)
Do we currently have such tools? How fast would that be for 68020/30 compared to hand optimized code?

Quote from: nicholas;768991
It's not like we can't detect the machine we are running on and use a different codepath for each architecture. Best of both worlds. :)
Indeed.
Title: Re: newb questions, hit the hardware or not?
Post by: OlafS3 on July 15, 2014, 12:14:18 PM: would it not be better to have standard libraries for that with versions for the different target platforms? When I worked on building up Aros Vision I did a lot of search (in both web and aminet) and found a lot of libraries that are dedicated to offer fast graphics hiding what platform you use (partly supporting ECS,AGA and RTG). SamuraiCrow knows one of those (and the owner has permitted to make changes on it). If adapted it would make life a lot easier. The 68k codebase is huge and we should make use of it as far as possible. We should have a set of portable libraries covering all important areas like graphics, sound and so on so that developer do not have to care about implementations/optimizations for a certain core (and it would be possible to port it to AROS/MorphOS/AmigaOS too).
Title: Re: newb questions, hit the hardware or not?
Post by: nicholas on July 15, 2014, 12:22:15 PM: Quote from: OlafS3;768994
would it not be better to have standard libraries for that with versions for the different target platforms? When I worked on building up Aros Vision I did a lot of search (in both web and aminet) and found a lot of libraries that are dedicated to offer fast graphics hiding what platform you use (partly supporting ECS,AGA and RTG). SamuraiCrow knows one of those (and the owner has permitted to make changes on it). If adapted it would make life a lot easier. The 68k codebase is huge and we should make use of it as far as possible. We should have a set of portable libraries covering all important areas like graphics, sound and so on so that developer do not have to care about implementations/optimizations for a certain core (and it would be possible to port it to AROS/MorphOS/AmigaOS too).

There are many. Not sure about publicly released ones but I know of several that are in various states of finish.
Title: Re: newb questions, hit the hardware or not?
Post by: smerf on July 15, 2014, 12:29:22 PM: You can bang the hardware, but remember you may have to make several different versions depending which Amiga you want the program to run on.

smerf
Title: Re: newb questions, hit the hardware or not?
Post by: OlafS3 on July 15, 2014, 12:30:07 PM: Quote from: nicholas;768996
There are many. Not sure about publicly released
ones but I know of several that are in various states of finish.

yes there are quiet a lot... but many long forgotten (if they were ever used)

example is this:
http://aminet.net/package/dev/misc/gms_dev

the author allows it to be patched. AGA works (and propably ECS). RTG could be supported too if someone makes the changes. The problem is it seems not be used very much. Years of work in it but everyone reinvents the wheels. We should use it, perhaps make a documentation about the state, recommendations when to use and so on, get permissions and sources where possible and start to improve these libraries, make includes/modules for different languages. That would help development a lot more than many other ideas.
Title: Re: newb questions, hit the hardware or not?
Post by: itix on July 15, 2014, 12:34:46 PM: Quote from: OlafS3;768994
would it not be better to have standard libraries for that with versions for the different target platforms? When I worked on building up Aros Vision I did a lot of search (in both web and aminet) and found a lot of libraries that are dedicated to offer fast graphics hiding what platform you use (partly supporting ECS,AGA and RTG). SamuraiCrow knows one of those (and the owner has permitted to make changes on it). If adapted it would make life a lot easier. The 68k codebase is huge and we should make use of it as far as possible. We should have a set of portable libraries covering all important areas like graphics, sound and so on so that developer do not have to care about implementations/optimizations for a certain core (and it would be possible to port it to AROS/MorphOS/AmigaOS too).

At overall they are not better than OS routines. Maybe some functions are optimized better for some hardware but you still can't get max performance out from older Amigas. Sprites are hard to abstract since they have many HW restrictions on Amiga and Copper is completely missing from RTG.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 15, 2014, 12:37:27 PM: Quote from: nicholas;768991
It's not like we can't detect the machine we are running on and use a different codepath for each architecture. Best of both worlds. :)

We could if we coded for a static VM and let the final optimization take place at install time.

Quote from: Thorham;768993
Do we currently have such tools? How fast would that be for 68020/30 compared to hand optimized code?

Mostly all that exists at this point is compiler middle-ware like LLVM or fixed-function compilers like GCC and VBCC. But I know LLVM has a PBQP register allocator that is smart enough to stuff multiple small values in a large register when it makes sense to do so, for example:
Code: [Select]
move.w (a0,var1), d0 ;load var1 swap d0 ;stuff it in top half of d0 move.w (a0, var2), d0; load var2 in the bottom half of d0
and so on. Compilers can be smart if they are programmed to be.

Once implemented, a virtual machine based on this could make a difference for anybody with generally unsupported hardware! This technology is already in use by the PNaCl VM in the Google Chrome browser but for little-endian machines only. This is why low-level coding is dying out, not only because high-level code is cheaper to make, but the optimization can be automated!

I've been pushing for this stuff since 1998 and if Amiga, Inc. hadn't been so pig-headed stubborn, we'd have had this on the Amigas by now! AmigaDE could have made Amiga much more compatible without the need to alter the hardware. Since LLVM is Apple-supported open-source freeware and the backend used by the PNaCl VM is also (except-Google supported), it is almost in reach again!
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 15, 2014, 12:48:31 PM: Quote from: itix;769001
At overall they are not better than OS routines. Maybe some functions are optimized better for some hardware but you still can't get max performance out from older Amigas. Sprites are hard to abstract since they have many HW restrictions on Amiga and Copper is completely missing from RTG.

I've been studying how to do this since 1995. Sprites may be hard to abstract but they are easy to emulate.

One of my planned libraries will be to make it possible to "tile" sprites side-by-side and top-to-bottom. All it needs is for a rectangle of a layer to be allocated for each 16-color sprite pair (using a non-standard interleave so that it appears to the OS to be a narrow 16-color screen). Layers.library will queue a separate blitter operations to each rectangle with clipping applied so that the seams between the sprites will not be detectable. Of course the main graphics will have to be specially coded for Amiga but getting it to work on other systems will be dead-simple by comparison.

I've got similar plans for the effects created with the Copper as well.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 15, 2014, 02:25:36 PM: To SamuraiCrow (http://www.amiga.org/forums/member.php?u=72):

That's all very nice, but I like programming in 680x0 assembly language, it's one of my computer related hobbies, and I'm not the only one.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 15, 2014, 02:38:54 PM: @Thorham

That's cool. Just don't throw away your Assembly source. The comments help others to figure out how to cross-assemble it into bytecode. :-)
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 15, 2014, 05:39:57 PM: Quote from: SamuraiCrow;769006
One of my planned libraries will be to make it possible to "tile" sprites side-by-side and top-to-bottom. All it needs is for a rectangle of a layer to be allocated for each 16-color sprite pair (using a non-standard interleave so that it appears to the OS to be a narrow 16-color screen). Layers.library will queue a separate blitter operations to each rectangle with clipping applied so that the seams between the sprites will not be detectable. Of course the main graphics will have to be specially coded for Amiga but getting it to work on other systems will be dead-simple by comparison.

Why do you make this needlessly complicated? First, layers is *not* the right library or abstraction for sprite emulation or blitting. It only maintains cliprects and damage regions, nothing else. But, if you want movable objects, there is already perfect support for this. Graphics.library BOBs exist since ever, provide an unlimited number of moving and animated objects. The correct abstraction is there, is in the Os and is completely supported. IIRC, the workbench uses them for moving drawers and icons around.

The copper is too specialized and, given the number of colors a graphics card supports, not even required there. The copper was a chip from the 80s required to work around the limited bandwidth chips had back then. Similar effects do not require a copper nowadays, and allow a simple straight-foreward approach that was not available back then.
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 15, 2014, 05:47:14 PM: @Thorham

Hit the hardware is issue of speed vs comparability / flexibility, for the user it means it will limit there hardware choice..

It depends on what you want to do, for most part you should avoid banging the hardware, but if you are interested in making a OCS game for Amiga500, it makes the most sense to take advantage of every CPU cycle you can get, and try to avoid the overhead of the software API.

Writing a game for a modern graphic card, and using CIAA/CIAB timers, will exclude many people from using your software (AmigaONE-*/Sam4*0), there for its not a good idea.

Poking around in $DFF00A and $DFF00C, to get Joystick and mouse, will limit your software to only to work with 9 pin joysticks, and mouse, and will not allow your software to support USB mouse and Joysticks.

Many people have USB upgrades to there classic Amiga (Algor/Subway), and NG users don't have 9 pin bus mouse at all.

Other things people do that is annoying is that they assume they know BytesPerRow, this forces the game or program to only run in one screen resolution, and fixed mode ID's that can't be changed.

For examples if you write game/program to use Paula sound chips, then it might be faster on a MC68020, where don't have lot of CPU power, but you will not allow your game/program to use a modern sound cards like Prelude 1200 or any other sound card.

Sticking to graphic.library will make your game or program run on any hardware but, it will most likely run slower then if you wrote directly to memory, instead of using graphic draw functions, and so on.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 15, 2014, 06:22:19 PM: Quote from: LiveForIt;769029
Sticking to graphic.library will make your game or program run on any hardware but, it will most likely run slower then if you wrote directly to memory, instead of using graphic draw functions, and so on.

I pretty much doubt this. If there is no cliprect (i.e. the rastport has no layer) graphics.library goes directly to the hardware on native amigas. There is no overhead in such a case. Again, before making such claims, please measure.

If there is a graphics card, it goes to the hardware of the graphics card if it supports 2D acceleration, same story. Only difference: It works on all hardware.
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 15, 2014, 07:06:09 PM: Quote from: Thomas Richter;769032
graphics.library goes directly to the hardware on native amigas.

I guess it depends on what you want to do, but writing directly too hardware might be slower.

Chipmem is slow, and fastmem is fast, so pre rendering into fast makes sense, if your doing a lot of read and writs from/to memory, and then using the most efficient way to transfare to chip memory.

The same is also true for modern graphic cards, writing directly to graphic card memory over the PCIe interface is slow, because it does not allow you to use DMA to copy the data.

The OS pen system is not most optimized for any picture format.

Quote
Again, before making such claims, please measure.

I should maybe not be so absolute in my statement.

Quote
it goes to the hardware of the graphics card if it supports 2D acceleration, same story. Only difference: It works on all hardware.

I take your word for it.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 15, 2014, 09:58:13 PM: Quote from: LiveForIt;769029
It depends on what you want to do
Write software that's fast on 20s and 30s with AGA, and when possible OCS/ECS, while keeping it system friendly, but without sacrificing speed (no need for syskill practices, custom screens and what not, of course).

Quote from: LiveForIt;769029
Writing a game for a modern graphic card, and using CIAA/CIAB timers, will exclude many people from using your software (AmigaONE-*/Sam4*0), there for its not a good idea.
I'm only interested in writing Amiga software anyway.

Quote from: LiveForIt;769029
Poking around in $DFF00A and $DFF00C
On 20s and 30s there's no need for that. Especially the mouse and keyboard are very easy to handle properly with the OS.

Quote from: LiveForIt;769029
Other things people do that is annoying is that they assume they know BytesPerRow
For a game that opens a fixed sized screen this isn't a problem. On Amigas this is generally not a problem, and if some user can't use this, then I wonder how their Amiga is set up.

Quote from: LiveForIt;769029
this forces the game or program to only run in one screen resolution, and fixed mode ID's that can't be changed.
You can at least use different monitor IDs so that double scan is supported. For non-games software it's a definite no-no.

Quote from: LiveForIt;769029
For examples if you write game/program to use Paula sound chips, then it might be faster on a MC68020, where don't have lot of CPU power, but you will not allow your game/program to use a modern sound cards like Prelude 1200 or any other sound card.
Not a problem when you run the software on an actual Amiga, unless someone doesn't have Paula output connected, and that's their own problem.

Quote from: LiveForIt;769029
Sticking to graphic.library will make your game or program run on any hardware but, it will most likely run slower then if you wrote directly to memory, instead of using graphic draw functions, and so on.
Only using the OS in general is unacceptable for me, because it only helps NG users. Why should we take NG users in account when writing Amiga software? If people want to run Amiga software, then let them use Amiga computers or emulation.
Title: Re: newb questions, hit the hardware or not?
Post by: biggun on July 15, 2014, 10:07:06 PM: Quote from: LiveForIt;769029
@Thorham

Hit the hardware is issue of speed vs comparability / flexibility, for the user it means it will limit there hardware choice..

It depends on what you want to do, for most part you should avoid banging the hardware, but if you are interested in making a OCS game for Amiga500, it makes the most sense to take advantage of every CPU cycle you can get, and try to avoid the overhead of the software API.

Writing a game for a modern graphic card, and using CIAA/CIAB timers, will exclude many people from using your software (AmigaONE-*/Sam4*0), there for its not a good idea.

Poking around in $DFF00A and $DFF00C, to get Joystick and mouse, will limit your software to only to work with 9 pin joysticks, and mouse, and will not allow your software to support USB mouse and Joysticks.

What you say it correct.

But it does not need to be like this.
A CIA need minimal FPGA
By adding the smallest and cheapest FPGA to a new system - every system could be made CIA compatible for a price of close to nothing.

The same is true with USB and accessing them via DFFxxx registers.
A USB to DFFxxx bridge logic costs around $2.

Technically there is really no reason to not havce both.
A NEO sytem could with no problem at all implement USB and DFF chipset and CIAs for nearly not money.
Title: Re: newb questions, hit the hardware or not?
Post by: matthey on July 16, 2014, 12:16:29 AM: Quote from: Thomas Richter;769032
I pretty much doubt this. If there is no cliprect (i.e. the rastport has no layer) graphics.library goes directly to the hardware on native amigas. There is no overhead in such a case. Again, before making such claims, please measure.

If there is a graphics card, it goes to the hardware of the graphics card if it supports 2D acceleration, same story. Only difference: It works on all hardware.

I agree with your point on using the AmigaOS (where possible given constraints) but I disagree with the "no overhead" claim to using the AmigaOS, even if it "goes directly to the the hardware". Function calls through the jump table have overhead and compiled AmigaOS code is not optimal. For example, your new layers.library is riddled with instructions like:

lea (0,a6),a4 ; optimize to move.l a6,a4
move.l #0,-(sp) ; optimize to clr.l -(sp)
lea (a3),a6 ; optimize to move.l a3,a6

That's just talking about peephole optimizations. It's compiled for the 68000 which isn't a huge loss with this particular code but it's a few percent more. Actually, the biggest gain could be using the auto word extension of address registers which SAS/C doesn't take advantage of (MVS or MOVE.W+EXT.L fusion/folding would be a huge gain with this code). There is no instruction scheduling for superscalar processors like the 68060. Cache accesses sometimes look like they aren't even avoided like this:

move.w (a5),d0
move.w (a5)+,d1

We are not talking about a cycle or 2. All these lack of optimizations add up and then programmers roll their own code to gain 10%+ speed over the AmigaOS. I want programmers to use the AmigaOS functions (but not required). We need to improve compilers and try to make code close to optimal for this to happen. Call me a cycle counter and ignore me if you like.

Quote from: Thorham;769052
Write software that's fast on 20s and 30s with AGA, and when possible OCS/ECS, while keeping it system friendly, but without sacrificing speed (no need for syskill practices, custom screens and what not, of course).

With a little bit of learning, it's possible to write code that is fairly optimal on the 68020-68060. Instruction scheduling for a 68060 generally doesn't affect 68020/68030 performance but can double 68060 performance with some code. Learning about modern CPU pipelining, superscalar execution, caches, hazards/bubbles, etc. (not just 68020/68030 timings and specifics) will improve your code for the 68020/68030 also. We may get a superscalar 68k fpga CPU someday where your code will magically be much faster too ;).

Quote from: biggun;769054
What you say it correct.

But it does not need to be like this.
A CIA need minimal FPGA
By adding the smallest and cheapest FPGA to a new system - every system could be made CIA compatible for a price of close to nothing.

The same is true with USB and accessing them via DFFxxx registers.
A USB to DFFxxx bridge logic costs around $2.

Technically there is really no reason to not havce both.
A NEO sytem could with no problem at all implement USB and DFF chipset and CIAs for nearly not money.

The SAM 440 and 460 have a small Lattice fpga. The mentality of some of the so called next generation Amiga guys is to get away from hardware dependency. They also may be trying to keep their AmigaOS closed for proprietary and security reasons.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 16, 2014, 12:28:37 AM: Quote from: matthey;769066
The mentality of some of the so called next generation Amiga guys is to get away from hardware dependency.

If you can't hit the hardware to read the mouse buttons then it's not an Amiga. However that only supports two ports, if you want more than two then you need to use the OS.
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 16, 2014, 06:04:47 AM: Quote from: Thorham;769052
On 20s and 30s there's no need for that, Especially the mouse and keyboard are very easy to handle properly with the OS.

its even easier to check the BITS, that's way some developers do it.
No need to reply intuition messages and so on, its maybe more common in games and demos.

Quote
and if some user can't use this, then I wonder how their Amiga is set up.

Some people like to use ModePro or some tool like that :-)
lol

Quote
Only using the OS in general is unacceptable for me, because it only helps NG users.

Not really, many Amiga1200/4000 users have Opalvision/Cybervison64/Cybervison3D/BlizzardVision, Grex/Mediator bus board upgrades some Roaden graphic card, sure they get around it by simply selecting a different video source on the monitor or some thing like that, or have two monitors connected, Its maybe more of convenience for this people, but might be problem for some with out scan double if they exist.

:-)
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 16, 2014, 07:07:53 AM: @biggun

I'm shore it will help. But its just in drop in the sea.

Quote from: matthey;769066
The SAM 440 and 460 have a small Lattice fpga. The mentality of some of the so called next generation Amiga guys is to get away from hardware dependency.

It might not be connected in away that allowed it to be used emulate CIAA/CIAB.

I believe its used to control the clock speed, the FPGA is also programmed as GPIO, so it can be used in theory as joystick port easily. Just configure the pins as inputs, and make small 9pin dsub cable.

But to get it on right address I think you need to do some MMU magic.

Quote
They also may be trying to keep their AmigaOS closed for proprietary and security reasons.

Most things are no harder to do then on AmigaOS3.x, AmigaOS4.x is hackable if you like to, but its not open as you say.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 16, 2014, 08:20:03 AM: Quote from: matthey;769066
With a little bit of learning, it's possible to write code that is fairly optimal on the 68020-68060. Instruction scheduling for a 68060 generally doesn't affect 68020/68030 performance but can double 68060 performance with some code.
Can't do it. 20s and 30s have priority for me. Not to mention that instruction scheduling sucks. My goal is to get something to run well on the lower end machines (25 mhz 68020). When something runs well on such machines, why would I need to optimize for 68060s? For me anything above 68030 is irrelevant in terms of optimizing, because if a '30 can run it fast enough, then so can a '40 or '60. I also don't have a '40 or '60.

Quote from: matthey;769066
Learning about modern CPU pipelining, superscalar execution, caches, hazards/bubbles, etc. (not just 68020/68030 timings and specifics) will improve your code for the 68020/68030 also.
Really? So, you're telling me that on 20/30 there's more than cache+timings+pipeline? Interesting!
Title: Re: newb questions, hit the hardware or not?
Post by: matthey on July 16, 2014, 09:23:23 AM: Quote from: Thorham;769105
Can't do it. 20s and 30s have priority for me. Not to mention that instruction scheduling sucks. My goal is to get something to run well on the lower end machines (25 mhz 68020). When something runs well on such machines, why would I need to optimize for 68060s? For me anything above 68030 is irrelevant in terms of optimizing, because if a '30 can run it fast enough, then so can a '40 or '60. I also don't have a '40 or '60.

More performance is always useful. Settings with better gfx, more sound effects/music and more options can be turned on a 68040/68060. Some games are nicer at 30fps than 20fps even if they are playable and fun at 20fps on a 68020/030. It does take a little more time to instruction schedule code but the code become re-usable for more and expanded projects.

Quote from: Thorham;769105

Really? So, you're telling me that on 20/30 there's more than cache+timings+pipeline? Interesting!

The 020/030 is friendly being lightly pipelined but performance is affected by alignment and data sizes (32 bit is sometimes faster than 16 bit) at least. Unfortunately, documentation is lacking in general for 68k instructions in regards to hazards/bubbles and instruction scheduling. I know the 020/030 has some instruction overlap but I don't know if it's enough to affect resource availability from instruction to instruction. Contrary to most 68k Amiga programmers, I have studied the 040 and 060 more (and I know more about the AmigaOS functions than banging the Amiga hardware also). Avoiding general slow downs for the 040/060 rarely hurts and sometimes helps 020/030 performance. This in contrast to the 68000 where optimizing for 68000-68060 is difficult as the 68000 is a 16 bit processor.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 16, 2014, 09:52:54 AM: Quote from: matthey;769110
More performance is always useful. Settings with better gfx, more sound effects/music and more options can be turned on a 68040/68060. Some games are nicer at 30fps than 20fps even if they are playable and fun at 20fps on a 68020/030. It does take a little more time to instruction schedule code but the code become re-usable for more and expanded projects.
Sure, but it depends on what you're writing. I'm writing a Fire Emblem clone with a full tiled display with several layers (16x16 pixel anim tiles, 16x24 map sprites, 16x8 status icons) and it's already much faster than needed on my 50 mhz '30 (you only need around seven FPS to get the animations to look like the original), and that's with out pipelining. I see no reason to optimize for 40/60 at all in this case, and will instead try to get as close as I can to A1200 with trapdoor fastmem.

Another example might be a text editor that has to run well in 640x512+overscan in eight colors (nice for double scan modes). Getting this to run as fast as possible on 20/30 obviously has priority over 40/60 where optimized 20/30 code will be fast enough anyway.

There is some software where it might be important, but for those cases it might be a better idea to write separate loops for 20/30 and 60. It's not as if it's much extra work.

Quote from: matthey;769110
The 020/030 is friendly being lightly pipelined but performance is affected by alignment and data sizes (32 bit is sometimes faster than 16 bit) at least. Unfortunately, documentation is lacking in general for 68k instructions in regards to hazards/bubbles and instruction scheduling. I know the 020/030 has some instruction overlap but I don't know if it's enough to affect resource availability from instruction to instruction. Contrary to most 68k Amiga programmers, I have studied the 040 and 060 more (and I know more about the AmigaOS functions than banging the Amiga hardware also). Avoiding general slow downs for the 040/060 rarely hurts and sometimes helps 020/030 performance. This in contrast to the 68000 where optimizing for 68000-68060 is difficult as the 68000 is a 16 bit processor.
Should be interesting to check out.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 16, 2014, 11:15:54 AM: Quote from: Thomas Richter;769028
Why do you make this needlessly complicated? First, layers is *not* the right library or abstraction for sprite emulation or blitting. It only maintains cliprects and damage regions, nothing else. But, if you want movable objects, there is already perfect support for this. Graphics.library BOBs exist since ever, provide an unlimited number of moving and animated objects. The correct abstraction is there, is in the Os and is completely supported. IIRC, the workbench uses them for moving drawers and icons around.

Now that I think about it you're right. Calling ClipBlit on multiple clipping regions would do the job just as well.

BOBs won't work for what I have planned because split-screen effects require seams in the display DMA. Sprites may be able to overlap this though.

Quote from: Thomas Richter;769028
The copper is too specialized and, given the number of colors a graphics card supports, not even required there. The copper was a chip from the 80s required to work around the limited bandwidth chips had back then. Similar effects do not require a copper nowadays, and allow a simple straight-foreward approach that was not available back then.

Who said anything about JUST using palette-changes? I'm talking about seamless both horizontal and vertical split-screen effects (unlike those buggy ones that leave a pixel seam like the OS does) at the same time as palette changes and maybe a few more Copper effects at the same time.

The Copper and sprite hacks would be less necessary if Commodore hadn't cheated their customers on the amount of Chip RAM that was addressable on AGA. It should have been expandable beyond 2 megs! For what it's worth, the compatible version of my libraries won't use hacks at all! It only will support hacks on the low-end systems that require them for performance reasons. Encapsulation is the goal.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 16, 2014, 11:28:54 AM: Quote from: SamuraiCrow;769115
Sprites may be able to overlap this though.
That won't work well for double scan users, because of the limited number of sprites. I think there may actually be only one sprite available in double scan modes :(
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 16, 2014, 11:33:47 AM: @Thorham

I'm not going to support double-scan resolutions without a hardware scan-doubler. There's also not enough bandwidth on the AGA Chip bus for what I'm planning to do anyway, in double-scan modes.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 16, 2014, 12:13:28 PM: Quote from: SamuraiCrow;769117
for what I'm planning to do
What exactly are you planning?
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 16, 2014, 12:19:42 PM: Quote from: matthey;769066
I agree with your point on using the AmigaOS (where possible given constraints) but I disagree with the "no overhead" claim to using the AmigaOS, even if it "goes directly to the the hardware". Function calls through the jump table have overhead and compiled AmigaOS code is not optimal. For example, your new layers.library is riddled with instructions like:

lea (0,a6),a4 ; optimize to move.l a6,a4
move.l #0,-(sp) ; optimize to clr.l -(sp)
lea (a3),a6 ; optimize to move.l a3,a6

That's exactly what I call a "cycle counter party argument". It is completely pointless because it makes no observable difference. Probably the reverse, the compiler had likely made the choice for a reason. Anyhow, the low-level graphics.library is in assembly, if that makes you feel any better. Still, does not make a difference. Fast code comes from smart algorithms, not micro-optimizations. V45 is smarter in many respects because it avoids thousands of CPU cycles of worthless copy operations in most cases, probably of the expense of a couple of hundred CPU cycles elsewhere.
Quote from: matthey;769066
We are not talking about a cycle or 2. All these lack of optimizations add up and then programmers roll their own code to gain 10%+ speed over the AmigaOS.
Which I actually doubt, and even if it would be hardly noticable because there is more that adds up to the complexity of moving windows than a couple of trivial move operations. Actually, V45 is faster, not slower, because it is smarter algorithmically.
Quote from: matthey;769066
I want programmers to use the AmigaOS functions (but not required). We need to improve compilers and try to make code close to optimal for this to happen. Call me a cycle counter and ignore me if you like.
Pointless argument, see above. It requires algorithmic improvements, or probably additional abstractions to make it fit to the requirements of its time. Arguing about a
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 16, 2014, 12:52:51 PM: Quote from: Thorham;769119
What exactly are you planning?

One of the libraries I was planning to write was a video codec that would stream Copperlists and Chip memory data from a CD-ROM, hard drive, or Flash memory in realtime. The Blitter and Copper will need as much bandwidth as they can get. It's a well-known fact that AGA's scan-doubling hardware takes twice as much display DMA bandwidth to create the same resolution as a single-scan display mode. (I'm debating the format to use for the disk-based portion since the CPU will have to run address relocation on each frame's Copper-list at the minimum. Maybe a simple JIT will make sense as well, to improve disk transfer speeds.)
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 16, 2014, 12:53:05 PM: You seem to have misread the "algorithm first, implementation later" rule of optimization as "algorithm first, implementation never, also you're stupid and horrible for thinking that a human being could ever be smarter than a piece of software engineered by human beings, or that multiple small numbers can add up into a larger number!" there, Thomas.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 16, 2014, 01:00:59 PM: @Thomas Richter

If you used GCC to generate 68k code, I'd have to ask which version. The 68k backends have bit-rotted terribly due to lack of maintenance. (And as Matthey observed, it misses loads of optimizations and may have never been fully complete in the first place. Simply using an optimizing assembler like VASM instead of GAS would help too.)

Also, cycle counting works for compiler designers. But if you want to avoid cycle counting you ought to choose your compiler more carefully than to use a bit-rotted heap of old code. The x86 may have nearly-optimal code generation in free compilers but the 68000 has never had terribly good compilers.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 16, 2014, 01:41:59 PM: Quote from: SamuraiCrow;769124
@Thomas Richter

If you used GCC to generate 68k code, I'd have to ask which version.

That's a plain simple SAS C 6.51 simply because the Os development chain depends on it (with the exception of intuition, actually, which depended on a really rotten compiler.) Anyhow, I stand for my opinion. Pointless argument. If you want to write video codec, the *blitter* is your least problem. The decoding algorithm will make a huge difference, and even there it makes sense to optimize the algorithm first. Been there, done that. That was actually a JPEG 2000 if you care.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 16, 2014, 02:37:34 PM: Quote from: commodorejohn;769123
or that multiple small numbers can add up into a larger number!" there, Thomas.

Please get your math fixed. If you have n algorithms, each of them spends 1/nth of the time in solving a problem, and each of them is speed up by 10%, the overall speedup is still 10%. In fact, if you only speed up one of them (e.g. layers) by 10%, the overall improvement is much smaller, depending on n, and even marginal.

If, however, you have an algorithm whose running time grows as O(N^2) (N being the number of layers being moved, arranged or resized) and that is replaced by an O(N) algorithm (as it happened, actually), then even for suitably small N the improvement can be enormous. It is really that simple. Do not waste your time optimizing the useless details. Get the big picture correct. Then, if performance is still not right, check whether the problem is, find the bottlenecks, and either get rid of them by changing the algorithm, or optimize only there.
Title: Re: newb questions, hit the hardware or not?
Post by: SamuraiCrow on July 16, 2014, 02:44:28 PM: Quote from: Thomas Richter;769125
That's a plain simple SAS C 6.51 simply because the Os development chain depends on it (with the exception of intuition, actually, which depended on a really rotten compiler.)

I've worked with that one also. Generates pretty good code most of the time. If you use deep orders of operation in a formula, it stuffs the temporaries to the stack regardless of how many registers are free for temporary variables. Also, ChaosLord used SAS/C for his game writing and it occasionally would get confused and generate pure nonsense code that wouldn't even execute. In that event inline Assembly is unavoidable.

Quote from: Thomas Richter;769125
Anyhow, I stand for my opinion. Pointless argument. If you want to write video codec, the *blitter* is your least problem. The decoding algorithm will make a huge difference, and even there it makes sense to optimize the algorithm first. Been there, done that. That was actually a JPEG 2000 if you care.

I would care, if I were making a bitmap-based codec. I was planning on using mostly filled vectors though. I know how to optimize a full-screen vector into the minimum number of line-draws so that the whole screen can do a single vector-fill operation. That full-screen, full bitplane-depth pass is going to be costly though, as are the uncompressed audio samples. I may have to triple-buffer the display and use the CPU to clear the screen after it's been displayed just to take some strain off the Blitter.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 16, 2014, 04:23:58 PM: Quote from: Thomas Richter;769129
Please get your math fixed. If you have n algorithms, each of them spends 1/nth of the time in solving a problem, and each of them is speed up by 10%, the overall speedup is still 10%. In fact, if you only speed up one of them (e.g. layers) by 10%, the overall improvement is much smaller, depending on n, and even marginal.
Yes, obviously. But: on a platform with a bus speed of less than 8MHz, 10% can make the difference between having enough time to get everything done in one frame or suffering a reduction in framerate. It doesn't matter how big or small of a percentage it is, it's the practical impact that matters.

Quote
If, however, you have an algorithm whose running time grows as O(N^2) (N being the number of layers being moved, arranged or resized) and that is replaced by an O(N) algorithm (as it happened, actually), then even for suitably small N the improvement can be enormous. It is really that simple. Do not waste your time optimizing the useless details. Get the big picture correct. Then, if performance is still not right, check whether the problem is, find the bottlenecks, and either get rid of them by changing the algorithm, or optimize only there.
Nobody's arguing that algorithmic optimization shouldn't be the first resort, or that it doesn't have much greater potential for performance increases. Of course that's true; it's so well known to be the case that everybody but you is taking it as a given. But that in no way makes the question of whether good code is being generated "useless details." You can't come up with an infinite series of successive algorithmic optimizations - eventually you're going to hit the optimal algorithm for the particular application and platform, and if that's not enough, no amount of casting about for better, purer Ideas will get you any further; you either have to get down and dirty with the low-level stuff, or give up on making it work. Cycles count, no matter how much you want to pretend that isn't the case - if that weren't true, there would be no difference between my Amiga 1200 and my Core 2 Duo laptop other than the quality of their algorithms.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 16, 2014, 07:32:17 PM: Quote from: commodorejohn;769133
Yes, obviously. But: on a platform with a bus speed of less than 8MHz, 10% can make the difference between having enough time to get everything done in one frame or suffering a reduction in framerate. It doesn't matter how big or small of a percentage it is, it's the practical impact that matters.
Was layers V40 usable on that machine before? Yes. Thus, apparently, even with the deficiencies the algorithm had, the result was ok. The improvement yout get is more than 10% now. Did you notice? Most the time, no. Thus, do you think a 10% improvement in micro-optimizations would make a difference? I can tell you my answer: I don't care.
Quote from: commodorejohn;769133
But that in no way makes the question of whether good code is being generated "useless details." You can't come up with an infinite series of successive algorithmic optimizations - eventually you're going to hit the optimal algorithm for the particular application and platform, and if that's not enough, no amount of casting about for better, purer Ideas will get you any further; you either have to get down and dirty with the low-level stuff, or give up on making it work. Cycles count, no matter how much you want to pretend that isn't the case - if that weren't true, there would be no difference between my Amiga 1200 and my Core 2 Duo laptop other than the quality of their algorithms.

No, cycles do not count, at least not at this microscopic level. Improve the algorithm. If that's not enough, identify the bottlenecks. Then optimize there. Arguing about micro-optimizations like replacing lea's with move''s is a pointless argument since it won't make a difference. What's the bottleneck in layers: Not the lea's. The bottleneck is copying data from A to B, bitmap data. Layers V40 copied N times when re-arranging a single out of N overlapping layers. Layers V45 copies a constant number of times, twice: Current data to backing store, backing store to front layer, independent of N. *That* makes a difference because the number of cycles required to copy graphics data around makes a difference. Not the individual instructions to arrange the layer structure. It is a pointless cycle-counter argument to replace individual instructions at microscopic level because that's not where the bottleneck is. The algorithm is better, the amount of copies is lower. So, no, whoever counts cycles does not understand the problem, you're looking at the problem on a much too fine scale to be able to identify it. The problem is to measure performance, use a profiler or some other tool to identify bottlenecks, then improve it. I do not need to count cycles to get there.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 16, 2014, 07:40:17 PM: Quote from: Thomas Richter;769148
Thus, do you think a 10% improvement in micro-optimizations would make a difference? I can tell you my answer: I don't care.
Then why are you getting so worked up about it?
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 16, 2014, 08:56:39 PM: Use the right algorithm? Really? How obvious :rolleyes: And because we're now using the right algorithm, optimizing it's performance isn't necessary, implying any crappy implementation will do. Yeah, right :rolleyes:

Use the right algorithm, AND write a PROPER implementation of it. Why bother with anything less? And besides, cycle counting is fun. Nothing beats hand optimizing tight loops on 20s and 30s :p Waste of time? Not to me, it's a hobby :p
Title: Re: newb questions, hit the hardware or not?
Post by: matthey on July 16, 2014, 10:15:17 PM: Quote from: Thomas Richter;769121
That's exactly what I call a "cycle counter party argument". It is completely pointless because it makes no observable difference. Probably the reverse, the compiler had likely made the choice for a reason.

In all 3 of my peephole optimization examples, the CCR is set in the same way. Vbcc's vasm assembler would optimize these 3 examples by default. The savings may be more than you expect even if a single peephole optimization "makes no observable difference". Let's take a look at the simple:

lea (0,a6),a4

It looks short and harmless. The instruction is 4 bytes instead of 2 bytes for the MOVE equivalent so the extra fetch is only a fraction of the cost of executing an instruction on the 68000-68030, not counting any code that falls out of the ICache. The 68040 and 68060 can handle this instruction in 1 cycle with the 68060 only using 1 pipe. Now let's use A4 for addressing in the next instruction like this:

lea (0,a6),a4
move.l (8,a4),d0

There is a 2-4 cycle bubble between the instructions on the 68040+ (including pipelined fpga 68k processors). A superscalar processor like the 68060 can have 2 integer pipes sitting idle for several cycles instead of executing half a dozen instructions. The above code looks like a compiler flaw anyway. If a6=a4 then it should use a6 as a base instead of copying it and using the copy.

Quote from: Thomas Richter;769121
Anyhow, the low-level graphics.library is in assembly, if that makes you feel any better. Still, does not make a difference. Fast code comes from smart algorithms, not micro-optimizations. V45 is smarter in many respects because it avoids thousands of CPU cycles of worthless copy operations in most cases, probably of the expense of a couple of hundred CPU cycles elsewhere.

Smart algorithms are the starting point to efficient executables and I appreciate your work to that end. Your layers.library probably runs at half the speed of what the 68060 is capable of because of non-algorithm issues even though it may be several times faster than it was before with many overlapping windows. You could say the 68060 is fast enough already so there is no need to have efficient code for it. If you were a compiler writer, you would have a 68000 backend for the 68060, an 8086 (or would it be 8080) backend for x86/x86_64 and 32 bit ARM backend for Thumb and ARMv8 processors all with no optimization options and no optimizations. Your job would be complete and any complaints would be met with "make better algorithms".

Quote from: Thomas Richter;769121
Which I actually doubt, and even if it would be hardly noticable because there is more that adds up to the complexity of moving windows than a couple of trivial move operations. Actually, V45 is faster, not slower, because it is smarter algorithmically.

I estimated your layers.library code could be ~10% faster on the 68020/68030 if compilers were better and you cared. Yes, that doesn't mean layers operations will be 10% faster because other code probably has the same problems.

Quote from: Thomas Richter;769121
Pointless argument, see above. It requires algorithmic improvements, or probably additional abstractions to make it fit to the requirements of its time. Arguing about a

I guess it's such a pointless argument that you stopped typing mid sentence?
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 16, 2014, 10:26:56 PM: You tell'm, matthey :D
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 07:36:36 AM: Quote from: matthey;769161
In all 3 of my peephole optimization examples, the CCR is set in the same way. Vbcc's vasm assembler would optimize these 3 examples by default. The savings may be more than you expect even if a single peephole optimization "makes no observable difference". Let's take a look at the simple:

lea (0,a6),a4

It looks short and harmless. The instruction is 4 bytes instead of 2 bytes for the MOVE equivalent so the extra fetch is only a fraction of the cost of executing an instruction on the 68000-68030, not counting any code that falls out of the ICache. The 68040 and 68060 can handle this instruction in 1 cycle with the 68060 only using 1 pipe. Now let's use A4 for addressing in the next instruction like this:

lea (0,a6),a4
move.l (8,a4),d0

There is a 2-4 cycle bubble between the instructions on the 68040+ (including pipelined fpga 68k processors). A superscalar processor like the 68060 can have 2 integer pipes sitting idle for several cycles instead of executing half a dozen instructions. The above code looks like a compiler flaw anyway. If a6=a4 then it should use a6 as a base instead of copying it and using the copy.

And I tell you again that this is a pointless argument because it makes no observable difference. It is a waste of time to spend such effort in micro-optimizations because you loose the vision on the big picture. There is not much time spend in these algorithms in first place, but a lot of time copying data around that does not require copying. The problem was elsewhere. If any of you cycle-counters had layers (or any other component of the Os, for that matter) under your fingers, you would probably replace leas by moves or vice versa, probably gain some minor improvement, though the project would still not be done, would be bug ridden by using assembly, and you would have lost the real opportunities for optimization because you would have not been able to work on the necessary level of abstraction. Actually, major parts of gfx show that exactly that happened because gfx is too tight to the hardware, ill-designed and lacks a proper abstraction. Needless to say, major parts are in assembly. On the other hand, intuition has a proper level of abstraction (at least for its time), and has a very small interface. It's totally written in C. Intuition is fast enough for the 68K. Do you see a pattern here?
Title: Re: newb questions, hit the hardware or not?
Post by: Georg on July 17, 2014, 08:33:38 AM: Quote from: Thomas Richter;769148
Layers V40 copied N times when re-arranging a single out of N overlapping layers. Layers V45 copies a constant number of times, twice: Current data to backing store, backing store to front layer, independent of N.

How do you do the re-arranging with only two copies if for example 3 or 4 smart refresh layers have their visible area changed after the single layer operation (moving, depth arrangement)? And the hidden area of a single smart refresh layers may consist of serveral cliprects = more than 1 backing store.
Title: Re: newb questions, hit the hardware or not?
Post by: biggun on July 17, 2014, 08:34:30 AM: Isn't this two tasks for two people?

1) The application developer should focus on solving his problems effectively.

2) The compiler writer should improve the compiler so that reasonable good code is generated.

To write an application like a texteditor, using C and using OS calls sounds perfectly ideal to me.

If you want to write a 1KB bootblock demo with sinus scroller and copper-plasma, then using ASM and banging the hardware is probably the ideal way of doing it.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 08:55:01 AM: Quote from: Georg;769184
How do you do the re-arranging with only two copies if for example 3 or 4 smart refresh layers have their visible area changed after the single layer operation (moving, depth arrangement)? And the hidden area of a single smart refresh layers may consist of serveral cliprects = more than 1 backing store.

No, of course in that case two copies are not sufficient. If I have three overlapping layers, and I move one of the layers down such that the areas of the two other layers become visible, then three copies are necessary obviously.

The problem V40 had was in cases where you had a stack of layers sitting on top of each other (say, five stacked windows) and you depth-arranged them, i.e. moving the topmost to the bottom. V45 does simply that: Copy the frontmost layer to the backing store, the backing store of the next one to the screen. Sounds like the logical thing to do. Unfortuntely, V40 did *not* operate like this. Instead, it copied the data of *all* layers in the stack around.

V40 (or rather, V32) was designed for a different target: Back then, copying always used the blitter, the blitter was fast, the CPU was slow, and backing store was an expensive resource. So V32 used an algorithm that minimizes the amount of backing store allocated at once, at the expense of using too many copy operations in situations where many windows overlap. Nowadays, the situation turned around: Copying is slow, the CPU is fast, the blitter is slow, and enough memory is available. Thus, the algorithm had to change to adapt to the new requirements.

V32 tried to optimize the memory footprint for backing store at all means, at the price of using too many copy operations. It also used the double-XOR trick to swap regions (good for the blitter, bad on a graphics card since it requires emulation).
V45 tries to optimize the number of copy operations at the price of potentially using more backing store. It no longer uses double-XOR, and it uses the primitives of the graphics card if they are available, especially, it allocates the bitmaps for the backing store from graphics card memory if available.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 08:57:44 AM: Quote from: biggun;769185
Isn't this two tasks for two people?

1) The application developer should focus on solving his problems effectively.

2) The compiler writer should improve the compiler so that reasonable good code is generated.

To write an application like a texteditor, using C and using OS calls sounds perfectly ideal to me.

That pretty much sums it up. It is the matter of the compiler writer to create a reasonably good compiler. If that is *still* not good enough, one can still go and hand-optimize the bottlenecks, which is exactly what happened also in the past for my professional (as in "work for money") software.
Quote from: biggun;769185
If you want to write a 1KB bootblock demo with sinus scroller and copper-plasma, then using ASM and banging the hardware is probably the ideal way of doing it.

Except that, in case you really want such a thing, you'd probably better off nowadays with a couple of lines in javascript and run it in a browser. (-; Yes, times changed.
Title: Re: newb questions, hit the hardware or not?
Post by: vxm on July 17, 2014, 09:24:36 AM: Quote from: biggun;769185
Isn't this two tasks for two people?

1) The application developer should focus on solving his problems effectively.

2) The compiler writer should improve the compiler so that reasonable good code is generated.
This is correct. In the same way that small streams make big rivers, the sum of the various levels of optimizations contribute to the overall system operation.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 17, 2014, 10:50:20 AM: Quote from: Thomas Richter;769182
would be bug ridden by using assembly
Blame the language for bugs, what a complete and utter nonsense.

Also, just because people optimize implementations, doesn't mean they are unable to choose the right algorithms to implement. You talk about this as if these are mutually exclusive, and it's nonsense.

Another thing, just because YOU think some things are a waste of time, doesn't mean that others don't. Some nerve you have talking for everyone like that.

And if you're using a compiler that writes lea (0,a0),a1 instead of move.l a0,a1, then you need a better compiler. It's crap code, whether you agree with that or not.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 12:19:00 PM: Quote from: Thorham;769192
Blame the language for bugs, what a complete and utter nonsense.
You haven't worked in professional software development, have you? Yes, humans make mistakes. No matter what language. Yes, the language matters: Some languages provide facilities to detect mistakes. Some more, some less. Higher languages are better than that than assembly.
Quote from: Thorham;769192
Also, just because people optimize implementations, doesn't mean they are unable to choose the right algorithms to implement. You talk about this as if these are mutually exclusive, and it's nonsense.
See above. Apparently, you do not yet have enough experience in implementing complex and large software projects. Such projects iterate, are under flux, change, several algorithms are tried, benchmarked, debugged. You cannot do that efficiently in assembly.
Quote from: Thorham;769192
Another thing, just because YOU think some things are a waste of time, doesn't mean that others don't. Some nerve you have talking for everyone like that.
Probably because I'm a bit longer in the business than you are? It's called experience. It comes over time. I've done many things in assembly, many things in other languages. C, C++, Pascal, Java, many more. Each has its drawback and advantages. Assembly is something you rarely ever need. It is in most cases a waste of time: Too much time to develop, to much time to debug, diminishing returns in terms of performance. Hence: Waste of time. If you do not believe me, ask other experienced developers.
Quote from: Thorham;769192

And if you're using a compiler that writes lea (0,a0),a1 instead of move.l a0,a1, then you need a better compiler. It's crap code, whether you agree with that or not.

The code is good enough - it makes no observable difference. You are of course invited to write a better compiler. I personally do not waste my time with micro-optimizations. Optimizations make sense, once the bottleneck is identified. Even assembly does sometimes make sense if you find that >50% of the time is spend in an isolated part of the code, and there is no further algorithmic improvement you could make use of. But layers is no such code. The low-level copy functions to move backing store and images around are. Guess what: That's assembly. It's still slow, but there's no chance to make it faster since the Zorro bus is the bottleneck - the only way to make it faster is to avoid the operations in first place whenever possible, and that's exactly what V45 does. Whether the lea's are coded with move's or not - makes no bloddy difference.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 17, 2014, 01:05:46 PM: Quote from: Thomas Richter;769182
If any of you cycle-counters had layers (or any other component of the Os, for that matter) under your fingers, you would probably replace leas by moves or vice versa, probably gain some minor improvement, though the project would still not be done, would be bug ridden by using assembly, and you would have lost the real opportunities for optimization because you would have not been able to work on the necessary level of abstraction.
That's not even remotely how that works. Low-level optimization does not prevent you from doing high-level optimization (and the idea that assembly is more prone to bugs than high-level languages is a myth. Bugs come from sloppy thinking, not from lack of language features.)

Quote from: Thomas Richter;769187
Except that, in case you really want such a thing, you'd probably better off nowadays with a couple of lines in javascript and run it in a browser. (-; Yes, times changed.
No they haven't.

Quote from: Thomas Richter;769195
You haven't worked in professional software development, have you?
I have - and you're talking nonsense.

Quote
Such projects iterate, are under flux, change, several algorithms are tried, benchmarked, debugged. You cannot do that efficiently in assembly.
Bull.

Quote
Probably because I'm a bit longer in the business than you are? It's called experience. It comes over time.
As the canonical New Yorker cartoon caption says, "Christ, what an asshøle."
Title: Re: newb questions, hit the hardware or not?
Post by: biggun on July 17, 2014, 02:35:42 PM: Quote from: commodorejohn;769198
you're talking nonsense.

"Bull.

"what an asshøle."

How about we try to stay calm and behave?
Maybe we could even go back to the main topic?

I think the main question was is going directly onto the hardware OK?
And I think we can agree that this is OK - as long as one is aware that running on OS4 machines is then not working.

One think is clear and everybody will agree to this.
Coding in high level languages is easier.
Coding complex routines in ASM takes more time.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 02:48:38 PM: Quote from: commodorejohn;769198
Bull.

As the canonical New Yorker cartoon caption says, "Christ, what an asshøle."

So, quite frankly, you haven't. Because that's how real projects look like. It's not "here are the specs, start coding today, be ready next Tuesday". It's more like: Here's what we want, we come back in a month, see what we got, here's a list of changes, here's what we want too, here's what we do not need anymore, please iterate on what you did two years ago. That's real life. And sorry, assembler does not work well. It's something that comes, at the very least, at the very end of the project. Been there, done that.

Well, anyhow. If you say you've done professional software, may I ask for whom? Who paid, and what was the project?

Maybe I'm an ass, but maybe an ass with a little more years behind me. Get older, learn more. As soon as you've done some larger real software, we talk again in years. Either, you're no longer in the business, or you've learned how it works.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 17, 2014, 03:20:34 PM: Quote from: commodorejohn;769198
Low-level optimization does not prevent you from doing high-level optimization

It does not completely prevent you from doing it, but writing complex code in assembler is a lot harder than writing it in C#/Java because you have to do everything yourself.

I've replaced assembler with C in projects that has ended up with the C code being quicker, it was rewritten in C to make it portable but in the process I noticed simple ways to optimise the algorithm. The effort required to optimise the assembler code wasn't worth it, neither was writing another version of it for a different assembler.

Quote from: commodorejohn;769198
(and the idea that assembly is more prone to bugs than high-level languages is a myth. Bugs come from sloppy thinking, not from lack of language features.)

It's not a myth. It's much easier to spot bugs in a high level language than it is to spot one in assembler. If you have infinite time to study a small assembler program then yes you can get it to zero bugs & sometimes that is important. Time pressure usually means that if you can't spot a bug by speed skimming your code as it scrolls by then it's likely to ship. Automated testing and code analysis also helps reduce bugs, but again these are easier to do with high level languages than assembler.

I guess you have never had to spend a couple of months writing 400 lines of code a day to have it tested for a couple of days before it's deployed to thousands of users who work outside of office hours. You need that to work. Language doesn't give you it magically, but spending the time to evaluate whether you could do it faster in assembler is likely to break your budget.

Having the same source compiled for PPC and 68k is more important than shaving off a few microseconds. Even if you could speed layer operations up by a further 10%, it would only make a noticeable difference if software was constantly performing layer operations. If it was doing anything else then it will diminish the return you get.

If your argument is that every single piece of software ever written should be micro-optimised, then you're likely to be dead before any of the software is finished. It would be cheaper to phone up Motorola and pay them to design a faster 68k just for you.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 17, 2014, 03:34:14 PM: Quote from: Thomas Richter;769207
So, quite frankly, you haven't. Because that's how real projects look like. It's not "here are the specs, start coding today, be ready next Tuesday". It's more like: Here's what we want, we come back in a month, see what we got, here's a list of changes, here's what we want too, here's what we do not need anymore, please iterate on what you did two years ago. That's real life. And sorry, assembler does not work well. It's something that comes, at the very least, at the very end of the project. Been there, done that.
Or, in other words, "anything that doesn't fit within my sphere of experience doesn't count."

Quote
Well, anyhow. If you say you've done professional software, may I ask for whom? Who paid, and what was the project?
I do backend-processing development for a call center whose business model is based around delivering captured data in arbitrary, client-specified formats. No, I don't do it in assembler, but it's not a shop where management gets anal about methodology; the goal is to deliver what the clients want in a way that's not too much of a drain on our systems, not to check the boxes that will make us the Purest and Holiest programmers who are removed from such earthy and abhorr'd concerns as how the code actually runs on real machines.

But since that doesn't fit with the point you want to make, I fully expect to be told "that's not professional," even though it's my job and I get paid for it.

Quote from: psxphill;769209
If your argument is that every single piece of software ever written should be micro-optimised, then you're likely to be dead before any of the software is finished.
That was never what I was arguing. I was simply saying that the fact that algorithm optimization should come first doesn't make low-level optimization irrelevant.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 04:03:54 PM: Quote from: commodorejohn;769212
I do backend-processing development for a call center whose business model is based around delivering captured data in arbitrary, client-specified formats. No, I don't do it in assembler.
Now, there you go. Why don't you approach your clients like you did approach me, tell them that you could get their software a couple of ms faster by writing it in assembler, and probably take 10 times the time to debug it and get it running? Why not? Oh, probably because that's impractical and you wouldn't get paid? Then, once again, why do you claim otherwise here?

Assembler is *not* a practical language to develop projects. If you believe otherwise, I'm happy to offer a project for you. I know how to write 68K assembler. I'm happy to write a 68K assembler version of layers just for you, provided you pay. Estimated development time approximately two month. Given my typical development rates, that's a couple of thousand Euros. Is that acceptable for you? Otherwise, I'd say you'd better stick with the free version you got, and stop complaining about a couple of irrelevant micro-optimizations the compiler missed.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 17, 2014, 04:17:14 PM: Quote from: Thomas Richter;769214
Now, there you go. Why don't you approach your clients like you did approach me, tell them that you could get their software a couple of ms faster by writing it in assembler, and probably take 10 times the time to debug it and get it running? Why not? Oh, probably because that's impractical and you wouldn't get paid? Then, once again, why do you claim otherwise here?
Gee, I dunno, maybe because that was never what I told you? The time saved doing a process that runs once per day on a multi-gigaherz system in assembler is not worth the bother, I agree. (But I wouldn't not get paid, because our clients don't know and don't care what I'm writing it in, as long as they get results.)

But claiming that it's across-the-board irrelevant everywhere, and especially that it's irrelevant in systems where the maximum clock speed, ever, is 100 MHz, and the average is much more likely to be in the 7-25MHz range, is absurd.

Quote
Assembler is *not* a practical language to develop projects. If you believe otherwise, I'm happy to offer a project for you. I know how to write 68K assembler. I'm happy to write a 68K assembler version of layers just for you, provided you pay. Estimated development time approximately two month. Given my typical development rates, that's a couple of thousand Euros. Is that acceptable for you?
No, because I don't care about layers at all, I just take exception to your claims that using assembler is never relevant.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 17, 2014, 04:55:08 PM: I'll stick with assembly language on my A1200, because it's one of my computer related hobbies, and I'm certainly not going to have some bloated ego tell me it's a waste of time.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 17, 2014, 05:08:22 PM: Quote from: commodorejohn;769216
I just take exception to your claims that using assembler is never relevant.

I've only seen that he is taking exception to criticism that he hasn't written it in assembler, with people justifying themselves by saying it's a myth that writing in assembler takes longer and is more error prone.

Quote from: commodorejohn;769212
That was never what I was arguing. I was simply saying that the fact that algorithm optimization should come first doesn't make low-level optimization irrelevant.

His argument appears to be: Before you do any optimization you should determine how much time the code is actually running for. In this circumstance he believes it doesn't run often enough that writing it in assembler would have any noticeable effect. This goes for any change that increases the on going maintenance cost. Sometimes optimising an algorithm in C has no visible benefits because it's not called often enough and it's more cost effective to throw away the optimised version.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 17, 2014, 05:20:39 PM: Quote from: psxphill;769221
with people justifying themselves by saying it's a myth that writing in assembler takes longer and is more error prone.
Who the hell ever said that? Of course it takes longer and is easier to mess up (doesn't mean you end up with bug riddled code, like someone claimed).
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 17, 2014, 05:38:25 PM: Quote from: commodorejohn;769216
But claiming that it's across-the-board irrelevant everywhere, and especially that it's irrelevant in systems where the maximum clock speed, ever, is 100 MHz, and the average is much more likely to be in the 7-25MHz range, is absurd.

Once again: Assembler is pretty much irrelevant, *unless* you can detect a bottleneck somewhere in the code that cannot be resolved by a better algorithm. I believe I stated this before, and I stand to my opinion. Layers does not have such bottlenecks. The only bottleneck is the memory copy for moving data around, and that *is* already in assembly (though not part of layers).

Thus, if you see any delay in depth arrangements or window movements, these delays cannot be resolved by changing the code to assembly or using irrelevant modifications from lea to move. It won't make any freaking difference. It would make a difference to get rid of the Zorro bottleneck, but that's beyond what you can reach with software.

The bottleneck is known, and approached as good as possible: By *avoiding* such slow operations whenever there is a chance to. That's an algorithmic improvement, and no, it does not take any bit of assembly to make it happen. It would have been much harder to improve the code in the same way if it would have been assembly, actually.

Whether 3Ghz or 25Mhz makes no difference to the argument: If 90% of execution time is spend in 5% of the code, optimizing the remaining 95% from C to assembly will not make a freaking difference, regardless of whether the system runs at 25Mhz, 3Ghz or 1.7Mhz. The problem remains at 5% of the code, and if that 5% is already optimal for the case at hand, there are no chances to make anything any better.
Title: Re: newb questions, hit the hardware or not?
Post by: Sean Cunningham on July 17, 2014, 05:43:43 PM: Quote from: psxphill;769221
...Sometimes optimising an algorithm in C has no visible benefits because it's not called often enough and it's more cost effective to throw away the optimised version.

Your name made me recall the first generation of next-gen consoles and how they relate directly to this discussion. The developers on the original PSX titles were coding games in C and couldn't achieve anything but porkish performance and absolutely nothing for the first couple generations could achieve anything approaching arcade quality responsiveness and framerates.

Over at SEGA, in particular their AM divisions, they were coding for the Saturn "close to the metal" and could actually achieve arcade feel with arguably less power (if you weren't fully taking advantage of its multi-processor design, and most 3rd parties weren't really up for this kind of complexity, just like they weren't for a long time with the PS3 and virtually any other multi-processor design).

The Saturn didn't have the true 3D acceleration that the PSX had but the Sony developers made it swim through a pool of peanut butter by coding the games the way they did. At the resolutions those titles were working at the Sony games should have been 30-60fps out of the gate with no lag and that was definitely not the standard.

It cracks me up that all this time there's any room for discussion over how to get the best performance on an Amiga (or any system that doesn't just have excess horsepower to subsidize generic coding methodologies, which relates to my dislike for X-Windows). The highest performance apps, highest performance games, highest performance demos all speak to the truth of you not getting there through the OS. Where are these high performance applications that were coded through the OS, particularly with respect to graphics effects, realtime or other forms of manipulation?
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 17, 2014, 05:55:19 PM: Quote from: Thomas Richter;769224
Once again: Assembler is pretty much irrelevant, *unless* you can detect a bottleneck somewhere in the code that cannot be resolved by a better algorithm. I believe I stated this before, and I stand to my opinion.
And again, absolutely nobody is saying algorithm optimization shouldn't come first.

Quote
Whether 3Ghz or 25Mhz makes no difference to the argument: If 90% of execution time is spend in 5% of the code, optimizing the remaining 95% from C to assembly will not make a freaking difference, regardless of whether the system runs at 25Mhz, 3Ghz or 1.7Mhz.
That's not how math works. 5% at 3GHz is 150,000,000 cycles per second of execution; saving a few dozen here and there is indeed pretty much irrelevant. 5% at 25MHz is 1,250,000 cycles per second - also pretty hefty, but depending on how many iterations of whatever code you're optimizing are running in that time, cycle-shaving can still make a bit of a difference. But at 7MHz? That's but 350,000 cycles - on a system where even basic instructions can take 8-20ish cycles to complete, that can get frittered away a lot faster than you'd think.
Title: Re: newb questions, hit the hardware or not?
Post by: Leffmann on July 17, 2014, 06:31:53 PM: Quote from: Thorham;769218
I'll stick with assembly language on my A1200, because it's one of my computer related hobbies, and I'm certainly not going to have some bloated ego tell me it's a waste of time.

That's a bit over the top :) he's perfectly merited to be this assertive, and he is right in what he says - there are no gains to be gotten from withering away doing micro-optimizations on parts that have little or no bearing on the performance of the program.

Quote from: commodorejohn;769226
That's not how math works. 5% at 3GHz is 150,000,000 cycles per second of execution; saving a few dozen here and there is indeed pretty much irrelevant. 5% at 25MHz is 1,250,000 cycles per second - also pretty hefty, but depending on how many iterations of whatever code you're optimizing are running in that time, cycle-shaving can still make a bit of a difference. But at 7MHz? That's but 350,000 cycles - on a system where even basic instructions can take 8-20ish cycles to complete, that can get frittered away a lot faster than you'd think.

He's not trying to teach you maths, he's just stating the simple fact that if you for some inexplicable reason choose to optimize a part of your code that accounts for 5% of your execution time, then you can never get more than a 5% performance gain.
Title: Re: newb questions, hit the hardware or not?
Post by: matthey on July 17, 2014, 06:43:29 PM: Quote from: Thorham;769222
Who the hell ever said that? Of course it takes longer and is easier to mess up (doesn't mean you end up with bug riddled code, like someone claimed).

It doesn't mean algorithms can't be improved in assembler either (even if ThoR considers them worthless micro-optimizations also). There are high level algorithm improvements that are generally easier in high level languages and low level algorithm improvements that high level languages may make difficult to see or implement because of abstraction. Assembler is complete freedom, it's just that most people don't know what to do with it. It requires logical and organized thinking to create good code. It's like a puzzle with beauty in the simplest and most logical code. A time schedule would take away from the creative freedom though. Some people have to code for a living and that's why we should try to improve the assembler in compilers for those imprisoned souls ;).

Algorithms can require assumptions that change also. New fpga Amiga chipsets may have a blitter much faster than the CPU with faster memory again. Perhaps SmartRefresh should be selected for the chipset and SimpleRefresh for a gfx board. Maybe layers should query the gfx system for the best refresh method to use for a gfx mode. There is all that algorithm work fishing for the big fish and that can all become outdated if the "best" algorithm becomes obsolete. At least thousands of micro-optimizations can be applied with quick compiler switches over and over again, provided they are used. Personally, I like tuna and sardines. I prefer to be a little more open minded than 640kB of memory with perfect algorithms is enough for everyone. Isn't it really the time saved compared to the amount of work? I would think that optimizing compilers so that a compiler switch can be used would be efficient in processing time saved vs programming time spent even if it was a few percent of savings. Of course, I'm not a professional programmer so my opinion doesn't seem to count, according to some people.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 17, 2014, 07:27:44 PM: Quote from: Leffmann;769228
That's a bit over the top :) he's perfectly merited to be this assertive, and he is right in what he says
No, he's not, because he's saying assembly language is a waste of time. To me, my hobby is NOT a waste of time, thank you very much. It would be a different story if he said that it's a waste of time for himself, but he acts as if it's a waste of time for everyone.

Quote from: Leffmann;769228
- there are no gains to be gotten from withering away doing micro-optimizations on parts that have little or no bearing on the performance of the program.
Obviously. It's just that when you write everything in assembler from the start (hobby!), you wouldn't write compiler style crap in the first place.

Quote from: matthey;769229
It's like a puzzle with beauty in the simplest and most logical code.
Indeed :)

Quote from: matthey;769229
Some people have to code for a living
Fortunately I don't :)
Title: Re: newb questions, hit the hardware or not?
Post by: wawrzon on July 17, 2014, 08:24:23 PM: This thread is becoming unnecesarily persnonal. Apparently everybody agrees that high level languages are best to maintain huge modular projects while asm is best for in place optimizations. None needs to convince others of personal interests or choices. Especially doest make sense to insult or attack others. Use your skill where you want or where its used best. People are different for a reason, just must realize they can coop complementarly instead to quarrel.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 17, 2014, 08:45:13 PM: Quote from: Thorham;769222
Of course it takes longer and is easier to mess up (doesn't mean you end up with bug riddled code, like someone claimed).

How does "easier to mess up" not mean "you end up with bug riddled code"?

Quote from: Sean Cunningham;769225
The developers on the original PSX titles were coding games in C and couldn't achieve anything but porkish performance and absolutely nothing for the first couple generations could achieve anything approaching arcade quality responsiveness and framerates.

I believe every single PSX game ever was mostly written in C, although there are a couple of early games that might have had parts written in assembler, because of the bugs in them. Tekken 2 has a bug that mostly goes unnoticed due to luck, which requires emulators be more accurate than Sony had envisaged. Otherwise it looks like http://smf.mameworld.info/img/tekk0009.png

Tekken and Tekken 2 were written for the arcade System 11 hardware first and then ported. System 11 appears to have been originally based on one of the PSX prototypes as the GPU is very different. At the end of System 11 life they started shipping with the PSX GPU, so a lot of games detect which GPU is fitted and adapt to it. Some versions of arcade Tekken 2 will run on the newer hardware, but the early versions won't. I suspect they used different tool chains or libraries for those games, which might be where the Tekken 2 bug comes from.

I think one of the tomb raider games has some odd register usage which someone speculated made it look like it was partly written in assembler too. The Saturn version was much worse than the PSX version. https://www.youtube.com/watch?annotation_id=annotation_2637363251&feature=iv&src_vid=q6oh_y9Tdao&v=z3GalI7AVj8 https://www.youtube.com/watch?v=NgQP7JOqgsk I believe the Saturn was the lead platform & it shipped three months earlier than the PSX version. http://www.game-rave.com/psx_galleries/battle_tombraider/index.htm

The reason why the software became better was mostly due to the performance analyser telling you what was actually making your software slow. Up until then they had people blindly making low level optimisations crossing their fingers they would work, instead it would tell you that actually it's caused by cache misses which means you need to rewrite/restructure your engine to make it fit in the cache better. Other reasons could be that the GPU was being starved because you were overloading GTE, or maybe the GPU was saturated because you were trying to draw too much or were trying to use too many textures. You needed to be able to rapidly change your engine all the way through development and that puts assembler out of the question.

I believe Gran Turismo was the first game to be developed using the Performance Analyser. Namco did a faster version of ridge racer which was bundled with ridge racer type 4, it ran at 60fps instead of the original 30fps. I don't know whether they just used their experience or whether this benefited from the performance analyser.

When Namco wrote Ridge Racer the PSX didn't exist in it's finished form & the hardware was actually quite different. Once SN Systems talked them into using PC's for development and putting the console hardware onto ISA cards then the Target boxes were returned back to Sony. So not many of the DTL-H500 target boxes exist, so it's hard to tell how different. I don't think Namco went back and optimised it for the final hardware. The CD drive didn't exist when the wrote the game either, which is one of the reasons it is a single load and only uses the drive for red book audio at run time. They only got hold of a prototype drive after the game was finished.

I don't believe that Sega ever had any tools like the ones Sony had, so the PSX games just kept getting better. While Saturn had some good games, they were generally poor. It did well for 2d games because I think the fill rate for 2d might have been higher than the PSX. Also Sony banned 2d games in some regions for a while because they wanted to focus on 3d games, which might have been why the 2d shooters ended up on the Saturn.

Quote from: Sean Cunningham;769225
The Saturn didn't have the true 3D acceleration that the PSX had

The Saturn had the exact same "3d" capabilities as the PSX. Both had hardware to do the 3d to 2d transforms as both GPU's could only render 2d, the main difference was the PSX scanned triangles and looked up the textures while the Saturn scanned the textures and plotted quads. The Saturn could draw the same screen coordinate more than one or not at all, which made the graphics look a bit wonky and made it hard to do transparency and gouraud shading. The PSX GPU could accept quads, but it split them into two or more triangles for rendering (it also has to split triangles sometimes too as the render has some specific requirements to reduce texture coordinate rounding errors) but this itself causes other rendering issues (though these can be worked round easier than the Saturn issues).

The Saturn had a 2d display chip as well & a 2nd cpu, which for games that were released on both formats was probably underutilised. You couldn't justify taking a game that ran and spend another year to make it run another 20% quicker when the market was so much smaller.

The only major low level optimisation that Sony introduced was inlining the GTE opcodes (geometry transform engine that does the 3d to 2d transformations) originally you called them through a function as they tried to hide and abstract everything about the hardware so that future consoles could be backward compatible. They backed off in this circumstance because they measured the effect. Sony really tried hard to make developers write software that was portable to different hardware. There were three main revisions of retail PSX, which all ran at different speeds. Games with race conditions are a problem if you only test on one speed of console, but it mostly worked out. It wasn't until the PS2 where the PSX GPU is software emulated & they have to patch games to make them run properly. The had to do something similar for the PS3 backward compatibility, they advertised that job on their web site. There are no 100% accurate PSX emulators out there, because nobody even knows what that means (including it seems Sony as they can't even emulate the 100% GTE accurately).

IMO the PSX is like the Amiga, while the Saturn is like the ST. In the next generation the PS2 was like the Saturn, the dreamcast and xbox was like a PC and the gamecube was the nicest hardware. The 360 and PS3 were pretty similar due to Microsoft buying the PS3 CPU from IBM (read the book http://www.amazon.co.uk/Race-New-Game-Machine-The/dp/0806531010). Sony kept their tradition of making more and more complex hardware that required low level optimisation for it to work properly, which is what finished Ken Kutaragi's career. They've both gone back to PC hardware now, with ram type being the main difference. Which introduces interesting issues for cross platform games.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 17, 2014, 08:50:06 PM: Quote from: psxphill;769235
How does "easier to mess up" not mean "you end up with bug riddled code"?
Just because it's easier to make mistakes doesn't mean you can't properly debug your code. Writing good software in assembly language just takes longer. Also, the bug riddled thing makes it sound like you can't write good software in assembly language, which is obviously nonsense.
Title: Re: newb questions, hit the hardware or not?
Post by: biggun on July 17, 2014, 09:01:40 PM: Quote from: Thorham;769236
Just because it's easier to make mistakes doesn't mean you can't properly debug your code. Writing good software in assembly language just takes longer. Also, the bug riddled thing makes it sound like you can't write good software in assembly language, which is obviously nonsense.

Yea - I know what you mean,
God blessed me with the gift that I can write the most complex algorithms and they are always bugfree.
I never need to debug. Whether I write in C or ASM or right away in hexcode. My code is always bug free.
;-)

So if you are like me then coding everything right away in ASM is fine.
But I was told that some people find coding in C easier.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 17, 2014, 11:34:36 PM: Quote from: Thorham;769236
Just because it's easier to make mistakes doesn't mean you can't properly debug your code.

In a lot of cases if you can't see the bug because it jumps out of the screen and it only fails in specific cases, then it will get shipped. That is why there has been so much money spent on better compilers and code analysis tools.

Quote from: Thorham;769236
Also, the bug riddled thing makes it sound like you can't write good software in assembly language, which is obviously nonsense.

It's not impossible, but neither is winning the jackpot on the lottery. It's just very unlikely. If you're on a death march http://en.wikipedia.org/wiki/Death_march_(project_management)

then you will release software as soon as you can because you're sick of it. Writing it in a high level language will definitely increase the chance of releasing it without major bugs.

It's also more likely to be good if the source code is from an existing project that has already had many hours of testing, like layers V45.

Quote from: biggun;769237
Yea - I know what you mean,
God blessed me with the gift that I can write the most complex algorithms and they are always bugfree.
I never need to debug. Whether I write in C or ASM or right away in hexcode. My code is always bug free.
;-)

So if you are like me then coding everything right away in ASM is fine.
But I was told that some people find coding in C easier.

You either have a different definition of complex than I do, or that is sarcasm, or both.
Title: Re: newb questions, hit the hardware or not?
Post by: Sean Cunningham on July 18, 2014, 01:18:23 AM: Sorry but the Saturn games were not "poor". The AM divisions games were outstanding and offered arcade feel, something virtually no PSX game ever did, across its entire lifetime. The closest to that feel that I ever got was from Psygnosis' Wipe-out, but it still didn't have the refresh. The Tekken series were okay but still didn't feel "arcade" and felt laggy compared to VF2 though the Tekken series was lightyears better than Toshinden, ugh.

None of the PSX 3D fighters had the same responsiveness or arcade feel as the VF series, Fighting Vipers, Last Bronx, etc. and the 2D fighters that were available for both platforms played better on the Saturn. One of the few 3D fighters available for both, Dead or Alive, was better on the Saturn (I had it for both, to counter your Tomb Raider example). The PSX had some clever games but it was a major disappointment and rarely got pulled out at my house, and I had both systems the first day they went on sale.

I played Ridge Racer some and Midnight Club and Rage Racer but they didn't have the arcade feel of Sega Rally Championship. The top PSX track-n-field game was disappointing after playing Sega's high-resolution 60fps game.

Sorry, nope. 3rd party offerings on the Saturn were generally not too good, I'll give you that, unless they were 2D. None of them invested in coding the way the AM divisions at SEGA did. But playing PSX games it was like Sony had never been to an arcade before. NAMCO was likely the most successful but they still seemed like they couldn't quite get there.
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 18, 2014, 03:12:58 AM: Quote from: biggun;769237
Yea - I know what you mean,
God blessed me with the gift that I can write the most complex algorithms and they are always bug free.
I never need to debug. Whether I write in C or ASM or right away in hex code. My code is always bug free.
;-)

Where few remember the Hex values, do you have exceptional memory, or did you spent a lot of time looking at HEX code.

Quote
But I was told that some people find coding in C easier.

Its easier to debug some thing if you have some debug symbols, or else you need to remember the assembler code, some people have problems with that if they have not looked at code in a while, there for debug symbols and stack traces are big help. Even more so if your trying to fix something someone else wrote.

Quote
So if you are like me then coding everything right away in ASM is fine.

I think that comes with experience, if he knows what he want to do, and know how need to do it then that's fine.

But some times you don't know what best approach is and you need to try different methods out, to find the best one, unless you already know what best machine code to use, it maybe not best idea to spend too much time optimizing a bad idea, that your going trow away latter for a different approach.

I have seen few examples people spending a lot of time writing assembler code, to do some thing ending up doing every thing on the CPU, instead of using existing routines or OS functions that take advantage of DMA, and hardware acceleration.

Its also silly to not use a existing routine that someone has spent years perfecting, and ending up writing your own routines that ends up being slower. So its a good idea to do some bench-marking.

And if you did write better routine way not replace / optimize the old routine instead of bloating the code with duplication.

And when it comes to bug free code, I have see my share of programs that where so called bug free, that wrote out side of allocated memory blocks.

It a good idea to run Enforcer/Mungwall, program can run with out crashing, but yet corrupt memory for other applications or the OS, if blocks close to the block that was overwritten was reserved by another program this might happen while your writing the program so you might not notice it.

For example allocating some memory example 256, and counting to 256, instead of 255, that's a mistake that is so easy to do.
Title: Re: newb questions, hit the hardware or not?
Post by: itix on July 18, 2014, 04:21:39 AM: Quote from: Thorham;769236
Just because it's easier to make mistakes doesn't mean you can't properly debug your code. Writing good software in assembly language just takes longer. Also, the bug riddled thing makes it sound like you can't write good software in assembly language, which is obviously nonsense.

There is limit how large code base you can manage yourself. With assembly language you hit this limit sooner than with C language. And again in C language you probably hit this limit sooner than with C++. And so on. Very likely your project written in assembly language never grow to "full scale" as you get tired to maintain huge code base.

Asm makes it easier to make bugs because you must remember small details more. Which was not problem to me when I was writing in 68k asm but for example on Amiga you must remember to assign parameters to right registers. Using stack for temporary variables in asm is more difficult than in C. In asm you must count how many bytes you need from the stack and then calculate correct offset to access each variable in the stack. With good C compiler register usage is possibly more efficient because at least in theory it could compute optimal register usage in functions. In practise it isnt so, it seems.

But coding in 68k asm can be fun. I did that several years in the past.
Title: Re: newb questions, hit the hardware or not?
Post by: biggun on July 18, 2014, 07:11:46 AM: Quote from: psxphill;769258
You either have a different definition of complex than I do, or that is sarcasm, or both.

Haha lol.
I thought a little bit of fun does not hurt, right :-D
Better than people getting at each others throats here.

But its true that I write sometimes a handfull instructions in Hexcode directly.
As you know I did develop the instruction decoders for three full 68K CPUs (68050/Apollo/Phoenix)
therefore I have quite good practise in knowing how every 68K instruction is encoded....
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 18, 2014, 07:29:27 AM: I recommend:

http://xkcd.com/378/

http://www.pbm.com/~lindahl/real.programmers.html

Now, be serious. Anyone recommending assembler for a full-scale project probably either has never done it, or has a different definition on "full scale" than I do. "layers" is a small project. (Yes, really).

ViNCEd is fully assembler, but still only medium size. Just to give you an idea, that's four years of work in this code. The same in C would have taken one fourth of its time. Besides debugging, which was one problem, the major problem was extending and enhancing the project. For a higher language, you rewrite a couple of functions or classes, change the interfaces here and there, and the compiler warns you for places where it no longer fits, most of the time at least. For assembler, I had to go through the complete code base over and over again. The version you have is version 3, which is more or less "rewritten for the third time" because that's more or less the only viable option you have for assembler. Well, not exactly rewritten, but for each new revision, I went through the complete code, every single function, line by line. Each version took about three month to complete, and more than a year to debug completely to my satisfaction.

As a Hobbyist, you may find time to do that, for a mid-size project. As a professional that needs to ship software at some point, and that need to attack somewhat larger scale projects than this, such a development pace is unacceptable. The relation of work to generated code is quite bad.

I also learned during the development that my assembler code looks more and more like compiler output. To get the project managable, you'd better pick a strict register allocation policy (so you know what are scratch registers and what is not) and depend on single-entry, single-exit, single-return value. That keeps the code managable. Code that does not becames quite early unmanagable - ARexx is such an example: Completely in assembler, multiple return values, no clear register allocation policy - not maintainable anymore, a big pile of mess.

Then I learned to use the compiler for larger projects, and rely only on assembler where needed. VideoEasel is such a project. That's a bit larger-scale (still not big, but larger). C everywhere, except where assembly is really required. It took also three attacks to get the project done. First some prototypes, then I started in assembler, but learned that it would be unmanagable. Version 2 was never completed. Then I started again, version 3, in C. That version was completed.

Thus, I really learned the hard way *not* to use assembler. Anyone who claims that assembler is the way to go has not yet tried to write a full-fledged full-scale application with it. Been there, done that. Life and learn.

Greetings, Thomas
Title: Re: newb questions, hit the hardware or not?
Post by: Bif on July 18, 2014, 09:19:22 AM: Quote from: Sean Cunningham;769225
Your name made me recall the first generation of next-gen consoles and how they relate directly to this discussion. The developers on the original PSX titles were coding games in C and couldn't achieve anything but porkish performance and absolutely nothing for the first couple generations could achieve anything approaching arcade quality responsiveness and framerates.

The PlayStation series are an interesting study on this topic. I think each machine faced different problems and required something quite different to achieve high performance.

For PSX, indeed I'm really not aware of much ASM being used on projects. I don't think I ever used any in my area. I think this was for two reasons: 1) all the graphics and sound heavy lifting needed was done via dedicated hardware. The main CPU was really too slow to waste precious cycles doing heavy lifting, and 2) there wasn't anything terribly special about the CPU that ASM would produce vastly better performance than C (no vector unit, etc.). I think the interesting irony here is that the PSX (of all Playstations) hardware setup (mediocre CPU with custom hardware to do heavy lifting) would be closest to the Amiga. In this case its the old slow platform that required no ASM to squeeze performance out of it, the opposite of what you might expect.

Now for PSX, I believe one of the things that really dragged down early game performance was the piss poor Sony APIs we were forced to use. Not only did they not always make a lot of sense, their performance was atrocious in some cases (for no great reasons, just brain dead code / API design), with no legal way around it. In my game area, using their stuff robbed the R3000 of 10% of its total cycles across the whole game. I'm sure this was pretty much true for almost any game that ever shipped. I got frustrated and bypassed the problem area, 10% game cycles back. I did fear the trouble I might get in - they eventually found out what I was doing through performance analysis of our games, but just gave me the nudge nudge wink wink as having games perform that much better is not going to look bad for their brand. I'm sure other gameplay areas ran into similar issues and worked out improvements over time.

For PS2 I spent a crapload of time writing ASM. Gobs of heavy lifting code written for both the main vector unit and R3000. Luckily it supported inline ASM so you only had to code the critical part of each function in ASM - it's really not bloody exciting fun or useful coding function call/entry code in ASM. At that point ASM was the only way to use the vector unit to full advantage, and it can provide a huge boost in performance. In the PS2 the R3000 also sat pretty much unused so I used the crap out of it for my stuff, and I think coding in ASM did help in many cases. When a loop to get something done is just several cycles it can really help to knock one or two cycles off. The R3000 was interesting in its instruction pairings and I think the compiler wasn't daring enough to get as aggressive as it could. I think I also got a lot of performance out of trial and error with straight C code though. With these older compilers it could be a lot of trial in error in how a loop is constructed, pointer increment vs. index increment, the magic amount of times to unroll a loop, etc.

The PS3 requires even more gobs of ASM to make it shine as all the power is in the SPUs, and there are lesser amounts of other hardware you can offload work to. You need ASM to take advantage of the vector processing in the SPUs. Actually, that is not fully true - unless you are insane, you use "intrinsics" instead to get at vector or other instructions that a compiler cannot easily use. Intrinsics are essentially ASM instructions, but they do not take registers as arguments, they take C variables. The compiler then does the work of register allocation. It's a beautiful compromise as register allocation/tracking is always what drives me totally nuts about ASM programming when dealing with a more complex algorithm, and a good compiler is going to do a better job of this than you unless you REALLY want to spend a lot of time thinking about how to beat it. I did have to work with a code base that was actually programmed in a large amount of real SPU ASM, probably out of stubbornness as I couldn't see a performance advantage to it - I really wanted to bitchslap the programmer as it is brutally hard to try to understand and debug that amount of someone else's ASM.

Now I've not touched a bit of Amiga code in 25 years, but if I had to get something going, I think I'd be inclined to at least try to code in C/C++ as much as possible, and only ASM optimize the heavy lifting tight loops where the compiler is sucking. I'd try to use the libraries provided, but if they became a problem, I'd bypass them. Just saying not sure there is 100% one right or wrong way to do anything, it will depend, though I would say I'd certainly avoid 100% ASM coding just for the sake of it, I'm too old for that crap now.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 18, 2014, 05:30:33 PM: Quote from: Sean Cunningham;769263
None of them invested in coding the way the AM divisions at SEGA did.

That was more because they already had experience with 3d. Most developers had no experience of 3d at all at the beginning as they had megadrive/snes/amiga backgrounds. That is why Sony got involved with Namco.

I always found the Saturn ports of games to be very disappointing and nothing like the frame rate or resolution of the arcade games they were supposed to be. I just watched a video of fighting vipers and it looks quite a low frame rate.

Quote from: Bif;769279
The R3000 was interesting in its instruction pairings and I think the compiler wasn't daring enough to get as aggressive as it could.

FWIW It's not an R3000, although Sony went to great lengths to make you think it was.

Sony licensed in the R33300 from LSI and modified it. You get HDL and you can change what you want, like adding the GTE and making it so the data cache could only run in scratch pad mode (the R33300 can be run time switched between a traditional cache and scratchpad). You then go back to LSI and they turn it into a sea of gates. The MDEC, DMA, IRQ and the RAM interface was also included here, if you look at the decap of the chip it's pretty much just one big blob of algorithmically generated gates. While gates designed by humans tend to have well defined areas for each piece of functionality. If we had the HDL you could convert it to run on an FPGA.

There aren't instruction pairings as such, it's pipelined so that each instruction should finish in a cycle and there is no register stalling (apart from GTE and mult/div). So if you read from a register that hasn't been written to yet by the previous instruction then you get the old contents unless an interrupt has occurred. There is a FIFO write cache so writes don't always stall (this is a standard R33300 feature which can be turned on or off at runtime they didn't bother crippling that) and that can throw you off if you don't know about it.

Quote from: Bif;769279
With these older compilers it could be a lot of trial in error in how a loop is constructed, pointer increment vs. index increment, the magic amount of times to unroll a loop, etc.

The instruction cache has only 1 set, so it's very easy to churn the instruction cache when you call a function. If the entire function plus functions it calls cannot fit in the cache then just moving them around in memory can make a huge difference. But that could happen whether you write your application in C or assembler, the key is to have your code written in such a way that you can easily refactor it & that isn't assembler.

Quote from: Bif;769279
Now for PSX, I believe one of the things that really dragged down early game performance was the piss poor Sony APIs we were forced to use. Not only did they not always make a lot of sense, their performance was atrocious in some cases (for no great reasons, just brain dead code / API design), with no legal way around it.

There is some interesting code in their libraries, it was in part caused by having to work around bugs in the hardware. Some of the later APIs were better, some of them were worse. The problem being that Sony only wanted you to use their libraries because then they only needed to make sure that the next hardware would work with those libraries. They should have spent more effort on them to start with, because they were reluctant to improve them later on. Even the BIOS has some pretty poor code in it, which they didn't fix because they didn't want to hurt compatibility. It was definitely a lesson for them.

Quote from: biggun;769276
Haha lol.
But its true that I write sometimes a handfull instructions in Hexcode directly.

I do too, but too infrequently and for too many different cpus to remember the opcodes. I generally look them up and poke them into something with a disassembler as I don't usually have a cross assembler.

Quote from: Bif;769279
though I would say I'd certainly avoid 100% ASM coding just for the sake of it, I'm too old for that crap now.

There is a lot more investment in better compilers these days. If anyone likes staring at ASM trying to figure out ways of making it faster and wants better Amiga software then writing a new back end for gcc or clang would probably be the best bet.
Title: Re: newb questions, hit the hardware or not?
Post by: Bif on July 19, 2014, 09:11:55 AM: Quote from: psxphill;769317
There aren't instruction pairings as such, it's pipelined so that each instruction should finish in a cycle and there is no register stalling (apart from GTE and mult/div).

Yeah you are right, I wasn't thinking superscalar pairing, my memory was bringing back the weirdness with the branch delay slot, where the instruction after the branch is always executed. If you didn't throw an instruction outside the end of a loop you wasted cycles. I can hardly remember any of this stuff, your memory and knowledge is really quite amazing, you live up to your moniker. I'm only now remembering a bit more where I recall designing every loop to do at least 6 loads before anything else. Or maybe that was the R5900, my memory is not that reliable.

Quote from: psxphill;769317
There is a lot more investment in better compilers these days. If anyone likes staring at ASM trying to figure out ways of making it faster and wants better Amiga software then writing a new back end for gcc or clang would probably be the best bet.

Yeah I agree, I was going to say the same thing but got too tired typing all that out. I still think there is room to use ASM to leverage some things that compilers can probably never be good at. E.g. in early days on integer only machines you could do tricks with Add With Carry type instructions to shave an instruction or two off a tight loop. That's the kind of stuff I'd be looking at if I went down to ASM.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 19, 2014, 10:18:44 AM: Quote from: Bif;769355
If you didn't throw an instruction outside the end of a loop you wasted cycles.

Yeah, that is where it got it's name from "Microprocessor without Interlocked Pipeline Stages". Even a branch modifying the program counter is performed without interlocks, so the following instruction is executed.

Branch and load delay slots made the hardware much simpler, moving the complexity to the compiler. The cpu in your pc is doing peep hole optimisation constantly, which could be done at compile time. The disadvantage is that you're baking a lot of cpu architecture into the binary. Which is why virtual machines are more interesting. The ART runtime on android is moving away from JIT and going from byte code to optimised code at install time.

Quote from: Bif;769355
I'm only now remembering a bit more where I recall designing every loop to do at least 6 loads before anything else. Or maybe that was the R5900, my memory is not that reliable.

I can't think why doing 6 loads would make a difference without a datacache, so it probably is r5900. The cache prefetch in R5900 is interesting as it triggers the data to be fetched from ram into the cache, but doesn't stall the application waiting for results. So you can request all your data is fetched, then do some calculation on data previously loaded into the cache before finally loading the newly cached data into registers. This is the kind of thing that even coding in assembler is really hard to get optimal, because you might end up flushing data out of the cache that you will need again soon.

The PS1 was definitely the best design that Sony did. The PS2 and PS3 were too complex and it's hard to think of the PS4 as anything other than a fixed specification PC.

I believe that if commodore had ignored AAA and started AGA earlier, but included some chunky 8 and 16 bit modes and included some simple texture mapping in the blitter and released it in 1990 then they would have stood a chance against the 3d consoles & doom on PC etc. AGA was designed in a year, AAA was started in 1988. So giving them two years should have been enough.
Title: Re: newb questions, hit the hardware or not?
Post by: ppcamiga1 on July 19, 2014, 10:36:20 AM: More than twenty years ago when I bought the Amiga 1200,

the most annoying thing is that the games does not work with a hard drive,

because some idiots doing these games "optimized" them and read out data from floppy disk

without the operating system.

On the Amiga with floppy disk only those games were faster maybe about 1%,

but on the Amiga with a hard drive were useless because these games do not use hard disk.

The same games work better on pc because these games use hard disk.

The second of most annoying thing on the Amiga 1200,

was that software does not work with VGA monitor.

Because again some idiots "optimized" the software,

users lost the ability to connect at low cost VGA monitor to Amiga.

Those idiots could have gained maybe 1.5% on performance, maybe not.

VGA cable to connect to the Amiga 1200 may cost 4 Euro maybe less.

Users should not be forced to purchase scandoubler for 150 Euro

and more because some developers are too stupid,

to give up with useless "optimization".

AGA must be differently programmed to use an ordinary monitor,

and differently to use VGA monitor.

It is sad but this is what Commdore did many years ago, and developers have to just accept it.

Access to the hard disk on the classic Amiga, should be made only through the system,

the original IDE interface is too slow, software for classic Amiga should work with FastATA.

Access to the graphics on the Amiga classic, should be made only through the system,

users should be able to connect at a low cost VGA monitor to Amiga.

Access to the keyboard and mouse on the Amiga classic, should be made only through the system,

users should be able to use USB mouse and keyboard with USB interface only and

without additional hardware.
Title: Re: newb questions, hit the hardware or not?
Post by: itix on July 19, 2014, 10:59:52 AM: Quote from: ppcamiga1;769358
More than twenty years ago when I bought the Amiga 1200,

the most annoying thing is that the games does not work with a hard drive,

because some idiots doing these games "optimized" them and read out data from floppy disk

without the operating system.

During 80s harddisks were rare and expensive. It made sense to optimize games for typical Amiga 500 configuration.

Quote

On the Amiga with floppy disk only those games were faster maybe about 1%,

but on the Amiga with a hard drive were useless because these games do not use hard disk.

The same games work better on pc because these games use hard disk.

They do. But on amiga you can fit more data to floppies if you throw away filesystem.

Quote

The second of most annoying thing on the Amiga 1200,

was that software does not work with VGA monitor.

Because again some idiots "optimized" the software,

users lost the ability to connect at low cost VGA monitor to Amiga.

Those idiots could have gained maybe 1.5% on performance, maybe not.

By using VGA monitor you lose 50% on performance.

Quote

Access to the hard disk on the classic Amiga, should be made only through the system,

the original IDE interface is too slow, software for classic Amiga should work with FastATA.

Access to the graphics on the Amiga classic, should be made only through the system,

users should be able to connect at a low cost VGA monitor to Amiga.

Btw if you only have VGA monitor on your Amiga you get into trouble if your system stops booting. AmigaOS cant display early bootmenu nor boot console on VGA.
Title: Re: newb questions, hit the hardware or not?
Post by: spirantho on July 19, 2014, 12:26:19 PM: Itix has covered most of this already but...

Quote from: ppcamiga1;769358
More than twenty years ago when I bought the Amiga 1200,
the most annoying thing is that the games does not work with a hard drive, because some idiots doing these games "optimized" them and read out data from floppy disk without the operating system.

Why is that being an idiot?
Reading data from a floppy disk is much faster if you know exactly what you're doing. If you're doing dynamic disk access (i.e. reading data while playing the game without interruption) it's almost a necessity.
With a well-organised disk you can just blast the data from certain tracks straight into memory - much faster than messing around with file tables, and much less memory required.
And that's not even to mention copy-protection systems which were standard at the times.

Quote

On the Amiga with floppy disk only those games were faster maybe about 1%,

Floppy access can be MUCH faster without the OS overheads on smaller chunks of data, and less memory footprint, which is very important. Plus to use the OS routines, you need the OS in memory too, which can be a very large chunk of available memory.
1% is massively understating the potential gains, in speed and memory.

Quote

but on the Amiga with a hard drive were useless because these games do not use hard disk.
The same games work better on pc because these games use hard disk.

But hardly anyone at the time actually had a hard disk. Most of the Amigas were sold were the standard versions, and those who DID have a hard disk tended to use them for serious things which really needed them because they were so small. The A600 came with a 20MB hard disk. By the time you've installed Workbench and some serious utilities, you barely had enough room for anything else, especially a game with 1MB per disk. You could always put in another larger hard disk, but these were seriously expensive at the time.

Quote

The second of most annoying thing on the Amiga 1200,was that software does not work with VGA monitor. Because again some idiots "optimized" the software, users lost the ability to connect at low cost VGA monitor to Amiga.

You need to think about what the market was at the time. Most Amiga users were using a TV, maybe by 1992 more people had VGA-capable monitors, but those who did usually had the Commodore or Microvitec monitors which could do both anyway. Standard VGA monitors were not very much cheaper so the Commodore/Microvitec so there was little to gain by supporting VGA only, and a LOT to lose.
To support VGA - apart from the fact that the AGA chipset just doesn't have the bandwidth to do most games in VGA - would require re-engineering games to work on VGA or normal, and when 99% of the market have or have access to 15KHz monitors which would you support?

Quote

Those idiots could have gained maybe 1.5% on performance, maybe not.

Those idiots quite often knew the hardware inside-out and knew the limitations and capabilities of the machine better than most of the people here.

Quote

VGA cable to connect to the Amiga 1200 may cost 4 Euro maybe less. Users should not be forced to purchase scandoubler for 150 Euro and more because some developers are too stupid, to give up with useless "optimization".

Nobody was ever forced to buy a scandoubler, and the reason games were done in 15KHz is because they had to be. The AGA chipset was never designed for rapid access to 31KHz screenmodes (hence why the screenmode is called "Productivity").

Quote

AGA must be differently programmed to use an ordinary monitor, and differently to use VGA monitor. It is sad but this is what Commdore did many years ago, and developers have to just accept it.

Developers had to accept the limitations of the machine, yes.

Quote

Access to the hard disk on the classic Amiga, should be made only through the system, the original IDE interface is too slow, software for classic Amiga should work with FastATA. Access to the graphics on the Amiga classic, should be made only through the system, users should be able to connect at a low cost VGA monitor to Amiga. Access to the keyboard and mouse on the Amiga classic, should be made only through the system, users should be able to use USB mouse and keyboard with USB interface only and without additional hardware.

I think you're underestimating the impact of the OS on a game. Squeezing a game into a floppy and 1MB or even 2MB could be a real challenge (remember 2MB on the A1200 sounds like more but the extra quality graphics soon makes that disappear).

There are very good reasons why these "idiots" made the choices they did. Yes, in a perfect world everything would work in 31KHz with OS-legal everything, but to get the performance out of a system like the A1200 (which was far slower and more limited in resources than the PC you compare to) developers had to sacrifice things.

This is why certain games like Colonization, Robosport and Sim City 2000 work in the OS - because they can happily work in the OS with slower disk access, and the market is people with "serious" machines - but making games like Zool run in the OS is pointless for nearly all the market they're appealing to.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 19, 2014, 01:26:15 PM: Quote from: spirantho;769360
Floppy access can be MUCH faster without the OS overheads on smaller chunks of data, and less memory footprint, which is very important. Plus to use the OS routines, you need the OS in memory too, which can be a very large chunk of available memory.
1% is massively understating the potential gains, in speed and memory.

You should be able to see the result by using WHDLoad.

I think the reason why games kept using custom disk loading was due to piracy and not enough people caring about running games from a hard disk.

The are plenty of PC games that did exactly the same thing in the mid 80's, but eventually they decided that allowing the games to be installed on a hard disk would boost sales. http://www.vintage-computer.com/vcforum/showthread.php?16334-PC-Floppy-Disk-Games-Copy-Protection

I remember removing the floppy disk protection check from one of the lemmings games on the PC so it could run without the original disk in the drive.
Title: Re: newb questions, hit the hardware or not?
Post by: spirantho on July 19, 2014, 01:37:53 PM: WHDLoad has the benefit of running on machines with more resources, though. Running WHDLoad'ed games on an A600 - even where it's possible - isn't terribly enjoyable!

Copy protection was a large part of it, though, yes. But it certainly wasn't stupidity or idiocy.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 19, 2014, 04:30:55 PM: Reading data from floppy faster than from hard drive? What an absolute load of nonsense. Where do people come up with that crap?
Title: Re: newb questions, hit the hardware or not?
Post by: spirantho on July 19, 2014, 04:32:55 PM: It course not, but track loading is much faster than file loading from a floppy.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 19, 2014, 06:30:00 PM: Quote from: spirantho;769370
It course not, but track loading is much faster than file loading from a floppy.

Why? Do you think that games use higher magic for loading? The trackdisk device also reads the data in full tracks, and then decodes the entire track in a single go, buffering the result. The limiting factor is the I/O speed of the floppy, and the timing of step motor. Everything else is just software and quite a bit faster than any type of I/O operation.

The only reason why games used custom track formats was either to include a little more data on the floppy by using a smaller track gap, or for copy protection.
Title: Re: newb questions, hit the hardware or not?
Post by: spirantho on July 19, 2014, 07:46:32 PM: Just because file loading requires more seeking (to block 880, particularly). With track loading you know at any time where the head is and where it needs to be. I/O time is more or less the same (usually slightly faster with custom loaders), but seek time can be much less. Just look at copying a disk to RAM: using "Copy DF0:#? RAM:" to "Diskcopy from DF0: to RAD:". They're both doing the exactly same thing, but Diskcopy will be massively faster, because the seeking is nearly eliminated.
With trackloading, you can optimise everything for fast, efficient, low-memory loading, because the OS as to cope with generic cases. When you know exactly where the data is that you want, you can do it directly much faster.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 19, 2014, 07:59:00 PM: Quote from: spirantho;769379
Just because file loading requires more seeking (to block 880, particularly).

I guess we're talking about two different abstraction levels here. I was talking about the difference between using trackdisk.device and a custom track loader that hacks the hardware. You are talking about using a custom file system (or just data track by track) vs. the standard OFS.

These are different levels. A floppy not using OFS can still use the Amiga floppy format and the trackdisk.device for reading, but can equally simply copied. Yes, I agree, if you do not depend on the OFS, you can be faster due to lack of seeking. However, that's not what most games did. They used also (besides a custom "filing system") a custom low-level format, and *that* makes not a major difference speedwise. There is no other reason to disable the Os except that, the trackdisk.device worked perfectly fine. Thus, even with the Os you can load faster. Just do not depend on files and the OFS.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 19, 2014, 08:30:05 PM: Quote from: Thomas Richter;769374
Why? Do you think that games use higher magic for loading? The trackdisk device also reads the data in full tracks, and then decodes the entire track in a single go, buffering the result. The limiting factor is the I/O speed of the floppy, and the timing of step motor. Everything else is just software and quite a bit faster than any type of I/O operation.

trackdisk.device is awful, in the abacus book there was a program that patched 1.2 trackdisk to double it's speed (http://issuu.com/ivanguidomartucci/docs/amiga-disk-drives-inside-and-out---ebook-eng page 249, real page 240). Either the person at commodore/amiga who wrote trackdisk didn't understand the hardware, or it was written before functionality was added/worked and the code wasn't revisited. I think commodore improved it in release 2 but it was a little late by then as amiga games were already in decline.

If we'd had the os loaded from flash rom or hard disk, instead of mask rom then it would have made more sense to use the os.

Final Fight uses the OS for disk loading during levels, so it's entirely possible. But I guess it is slower, has less ram and cpu in the process.
When dos.library and the filesystem was bought in, it was only minimally changed to fit into the amiga & it wasn't a great design in the first place. commodore also improved it in release 2, but there are some things they couldn't change because of compatibility.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 19, 2014, 11:10:56 PM: Peecee operating system in assembly language: http://www.menuetos.net.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 19, 2014, 11:13:42 PM: Quote from: Thorham;769392
Peecee operating system in assembly language: http://www.menuetos.net.
No, no, no, Thorham! You can't manage a large-scale project in assembler, therefore that doesn't exist!
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 19, 2014, 11:38:50 PM: Quote from: psxphill;769383
Either the person at commodore/amiga who wrote trackdisk didn't understand the hardware, or it was written before functionality was added/worked and the code wasn't revisited.

The issue is that Amiga500 does not have lot RAM, there is space for pre fetching blocks.
Instead the disk has rotate to correct sector read a block discard the rest, rotate to next sector read a block and discard the rest.

The same problem you have now with faster ide/sata devices, now because the disk is so fast anyway so they don't care, but it is a problem for DVD and CD, if the filesystem is not smart about how to read blocks the it filters down to device io, where there is no intelligence to make it as efficient as possible.

Anyway perfecting every block in track on Amiga500 was not some thing they did, most likely because it takes up memory. When you have only 512k, then every byte count.

What does not make sense is way they don't do it today.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 01:29:43 AM: Quote from: commodorejohn;769393
No, no, no, Thorham! You can't manage a large-scale project in assembler, therefore that doesn't exist!
Even if you could, it would probably end up being riddled with bugs anyway :rofl:
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 20, 2014, 07:40:46 AM: Quote from: commodorejohn;769393
No, no, no, Thorham! You can't manage a large-scale project in assembler, therefore that doesn't exist!

It's a monolithic kernel and has little hardware support, yet it's taken 9 years. That sounds unmanageable to me, you seem to be confused between something being unmanageable and something existing. I'll give you the benefit of the doubt that you just don't understand the meaning of words rather than trying to bend the meaning on purpose (if you want an analogy: unmanageable hair doesn't mean you're bald).

http://www.osnews.com/story/1385/Comparing-MenuetOS-SkyOS-and-AtheOS/

Quote from: LiveForIt;769394
The issue is that Amiga500 does not have lot RAM, there is space for pre fetching blocks.
Instead the disk has rotate to correct sector read a block discard the rest, rotate to next sector read a block and discard the rest.

trackdisk.device only ever reads and buffers whole tracks.

If you read the abacus chapter you'll see that the trackdisk in 1.2 doesn't use the word sync to find where the track starts, it reads more than a track worth and then uses the cpu to search through the result. I think they might have stopped doing that in release 2.

The disk format wasn't optimal for the hardware either. For reading it would make more sense if there was just one $4489 per track, this wouldn't affect writing as you have to write an entire track even if you have only modified one byte anyway. It looks like they wanted to allow sector writing because paula can search for the sync word when writing, but it doesn't have any way of checking which sector it would be writing to. My guess is the disk format was decided on and code hacked to work on the hardware that existed but nobody had time, or thought it would be a good idea, to go back and review the design after the hardware was finished.
Title: Re: newb questions, hit the hardware or not?
Post by: DamageX on July 20, 2014, 08:25:38 AM: Thanks for jacking my thread, LOL.
Title: Re: newb questions, hit the hardware or not?
Post by: matthey on July 20, 2014, 08:41:15 AM: Quote from: DamageX;769408
Thanks for jacking my thread, LOL.

From 2006 no less :D.
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 20, 2014, 09:21:39 AM: Quote from: psxphill;769407
If you read the abacus chapter you'll see that the trackdisk in 1.2 doesn't use the word sync to find where the track starts, it reads more than a track worth and then uses the cpu to search through the result. I think they might have stopped doing that in release 2.

Well it has to read the RAW data, decode the MFM, after the MFM is decoded can know whats on it, to see what block that was requested.

Quote
The disk format wasn't optimal for the hardware either. For reading it would make more sense if there was just one $4489 per track

Sure more data if there are no gaps, but I think comes down to filesystem, you don't want a too big blocks because, then small files take larger space, but physical storage and filesystem blocks does not need to be the same I agree. On the other hand with no sectors, the be only one CRC for large block of 4489, so more data being lost if there was a read/write error, maybe diving it into sector made it more reliable I don't know.

Quote
It looks like they wanted to allow sector writing because Paula can search for the sync word when writing

Some floppy drives / disks did have a sync hole, maybe it was legacy.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 09:22:58 AM: Anyone who says that managing big assembly language projects is impossible, is basically saying that we humans are too damned stupid for that. Speak for yourself, please.
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 20, 2014, 09:29:13 AM: Quote from: Thorham;769412
Anyone who says that managing big assembly language projects is impossible, is basically saying that we humans are too damned stupid for that. Speak for yourself, please.

Humans are good at poison the food they eat sure they are stupid, from time to time they do some thing smart, but that's not often.

Regards
The Gray Alien from planet X.
(I run to hide in my UFO.)
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 09:44:10 AM: Quote from: LiveForIt;769394
The issue is that Amiga500 does not have lot RAM, there is space for pre fetching blocks.

Actually, the trackdisk.device does prefetch entire tracks. Since ever.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 09:54:53 AM: Quote from: psxphill;769407
If you read the abacus chapter you'll see that the trackdisk in 1.2 doesn't use the word sync to find where the track starts, it reads more than a track worth and then uses the cpu to search through the result. I think they might have stopped doing that in release.
This is correct, and indeed, they stopped this nonsense in kickstart 2.0. The reason for this "feature" in 1.3 was likely because there is a mis-documentation in the RKRMs Hardware book on how the sync-word feature works. According to the RKRMs, the sync word works by PAULA watching the incoming MFM stream and enabling DMA as soon as the pattern is detected. However, this is incorrect, PAULA does more than that. If the sync word is enabled, and DMA is running, PAULA *also* resynchronizes on every detected sync word. This makes an important difference in the track gap, where synchronization can be lost, and it is not clear that the start of the track aligns correctly to the end of it, bit-wise. From the RKRM description, you would have only gotten unaligned MFM data in the buffer after the track gap, and hence would need to re-align manually - which is what they did. However, PAULA is not that stupid. Whenever it sees the sync word, it restarts at a word-boundary from that sync-word on, discharging any incomplete or unsynchronized bits.
Quote from: psxphill;769407
The disk format wasn't optimal for the hardware either. For reading it would make more sense if there was just one $4489 per track, this wouldn't affect writing as you have to write an entire track even if you have only modified one byte anyway. It looks like they wanted to allow sector writing because paula can search for the sync word when writing, but it doesn't have any way of checking which sector it would be writing to. My guess is the disk format was decided on and code hacked to work on the hardware that existed but nobody had time, or thought it would be a good idea, to go back and review the design after the hardware was finished.

Actually, no. If you only had a single sync word, then PAULA had only a single chance of finding the sync word per track, i.e. in worst case, an entire track would have to be read twice: First, to detect the sync word - if you enabled PAULA right after the sync word is passing under the head - and second to get the full data. With the sector layout, the MFM reader will at most spoil an entire sector plus the track gap, which is a much smaller part of the track.

Some games, however, use the entire track to store data, with a minimal track gap, and thus squeeze more bytes into a track. Of course at the expense of possibly reading tracks slower by missing the sync word in the first rotation.

The PC floppy disk controller uses the reverse: It synchronizes on every sector, has a much larger inter-sector gap, and does no buffering, i.e. sectors are read and written individually. With the relatively short sector gap the Amiga trackdisk layout has, this would be rather impossible. The chance of overwriting the next sector would be very high. For the PCs, the uncertainty in write alignment is compensated with the higher inter-sector gap (i.e. the sector can overflow a little bit behind its natural location, then fills the sector gap without overwriting the next sector header).
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 10:02:47 AM: Quote from: Thorham;769412
Anyone who says that managing big assembly language projects is impossible, is basically saying that we humans are too damned stupid for that. Speak for yourself, please.

Again, you apparently haven't tried that yet. It is not a matter of stupidity. It is a game of chances and discipline. For every line of code you write, there is a certain chance to get it wrong. With higher languages, the compiler detects such things for you. In assembler, you don't. It is also a matter of flexibility and maintainability. A project does not start from the first line and then is written in one go to the last line. You typically get spec changes somewhere in the middle, let it be by your employer, let it be because you find that your initial design wouldn't work for some reason. In a higher language, it is easy to adapt. In assembly, it usually means that you have to rewrite major parts of the code because the interfaces no longer fit, and errors sneak into this process.

You know, (but probably haven't experienced this) a piece of software is more than a collection of instructions. It is a software *design*. You don't need to design small projects, but you do for larger projects. The more abstract tools you have for designing the code, the simpler it becomes to adopt and fix.

Anyhow, since you haven't gotten through this, I suggest that you just try. I would have had one or two assignments for you, to be written completely in assembler, *BY YOU*. If go you through this, and complete this in time, I stand corrected.

Be warned, however. This will not be an easy untertaking. It is designed *not to be easy*, it is a "real life" project and not a toy project like your average demo.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 10:45:47 AM: Quote from: Thomas Richter;769418
Again, you apparently haven't tried that yet.
You're right, I haven't. Doesn't mean it's impossible.

Quote from: Thomas Richter;769418
You know, (but probably haven't experienced this) a piece of software is more than a collection of instructions.
Obviously. A system is always more than the sum of it's parts.

Quote from: Thomas Richter;769418
Anyhow, since you haven't gotten through this, I suggest that you just try. I would have had one or two assignments for you, to be written completely in assembler, *BY YOU*. If go you through this, and complete this in time, I stand corrected.
I have some interesting things I've been wanting to do for a long time now. One of which is a new, written from scratch, modern GUI system for 68k Amigas. Another one is a new OS.

What are those assignments you have in mind?

Quote from: Thomas Richter;769418
Be warned, however. This will not be an easy untertaking. It is designed *not to be easy*, it is a "real life" project and not a toy project like your average demo.
Wouldn't have it any other way :)
Title: Re: newb questions, hit the hardware or not?
Post by: OlafS3 on July 20, 2014, 12:27:06 PM: Quote from: Thorham;769421
You're right, I haven't. Doesn't mean it's impossible.

Obviously. A system is always more than the sum of it's parts.

I have some interesting things I've been wanting to do for a long time now. One of which is a new, written from scratch, modern GUI system for 68k Amigas. Another one is a new OS.

What are those assignments you have in mind?

Wouldn't have it any other way :)

A GUI system sounds interesting. Or you look at the existing systems and improve one of them? Partly sources are available.

I think we all talk of different things. "Realworld applications" like a word processor are difficult to develop just in assembler. Efficiency is very important and most developer would need much longer in assembler than in a high level language. Also you often have debugger and similar and outside amiga world you even have configurable components that make life much easier. Another problem assembler is not portable what is important for some projects and you have the problem to find another similar skilled developer if f.e. the main developer leaves. There are obviously much more developers with experience in C than in a certain assembler. That many people wrote in assembler many years ago was because of the lack of system ressources and not because most people liked it. And as I said I am right now aware of a number of different cores for FPGAs that are in development. It is not predictable if all cores are all identical (from dev view) so if you are a game developer hitting the hardware in assembler you have to test it on every core available (and UAE) if you want to be sure. Or you just make it for one core with the risk that it will not run everywhere. That might be ok for hobby development but is a no-go for potential commercial development. And if someone works on a application he will certainly not hit the hardware and use the OS.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 12:41:53 PM: Quote from: OlafS3;769424
Or you look at the existing systems and improve one of them?
Sorry, but no. I want my own system that's more modern than what's available now for 68k. The idea is to start from scratch, and use the OS for user IO. And yes, that means it would only run on it's own screen.

Quote from: OlafS3;769424
That many people wrote in assembler many years ago was because of the lack of system ressources and not because most people liked it.
It's a hobby for me.

Quote from: OlafS3;769424
And as I said I am right now aware of a number of different cores for FPGAs that are in development.
I'm interested in writing Amiga software. If someone wants to run Amiga software, let them use an Amiga (or an emu)!
Title: Re: newb questions, hit the hardware or not?
Post by: OlafS3 on July 20, 2014, 12:59:22 PM: Quote from: Thorham;769425
Sorry, but no. I want my own system that's more modern than what's available now for 68k. The idea is to start from scratch, and use the OS for user IO. And yes, that means it would only run on it's own screen.

It's a hobby for me.

I'm interested in writing Amiga software. If someone wants to run Amiga software, let them use an Amiga (or an emu)!

Ah ok

if you do not care if other people can run it then go

GUI system is something like intuition or triton for me, a library to create a GUI for applications and tools/utilities. If it not runs everywhere (or it is at least not certain that it works) noone else will use it.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 01:30:27 PM: Quote from: OlafS3;769426
if you do not care if other people can run it then go
Maybe I put it a little too harshly. The intention is to write true Amiga software that goes further then what's currently available, especially on lower end 20s and 30s with some fastmem. If that's successful, it shouldn't be a big deal to add some chunky GFX support for GFX cards.

The emphasis is on OCS/ECS/AGA+20s/30s, because I believe these machines are capable of more than what we're seeing today. All that's available is old desktops and old GUIs, and this can be massively modernized without any crazy requirements (obviously some of the eye candy will be missing, but that's not what makes a modern GUI system modern).

All of the low level graphics code for this will be planar blitting routines, some c2p, and a hardware sprite for the mouse pointer. To get this stuff to work on GFX cards would require adding some code to access the GFX card functions, add chunky versions of the planar blitting routines (which are vastly easier to write than the planar ones) and to not use the c2p. Certainly not a massive task when you have an actual system running. It's just something I wouldn't do from the start if I would take on a project like this.

As for the 68k assembly language, that's not negotiable :D It's a hobby after all ;)
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 01:34:57 PM: Quote from: Thorham;769421
What are those assignments you have in mind?

Ok, here are a couple of ideas. Write a complete JPEG 2000 codec, from scratch, from the specs, in assembler. If that's not interesting enough, you can also start with HEVC (the latest MPEG standard), again from the specs. For the first project, I could give you help since I did this. For the second, I would be of no help since it's not exactly my branch.

The first project takes approximately 6 months in C++ (been there, did that). The second takes probably longer since it is more complex. In either case, it would be beneficial for the Amiga community, and both projects have *some* use for assembler - though it would be insane, as I said before, to write them in assembler completely. But again, you claim it's possible, so go ahead.

My bet is, you'll probably start, let it go for a month, then will loose interest because it is too complex. But, as said, it's your choice. Rule number 1 is not to give up, but that's easier to realize in C++ than it is in assembler.
Title: Re: newb questions, hit the hardware or not?
Post by: commodorejohn on July 20, 2014, 01:48:36 PM: Quote from: psxphill;769407
That sounds unmanageable to me, you seem to be confused between something being unmanageable and something existing. I'll give you the benefit of the doubt that you just don't understand the meaning of words rather than trying to bend the meaning on purpose
And I'll give you the benefit of the doubt and assume that you're a robot from space who does not understand the thing we hu-mans refer to as a "joke."
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 02:17:30 PM: Quote from: Thomas Richter;769428
Ok, here are a couple of ideas. Write a complete JPEG 2000 codec, from scratch, from the specs, in assembler. If that's not interesting enough, you can also start with HEVC (the latest MPEG standard), again from the specs. For the first project, I could give you help since I did this. For the second, I would be of no help since it's not exactly my branch.
I lack the math knowledge to implement these optimally, and they're boring to me. Also, MPEG is useless on my preferred targets (20's and 30's with some fastmem) because they're just not fast enough.

What about that GUI system I was talking about? This would be much more interesting for me, because it also requires designing everything. Coding from specs is boring to me, and I want the freedom to do what I want. It also doesn't seem to be a small project.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 02:36:45 PM: Quote from: Thorham;769430
I lack the math knowledge to implement these optimally
So did I when I started. That's not an argument - learn it. You might experience something new.
Quote from: Thorham;769430
Also, MPEG is useless on my preferred targets (20's and 30's with some fastmem) because they're just not fast enough.
Then make it fast enough. (-;
Quote from: Thorham;769430
What about that GUI system I was talking about? This would be much more interesting for me, because it also requires designing everything. Coding from specs is boring to me, and I want the freedom to do what I want. It also doesn't seem to be a small project.

Real world software development is hardly ever "doing what you want". GUI systems, in a sense, are not very challenging, and neither very demanding (I wrote one for SDL in a matter of weeks, not months). Except that I would never do that in assembly, not even partially, because any other language is also fast enough, and has better means to express the dependencies of the objects. The challenge is here, rather, to come up with a good class design and a good documentation to actually make people use it.

So, please, let's come up with some projects that have a challenge in it due to their complexity. A GUI system is not that complex (been there, done that).
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 04:00:18 PM: Quote from: Thomas Richter;769431
So did I when I started. That's not an argument - learn it. You might experience something new. Then make it fast enough. (-;
Yeah, that probably is a bad argument, but it's just not my cup of tea.

Quote from: Thomas Richter;769431
Real world software development is hardly ever "doing what you want".
It is when you're doing it as a hobby. Not always, of course. You may end up having to write some things for your project that you don't feel like writing.

Quote from: Thomas Richter;769431
GUI systems, in a sense, are not very challenging, and neither very demanding (I wrote one for SDL in a matter of weeks, not months).
I'm talking about the larger projects like Gnome and KDE. These were undoubtedly not written in a couple of weeks.

Quote from: Thomas Richter;769431
So, please, let's come up with some projects that have a challenge in it due to their complexity. A GUI system is not that complex (been there, done that).
Perhaps not very complex, but also not very small, and a good, modern GUI system is quite useful at least.

The thing that's most interesting for me to do that's also complex is writing a new OS from scratch. Should be sufficiently challenging.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 20, 2014, 04:33:23 PM: Quote from: LiveForIt;769411
Well it has to read the RAW data, decode the MFM, after the MFM is decoded can know whats on it, to see what block that was requested.

It doesn't need to do that for every block though, if you request a block from the last track read then it can skip steps.

Quote from: commodorejohn;769429
And I'll give you the benefit of the doubt and assume that you're a robot from space who does not understand the thing we hu-mans refer to as a "joke."

We humans call it sarcasm. I just thought your mobbing attempt should be debunked for it's inaccuracy.

Quote from: Thorham;769421
You're right, I haven't. Doesn't mean it's impossible.

There are lots of things that are theoretically possible but practically impossible. Usually because people over estimate their abilities and their theory was incomplete.

Quote from: Thomas Richter;769417
Actually, no. If you only had a single sync word, then PAULA had only a single chance of finding the sync word per track

Is this a big deal? Why wouldn't it find the sync word? If there are frequent bit errors then the floppy disk is going to fail anyway as there is no redundancy for all the data bits.

The only advantage I can think of is you can start reading sooner as you don't need to wait for the start of a track, but I'm not sure how much help that is if you need to re-arrange the track in ram anyway. Especially with the hokum that kickstart 1.x uses.

Quote from: LiveForIt;769411
On the other hand with no sectors, the be only one CRC for large block of 4489, so more data being lost if there was a read/write error, maybe diving it into sector made it more reliable I don't know.

You could use multiple CRC's, but if you're expecting errors then you could lose the only 4489. I don't know how well trackdisk copes with damaged sectors. Especially if it's the sector number that gets corrupted.

Quote from: Thomas Richter;769417
With the relatively short sector gap the Amiga trackdisk layout has, this would be rather impossible. The chance of overwriting the next sector would be very high. For the PCs, the uncertainty in write alignment is compensated with the higher inter-sector gap (i.e. the sector can overflow a little bit behind its natural location, then fills the sector gap without overwriting the next sector header).

Yes you would need to make the sector gap larger & I think they would have done that if paula had the functionality in. In write mode with sync turned on it does read from the disk and wait for the word sync. However there is no sector number comparison, my conjecture is that they may have started out down this path and given up due to time/available gates rather than a desire to do it all with the CPU.

Quote from: Thomas Richter;769417
From the RKRM description, you would have only gotten unaligned MFM data in the buffer after the track gap, and hence would need to re-align manually - which is what they did. However, PAULA is not that stupid.

I'd assumed that the hardware reference manual was written after trackdisk.device was. There will have been documentation, but exactly what we'll never know. The software and hardware engineers had the opportunity to talk to each other about how the hardware worked and supposedly they did on other occasions. It kinda worked well enough and fixing it might not have been seen as a priority, the developer might have been arrogant about his ability and never bothered to discuss it with anyone or he might have been arrogant enough to say that he'd tried making the hardware work and it didn't so he'd been forced to do it that way and nobody ever took him on.

The Kickstart 1.2 easter egg would suggest that it was released before development moved to commodore. My guess is that the trackdisk developer didn't transition to commodore, allowing any misinformation to disappear as to why it was coded like that.

Quote from: Thorham;769412
Anyone who says that managing big assembly language projects is impossible, is basically saying that we humans are too damned stupid for that. Speak for yourself, please.

You might want to read this:

http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect (http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect)

My favourite quotes

"One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision."

"If you’re incompetent, you can’t know you’re incompetent. […] the skills you need to produce a right answer are exactly the skills you need to recognize what a right answer is."

Over the years I've often found people say things are impossible when they are possible, but say they are possible when they are impossible. It's usually down to whether they want to do something rather than any practical reasons.

Quote from: Thorham;769434
I'm talking about the larger projects like Gnome and KDE. These were undoubtedly not written in a couple of weeks.

You'd need to design it first, or you'll code yourself into a corner and end up constantly rewriting it all to get some new functionality that you think of tomorrow (and the next day). The problem with doing something for a hobby is that you don't have any external influence. If you don't manage scope then you're going to get bored
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 05:05:48 PM: Quote from: psxphill;769435
You might want to read this:

http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
I never said that I can take on large, complex projects in assembly language by myself and get it right, I said that it's not impossible to do large, complex projects in assembly language and do them properly. Although I'm fairly confident, I'd have to try and see for myself if I could do it or not.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 05:15:36 PM: Quote from: psxphill;769435
Is this a big deal? Why wouldn't it find the sync word? If there are frequent bit errors then the floppy disk is going to fail anyway as there is no redundancy for all the data bits.

It's certainly not a big deal, but it makes on average floppy transfer roughly 50% slower. Reason being that with only a single sync word, PAULA would have to wait on average half a track to get to the sync word. With a sync on every sector, you miss on average a bit more than half a sector, which is not that bad.

In the end, we do not know why CBM took such decisions.

For the history lesson on the sync register, I found that apparently trackdisk was written before PAULA was completed (back then called "Portia") and CBM decided not to update trackdisk for the new features but rather rush it out. AmigaOs was late anyhow. It's also interesting to note that the trackdisk in Kickstart 2.0 and above no longer requries user buffers in chip memory. In worst case, it copies data to a chip mem buffer, or decodes using the CPU, bypassing the blitter. The track buffer remains, of course, in chip ram. Thus, the ugly BufMemType hack for the FFS is actually no longer required (and should not be required by any sane device.)
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 20, 2014, 05:45:07 PM: Quote from: Thorham;769437
I never said that I can take on large, complex projects in assembly language by myself and get it right, I said that it's not impossible to do large, complex projects in assembly language and do them properly. Although I'm fairly confident, I'd have to try and see for myself if I could do it or not.

You don't have any idea what your own or anyone elses ability to achieve it is. You also don't have enough experience to evaluate your or anyone elses results.

If you read and understood the link I posted then saying it's possible to do it and being fairly confident you could do it yourself is a bad sign for your ability to actually do it. It will definitely have an effect on how long you think it will take and how long it will actually take.

Quote from: Thomas Richter;769438
For the history lesson on the sync register, I found that apparently trackdisk was written before PAULA was completed (back then called "Portia") and CBM decided not to update trackdisk for the new features but rather rush it out. AmigaOs was late anyhow.

Well that was my initial guess. While AmigaOS was late, so were the chips. Although that was partly commodore themselves switching AGNUS from YUV to RGB colour space. When they were hawking the breadboards round there was no floppy, sound and rs232 would be much more useful than floppy disk. I got the impression that trackdisk predated dos.library, but I find that kind of history really interesting. So any links etc would be great.

Quote from: Thomas Richter;769438
It's also interesting to note that the trackdisk in Kickstart 2.0 and above no longer requries user buffers in chip memory. In worst case, it copies data to a chip mem buffer, or decodes using the CPU, bypassing the blitter. The track buffer remains, of course, in chip ram. Thus, the ugly BufMemType hack for the FFS is actually no longer required (and should not be required by any sane device.)

I knew that you could use it to read into fast ram, but my knowledge (or at least my memory of) how that affects decoding is limited. The MFM decoding using the blitter is pretty insane, they could have implemented one pass MFM decoding and encoding into the blitter. On a chip ram only 68000 system it's probably still faster than using the CPU to do it though, on faster CPU's with fast ram then using a lookup table is probably much better.

I think you'd need to specify MEMF_24BITDMA in BufMemType for a Zorro II SCSI card in a Zorro III system with fast ram, but you might not consider that sane.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 20, 2014, 06:18:06 PM: Quote from: psxphill;769439
You don't have any idea what your own or anyone elses ability to achieve it is.
I know we can go to the moon, and drive remote controllable vehicles around on Mars, so I know that human beings have the ability to pull off some damn difficult things, and that's all I need to know. That there are possibilities where others see brick walls. If no one thought anything hard was possible, we'd still be stuck in the stone age. Can I do it? Don't know, haven't tried.

Quote from: psxphill;769439
If you read and understood the link I posted then saying it's possible to do it and being fairly confident you could do it yourself is a bad sign for your ability to actually do it.
With fairly confident I meant that I'm fairly confident in my programming ability, and I specifically said that I don't know if I could handle a large, complex project in assembly language by myself. Also, notice the word 'fairly', implying I know that I have some skill. How much? I don't know.

Anyway, all you're trying to say is how things are impossible, and I refuse to think like that, because I know things are not impossible, even though they may be very difficult.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 20, 2014, 09:03:50 PM: Quote from: psxphill;769439
I got the impression that trackdisk predated dos.library, but I find that kind of history really interesting. So any links etc would be great.
No official links on that, but I have other sources of information. (-: As far as the dos.library is concerned, I would say this rather depends on your definition. dos is just a rushed port of Tripos, and I would savely say that Tripos predates any CBM activities on the Amiga. But that's probably not what you are asking? Tripos is from 1979, some parts such as the OFS from 1980, trackdisk from 1985. Porting to Amiga also happened 1985, so it's hard to say which one came earlier.
Quote from: psxphill;769439
I think you'd need to specify MEMF_24BITDMA in BufMemType for a Zorro II SCSI card in a Zorro III system with fast ram, but you might not consider that sane.

It is insane. Any driver worth its money should know which memory it can reach (by DMA or otherwise), and should take appropriate means to get the data into memory regions where it is accessible for it, possibly using additional buffers. At least, it should not be the matter of the *filing system* or the *user program* (shudder) to care about such implementation details. If a special memory type must be used, at least an interface should have existed to negotiate the requirements. Just depending on the user for this is a very lousy hack. Apparently, they didn't know what to do when FFS had direct buffer access and implemented a quick workaround by means of the BUFMEMTYPE. Argh.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 20, 2014, 11:30:31 PM: Quote from: Thomas Richter;769449
dos is just a rushed port of Tripos

Yeah, they do seem to have ripped the guts out of it and transplanted it with exec devices reasonably well. But the whole BCPL thing is tragic. I don't think anything other than the 1.x C: directory ever used the a5 global vector (which was why they were impervious to the powerpackerpatcher & you had to use the ARP equivalents). I guess AROS 68k doesn't implement the global vector either, but I don't know.

Quote from: Thomas Richter;769449
It is insane. Any driver worth its money should know which memory it can reach (by DMA or otherwise), and should take appropriate means to get the data into memory regions where it is accessible for it, possibly using additional buffers.

Additional buffers is the wrong way to do it. Ideally you'd be able to ask for memory that you and another task can access and there would be some way of trackdisk.device or scsi.device to tell exec what memory it needed, the mmu pages for your task and the other task would then get setup properly so that the memory could be accessed.

This would change quite a lot though, I think with the way AllocMem works you need BufMemType. I'm not that bothered about that, MaxTransfer is much higher up my list of wtf.
Title: Re: newb questions, hit the hardware or not?
Post by: itix on July 21, 2014, 02:36:34 AM: Quote from: psxphill;769454

This would change quite a lot though, I think with the way AllocMem works you need BufMemType. I'm not that bothered about that, MaxTransfer is much higher up my list of wtf.

Wasnt MaxTransfer workaround for buggy harddisks that could not transfer more than 64K at once?
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 21, 2014, 07:00:08 AM: Quote from: itix;769461
Wasnt MaxTransfer workaround for buggy harddisks that could not transfer more than 64K at once?

You should be able to set MaxTransfer to ffffffff on ATA hard drives, there is a fixed upper limit of 1fffe which is impossible to exceed that should be hard coded in the device and you are then supposed to ask the drive how many words to transfer after issuing a command. commodore don't seem to know about the 1fffe limit & they assume it will transfer what they request.

The highest most ATA drives can transfer is 1fe00 because it always transfers multiples of 200 and 20000 would overflow the count. There is nothing to stop a drive from normally being able to transfer 1f800 bytes from only being able to transfer 800 one time because of temporary buffer constraints or sector remapping. If the drive doesn't transfer as much as you need it to then afterwards you're supposed to issue more commands.

There might be hard disks that react differently to how commodore expect them to, but the hard disks are operating within specification. commodore either never read it, or decided it was too hard to implement properly (maybe because it was written in assembler?).

Any problems with SCSI drives are likely to be caused by similar bugs in the relevant .device code. It sounds better if you have can convince everyone that it's the superiority of the Amiga that it requires kludges to work around bugs in cheap hard disks that were made by lesser people for the inferior PC.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 21, 2014, 09:50:19 AM: Quote from: psxphill;769454
Additional buffers is the wrong way to do it. Ideally you'd be able to ask for memory that you and another task can access and there would be some way of trackdisk.device or scsi.device to tell exec what memory it needed, the mmu pages for your task and the other task would then get setup properly so that the memory could be accessed.

This would change quite a lot though, I think with the way AllocMem works you need BufMemType. I'm not that bothered about that, MaxTransfer is much higher up my list of wtf.

Look, the design principle is "the easy things should be easy". And reading a block from a device should be easy. If I - as caller - first have to make sure to get the proper memory, then the interface is already too complex, which will certainly create bugs by folks not following the interface. I would not have a problem with an interface that provides the *ideal* memory type for such users that want to optimize the throughput, but the overall design principle should be that a device can handle whatever the user provides, regardless of the buffer memory type or the transfer size. Everthing else is just asking for trouble.
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 21, 2014, 10:46:19 AM: Yeah, maxtransfer sucks turds. Use IdeFix and never worry about it again.
Title: Re: newb questions, hit the hardware or not?
Post by: psxphill on July 21, 2014, 02:16:42 PM: Quote from: Thomas Richter;769476
I would not have a problem with an interface that provides the *ideal* memory type for such users that want to optimize the throughput, but the overall design principle should be that a device can handle whatever the user provides, regardless of the buffer memory type or the transfer size. Everthing else is just asking for trouble.

I would prefer that you asked for memory that was applicable rather than adding lots of layers which will just slow everything down when you pass the wrong type of ram.

Quote from: Thorham;769442
I know we can go to the moon, and drive remote controllable vehicles around on Mars, so I know that human beings have the ability to pull off some damn difficult things, and that's all I need to know.

Do you know that they spent a lot of money on high level language development so they didn't have to rely on someone writing a complex assembly language for the software for those projects?
Title: Re: newb questions, hit the hardware or not?
Post by: Thorham on July 21, 2014, 03:41:26 PM: Quote from: psxphill;769490
Do you know that they spent a lot of money on high level language development so they didn't have to rely on someone writing a complex assembly language for the software for those projects?
You're missing the point.

All I'm saying is that it's possible to handle a large project in assembly language and get it right. Possible, nothing else. In fact, it probably isn't easy. Large projects aren't easy, and assembly language will obviously not make them any easier. Doesn't mean it's impossible. Difficult != impossible.

Furthermore, 68k development is a hobby. As such, choosing assembly language on a 68k system is something you do because you like working in 68k assembly language, not because it's the least difficult or the fastest to develop in.
Title: Re: newb questions, hit the hardware or not?
Post by: guest11527 on July 21, 2014, 05:49:47 PM: Quote from: psxphill;769490
I would prefer that you asked for memory that was applicable rather than adding lots of layers which will just slow everything down when you pass the wrong type of ram.

What's better in case of a user failing to allocate the buffer from proper memory? a) slowing down the transfer or b) delivering wrong data and/or crashing the system?

What if that's your credit card data that being transfered?
Title: Re: newb questions, hit the hardware or not?
Post by: LiveForIt on July 21, 2014, 06:05:17 PM: Quote from: psxphill;769490
I would prefer that you asked for memory that was applicable rather than adding lots of layers which will just slow everything down when you pass the wrong type of ram.

It more efficient to reuse memory then to allocate memory. So if you have API allocates the memory for you, then you can not reuse the memory, then you will need to free it and reallocate memory using the API.