@whoosh777
I havent used the blitter for a long time but what happens if you give the blitter an odd address, do you get an exception or just a horrible crash??
No exception. The lowest bit of the address is ignored, so you would get odd_address - 1.
as I said I can turn all the criticisms of my code around into criticisms of the system design, both h/w and OS,
I am sure of that. However, I would still recommend just testing against NULL return as documented.
Even if the app crashes, there is no way to "remove task" safely, since there is no resource tracking or separated address spaces.
if you had just memory tracking then you could remove the tasks allocated memory + code + stack,
I'm afraid that won't work. The memory allocated by the task can still be in use by other tasks / processes. Also the program seglist could be used by interrupts, hooks or other processes. This is why you can't free the task memory or unload the seglist.
With such methods all buffering would be left at the lower level (filesystem and device driver), thru several APIs.
... Lots of stuff comparing RAM L1/L2 cache and medium cache ...
correction: closer buffering is to the h/w the faster it is,
see CPU caches for example
I'm afraid this comparision is totally unfair, as typically the medium is tens to hundreds times slower than memory, not to mention L1/L2. It certainly makes no sense at all to locally cache memory!
Also when the buffering is local it has much better knowlege of the actual buffer usage, so further optimizations are possible that would not be at lower level.
I totally disagree, buffers should be dynamically allocated at the lowest level, preferably not by the programmer, in fact maybe even not by the filesystem but by a lower level still, though integrating the filesystem with the lowest level maybe is the best approach
I totally disagree with you, see below for an example.
Also choice of caching algorithm can make a huge difference this is part of why I dont want it done by the programmer,
if its not done by the programmer then it can be retargetted ie reimplemented
But the programmer does not need to do it, libc does it for him/her.
caches arent everything, filesystem design is equally important at determining speed, a well designed system would be really fast even if the programmer doesnt do any buffering, the subroutine call overhead of fputc( fgetc(infp) , outfp) should be quite tiny because this is such a tiny loop it should entirely be in the memory cache, (with fgetc() copying a byte from a low level buffer and fputc() to a low level buffer) note that not only will the instructions of this be entirely in the instruction caches but the file buffer arrays will also be entirely in data caches, so all round very fast,
I think AmigaOS filesystem is much more efficient than that of Windows XP, its a good filesystem, but it could be a lot better,
Well, apparently you don't know how complex the two APIs are, and how much overhead is caused by it. Maybe a simple example will clear it for you.
Lets imagine simple fputc without any buffering, except at the exec device driver level (which you say will be the most efficient):
- fputc calls Write(fh, &ch, 1); to write the char
- Write calls DoPkt with ACTION_WRITE
- DoPkt sets up a DosPacket with ACTION_WRITE and parameters for the write
- DoPkt PutMsg the DosPacket to filesystem MsgPort and Wait for the reply
- filesystem wakes up from Wait() and GetMsg() the DosPacket
- filesystem determines the DosPacket is ACTION_WRITE and process it
- filesystem ACTION_WRITE updates the current block of the file
- filesystem ACTION_WRITE use DoIO CMD_WRITE (or CMD_WRITE64, or HD_SCSICMD etc) to send IORequest to exec device driver
- DoIO use device driver DEV_BEGINIO vector to send the IORequest
- The device driver DEV_BEGINIO link the IORequest to device task for processing, and return
- DoIO WaitIO is waiting for the IO to finish
- The device task will process the IORequest, see that it's CMD_WRITE (or whatever), and update buffers (and perhaps do the actual IO).
- When the IO is finished the IORequest will return (ReplyMsg)
- DoIO's WaitIO call wakes up, and DoIO returns
- filesystem checks for IO error and io_Actual to see write was successful
- filesystem ACTION_WRITE set dp_Ret1 to 1 (written 1 byte) and dp_Ret2 to 0 and PutMsg() the DosPacket back to caller (DoPkt)
- DoPkt's Wait wakes up, and DoPkt GetMsg the reply DosPacket, and moves dp_Ret1 to d0, and dp_Ret2 to pr_Result2 (IoErr) and returns
- Write returns with 1 byte written
- fputc returns with 1 byte written
The above sequence is for writing
single byte without local buffering. It involves several task switches (scheduling) and waiting. In all it is very very time consuming and will kill the performance, regardless of caches.
Now, if you put the caching to filesystem the sequence gets a lot shorter:
- fputc calls Write(fh, &ch, 1); to write the char
- Write calls DoPkt with ACTION_WRITE
- DoPkt sets up a DosPacket with ACTION_WRITE and parameters for the write
- DoPkt PutMsg the DosPacket to filesystem MsgPort and Wait for the reply
- filesystem wakes up from Wait() and GetMsg() the DosPacket
- filesystem determines the DosPacket is ACTION_WRITE and process it
- filesystem ACTION_WRITE updates the current block of the file in cache
- filesystem ACTION_WRITE set dp_Ret1 to 1 (written 1 byte) and dp_Ret2 to 0 and PutMsg() the DosPacket back to caller
- DoPkt's Wait wakes up, and DoPkt GetMsg the reply DosPacket, and moves dp_Ret1 to d0, and dp_Ret2 to pr_Result2 (IoErr) and returns
- Write returns with 1 byte written
- fputc returns with 1 byte written
It still involves several task switches (scheduling) and waiting.
Now, lets put the cache to fputc:
- fputc puts the char to local buffer
- fputc return with 1 byte written
No task switching is involved. Depending on the buffering mode, only filling up the buffer or linefeed will cause actually flush of the cache (Write).
If you still fail to see my point, I can't really help it.
so the OS is sensibly not writing to the hardware till the cylinder or track is full
Only because you write full tracks. If you would do small writes, it would rewrite the same track several times.