Amiga.org
Amiga News and Community Announcements => Amiga News and Community Announcements => Amiga Software News => Topic started by: platon42 on October 14, 2007, 11:56:41 PM
-
TLSFMem is an implementation of a very new memory allocation system called TLSF (two level segregated fit). TLSF was described in a paper in 2005 by the three italian researchers M. Masmano, I. Ripoll, A. Crespo. Originally designed for Realtime Operating Systems, all allocation and free operations run with constant time complexity (O(1)). This is a major improvement over the original AmigaOS memory system, which gets slower while memory gets fragmented (O(m) where m is the number of fragments).
Moreover, the old AmigaOS allocator uses a first fit strategy, which causes the memory to fragment pretty quickly. TLSF is an exact fit allocator for memory blocks smaller than 512 bytes and a good fit allocator for all other sizes: It will always find a free block which is always smaller than 103% of the requested block.
TLSFMem is blindly fast and will reduce memory fragmentation significantly!
TLSFMem was written in optimized assembly language, but more importantly, it uses these clever constant time algorithms.
This software comes as freeware, but as I spent a lot of time in developing it, you are welcome to donate something.
Download it here (http://www.platon42.de/cgi-local/navbar.pl?0000&download.html#tools).
Chris Hodges
-
This sounds great!!!
So Chris, is this useful in any implementation of the AmigaOS?
classic hardware running 3.x? 2.x?
Amiga Forever? UAE?
Amithlon?
or is it mostly useful in only some cases of the above?
-
Hooray!!!
But is it any better than MemOptimizer?
Or Poolmem?
-
@JosephC
It should be better than both of those. MemOptimizer and PoolMem don't actually replace the memory allocation routines, they just try to reduce the fragmentation.
-
What about the new memory system in OS4? How does it compare with this new system?
Varthall
-
The OS4 memory system claims O(1) performance (*), and an inefficiency of under 112.5%. So this new memory system is seemingly a little more efficient (103%), but not enough to be worth rewriting the whole OS4 memory subsystem for.
(* = I disagree with this, and most likely would disagree with the magical O(1) claim of this new allocator too (once I get around to reading how it works). In OS4's case, you only get O(1) performance if there is a mix of allocations & deallocations, otherwise you'll likely get O(n) performance (but should get O(log(n)) performance if they changed the implementation).)
Still, this is a great thing to see for OS3.x! :-)
-
@Piru
I thought that PoolMem patched AllocMem() to allocate memory in a less fragulated way?
-
Is there any way to rewrite MuForce, MuGuardianAngel, MungWall, etc. to work with the new ChrisHodges(tm) TLSF patch?
-
Is there any way to rewrite MuForce, MuGuardianAngel, MungWall, etc. to work with the new ChrisHodges(tm) TLSF patch?
I dont see any reason why MuForce and MungWall wouldnt work with this utility.
MuGuardianAngel is only pointless exercise.
-
What a pleasant surprise from Platon42!! :-) Thank you!
I'm using MemOptimizer currently and i will test this patch, asap.
@platon42
From the docs:
There is a bug in all known version of the scsi.device (both IDE and NCR
SCSI). The devices will allocate an IORequest structure (32 bytes big) and
then use it as IOStdReq structure (which is 48 bytes big), overwriting
innocent memory past the 32 allocated bytes. By chance, this seems to have
no noticable effect on the standard AmigaOS allocator, but immediately
kills TLSFMem, as a vital pointer in its internal structures are
overwritten. Don't get me wrong: This is a bug in the scsi.device and that
TLSFMem triggers it makes it no bug of TLSFMem.
Hey, do you think that this bug could explain the (urg!..) partition trashing experiences that i had with PoolMem and AllocP?? (No problems with MemOptimizer, so far)
What are the expected consequences of this? I do know that you provide a bug-fix, but i still would like to know.
Edit:
Oh, another (important, i believe) question please: Many ppl like me are using the IDEFix package which patches/replaces the standard scsi.device. What's the case with this one? Is there a need for a fix? Which one from the ScsiBugfix directory?
-
Forgive me for being a bit dense, but I don't quite get it. What does this do to my system? Would I be correct in thinking that it makes memory access faster? Significantly?
--
moto
-
@ChrisH
I disagree with this, and most likely would disagree with the magical O(1)
claim of this new allocator too (once I get around to reading how it
works). In OS4's case, you only get O(1) performance if there is a mix of
allocations & deallocations, otherwise you'll likely get O(n) performance
(but should get O(log(n)) performance if they changed the
implementation).)
You get O(1) guaranteed for each single operation (except for AllocAbs,
which will likely be O(1)). Not O(n), not O(n log n), but O(1). Worst
case. Every case. No mix of operations needed. There are no loops in the
code (except for looping over the existing memory headers, see docs for
detail).
Regarding the "inefficiency" of 103% is about the good fit properties, it
doesn't waste memory (but I don't know about the OS4 allocator).
-
> But is it any better than MemOptimizer?
> Or Poolmem?
Yes.
-
> Is there any way to rewrite MuForce, MuGuardianAngel, MungWall, etc. to work with the new ChrisHodges(tm) TLSF patch?
MuForce magically seems to circumvent the patches (calling the original ones (however it gets their pointers)), and thus will immediately cause havok on attempting to free TLSF allocated memory.
> I dont see any reason why MuForce and MungWall wouldnt work with this utility.
MungWall trashes the freed memory, but TLSF stores administration data in the freed memory. I still could add intrinsic MungWall functionality to TLSFMem.
-
> Hey, do you think that this bug could explain the (urg!..) partition trashing experiences that i had with PoolMem and AllocP?? (No problems with MemOptimizer, so far)
> What are the expected consequences of this? I do know that you provide a bug-fix, but i still would like to know.
Could be a result indeed. Whatever application gets access to this memory by allocating it first and using it could cause all kind of weird behaviour -- and as the important pointers (io_Data, io_Offset and io_Length) are in the back part of the structure, partition trashing could occur.
> Oh, another (important, i believe) question please: Many ppl like me are using the IDEFix package which patches/replaces the standard scsi.device. What's the case with this one? Is there a need for a fix? Which one from the ScsiBugfix directory?
I don't think that the idefix drivers are derivates from the Commodore ones, hence they should not have this bug.
-
> Forgive me for being a bit dense, but I don't quite get it. What does this do to my system? Would I be correct in thinking that it makes memory access faster? Significantly?
It doesn't speed up memory access -- this is hardware bound. But programs that use the exec memory functions for reserving and freeing dynamic memory will benefit from the new routines. Significantly.
For example, when I run my backup script on my harddisk, which will use lha to scan a partition with >80000 files, it will create over 5000 memory fragments of 8 bytes size. This sucks and makes my system *crawl*. In the past, I just called AllocFrags at regular intervals which will remove the 8 byte fragments, but after the backup process, my memory is a piece of swiss cheese (and lha needs about a minute to release its memory).
With TLSFMem, this fragmentation is gone and will speed up these tenthousands of memory operations significantly.
-
@platon42
Thanks for the reply.
I have found a serious problem with your patch:
Every WOS and PUP progs i have tried so far, refuses to work when TLSFMem is installed. :-(
Anyone else have this problem?
-
Sounds great! Thanks for this. So would a good test of performance increase be to time how long it takes to create a huge lha file, then use your patch and do the same thing again?
--
moto
-
On my A4000 25 standard (+dvd & masobochi 8mg ram card) i get those promising:
sysspeed vs Ak4_040
LhaCrunch 6.36 (7.95)
LhaTest (=)
LhaDeCrunch 0.76 (1.10)
XPK 14.48 (15.52)
XPKDe 2.14 (2.18)
PPCrunch 13.62 (13.96)
PPDe 0.50 (0.80)
Yes, my system was already optimized, but even optimized I experienced gain of about 10%.
I also notice a VERY BIG gain in the intuition library, special here:
movewin16 40 (18) ((35)
movewin256 27 (3) ((17)) !!! UAU !!!
So, this is GREAT. :rtfm:
-
Are the sources included in the archive?
Dammy
-
I want this patch to become the standard.
But I don't see how that will happen when it can't be run with debugging software such as MuGuardianAngel.
What can we do to make MuGuardianAngel cooperate with TSLFMem?
Please someone must have the answer!
-
@platon42
Regarding the "inefficiency" of 103% is about the good fit properties, it
doesn't waste memory (but I don't know about the OS4 allocator).
But the "good fit properties" of 103% means that it uses (up to) 3% more memory than necessary, so it *is* an inefficiency. OS4 has a similar wastage of (up to) 12.5%. BTW, the proper name for this is "internal fragmentation", unless I misunderstand you.
-
@motorollin
Forgive me for being a bit dense, but I don't quite get it. What does this do to my system?
For some programs it will make NO difference, but for those that allocate lots of (usually small) chunks of memory, it could make a massive difference:
For my own (very old) FolderSync program, I was able to gain a 70 times speed-up by writing my own custom memory allocation system. The reason is that (a) AmigaOS 3.x has a very old & slow memory allocation system, and (b) FolderSync made very many (small) memory allocations.
-
I have found a serious problem with your patch:
Every WOS and PUP progs i have tried so far, refuses to work when TLSFMem is installed.
So.. Is there anyone with a PPC board who confirms this?
-
Interesting, last time I was using my A1200 heavily I'd have to keep running a util to defrag and reclaim the RAM, chip RAM in particular used to lose 200K-400k or so pretty quickly. Think the util I was using was called "memory hogs" or something like that.
Now if you want a perfect memory allocation system, howabout if it was divided up into 64KB banks, and each program was given it's own 64KB bank. If you had 10 banks that would be 640K which ought to be enough for anyone.. :-P
(Just kidding of course)
-
This sounds like one of those rare bits of brilliance.
I'm keen to get this going as the only reason my A4000/060/CS-Mk3 reboots these days is when memory fragmentation slows it down after 20-30 days uptime.
I can get it to run by booting with no startup, but tools like VirusZ report bus errors. I could live with that, but later in the bootup something on loading Opus5 causes a guru. I can't see anything significant in WBStartup.
Has anyone encountered this? Is there somewhere else I should go for support where I can post better diagnostics?
My best guess, since the buserrors start after SetPatch, is perhaps it doesn't like this (I can't find a copy of the Phase5 libs just now to test):
> version libs:68060.library full
68060.library 40.18 (12/11/2001)
(c) 1999-2001 The MMU.lib development group, THOR
-
> Are the sources included in the archive?
No. I don't think that ~2000 lines of 68k-asm would be helpful for AROS either. There is that paper and at least two open source implementations of TLSF in C on the internet.
-
> But I don't see how that will happen when it can't be run with debugging software such as MuGuardianAngel.
Well, developers will usually have their "debugging setup". I'm not running MuGuardianAngel during the normal system operation :)
> What can we do to make MuGuardianAngel cooperate with TSLFMem?
MuGuardianAngel *replaces* the exec.library functions, just as TLSFMem does. But as it does, it will bork because the old memory headers are nearly empty (thus no free memory) and a free memory check on the TLSF memory headers will cause a (recoverable) guru, because it doesn't contain any chunks, but the MH_FREE field is not 0. Also I would expect MuGuardianAngel to return MMU page aligned allocations to protect them from harm -- this is currently not supported in TLSFMem.
-
But the "good fit properties" of 103% means that it uses (up to) 3% more memory than necessary, so it *is* an inefficiency. OS4 has a similar wastage of (up to) 12.5%. BTW, the proper name for this is "internal fragmentation", unless I misunderstand you.
No, thats exactly NOT what it means. Good fit means that *if* there is a memory block of at least that size category, it will find it in O(1) -- otherwise the next bigger block is returned (also in O(1)). If the sizes are exactly the same, it will allocate and eat it, if the free block was bigger, it will split it accordingly AND NO MEMORY IS WASTED. The 103% only means that a "size category" is between 100% and 103% of the requested size, and it will not distinguish between a set of different free blocks within a category. I suggest to have a look at the paper describing TLSF.
Good fit means, that it will find a free block that will have a size /close/ to the requested block. Best/Exact fit would mean, the algorithm had to search *all* the free memory chunks for a block that fits exactly.
In any case, if there's only bigger chunk of free memory than requested, the memory block is split. And the remaining block is still available to the system. No memory is wasted.
Thus TLSFMem does not have internal fragmentation except for its alignment and allocation sizes need to be multiples of 16. If the OS4 allocator has internal fragmentation and if it's really up to 12,5%, it sucks.
-
For my own (very old) FolderSync program, I was able to gain a 70 times speed-up by writing my own custom memory allocation system. The reason is that (a) AmigaOS 3.x has a very old & slow memory allocation system, and (b) FolderSync made very many (small) memory allocations.
Isn't that what memory pools, introduced with OS 2.0 is for? Small allocations, pooled into puddles? Unfortunately, there are still programs, that don't use them, way past the OS 2.0 release date...
-
I have found a serious problem with your patch:
Every WOS and PUP progs i have tried so far, refuses to work when TLSFMem is installed.
So.. Is there anyone with a PPC board who confirms this?
Seems to be true. I guess PUP and WarpOS have their own allocation routines for PPC, which are of course not patched with TLSFMem, and as TLSFMem grabs most available memory, they will at least suffer from "out of memory" problems.
-
Isn't that what memory pools, introduced with OS 2.0 is for? Small allocations, pooled into puddles? Unfortunately, there are still programs, that don't use them, way past the OS 2.0 release date...
Bleugh, memory pools are disgusting! You have to manually manage them, rather than having it automatically allocate/choose automatically sized 'pools' for you. i.e. mine simply replaced the existing AllocMem() routines, like yours does (but only for my program) - no hassle.
If the OS4 allocator has internal fragmentation and if it's really up to 12,5%, it sucks.
Well, that is a lot better than all previous memory allocators, the fragmentation can never exceed 12.5%, and it is apparently used by Linux & other modern OSes.
P.S. Yes, I will look at the TLSF papers when I have time (already downloaded). Pity you didn't link to the paper, as it took a while to find (most places wanted payment!).
-
No. I don't think that ~2000 lines of 68k-asm would be helpful for AROS either. There is that paper and at least two open source implementations of TLSF in C on the internet.
No help to AROSm68K (http://thenostromo.com/teamaros2/?number=23) Devs?
Dammy
-
@platon42
A pity because according to AllocatorBenchmark, your patch is superior from the others. :-(
-
Two questions I have that anyone can help answer me with:
Q1: When I try to make a back-up of my HD by using MakeCD to copy my hundreds of files, it seems to hang, not doing anything. Is it this a memory problem that TLFS can resolve or is it some other problem?
Q2:I've got the Elbox Fast ATA 1200 MK-III 32-bit High Speed ATA-2 EIDE Controller for the A1200 & the Elbox 4000 MK III for the A4000. Would it be safe to use it with these hardware devices and partition HDs without any problems or will I have to find out for myself?
:-?
-
A1: Probably Yes. I had the same problem on my system and both PoolMem or TLSFmem solve it for me.
A2: Should be 100% safe.
-
I finally got around to trying TLSFMem on my OS3.9.2 WinUAE system, and sadly it crashes my system *immediately*, even when I disable User-Startup (which has a lot of patches). Which is a pity, because I was hoping it would have been a superior alternative to my own (program specific) fast memory allocator...
-
@ChrisH
Check your Startup-Sequence for anything that might be patching your memory allocation routines. MCP? PoolMem? MemSniff? MungWall? Ixemul?