Welcome, Guest. Please login or register.

Author Topic: NetSurf 3.6 web browser released!  (Read 19456 times)

Description:

0 Members and 2 Guests are viewing this topic.

Offline olsen

Re: NetSurf 3.6 web browser released!
« on: November 21, 2016, 09:43:33 AM »
Quote from: chris;816638
It's the price of eliminating memory fragmentation, I guess.


I think that this warrants a thorough analysis. This the first time the new memory management system has been used, and we barely know enough about its real-life performance, let alone the constraints under which it operates.

I do not know what causes this slowdown, and I did not expect it either. Allocating and releasing memory should require (roughly) the same amount of time, regardless of the size of the allocation (as long as it can be covered by the allocator's page size). There must be more to that.

Quote
I believe that's as fast as it can be, any optimisations will be ones in the browser.
The new memory management (it's called a "slab allocator") is good at keeping fragmentation at bay. But we have barely begun testing it yet, it's hardly even a week old to begin with ;)

There is plenty of room for "tuning" the new memory management. Tuning requires that data is collected on how the memory management is used, for later analysis. To this end there is an interface in the clib2 runtime library which NetSurf uses. It would be nice to allow for memory usage information to be collected through that interface, to be stored in log files.
« Last Edit: November 21, 2016, 09:46:56 AM by olsen »
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #1 on: November 21, 2016, 10:32:18 AM »
Quote from: chris;816647
You can send SLABSTATS to NetSurf's ARexx port to get it to dump them to the log.
Code: [Select]
NetSurf -V ram:ns.log
rx "address netsurf 'slabstats'"


Great! I'd like to see samples of these log files, if possible.

Quote
I did find with the log enabled NetSurf was crashing, but I don't think that's related to the stats collection.  I might see about modifying this so the stats are stored in an ARexx stem variable.
I think log files should be stored in a format which makes automated analysis easy, e.g. JSON. I'll work something out.

Quote
This isn't in the 3.6 build as I added it later, I'll need to upload a new test version but I can't do that today.

I did notice that we get thousands of allocations of 32 bytes, and not much above 2K.  I reduced the slab size to 2K (again, after this 3.6 build) and the timings don't look much different.  v3.6 has the slab set at 8K.

Thousands of 32 byte allocations translates into hundreds of slabs. Worst case, allocation requires the traversal of the entire slab list to find that a new slab needs to be added. Deallocation likewise would (worst case) require a traversal of the entire list.

I think I ought to beef up the (not terribly sophisticated) slab statistics some more, so that the number of slab list items checked before an allocation/deallocation is performed is counted.

As for the list traversal effort, that says "hash table" to me :)
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #2 on: November 21, 2016, 01:23:01 PM »
Quote from: chris;816651
I'll copy the file I have somewhere useful this evening.
Thank you! Please send it to me via e-mail.

Quote
I was getting something like 800 slabs of 8K in total after visiting amiga.org.
I'd say that this number is far too high. Which chunk sizes are most frequently-used? Scaling the respective slab size for these might be helpful and might work better in the long term than adding hash tables to speed up allocation/deallocation.

Quote
Some sort of breakdown of size of allocations above the slab size might be useful too.
Yes, this information is completely obscure right now. I'll have to change the data structure which tracks these allocations so that it no longer relies upon AllocVec(). The size information should become part of the management data structure.
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #3 on: November 22, 2016, 08:26:54 AM »
Quote from: gregthecanuck;816699
@olsen

Has the slab allocator from OS4 been back-ported to 3.x? Is that what you are testing?
No, this is something very different and very much simpler in design. I was suprised that it worked so well after having spent barely a week building and testing it.

Both the AmigaOS4 slab allocator and the one which is part of clib2 (available at https://sourceforge.net/projects/clib2/ and https://github.com/adtools/clib2) are based upon the same 1994/2001 papers written by Jeff Bonwick, who introduced the concept. For reference I looked at the Linux and FreeBSD slab allocators, but then found that the slab allocator used by "memcached" was easiest to understand for me in the context described by the papers.

The papers which I read in addition to the two by Jeff Bonwick strongly supported the idea that using a slab allocator at the application level, e.g. linked against the NetSurf code, made good sense. It was not necessary to have the slab allocator in the operating system to get useful results.

A slab allocator at the operating system level needs to cater for caching requirements, it has to provide for proper data alignment, and it has to watch very carefully how much management overhead it deploys to herd the slabs. A slab allocator at the application level, linked against NetSurf, can ignore some of those constraints, making it simpler in design.

That's what landed in clib2, and I am still working on making it perform better. One week of design and implementation clearly is not sufficient ;)
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #4 on: November 22, 2016, 10:45:30 AM »
Quote from: gregthecanuck;816707
Nice. Thanks for the background information.

Had you considered TLSF?
No, I stopped searching for an alternative solution when the slab allocator appeared to do the job.

Quote
I recall a while ago there was some mud-slinging as to which was "better".
I'll have another look at TLSF. From what I could learn quickly about TLSF, it is a technically more complex approach than what a slab allocator could get away with. The management overhead (accounting data structures) appears to be much lower, and the smallest usable fragment size is also lower than what my own slab allocator allows for.

However, much of that overhead comes about because clib2 now supports three different back-ends for allocating and managing memory. The management data structures are not strictly needed for the memory pool and slab allocator back-ends, so there is quite some room for improvement in clib2.

I doubt that I will be replacing the slab allocator in clib2 with TLSF any time soon. Instead, I'll work on improving the clib2 memory management system.
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #5 on: November 22, 2016, 01:31:10 PM »
Quote from: wawrzon;816713
according to wikipedia, the behaviour we are experiencing with slab, may be a known handicap, sorry its only on a german page:

https://de.wikipedia.org/wiki/Slab_allocator
The issues mentioned in this context appear to refer to the kernel implementation of the slab allocator. The kernel slab allocator should care about proper alignment of allocations, so as to avoid friction with multiple processors and non-uniform memory access.

The slab allocator in clib2 sidesteps these issues by mostly ignoring them. Unless I made a mistake, allocations are currently aligned to 64 bit word boundaries because of the chunk allocation granularity (this may change, though). No optimizations for multiprocessing or NUMA are needed.

Quote

the mentioned buddy allocator is afaik the default one in bernds ixemul library >6x.x, which may explain, why it performs well in comparison. maybe some more considerations should be spent on that issue?
From what I know buddy allocators are much more complex in operation than the slab allocator. For example, the highly configurable dlmalloc allocator, including documentation comments, is more than 5000 lines long.

In a buddy allocator effort is spent on merging chunks, and depending upon the order in which allocations are being made, buddies may not be released in the order which allows them to be merged, slowly increasing fragmentation over time. dlmalloc is designed to make best-fit allocations, as opposed to first-fit, which contributes to the complexity and the effort spent (first-fit is fast at the expense of quickly increasing fragmentation over time).

By comparison, a slab allocator can deliver both best-fit and first-fit performance at the same time without spending any effort on merging chunks. Furthermore, it can deliver this in nearly constant time, i.e. O(1), discounting that it has to obtain the pages which it manages from somewhere ;)
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #6 on: November 22, 2016, 01:33:18 PM »
Quote from: utri007;816715
TLSF didn't help with Netsurf.


Why did it fail?
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #7 on: November 22, 2016, 05:44:21 PM »
Quote from: utri007;816724
I don't know but it didn't help. There was tlsfmen.lha in aminet, wich is now deleted some reason. It was started from shell anytime according to the guide.

Ah, so none of the existing TLSF implementations were linked against NetSurf at any time? I'd say it would not be too difficult to adapt them to work with AmigaOS.

Patching the exec memory management API at runtime would not be quite the same as replacing the memory management for a single application. Once your exec patches have to work for all running programs, it becomes so much harder to diagnose problems and their causes. Your chances of discovery are much better if you can limit the scope.
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #8 on: November 22, 2016, 05:54:45 PM »
Quote from: utri007;816729
Slow down effect is gone, it was actual "show stopper". :) Now browser is actually usefull. Now it loads first page longer than it used to, but lets see if it gets faster.
I'm working on it, or rather, working on the slab allocator in clib2.

The next set of changes is already checked in. The goal is to speed up allocation and deallocation operations so that the number of "slabs" already in use no longer affects the time spent performing these operations.

Chris sent me a log file which detailed that at times there were more than 800 "slabs" in play. I expect that this number must have had a significant negative effect on overall performance.

Let's see how today's changes fare. Unfortunately, the changes currently result in higher memory usage figures. But I think I have a solution for that problem, too.
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #9 on: November 25, 2016, 01:09:48 PM »
Quote from: utri007;816861
I made a quick test, when I had my lunchbreak.

Congratulations again to all involved. :) It is much faster, amiga.org downloads now 28-30 seconds with my 68060 / 66mhz. Was 46-48 seconds. Still not that fast than it used to be20-22 seconds, but with memory fragemention problem it wasn't usefull.

Does the page load time improve if you immediately reload the page (assuming that it is not simply reloaded from the disk cache, but re-fetched from the web site)?

If so, then this may be significant. The "slab allocator" is referred to as a cache in the paper which introduced the concept. For a cache to be helpful it needs to already contain the data that is subsequently drawn from it ;)  Otherwise, the cache first needs to be organized and "primed". This process is sometimes called "warming up".

For the "slab allocator" to warm up it needs to allocate memory for the slabs which it will manage, chopping the slabs into (sometimes) hundreds of chunks. If you start this process cold, the "slab allocator" will first spend a lot of effort allocating memory and getting it ready for use.

This process could be sped up by preallocating slabs when NetSurf starts. However, it won't do just to allocate a bunch of slabs, one also needs to know which memory chunk sizes would be most appropriate.
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #10 on: November 27, 2016, 09:24:10 AM »
Quote from: Primax;816887
Some little tests on my Amiga1200 with 1230/50 accelerator:
Netsurf  took - with best configuration - 35,6 seconds to load amiga-news.de  using the version before, now takes 31,8 seconds. Jumping to aminet.net  (35 seconds) and back to amiga-news.de (31,1 seconds). Holding shift and  reload it, Netsurf took 30,8 seconds. So I guess if I reload it ten  times I can break the 20 seconds mark...:)
If you keep it at, at some point you'll be able to see web pages before you have decided to visit them, which leads to the possibility of being able to read next week's lottery numbers ;)

But seriously, the longer you use the web browser, the better the new memory management system will adapt. When you switch to a new web page, NetSurf has to break up the memory allocated for all the old page's components, then reuse it for the new page.

Imagine smashing up all your crockery after dinner, then glueing it all back together for the next meal. The fragments will become smaller and smaller over time, making it harder to find those which fit together well enough. Larger fragments need to be broken up into smaller pieces to make them fit with the rest.

That's exactly the problem which NetSurf has. The new memory management system helps here by sorting the fragments into bins from which they can be retrieved more easily without having to smash larger fragments. The longer you use it, the better the new memory management system will know which bin sizes are needed the most, and the more readily-available fragments it will keep at hand.
« Last Edit: November 27, 2016, 09:33:17 AM by olsen »
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #11 on: November 27, 2016, 03:10:03 PM »
Quote from: utri007;816943
How it will perform "low memory" situations? I was surfing with my A1200, 68040 and 32mb ram system after Netsurf is loaded there is about 11mb free ram. Ram was eated up quite fast, if surfing to other sites ie. amiga.org - > amigaworld.net -> wikipedia.org. There is no problem is just surfing one site, ie. amiga.org -> forums -> thread -> amiga.org


How much memory is being consumed needs to be measurable. To this end I just added new code to clib2 which produces machine-readable status information in JSON format. I hope that Chris will add support for it in NetSurf, so that snapshots of this status information could be saved and submitted for analysis.
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #12 on: November 27, 2016, 07:52:37 PM »
Quote from: chris;816946
@olsen

Added, but I don't think it is working properly: http://www.cy2.uk/tmp/ns-stats.json.gz


Something isn't right. The strftime() appears to produce no output whatsoever in the archive, and the vsnprintf() function doesn't even convert %zu output correctly.

The code which prepares the JSON data for output, one line at a time, uses the clib2 vsnprintf() function. There are two tests for it in both the library and the "slab-test.c" program, and they both worked.

Does NetSurf replace vsnprintf() or strftime()? That's my only guess. The output in the archive went completely off the rails...
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #13 on: November 28, 2016, 10:28:13 AM »
Quote from: chris;816955
Ah, looks like strftime gets replaced. Before I change this, do you know if this comment is valid?

Code: [Select]
/* Although these platforms might have strftime or strptime they
 *  appear not to support the time_t seconds format specifier.
 */
Yes, this is still valid. clib2 does not support the strftime() "%s" conversion specifier. "%s" appears to be a Unix addition, and it is not part of the C99 specs. I could easily add it, though, if it is needed.

Looking at the source code of NetSurf 3.6, it appears that there is a workaround in place which performs the conversion with snprintf(). Makes you wonder why strftime() is used in the first place ;)

As for strptime(), this is not part of the C99 specs either, and quite complex to implement. This seems to be a Unix addition, too. Again, NetSurf has various workarounds for strptime() in place which do the job just fine.

NetSurf neither replaces vsnprintf() nor strftime(), as far as I can tell. There goes my theory as to what interferes with the JSON data generation. I'm really puzzled how this could go so spectacularly awry... Could you check the linker map? It should show where the vsnprintf() and strftime() code comes from, exactly.

That said, it's not difficult to generate the same data using the __get_slab_allocations() and __get_slab_usage() API functions in clib2 if you wanted to implement it yourself. I just thought I'd save you the effort and put it into __get_slab_stats().
 

Offline olsen

Re: NetSurf 3.6 web browser released!
« Reply #14 on: November 28, 2016, 08:17:57 PM »
Quote from: chris;816993
@olsen
I'm afraid the map file means nothing to me, but it's here if you want to decode it: http://www.cy2.uk/tmp/map.txt.gz

Thank you, it's not that cryptic, it's just too long ;)

Here are the "interesting" bits (for given values of "interesting"):

-- 8< --

 .text          0x0056a724       0x90 /opt/netsurf/m68k-unknown-amigaos/cross/lib/gcc/m68k-unknown-amigaos/3.4.6/../../../../m68k-unknown-amigaos/lib/libm.a()
                0x0056a724                _vsnprintf

 .text          0x005886e0      0xe98 /opt/netsurf/m68k-unknown-amigaos/cross/lib/gcc/m68k-unknown-amigaos/3.4.6/../../../../m68k-unknown-amigaos/lib/libc.a(time_strftime.o)
                0x00589408                _strftime

-- 8< --

This says that both vsnprintf() and strftime() come from the libm.a and libc.a linker libraries, which are part of clib2, etc. It's odd that the reference to libm.a does not list the source file name (when did you last rebuild all these libraries from scratch?).

Anyhow, these library functions are not getting overridden by NetSurf, libcurl, libssl, libgcc, etc. code.

Which collapses another theory of mine.

How does the code look like which fails to produce the correct JSON data? Have you already checked it in?