Welcome, Guest. Please login or register.

Author Topic: AROS SMP Research: Technical Discussion  (Read 33755 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline psxphill

Re: AROS SMP Research: Technical Discussion
« Reply #29 from previous page: August 23, 2013, 12:35:10 PM »
Quote from: itix;745940
Oh, btw... you have to consider that other CPUs can call forbid not just that first one. When adding more CPUs chances to be in forbid state increases.

Tell me how you will find out what the percentage of time various software will spend in forbid without trying it?
 
Quote from: itix;745940
Problem could be demonstrated using silly pingpong task sending message back and worth constantly. Because sending a message requires forbid that task that could easily render other cores useless. Even when running at low priority it would disrupt higher priority task on other cores, due to forbid/disable semantics.

You have always been able to write software for AmigaOS which disturbs high priority tasks. If it turns out that this is a problem that needs solving then you could try changing it so that cpu's will only ever be running a task that has the same priority. As high priority tasks are supposed to run for a short period of time then wasting the other cores during that time may not be a big deal. If you have a high priority task on AmigaOS that takes a long time then it becomes unusable (standard priority tasks like workbench won't be allowed to run at all). The priority in AmigaOS is quite fine grained (-128 to 127 IIRC) which means you could also derail this using as many as possible. But there is no reason why software that can take advantage of SMP shouldn't have limitations (like all Tasks that run at the same time have to run at the same priority).
 
I understand about the Forbid() overhead, if something spins in a Forbid()/Permit() call then it could cause problems. But is that something that any software should need/want to do? We pretty much have the source to all AROS software at this point & this isn't going to affect AROS 68k.
 
There is no reason why the number of cores in use at a time couldn't be dynamic & when you're only using 1 core then the Forbid()/Permit() overhead could be reduced to current levels. If it can detect situations where the SMP implementation will help and which will hurt then you could always end up benefiting.
 
Technologies like Intel Turboboost benefit from only using as many cores as necessary, i.e. when you're only using 1 core it can boost the clock speed but when you're saturating all cpu cores then it drops back to the default (some chips can sustain constant boosting, but in a laptop you'd want to minimize it for power usage).
« Last Edit: August 23, 2013, 12:44:21 PM by psxphill »
 

Offline Terminills

  • Grand Conspirator
  • Hero Member
  • *****
  • Join Date: Jan 2003
  • Posts: 594
  • Country: 00
  • Thanked: 2 times
    • Show only replies by Terminills
Re: AROS SMP Research: Technical Discussion
« Reply #30 on: August 23, 2013, 01:02:41 PM »
Quote from: wawrzon;745932
back to topic please. this is supposed to be technical discussion.

Good point.  Edited my post accordingly. :)
Support AROS sponsor a developer.

edited by mod: this has been addressed
 

Offline ChaosLord

  • Hero Member
  • *****
  • Join Date: Nov 2003
  • Posts: 2608
    • Show only replies by ChaosLord
    • http://totalchaoseng.dbv.pl/news.php
Re: AROS SMP Research: Technical Discussion
« Reply #31 on: August 23, 2013, 01:37:06 PM »
The whole Forbid() Permit() problem only affects old software on the old AmigaOS.

New software written for a new OS, such as a New AROS or new MorphOS can use semaphores to access the various protected OS structures.

So new software on a new OS can make full use of multiple processors.

Someone just has to code up a SMP-friendly new AROS, right?

Then I can start coding up gamez that make use of multiple cores, right?
Wanna try a wonderfull strategy game with lots of handdrawn anims,
Magic Spells and Monsters, Incredible playability and lastability,
English speech, etc. Total Chaos AGA
 

Offline itix

  • Hero Member
  • *****
  • Join Date: Oct 2002
  • Posts: 2380
    • Show only replies by itix
Re: AROS SMP Research: Technical Discussion
« Reply #32 on: August 23, 2013, 01:38:53 PM »
Quote from: psxphill;745951
Tell me how you will find out what the percentage of time various software will spend in forbid without trying it?

Getting accurate results can be difficult but itcould be profiled. At least how many calls to Forbid() or Disable() there are per minute...

Quote
You have always been able to write software for AmigaOS which disturbs high priority tasks. If it turns out that this is a problem that needs solving then you could try changing it so that cpu's will only ever be running a task that has the same priority. As high priority tasks are supposed to run for a short period of time then wasting the other cores during that time may not be a big deal. If you have a high priority task on AmigaOS that takes a long time then it becomes unusable (standard priority tasks like workbench won't be allowed to run at all). The priority in AmigaOS is quite fine grained (-128 to 127 IIRC) which means you could also derail this using as many as possible. But there is no reason why software that can take advantage of SMP shouldn't have limitations (like all Tasks that run at the same time have to run at the same priority).

I know what you mean. There is a hope that at least sometimes some software could run on parallel. Even if you get only +10% instead of +100% it is better than nothing. In the future bottlenecks could be removed one by one.

Quote

I understand about the Forbid() overhead, if something spins in a Forbid()/Permit() call then it could cause problems. But is that something that any software should need/want to do? We pretty much have the source to all AROS software at this point & this isn't going to affect AROS 68k.

Often Forbid() or Disable() is called indirectly. Take this example:

Code: [Select]
sillypseudocode()
{
   SetTaskPri(SysBase->ThisTask, -128);
   PutMsg(port, msg);

   while (true)
      PutMsg(GetMsg(port));
}

PutMsg() and GetMsg() have hidden Disable() but that is fine because on single core system high priority tasks get scheduled as soon as Enable() is called.

On multicore this would steal almost all available CPU time from each core.

Having Disable() free messaging system would solve this problem but this would have implications to all software. Another solution could be limiting scheduler to not schedule lower priority tasks on other cores as you mentioned.

Anyway, I just wanted to point out that biggest culprit is the OS itself and write some silly example :)
My Amigas: A500, Mac Mini and PowerBook
 

Offline NorthWay

  • Full Member
  • ***
  • Join Date: Jun 2003
  • Posts: 209
    • Show only replies by NorthWay
Re: AROS SMP Research: Technical Discussion
« Reply #33 on: August 23, 2013, 01:47:56 PM »
Quote from: itix;745954
Getting accurate results can be difficult but itcould be profiled. At least how many calls to Forbid() or Disable() there are per minute...

I seem to remember some _old_tool that counted OS calls of your choice.
(When I say old I am thinking Fred Fish age.)
 

Offline bloodline

  • Master Sock Abuser
  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 12114
    • Show only replies by bloodline
    • http://www.troubled-mind.com
Re: AROS SMP Research: Technical Discussion
« Reply #34 on: August 23, 2013, 01:51:56 PM »
Quote from: itix;745954

Often Forbid() or Disable() is called indirectly. Take this example:

Code: [Select]

sillypseudocode()
{
   SetTaskPri(SysBase->ThisTask, -128);
   PutMsg(port, msg);

   while (true)
      PutMsg(GetMsg(port));
}




Not quite relevant, but for SillySMP (currently) SysBase->ThisTask doesn't work anymore and you have to use findTask(null).

Offline warpdesign

  • Sr. Member
  • ****
  • Join Date: Feb 2008
  • Posts: 256
    • Show only replies by warpdesign
    • http://www.warpdesign.fr
Re: AROS SMP Research: Technical Discussion
« Reply #35 on: August 23, 2013, 01:53:27 PM »
@Itix: why do we need to halt multitask by using enable/disable ?
How do OS that support real SMP work ? I mean: what's the main difference with AmigaOS and "modern" OS ?
 

Offline bloodline

  • Master Sock Abuser
  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 12114
    • Show only replies by bloodline
    • http://www.troubled-mind.com
Re: AROS SMP Research: Technical Discussion
« Reply #36 on: August 23, 2013, 02:16:51 PM »
Quote from: warpdesign;745959
@Itix: why do we need to halt multitask by using enable/disable ?
How do OS that support real SMP work ? I mean: what's the main difference with AmigaOS and "modern" OS ?
One of the key problems with AmigaOS design is the sheer amount of freedom to access system structures that it allows :) but there are plenty of design decisions that never considered the machine would ever have more than one CPU.

Offline psxphill

Re: AROS SMP Research: Technical Discussion
« Reply #37 on: August 23, 2013, 02:31:54 PM »
Quote from: ChaosLord;745953
New software written for a new OS, such as a New AROS or new MorphOS can use semaphores to access the various protected OS structures.

I don't think this should even be considered unless as an absolute last resort. Making that compromise without knowing what the benefits are would be a mistake.
 
Quote from: warpdesign;745959
@Itix: why do we need to halt multitask by using enable/disable ?

Blame Carl Sassenrath.
 
Quote from: warpdesign;745959
How do OS that support real SMP work ? I mean: what's the main difference with AmigaOS and "modern" OS ?

AmigaOS has a lot of design mistakes in, which didn't matter so much on a games console from the early 1980's that would be around for a few years.
 
Worrying about not being able to use every ounce of cpu power when using SMP is a mistake. Windows/Linux has a high latency on a lot of it's api calls.
 
 
Quote from: itix;745954
Getting accurate results can be difficult but itcould be profiled. At least how many calls to Forbid() or Disable() there are per minute...

The number of calls is not the metric you need. It's how long it spends in Forbid(). You could make one call and stay in Forbid() for 99% of time, or 10 calls and only stay in Forbid() for 1% of time. This affects how much of each CPU you'll lose. The overhead of stopping and starting each cpu would also need to be taken into account, however this becomes even more of a problem to calculate because unless you've written the code and tested it you don't even know what the overhead will be. Plus just counting instructions doesn't help as modern CPU's are way too complex.
[/QUOTE]
 
Quote from: itix;745954
Code: [Select]
sillypseudocode()
{
   SetTaskPri(SysBase->ThisTask, -128);
   PutMsg(port, msg);
 
   while (true)
      PutMsg(GetMsg(port));
}

I can play too.
 
Code: [Select]
sillypseudocode()
{
   Forbid();
   while (true);
}

Sure there are pathological cases, The easiest way to speed up your program is for the user to not run it.
 
Quote from: itix;745954
Anyway, I just wanted to point out that biggest culprit is the OS itself and write some silly example :)

But you don't have any idea what the overhead of the biggest culprit is. It depends on how many messages are being processed, what work is done on each message.
 
The whole point of coding it was to avoid the constant arguments based on contrived examples & be able to see how real software that people might want to run will behave. It doesn't matter if it's not perfect, it's research. It could be derailed by something that nobody has considered.
« Last Edit: August 23, 2013, 03:10:46 PM by psxphill »
 

Offline wawrzon

Re: AROS SMP Research: Technical Discussion
« Reply #38 on: August 23, 2013, 02:58:18 PM »
Quote from: NorthWay;745955
I seem to remember some _old_tool that counted OS calls of your choice.
(When I say old I am thinking Fred Fish age.)


wait a minute. couldnt you come up with what it was? there is tremendous overhead on some aros68k operations as i see on a slow system and it would be great to identify most frequently called functions while it happens without doing profiling job, which im not able to.
 

Offline psxphill

Re: AROS SMP Research: Technical Discussion
« Reply #39 on: August 23, 2013, 03:16:50 PM »
Quote from: wawrzon;745965
wait a minute. couldnt you come up with what it was? there is tremendous overhead on some aros68k operations as i see on a slow system and it would be great to identify most frequently called functions while it happens without doing profiling job, which im not able to.

You could, but it might lead you down the wrong path. If you have a function that is called 1000 times which takes 10ms or a function that is called 1 time which takes 100s then it won't help.
 
Profiling is the key, often bottlenecks show up in completely unexpected parts of the code. I've seen people spend time optimising code that when they'd finished made no perceivable difference, even though they could measure a 2x speed up in the function they sped up.
 
Some of the aros68k problems are caused by adding a level of abstraction to the graphics library, which wasn't designed to be as fast as it possibly could as an x86 was fast enough that you wouldn't care.
 
Also due to small/non existent caches on 68k hardware it's actually very hard to guess where the delays are going to be. For example what you consider a good algorithm choice could end up with the cache being thrashed, a less optimal design could end up being faster if it's memory access patterns suit the cache better. Making it aros68k faster will take a lot of research and effort. Just counting calls and then spending ten minutes rewriting the function with the most calls is quite dangerous, it might be slower in all cases except the one you tested & even if you speed it up it could end up being broken. Although I'm cynical after watching people do it repeatedly and fail (although they generally get to claim the credit before anyone finds out).
« Last Edit: August 23, 2013, 03:26:09 PM by psxphill »
 

Offline wawrzon

Re: AROS SMP Research: Technical Discussion
« Reply #40 on: August 23, 2013, 03:41:37 PM »
Quote

If you have a function that is called 1000 times which takes 10ms or a function that is called 1 time which takes 100s then it won't help.

yes i know, and especially being not a programmer i dont expect a lot, though being able to see what happens while stall at least may suggest something. but lets not derail the thread.
 

Offline itix

  • Hero Member
  • *****
  • Join Date: Oct 2002
  • Posts: 2380
    • Show only replies by itix
Re: AROS SMP Research: Technical Discussion
« Reply #41 on: August 23, 2013, 09:13:52 PM »
Quote from: psxphill;745962

I can play too.
 
Code: [Select]

sillypseudocode()
{
   Forbid();
   while (true);
}

 
Sure there are pathological cases, The easiest way to speed up your program is for the user to not run it.


You missed one very important difference. My example would run just fine on any traditional non-SMP Amiga OS system. Your example wouldnt.

Change Forbid() to SetTaskPri(task, 127) and you would have an example where SMP is superior to non-SMP system.
My Amigas: A500, Mac Mini and PowerBook
 

Offline Bif

  • Full Member
  • ***
  • Join Date: Aug 2009
  • Posts: 124
    • Show only replies by Bif
Re: AROS SMP Research: Technical Discussion
« Reply #42 on: August 23, 2013, 09:51:07 PM »
I'm happy to see this work going on.

I've done nothing but write code for SMP (and AMP) game systems for the last 10 years, and almost every day I have to think about parallel programming problems. Based on this experience, my personal opinion is that an Amiga SMP system will perform better than most people expect. That said, I have zero recent programming experience on Amiga so I could also be talking out of my arse.

There's all sorts of talk of Forbid/permit/enable/disable and messages flying around. I think a key concept of writing a highly efficient CPU intensive program is to reduce all interactions with the OS as much as possible. This is regardless of what OS you are coding for - OS's always have overhead. If you take something like an MP3 encoder or decoder, how often do you need to interact with the OS and thus get stuck in a Forbid? If you are smart you will malloc all your memory up front on program startup so there will only be that initial interfacing to the OS for that. You then probably only need to ask the OS to handle file IO and maybe some output to the console. 99% of your CPU should be spent on computation, outside of any kind of Forbid(). Now I think that for any type of program where SMP is useful this will generally be true. Of course, you could do a really crap job of writing an MP3 encoder where you read and write 1 byte to files at a time instead of a block of data, and there will be Forbid() calls everywhere. But I'd also bet that program would run slow on AmigaOS as it is now.

I think the main culprits that will be issuing heaps of Forbid() calls will be programs with intense GUIs. E.g. a paint program. But ... are you going to be running a whole bunch of programs like this at once, or writing a multi-threaded GUI? I would think not too much. Probably something like a movie player in one window while painting in another window could cause some conflict. But worst case I think we are basically serializing all the drawing - even on a proper SMP OS that's probably much the case anyway.

Anyway, I think in particular if Forbid() can be made to only block if another CPU is already in a Forbid() call the SMP should give a pretty darn good benefit for those cases that actually need the extra CPU horsepower. If it can't work that way then certainly one Forbid() heavy program could lock things down a fair bit and introduce additional overhead that causes a net decrease in performance.

Very curious to see the results of the experiment, good luck.
 

Offline vidarh

  • Sr. Member
  • ****
  • Join Date: Feb 2010
  • Posts: 409
    • Show only replies by vidarh
Re: AROS SMP Research: Technical Discussion
« Reply #43 on: August 23, 2013, 11:43:06 PM »
Quote from: takemehomegrandma;745931
I don't think anyone has claimed that SMP couldn't be done in a situation where the precondition is that SW base is being built explicitly for that system? Rather the opposite, actually; this is a given! What "people" said is that true SMP can't be done without breaking the Amiga compatibility, which I suppose is a more relevant issue on MorphOS/OS4 than on AROS anyway, since the latter has been CPU/ISA agnostic since pretty much the beginning hence most people are happy to either run AROS builds of whatever SW they use, or run it in an UAE environment.


I think the "breaking Amiga compatibility is a bit too dogmatic. None of the modern alternatives after all run m68 Amiga software directly other than the 68k port of AROS - in all other instances any of the old software that we don't have sources for is run through an emulator/jit. That makes a huge difference, in that it is fully possible to detect attempts at accessing system structures etc., which means careful changes can be made while letting the emulator compensate (e.g. say take up a mutex/semaphore before accessing certain values)

The amount of "native" proprietary software for these new OS's where the author isn't still active is vanishingly small. And the open source software for these OS's can be updated reasonably easily. So if SMP support was added to all of these "tomorrow", how many applications would we realistically "lose"? And if done properly, the "worst case" scenario is to disable SMP while running them.

AROS is in the process of breaking all binary compatibility with past AROS versions anyway, so it's perfect timing for the SMP work as all apps will need to be recompiled, but frankly I just don't believe that it's worth being all that concerned about breaking compatibility over this - the systems where we may want to care about compatibility are the classics, and it's not like they are SMP systems anyway...
 

Offline vidarh

  • Sr. Member
  • ****
  • Join Date: Feb 2010
  • Posts: 409
    • Show only replies by vidarh
Re: AROS SMP Research: Technical Discussion
« Reply #44 on: August 23, 2013, 11:53:13 PM »
Quote from: psxphill;745962
I don't think this should even be considered unless as an absolute last resort. Making that compromise without knowing what the benefits are would be a mistake.


It's not a compromise. It is the clean alternative.

Forbid()/Disable() is the hacky comprise.  In a single CPU system, when done in cases where you're "only" swapping a pointer or two, it can be forgiveable. In pretty much all other instances, it is a big, giant, lazy cop-out that made coding it a tiny bit easier back in the day compared to using semaphores and mutexes to protect the *specific* structures that an application needs to access safely.

Now, I agree that it's not necessary to jump straight into tearing out every Forbid()/Disable() you can find. But we can categorically state that having them there is bad. It's just that for the most part it won't be bad enough to prevent us from seeing some benefit from SMP, and so removing them can wait until there's some SMP support working.

It certainly makes sense to prioritise *where* to clean up Forbid()/Disable() calls first based on profiling which ones actually hurt the most.