Amiga.org

The "Not Quite Amiga but still computer related category" => Amiga Emulation => Topic started by: bloodline on November 24, 2010, 02:31:29 PM

Title: Threaded Code
Post by: bloodline on November 24, 2010, 02:31:29 PM
So here I am thinking about an idea... And it turns out my idea already has a name, "Threaded Code"!

My question is, is there any C/C++ legal way to jump to an address? A computed Goto if you will...

goto *someaddress;
Title: Re: Threaded Code
Post by: ElPolloDiabl on November 24, 2010, 02:51:14 PM
Don't no too much about C, but I read that it explicitly doesn't allow such things.
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 02:55:18 PM
Quote from: ElPolloDiabl;594070
Don't no too much about C, but I read that it explicitly doesn't allow such things.
Yeah... I think I've found a gcc extention that allows it... But I'd prefer something more portable :(
Title: Re: Threaded Code
Post by: skurk on November 24, 2010, 02:57:33 PM
Yes, something like this should work

void (*app)() = (void(*))0x123456;

and then later

app();

I can't test this right now, but I'm pretty sure the above would jump to addres 0x123456.
Title: Re: Threaded Code
Post by: Karlos on November 24, 2010, 02:59:42 PM
Quote from: bloodline;594067
So here I am thinking about an idea... And it turns out my idea already has a name, "Threaded Code"!

My question is, is there any C/C++ legal way to jump to an address? A computed Goto if you will...

goto *someaddress;


The most obvious legal way to do this is to use a table of function pointers.
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 02:59:44 PM
Quote from: skurk;594072
Yes, something like this should work

void (*app)() = (void(*))0x123456;

and then later

app();

I can't test this right now, but I'm pretty sure the above would jump to addres 0x123456.
No, that's a function pointer which will invoke the calling convention (saving registers etc) which would add a lot of overhead a threaded code Virtual Machine...
Title: Re: Threaded Code
Post by: Karlos on November 24, 2010, 03:03:36 PM
Quote from: bloodline;594074
No, that's a function pointer which will invoke the calling convention (saving registers etc) which would add a lot of overhead a threaded code Virtual Machine...


Well the gcc extension is as portable as gcc is. If you keep your dubious computed branch target code to just one translation unit, you could always make it an exception in your makefile and have everything else all nice and ANSI.
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 03:06:09 PM
Quote from: Karlos;594073
The most obvious legal way to do this is to use a table of function pointers.
I appreciate the advice (and from skurk too), but a function pointer is just too expensive in this regard... My VM needs to run on my 100mhz ARM M3 microcontroller... We have to think lightweight! But I also need it to be portable so I can test it on my test machine... An x86... So ARM Asm is going to be problematic...
Title: Re: Threaded Code
Post by: skurk on November 24, 2010, 03:08:04 PM
Quote from: bloodline;594074
No, that's a function pointer which will invoke the calling convention (saving registers etc) which would add a lot of overhead a threaded code Virtual Machine...


Oh, then I misunderstood your question. :-)
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 03:08:11 PM
Quote from: Karlos;594076
Well the gcc extension is as portable as gcc is. If you keep your dubious computed branch target code to just one translation unit, you could always make it an exception in your makefile and have everything else all nice and ANSI.
Yeah, that's almost certainly the most intelligent way to do it! :)
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 03:10:22 PM
Quote from: bloodline;594074
No, that's a function pointer which will invoke the calling convention (saving registers etc) which would add a lot of overhead a threaded code Virtual Machine...


Quote from: skurk;594078
Oh, then I misunderstood your question. :-)


Not really, your advice was good, I didn't provide enough specification for you! As Karlos has also pointed out function pointers are the legal way to do this... I just don't want to save the registers for every function call!
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 03:39:50 PM
If I'm happy to stay with gcc, then this seems to be the winner:

http://docs.freebsd.org/info/gcc/gcc.info.Labels_as_Values.html

Bit sucky really...
Title: Re: Threaded Code
Post by: commodorejohn on November 24, 2010, 06:14:00 PM
If you're just writing for ARM and testing on x86, couldn't you use a conditionally-compiled bit of assembler like:
Code: [Select]
#ifdef ARM
asm { /* whatever */}
#else
asm { /* x86 equivalent */}
#endif
or somesuch?
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 06:16:46 PM
Quote from: commodorejohn;594103
If you're just writing for ARM and testing on x86, couldn't you use a conditionally-compiled bit of assembler like:
Code: [Select]
#ifdef ARM
asm { /* whatever */}
#else
asm { /* x86 equivalent */}
#endif
or somesuch?
It's not the code paths that I have a problem with... it's maintaining two code paths :lol: and in ASM too... kinda defeats the point of using C in the first place ;)
Title: Re: Threaded Code
Post by: itix on November 24, 2010, 06:38:20 PM
Quote from: bloodline;594081
Not really, your advice was good, I didn't provide enough specification for you! As Karlos has also pointed out function pointers are the legal way to do this... I just don't want to save the registers for every function call!


Necessarily it is not saving the registers for every function call.  It is platform specific...
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 07:26:07 PM
Quote from: itix;594110
Necessarily it is not saving the registers for every function call.  It is platform specific...
If I have a function that takes no arguments and returns nothing, and acts only on global variables... then will the compiler optimize for not saving any registers? I doubt it :(
Title: Re: Threaded Code
Post by: Karlos on November 24, 2010, 07:48:16 PM
Quote from: bloodline;594117
If I have a function that takes no arguments and returns nothing, and acts only on global variables... then will the compiler optimize for not saving any registers? I doubt it :(


That depends. If shovel it all into the same translation unit so it can see the scope of everything whilst it is compiling, you'd be surprised...
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 08:13:37 PM
Quote from: Karlos;594120
That depends. If shovel it all into the same translation unit so it can see the scope of everything whilst it is compiling, you'd be surprised...
Unfortunately the ARM compiler I'm using doesn't allow me access to the asm (I support I could disassemble it...) so I can't see what's really going on...

But you are right though, I've seen some pretty incredible optimization done by a C complier before...

I'm rapidly losing interest in this idea... as my real life workload has just increased :(
Title: Re: Threaded Code
Post by: Trev on November 24, 2010, 09:27:33 PM
setjmp() and longjmp()? "sjlj" exceptions are used in many C++ implementations, including G++ on AmigaOS. In C, see the Protothreads library (more like fibers than threads) for creative uses http://www.sics.se/~adam/pt/. Protothreads is used by uIP.
Title: Re: Threaded Code
Post by: Karlos on November 24, 2010, 09:35:03 PM
Hmm, IIRC, setjmp / longjmp save more state information than a normal function call does, so if the latter are too expensive, the former aren't likely to be any better?
Title: Re: Threaded Code
Post by: Trev on November 24, 2010, 10:19:00 PM
Then you're just talking about a state machine of some sort, e.g. in pseudo code:

Code: [Select]
unsigned thread = 0;
unsigned state0 = 0;
unsigned state1 = 0;
unsigned state2 = 0;
unsigned state3 = 0;

while (1) {
  switch (thread++ % 4)
  {
  case 0:
    switch (state0++ % 2)
    {
      case 0:
        /* do something */
        break; /* yield */
     
      case 1:
        /* continue doing something */
    }
    break;

  case 1:
    switch (state1++ % 2)
    {
      case 0:
        /* do something */
        break; /* yield */
     
      case 1:
        /* continue doing something */
    }
    break;

  case 2:
    switch (state2++ % 2)
    {
      case 0:
        /* do something */
        break; /* yield */
     
      case 1:
        /* continue doing something */
    }
    break;

  case 3:
    switch (state3++ % 2)
    {
      case 0:
        /* do something */
        break; /* yield */
     
      case 1:
        /* continue doing something */
    }
  }
}

That's essentially what Protothreads does, albeit with the overhead of setjmp/longjmp.
Title: Re: Threaded Code
Post by: yakumo9275 on November 24, 2010, 10:40:09 PM
protothreads / coroutines. But mostly, C was not built to do what you want to do. Look at a lot of Forth interpreters and see how they do it, they pioneered the threaded code model, but mostly the ones I've seen use this method are written in assembler.
Title: Re: Threaded Code
Post by: bloodline on November 24, 2010, 11:27:32 PM
Quote from: yakumo9275;594170
protothreads / coroutines. But mostly, C was not built to do what you want to do. Look at a lot of Forth interpreters and see how they do it, they pioneered the threaded code model, but mostly the ones I've seen use this method are written in assembler.
Yeah, just a note to say I'm not talking about Multithreading (I don't want to run more than one task) I'm talking about a method of implementing a Virtual machine using a table of functions/subroutines... What little reading I have done has mostly been around forth... And yeah.. C really wasn't build for this :(
Title: Re: Threaded Code
Post by: Karlos on November 24, 2010, 11:30:52 PM
Quote from: bloodline;594186
Yeah, just a note to say I'm not talking about Multithreading (I don't want to run more than one task) I'm talking about a method of implementing a Virtual machine using a table of functions/subroutines... What little reading I have done has mostly been around forth... And yeah.. C really wasn't build for this :(

If it helps, I've built a "virtual processor" using function tables / giant switch case depending on compiler settings as an experiment. Can send you the source if it helps.

PM me if you are interested.
Title: Re: Threaded Code
Post by: bloodline on November 25, 2010, 12:02:23 AM
Quote from: Karlos;594188
If it helps, I've built a "virtual processor" using function tables / giant switch case depending on compiler settings as an experiment. Can send you the source if it helps.

PM me if you are interested.
Very kind, but I already have that! When last discussed this topic I was convinced that a giant switch case was the way to go and you helpfully sent me your own experiments! My project never progressed very far it was just too slow for the task I had intended (basic DSP work)...

Now I'm playing with ARM microcontrollers I'm wondering if I can get them to do something rather fun... That is to say replace an ASIC in a circuit... Or maybe even an old 16bit CPU... ;)
Title: Re: Threaded Code
Post by: Karlos on November 25, 2010, 12:04:36 AM
Quote from: bloodline;594200
Very kind, but I already have that! When last discussed this topic I was convinced that a giant switch case was the way to go and you helpfully sent me your own experiments! My project never progressed very far it was just too slow for the task I had intended (basic DSP work)...

Now I'm playing with ARM microcontrollers I'm wondering if I can get them to do something rather fun... That is to say replace an ASIC in a circuit... Or maybe even an old 16bit CPU... ;)


Is it the one that has a test virtual program to generate the Mandelbrot set as a PPM file?
Title: Re: Threaded Code
Post by: bloodline on November 25, 2010, 12:19:28 AM
Quote from: Karlos;594202
Is it the one that has a test virtual program to generate the Mandelbrot set as a PPM file?
I don't recall any test code...
Title: Re: Threaded Code
Post by: Trev on November 25, 2010, 12:39:51 AM
This is from the hip and probably dragon-infested, but it looks legal:

Code: [Select]
int thread[] = {
  1, 2, 3, n, 0
};                 /* thread of instructions */

int *ip = thread;  /* initialize instruction pointer */

jmp_buf buf[0xff]; /* 0x00..0xff bytecode operations */
jmp_buf top;       /* top-level interpreter */

if (setjmp(buf[0x00])) {
  /* example uses op 0x00 as terminator, so this should never be executed */
  longjmp(top, 1); /* optionally, use second parameter to raise exceptions */
}

if (setjmp(buf[0x01])) {
  /* do op 0x01, manipulate ip (also the stack) as needed */
  longjmp(top, 1); /* optionally, use second parameter to raise exceptions */
}


if (setjmp(buf[0x02])) {
  /* do op 0x02, manipulate ip (also the stack) as needed */
  longjmp(top, 1); /* optionally, use second parameter to raise exceptions */
}

if (setjmp(buf[n])) {
  /* do op n, manipulate ip (also the stack) as needed */
  longjmp(top, 1); /* optionally, use second parameter to raise exceptions */
}

while (*ip) {
  if (!setjmp(top)) {
    longjmp(buf[*ip++], 1);
  }
  else {
    /* process exception */
  }
}

The overhead of setjmp/longjmp is system specific but probably less than a function call.
Title: Re: Threaded Code
Post by: Karlos on November 25, 2010, 01:02:16 AM
Quote from: bloodline;594208
I don't recall any test code...


Then maybe the version you have is old? This one even had a whole bunch of evil C macros that allowed the VM code to be written within the C source:

Code: [Select]



void nativeAllocBuffer(VMCore* vm)
{
  // width/height in r2/r3
  // return buffer in r1
  int w = vm->getReg(_r2).s32();
  int h = vm->getReg(_r3).s32();
  vm->getReg(_r1).pU8() = new uint8[w*h];
  printf("Allocated buffer [%d x %d]\n", w, h);
}

void nativeFreeBuffer(VMCore* vm)
{
  // expects buffer in r1
  delete[] vm->getReg(_r1).pCh();
  vm->getReg(_r1).pCh() = 0;
  printf("Freed buffer\n");
}

void nativeWriteBuffer(VMCore* vm)
{
  // writes buffer in r1 to filename in r4
  // expects width/height in r2/r3
  const char* fileName = vm->getReg(_r16).pCh();
  int w = vm->getReg(_r2).s32();
  int h = vm->getReg(_r3).s32();
  if (fileName) {
    FILE *f = fopen(fileName, "wb");
    fprintf(f, "P5\n%d\n%d\n255\n", w, h);
    fwrite(vm->getReg(_r1).pCh(), 1, w*h, f);
    fclose(f);
    printf("Wrote buffer '%s'\n", fileName);
  }
}

void nativePrintCoords(VMCore* vm)
{
  printf(
    "Coords %4d, %4d (%.6f, %.6f)\n",
    (int)vm->getReg(_r7).s32(),
    (int)vm->getReg(_r5).s32(),
    vm->getReg(_r9).f32(),
    vm->getReg(_r4).f32()
  );
}

_VM_CODE(makeFractal)
{
  // r1 = pixel data address
  // r2 = width in pixels
  // r3 = height in pixels
  // r4 = cY (float pos, starting at yMin)
  // r5 = y (int) pixel
  // r6 = xMin (float)
  // r7 = cX (float pos, starting at xMin)
  // r8 = fStep
  // r9 = x (int) pixel
  // r10 = iStep (1)

  _save       (_mr1)           // 2 : save r1
  _ldq        (0, _r5)         // 1 : y (r5) = 0
  _ld_16_i32  (255, _r10)      // 2 : max iters

  // y-loop
  _ldq        (0, _r9)         // 1 : x = 0
  _move_32    (_r6, _r7)       // 1 : cX = xMin

  // x-loop                        do {

  _move_32    (_r7, _r11)            // 1 : zx = cX
  _move_32    (_r4, _r12)            // 1 : zy = cY
  _ldq        (0, _r13)              // 1 :  n = 0

                                      // do {

  _move_32    (_r11, _r14)           // 1
  _mul_f32    (_r11, _r14)           // 1 : zx2 = zx*zx
  _move_32    (_r12, _r15)           // 1
  _mul_f32    (_r12, _r15)           // 1 : zy2   = zy*zy

  _move_32    (_r7,  _r16)           // 1 : new_zx = cX
  _add_f32    (_r14, _r16)           // 1 : new_zx += zx2
  _sub_f32    (_r15, _r16)           // 1 : new_zx -= zy2

  _add_f32    (_r15, _r14)           // 1 : r14 = zx*zx + zy*zy (for loop test)

  _move_32    (_r11, _r15)           // 1 : tmp = zx
  _mul_f32    (_r12, _r15)           // 1 : tmp *= zy
  _add_f32    (_r15, _r15)           // 1 : tmp += tmp2
  _add_f32    (_r4,  _r15)           // 1 : tmp += cY (tmp = 2*zx*zy+cY)

  _move_32    (_r15, _r12)           // 1 : zy = tmp
  _move_32    (_r16, _r11)           // 1 : zx = new_zx
  _addi_16    (1, _r13)              // 2 : n++

  _ld_32_f32  (4.0f, _r16)             // 3
  _bgr_f32    (_r14, _r16, 2)          // 2
  _bls_32     (_r13, _r10, -23)        // 2

  _mul_u16    (_r13, _r13)             // 1
  _st_ripi_8  (_r13, _r1)              // 1 : out = n
  _add_f32    (_r8, _r7)               // 1 : cX += fStep
  _addi_16    (1, _r9)                 // 2 : x += iStep

  _bls_32     (_r9, _r2, -(6+23+3+1))    // 2 : } while (x < width)

  _add_f32    (_r8, _r4)                 // 1 : cY += fStep
  _addi_16    (1, _r5)                   // 2 : y += iStep
  _bls_32     (_r5, _r3, -(5+5+6+23+1))  // 2 : } while (y < height)

  _restore    (_mr1)                     // 1
  _ret
};

_VM_CODE(calculateRanges)
{
  // calculates xMin in r6, xMax in r7, step in r8
  _move_32    (_r5, _r6)
  _sub_f32    (_r4, _r6)         // r6 = r5-r4 (total y range)
  _s32to_f32  (_r2, _r7)         // r7 = (float) r2
  _move_32    (_r7, _r9)
  _mul_f32    (_r6, _r7)         // r7 *= r6
  _s32to_f32  (_r3, _r6)         // r6 = (float) r3
  _div_f32    (_r6, _r7)         // r7 /= r6
  _move_32    (_r7, _r8)
  _div_f32    (_r9, _r8)
  _ld_32_f32  (0.75f, _r6)
  _sub_f32    (_r7, _r6)
  _add_f32    (_r6, _r7)

  _ret
};


_VM_CODE(virtualProgram)          // a vm function
{
  _ld_16_i16  (512, _r2)
  _ld_16_i16  (512, _r3)
  _calln      (nativeAllocBuffer)
  _ld_32_f32  (-1.25f, _r4)       // yMin
  _ld_32_f32  (1.25f, _r5)        // yMax
  _call       (calculateRanges)
  _call       (makeFractal)
  _lda        (&quot;framebuffer.pgm&quot;, _r16)
  _calln      (nativeWriteBuffer)
  _calln      (nativeFreeBuffer)
  _ret
};
Title: Re: Threaded Code
Post by: bloodline on November 25, 2010, 01:05:24 AM
Quote from: Karlos;594217
Then maybe the version you have is old? This one even had a whole bunch of evil C macros that allowed the VM code to be written within the C source:

Code: [Select]



void nativeAllocBuffer(VMCore* vm)
{
  // width/height in r2/r3
  // return buffer in r1
  int w = vm->getReg(_r2).s32();
  int h = vm->getReg(_r3).s32();
  vm->getReg(_r1).pU8() = new uint8[w*h];
  printf(&quot;Allocated buffer [%d x %d]\n&quot;, w, h);
}

void nativeFreeBuffer(VMCore* vm)
{
  // expects buffer in r1
  delete[] vm->getReg(_r1).pCh();
  vm->getReg(_r1).pCh() = 0;
  printf(&quot;Freed buffer\n&quot;);
}

void nativeWriteBuffer(VMCore* vm)
{
  // writes buffer in r1 to filename in r4
  // expects width/height in r2/r3
  const char* fileName = vm->getReg(_r16).pCh();
  int w = vm->getReg(_r2).s32();
  int h = vm->getReg(_r3).s32();
  if (fileName) {
    FILE *f = fopen(fileName, &quot;wb&quot;);
    fprintf(f, &quot;P5\n%d\n%d\n255\n&quot;, w, h);
    fwrite(vm->getReg(_r1).pCh(), 1, w*h, f);
    fclose(f);
    printf(&quot;Wrote buffer '%s'\n&quot;, fileName);
  }
}

void nativePrintCoords(VMCore* vm)
{
  printf(
    &quot;Coords %4d, %4d (%.6f, %.6f)\n&quot;,
    (int)vm->getReg(_r7).s32(),
    (int)vm->getReg(_r5).s32(),
    vm->getReg(_r9).f32(),
    vm->getReg(_r4).f32()
  );
}

_VM_CODE(makeFractal)
{
  // r1 = pixel data address
  // r2 = width in pixels
  // r3 = height in pixels
  // r4 = cY (float pos, starting at yMin)
  // r5 = y (int) pixel
  // r6 = xMin (float)
  // r7 = cX (float pos, starting at xMin)
  // r8 = fStep
  // r9 = x (int) pixel
  // r10 = iStep (1)

  _save       (_mr1)           // 2 : save r1
  _ldq        (0, _r5)         // 1 : y (r5) = 0
  _ld_16_i32  (255, _r10)      // 2 : max iters

  // y-loop
  _ldq        (0, _r9)         // 1 : x = 0
  _move_32    (_r6, _r7)       // 1 : cX = xMin

  // x-loop                        do {

  _move_32    (_r7, _r11)            // 1 : zx = cX
  _move_32    (_r4, _r12)            // 1 : zy = cY
  _ldq        (0, _r13)              // 1 :  n = 0

                                      // do {

  _move_32    (_r11, _r14)           // 1
  _mul_f32    (_r11, _r14)           // 1 : zx2 = zx*zx
  _move_32    (_r12, _r15)           // 1
  _mul_f32    (_r12, _r15)           // 1 : zy2   = zy*zy

  _move_32    (_r7,  _r16)           // 1 : new_zx = cX
  _add_f32    (_r14, _r16)           // 1 : new_zx += zx2
  _sub_f32    (_r15, _r16)           // 1 : new_zx -= zy2

  _add_f32    (_r15, _r14)           // 1 : r14 = zx*zx + zy*zy (for loop test)

  _move_32    (_r11, _r15)           // 1 : tmp = zx
  _mul_f32    (_r12, _r15)           // 1 : tmp *= zy
  _add_f32    (_r15, _r15)           // 1 : tmp += tmp2
  _add_f32    (_r4,  _r15)           // 1 : tmp += cY (tmp = 2*zx*zy+cY)

  _move_32    (_r15, _r12)           // 1 : zy = tmp
  _move_32    (_r16, _r11)           // 1 : zx = new_zx
  _addi_16    (1, _r13)              // 2 : n++

  _ld_32_f32  (4.0f, _r16)             // 3
  _bgr_f32    (_r14, _r16, 2)          // 2
  _bls_32     (_r13, _r10, -23)        // 2

  _mul_u16    (_r13, _r13)             // 1
  _st_ripi_8  (_r13, _r1)              // 1 : out = n
  _add_f32    (_r8, _r7)               // 1 : cX += fStep
  _addi_16    (1, _r9)                 // 2 : x += iStep

  _bls_32     (_r9, _r2, -(6+23+3+1))    // 2 : } while (x < width)

  _add_f32    (_r8, _r4)                 // 1 : cY += fStep
  _addi_16    (1, _r5)                   // 2 : y += iStep
  _bls_32     (_r5, _r3, -(5+5+6+23+1))  // 2 : } while (y < height)

  _restore    (_mr1)                     // 1
  _ret
};

_VM_CODE(calculateRanges)
{
  // calculates xMin in r6, xMax in r7, step in r8
  _move_32    (_r5, _r6)
  _sub_f32    (_r4, _r6)         // r6 = r5-r4 (total y range)
  _s32to_f32  (_r2, _r7)         // r7 = (float) r2
  _move_32    (_r7, _r9)
  _mul_f32    (_r6, _r7)         // r7 *= r6
  _s32to_f32  (_r3, _r6)         // r6 = (float) r3
  _div_f32    (_r6, _r7)         // r7 /= r6
  _move_32    (_r7, _r8)
  _div_f32    (_r9, _r8)
  _ld_32_f32  (0.75f, _r6)
  _sub_f32    (_r7, _r6)
  _add_f32    (_r6, _r7)

  _ret
};


_VM_CODE(virtualProgram)          // a vm function
{
  _ld_16_i16  (512, _r2)
  _ld_16_i16  (512, _r3)
  _calln      (nativeAllocBuffer)
  _ld_32_f32  (-1.25f, _r4)       // yMin
  _ld_32_f32  (1.25f, _r5)        // yMax
  _call       (calculateRanges)
  _call       (makeFractal)
  _lda        (&quot;framebuffer.pgm&quot;, _r16)
  _calln      (nativeWriteBuffer)
  _calln      (nativeFreeBuffer)
  _ret
};
Old! I'll say... We were last talking about this in 2003... Hmmm ilike the c macros idea... That allows you to test the functions! :)
Title: Re: Threaded Code
Post by: Karlos on November 25, 2010, 01:08:48 AM
Quote from: bloodline;594218
Old! I'll say... We were last talking about this in 2003... Hmmm ilike the c macros idea... That allows you to test the functions! :)


I can send you this version. It basically is a bit crap in that it compiles to a single executable containing the embedded test. The reason being, the final target was for a library which included all the necessary loading/linking stuff. It should compile on any posix compliant system.
Title: Re: Threaded Code
Post by: bloodline on November 25, 2010, 01:13:02 AM
Quote from: Karlos;594219
I can send you this version. It basically is a bit crap in that it compiles to a single executable containing the embedded test. The reason being, the final target was for a library which included all the necessary loading/linking stuff. It should compile on any posix compliant system.
Kind offer, but I don't have much time right now to develop this idea further :) but I am keen to disccus vm techniques so that as time permits I will have a clear idea of the issues and maybe build something! :idea:

Though right now I'm just fighting insomnia... I've got an early start tomorrow and for some reason that means I am unable to get the rest in need :(
Title: Re: Threaded Code
Post by: Karlos on November 25, 2010, 01:25:28 AM
@bloodline

Well there are some things I'd do differently if I was going to do it again. That VM had 16 general purpose 64-bit registers that could contain any 8/16/32/64-bit wide elemental type at once. The opcode defined how they were to be interpreted. Each opcode was (at least) a 2-byte entity, with a byte for the operation and usually a byte that encoded the source and destination register. As such it was a load/store architecture.

This made it easy to design and write, but doing it from scratch, I'd probably go for a stack-frame machine. It wouldn't actually be any slower since the above registers are still memory locations anyway, and if done correctly, would allow you to have as many "registers" as you have local data inside any function. That is to say, I'd use the same register-like topology but have only as many of them in a function context as needed.

I did write some documentation, but it is rather out of date, I expect: http://extropia.co.uk/projects/vm/
Title: Re: Threaded Code
Post by: Trev on November 25, 2010, 05:04:58 AM
My abuse of setjmp/longjmp worked as expected. Add a stack and an op to push values, and you have a very simple (yet poorly designed ;-) virtual machine.
Title: Re: Threaded Code
Post by: ganyaik on November 25, 2010, 06:01:06 AM
Hi,

I did(/am doing) something similar: a VM on a resource constrained system. I made some experiments with jump tables, ifs and such and settled with plain "switch/case".

In my VM the opcode is always the first byte of the instruction so I can branch on it easily and the C compiler(gcc) generates a nice big jumptable. The code, that jumps to the case: which executes the emulated instruction is a few instructions to create pointer and an indirect jump in assembly. No register saving and such involved.

Hope this helps,
Chris
Title: Re: Threaded Code
Post by: Karlos on November 25, 2010, 11:11:25 AM
^ that's precisely how mine works when compiled with -D_VM_INTERPRETER=_VM_INTERPRETER_SWITCH_CASE. Otherwise it generates a function table.

There is an incomplete 68K version which uses assembler for the core interpreter. In this model, each opcode handler is at a 64-byte boundary relative to a base address and each handler ends with the code required to read the next opcode and calculate which handler to branch to next. The rest of the space is left to implement the handler or jump out to an external block (for the few that did not fit).

This design showed a lot of promise, performance wise.