Welcome, Guest. Please login or register.

Author Topic: Threaded Code  (Read 2389 times)

Description:

0 Members and 1 Guest are viewing this topic.

Offline bloodlineTopic starter

  • Master Sock Abuser
  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 12113
    • Show only replies by bloodline
    • http://www.troubled-mind.com
Re: Threaded Code
« Reply #29 from previous page: November 25, 2010, 01:05:24 AM »
Quote from: Karlos;594217
Then maybe the version you have is old? This one even had a whole bunch of evil C macros that allowed the VM code to be written within the C source:

Code: [Select]



void nativeAllocBuffer(VMCore* vm)
{
  // width/height in r2/r3
  // return buffer in r1
  int w = vm->getReg(_r2).s32();
  int h = vm->getReg(_r3).s32();
  vm->getReg(_r1).pU8() = new uint8[w*h];
  printf("Allocated buffer [%d x %d]\n", w, h);
}

void nativeFreeBuffer(VMCore* vm)
{
  // expects buffer in r1
  delete[] vm->getReg(_r1).pCh();
  vm->getReg(_r1).pCh() = 0;
  printf("Freed buffer\n");
}

void nativeWriteBuffer(VMCore* vm)
{
  // writes buffer in r1 to filename in r4
  // expects width/height in r2/r3
  const char* fileName = vm->getReg(_r16).pCh();
  int w = vm->getReg(_r2).s32();
  int h = vm->getReg(_r3).s32();
  if (fileName) {
    FILE *f = fopen(fileName, "wb");
    fprintf(f, "P5\n%d\n%d\n255\n", w, h);
    fwrite(vm->getReg(_r1).pCh(), 1, w*h, f);
    fclose(f);
    printf("Wrote buffer '%s'\n", fileName);
  }
}

void nativePrintCoords(VMCore* vm)
{
  printf(
    "Coords %4d, %4d (%.6f, %.6f)\n",
    (int)vm->getReg(_r7).s32(),
    (int)vm->getReg(_r5).s32(),
    vm->getReg(_r9).f32(),
    vm->getReg(_r4).f32()
  );
}

_VM_CODE(makeFractal)
{
  // r1 = pixel data address
  // r2 = width in pixels
  // r3 = height in pixels
  // r4 = cY (float pos, starting at yMin)
  // r5 = y (int) pixel
  // r6 = xMin (float)
  // r7 = cX (float pos, starting at xMin)
  // r8 = fStep
  // r9 = x (int) pixel
  // r10 = iStep (1)

  _save       (_mr1)           // 2 : save r1
  _ldq        (0, _r5)         // 1 : y (r5) = 0
  _ld_16_i32  (255, _r10)      // 2 : max iters

  // y-loop
  _ldq        (0, _r9)         // 1 : x = 0
  _move_32    (_r6, _r7)       // 1 : cX = xMin

  // x-loop                        do {

  _move_32    (_r7, _r11)            // 1 : zx = cX
  _move_32    (_r4, _r12)            // 1 : zy = cY
  _ldq        (0, _r13)              // 1 :  n = 0

                                      // do {

  _move_32    (_r11, _r14)           // 1
  _mul_f32    (_r11, _r14)           // 1 : zx2 = zx*zx
  _move_32    (_r12, _r15)           // 1
  _mul_f32    (_r12, _r15)           // 1 : zy2   = zy*zy

  _move_32    (_r7,  _r16)           // 1 : new_zx = cX
  _add_f32    (_r14, _r16)           // 1 : new_zx += zx2
  _sub_f32    (_r15, _r16)           // 1 : new_zx -= zy2

  _add_f32    (_r15, _r14)           // 1 : r14 = zx*zx + zy*zy (for loop test)

  _move_32    (_r11, _r15)           // 1 : tmp = zx
  _mul_f32    (_r12, _r15)           // 1 : tmp *= zy
  _add_f32    (_r15, _r15)           // 1 : tmp += tmp2
  _add_f32    (_r4,  _r15)           // 1 : tmp += cY (tmp = 2*zx*zy+cY)

  _move_32    (_r15, _r12)           // 1 : zy = tmp
  _move_32    (_r16, _r11)           // 1 : zx = new_zx
  _addi_16    (1, _r13)              // 2 : n++

  _ld_32_f32  (4.0f, _r16)             // 3
  _bgr_f32    (_r14, _r16, 2)          // 2
  _bls_32     (_r13, _r10, -23)        // 2

  _mul_u16    (_r13, _r13)             // 1
  _st_ripi_8  (_r13, _r1)              // 1 : out = n
  _add_f32    (_r8, _r7)               // 1 : cX += fStep
  _addi_16    (1, _r9)                 // 2 : x += iStep

  _bls_32     (_r9, _r2, -(6+23+3+1))    // 2 : } while (x < width)

  _add_f32    (_r8, _r4)                 // 1 : cY += fStep
  _addi_16    (1, _r5)                   // 2 : y += iStep
  _bls_32     (_r5, _r3, -(5+5+6+23+1))  // 2 : } while (y < height)

  _restore    (_mr1)                     // 1
  _ret
};

_VM_CODE(calculateRanges)
{
  // calculates xMin in r6, xMax in r7, step in r8
  _move_32    (_r5, _r6)
  _sub_f32    (_r4, _r6)         // r6 = r5-r4 (total y range)
  _s32to_f32  (_r2, _r7)         // r7 = (float) r2
  _move_32    (_r7, _r9)
  _mul_f32    (_r6, _r7)         // r7 *= r6
  _s32to_f32  (_r3, _r6)         // r6 = (float) r3
  _div_f32    (_r6, _r7)         // r7 /= r6
  _move_32    (_r7, _r8)
  _div_f32    (_r9, _r8)
  _ld_32_f32  (0.75f, _r6)
  _sub_f32    (_r7, _r6)
  _add_f32    (_r6, _r7)

  _ret
};


_VM_CODE(virtualProgram)          // a vm function
{
  _ld_16_i16  (512, _r2)
  _ld_16_i16  (512, _r3)
  _calln      (nativeAllocBuffer)
  _ld_32_f32  (-1.25f, _r4)       // yMin
  _ld_32_f32  (1.25f, _r5)        // yMax
  _call       (calculateRanges)
  _call       (makeFractal)
  _lda        (&quot;framebuffer.pgm&quot;, _r16)
  _calln      (nativeWriteBuffer)
  _calln      (nativeFreeBuffer)
  _ret
};
Old! I'll say... We were last talking about this in 2003... Hmmm ilike the c macros idea... That allows you to test the functions! :)

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16867
  • Country: gb
  • Thanked: 4 times
    • Show only replies by Karlos
Re: Threaded Code
« Reply #30 on: November 25, 2010, 01:08:48 AM »
Quote from: bloodline;594218
Old! I'll say... We were last talking about this in 2003... Hmmm ilike the c macros idea... That allows you to test the functions! :)


I can send you this version. It basically is a bit crap in that it compiles to a single executable containing the embedded test. The reason being, the final target was for a library which included all the necessary loading/linking stuff. It should compile on any posix compliant system.
int p; // A
 

Offline bloodlineTopic starter

  • Master Sock Abuser
  • Hero Member
  • *****
  • Join Date: Mar 2002
  • Posts: 12113
    • Show only replies by bloodline
    • http://www.troubled-mind.com
Re: Threaded Code
« Reply #31 on: November 25, 2010, 01:13:02 AM »
Quote from: Karlos;594219
I can send you this version. It basically is a bit crap in that it compiles to a single executable containing the embedded test. The reason being, the final target was for a library which included all the necessary loading/linking stuff. It should compile on any posix compliant system.
Kind offer, but I don't have much time right now to develop this idea further :) but I am keen to disccus vm techniques so that as time permits I will have a clear idea of the issues and maybe build something! :idea:

Though right now I'm just fighting insomnia... I've got an early start tomorrow and for some reason that means I am unable to get the rest in need :(

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16867
  • Country: gb
  • Thanked: 4 times
    • Show only replies by Karlos
Re: Threaded Code
« Reply #32 on: November 25, 2010, 01:25:28 AM »
@bloodline

Well there are some things I'd do differently if I was going to do it again. That VM had 16 general purpose 64-bit registers that could contain any 8/16/32/64-bit wide elemental type at once. The opcode defined how they were to be interpreted. Each opcode was (at least) a 2-byte entity, with a byte for the operation and usually a byte that encoded the source and destination register. As such it was a load/store architecture.

This made it easy to design and write, but doing it from scratch, I'd probably go for a stack-frame machine. It wouldn't actually be any slower since the above registers are still memory locations anyway, and if done correctly, would allow you to have as many "registers" as you have local data inside any function. That is to say, I'd use the same register-like topology but have only as many of them in a function context as needed.

I did write some documentation, but it is rather out of date, I expect: http://extropia.co.uk/projects/vm/
int p; // A
 

Offline Trev

  • Zero
  • Hero Member
  • *****
  • Join Date: May 2003
  • Posts: 1550
  • Country: 00
    • Show only replies by Trev
Re: Threaded Code
« Reply #33 on: November 25, 2010, 05:04:58 AM »
My abuse of setjmp/longjmp worked as expected. Add a stack and an op to push values, and you have a very simple (yet poorly designed ;-) virtual machine.
 

Offline ganyaik

  • Newbie
  • *
  • Join Date: Jan 2007
  • Posts: 7
    • Show only replies by ganyaik
Re: Threaded Code
« Reply #34 on: November 25, 2010, 06:01:06 AM »
Hi,

I did(/am doing) something similar: a VM on a resource constrained system. I made some experiments with jump tables, ifs and such and settled with plain "switch/case".

In my VM the opcode is always the first byte of the instruction so I can branch on it easily and the C compiler(gcc) generates a nice big jumptable. The code, that jumps to the case: which executes the emulated instruction is a few instructions to create pointer and an indirect jump in assembly. No register saving and such involved.

Hope this helps,
Chris
 

Offline Karlos

  • Sockologist
  • Global Moderator
  • Hero Member
  • *****
  • Join Date: Nov 2002
  • Posts: 16867
  • Country: gb
  • Thanked: 4 times
    • Show only replies by Karlos
Re: Threaded Code
« Reply #35 on: November 25, 2010, 11:11:25 AM »
^ that's precisely how mine works when compiled with -D_VM_INTERPRETER=_VM_INTERPRETER_SWITCH_CASE. Otherwise it generates a function table.

There is an incomplete 68K version which uses assembler for the core interpreter. In this model, each opcode handler is at a 64-byte boundary relative to a base address and each handler ends with the code required to read the next opcode and calculate which handler to branch to next. The rest of the space is left to implement the handler or jump out to an external block (for the few that did not fit).

This design showed a lot of promise, performance wise.
int p; // A