Author Topic: Zorro III memory card... now with Ethernet (Read 22254 times)

olsen · « **Reply #44 from previous page:** December 19, 2013, 08:52:57 AM »

Quote from: tnt23;754231

Frankly speaking, I don't understand why, for the 16-bit case, there would be any 32-bit transfer at all. Say, if we need to transfer two 16-bit words with respect to both size AND alignment, then it should look like two "move.w (src)+, (dst)+" instructions should it not? The addressing will be done in words, and that's nice. In my perception this is not equal to one "move.l (src)+, (dst)+" instruction as the latter breaks both the size (transferring 32 bits at once) and alignment constraints (crossing the 16 bit boundary).

There are two reasons.

The first is historic: up until very recently (and with the exception of the DKB WildFire, which I believe was capable of 32 bit wide memory access) all Amiga Ethernet hardware was either accessible only through the Zorro II bus, or did not permit 32 bit wide memory access. On the Zorro II bus, a 32 bit wide access will be broken up into two consecutive 16 bit accesses. How this worked out with hardware which could not support 32 bit wide accesses was up to the glue logic on the board.

The second is performance: the ratio of instructions executed vs. the amount of data copied is terrible for "move.w (a0)+,(a1)+", less terrible for "move.l (a0)+,(a0)+" and becomes better if you can leverage "movem.l (a0)+,d1-d7/a2-a6 ; movem.l d1-d7/a2-a6,(a1)+" style copying (better still if you can unroll the copying loop in which movem.l is used).

I stopped counting execution cycles more than 15 years ago, but I believe that performance of even an unrolled "move.w (a0)+,(a1)+" loop will be quite poor.

Roadshow contains a restricted version of the original, optimized copying function, with the restriction being that only 16 and 32 bit copying operations are used. The goal was to provide for better performance than the S2_CopyFromBuff/S2_CopyToBuff callbacks could. Which was done specifically for the "Ariadne".

There is a slow "move.w (a0)+,(a1)+" variant available in Roadshow already. It is enabled by default, but all the example interface configuration files disable it. To switch back to the slow variant, either remove the "copymode=fast" parameter from the respective interface file, or replace it with "copymode=slow".

Quote

That's exactly what I am trying to figure out. It is possible to implement the said contiguous buffer on my card, with the restriction that it should only be accessed in 16-bits using even addresses only. If the S2_CopyFromBuff16/S2_CopyToBuff16 hooks would follow that "move.w (src)+, (dst)+" restriction, everything should work smoothly - and that would eliminate the need in any side buffering, saving in memory and performance.

Now, if the S2_CopyFromBuff16/S2_CopyToBuff16 hooks at some point won't follow the granularity convention and decide to switch to transferring 32 bits at once, that would break the whole idea, I think.

Could be, but then your code needs to be able to handle the regular S2_CopyFromBuff/S2_CopyToBuff callbacks, which are likely going to be much worse in terms of performance. You will always have to be able to provide for a side-buffer, in case S2_CopyFromBuff/S2_CopyToBuff callbacks are invoked and the client offers no alternative callbacks.

Quote

I understand the DMA callbacks idea better now In fact, I am trying to perform exactly like that, checking if the DMA hook is available, then asking for the pointer etc. It even seems to work, although is slow as hell. Lot to check on my side.

So, back to our 16-bit stuff. Do you think it would be feasible to implement that 'strict' behaviour S2_CopyFromBuff16/S2_CopyToBuff16 in Roadshow?

See above: it's already supported

Quote

UPDATE. I'm afraid I have been terribly wrong: the hardware buffer on my side could only be arranged for long-aligned 16-bit access

If you can make it appear on a 32 bit aligned start address, then testing it with Roadshow's built-in slow 16 bit copy callback might just work out.

tnt23 · « **Reply #45 on:** March 27, 2014, 07:17:48 PM »

Haven't posted for a while. Not much progress on the software side, was building more cards to test.

Plaz · « **Reply #46 on:** March 28, 2014, 02:27:31 AM »

Great work and very nice looking cards. I hope you continue to have success.

Plaz

tnt23 · « **Reply #47 on:** May 29, 2014, 07:31:36 PM »

Got a hold of Cyberstorm MK3 (060@50MHz), benchmarks get way better:

Code: [Select]

NETIO - Network Throughput Benchmark, Version 1.32
(C) 1997-2012 Kai Uwe Rommel

UDP server listening.
TCP server listening.
TCP connection established ...
Receiving from client, packet size  1k ...  947.98 KByte/s
Sending to client, packet size  1k ...  659.21 KByte/s
Receiving from client, packet size  2k ...  1057.02 KByte/s
Sending to client, packet size  2k ...  895.88 KByte/s
Receiving from client, packet size  4k ...  1119.78 KByte/s
Sending to client, packet size  4k ...  1269.84 KByte/s
Receiving from client, packet size  8k ...  1183.08 KByte/s
Sending to client, packet size  8k ...  1380.43 KByte/s
Receiving from client, packet size 16k ...  1212.76 KByte/s
Sending to client, packet size 16k ...  1411.63 KByte/s
Receiving from client, packet size 32k ...  1207.01 KByte/s
Sending to client, packet size 32k ...  1427.19 KByte/s
Done.

HammerD · « **Reply #48 on:** May 29, 2014, 08:53:56 PM »

Quote from: tnt23;765274

Got a hold of Cyberstorm MK3 (060@50MHz), benchmarks get way better:

Code: [Select]
NETIO - Network Throughput Benchmark, Version 1.32 (C) 1997-2012 Kai Uwe Rommel UDP server listening. TCP server listening. TCP connection established ... Receiving from client, packet size 1k ... 947.98 KByte/s Sending to client, packet size 1k ... 659.21 KByte/s Receiving from client, packet size 2k ... 1057.02 KByte/s Sending to client, packet size 2k ... 895.88 KByte/s Receiving from client, packet size 4k ... 1119.78 KByte/s Sending to client, packet size 4k ... 1269.84 KByte/s Receiving from client, packet size 8k ... 1183.08 KByte/s Sending to client, packet size 8k ... 1380.43 KByte/s Receiving from client, packet size 16k ... 1212.76 KByte/s Sending to client, packet size 16k ... 1411.63 KByte/s Receiving from client, packet size 32k ... 1207.01 KByte/s Sending to client, packet size 32k ... 1427.19 KByte/s Done.

That is very good transfer rates with the 060 :-)

amigean · « **Reply #49 on:** May 30, 2014, 12:30:50 AM »

this looks very cool - quite an achievement to build this thing from scratch.

I'd love one of them when you're ready to produce them.

freqmax · « **Reply #50 on:** May 30, 2014, 01:31:32 AM »

Regarding networking it seems all driver arhitectures suffers from various bugs:
* AS225r1 for the A2065 Ethernet uses hardcoded driver.
* SANA-II suffers from an inefficient buffer handling scheme, lacking proper support for promiscuous and multicast modes.
* Miami Network Interface (MNI) abandoned without support. And still lacks some Ethernet capabilities.

So making a new bsd socket layer and proper hardware driver abstraction API might be a really good deed. Booting from network is also something that is kind of missing. Btw, did you add 32-bit transfers to speed up things? DMA transfers?

How many layers does your card need? and what was the price to produce just the PCB? (via, tin, laquer, etc options?)

tnt23 · « **Reply #51 on:** June 02, 2014, 07:39:35 AM »

Quote from: freqmax;765291

So making a new bsd socket layer and proper hardware driver abstraction API might be a really good deed. Booting from network is also something that is kind of missing.

Probably, but who's gonna make it, and who's gonna write driver replacements for all legacy hardware out there?

Quote

Btw, did you add 32-bit transfers to speed up things? DMA transfers?

No, there are neither 32 bit nor DMA transfers.

Quote

How many layers does your card need? and what was the price to produce just the PCB? (via, tin, laquer, etc options?)

The card has 4 layers, and the price of the first (experimental and express) batch was I'd say decent, something like 50 euro per board. Subsequent batches are of course cheaper. I could lookup exact PCB parameters like copper thickness, track width and via size, why?

freqmax · « **Reply #52 on:** June 02, 2014, 02:34:39 PM »

Quote from: tnt23;765537

Probably, but who's gonna make it, and who's gonna write driver replacements for all legacy hardware out there?

The important step is to create a infrastructure that others can fill in. Not to do it oneself all the way.

Quote from: tnt23;765537

The card has 4 layers, and the price of the first (experimental and express) batch was I'd say decent, something like 50 euro per board. Subsequent batches are of course cheaper. I could lookup exact PCB parameters like copper thickness, track width and via size, why?

Curious for other projects probably involving FPGA, ARM, MIPS etc.

tnt23 · « **Reply #53 on:** June 02, 2014, 08:15:47 PM »

PCB is 4 layers, FR-4 18u copper, 0.2mm/0.2mm tracks, 0.3mm/0.7mm vias, two sided soldering mask, no silk. Continuity check, express production (4-5 days) by local PCB house in Saint Petersburg, so your mileage most likely will vary.

freqmax · « **Reply #54 on:** June 03, 2014, 02:12:01 AM »

Dunno if Saint Petersburg is local though

tnt23 · « **Reply #55 on:** June 03, 2014, 07:29:47 AM »

Quote from: freqmax;765598

Dunno if Saint Petersburg is local though

Unless you're in Florida

Author Topic: Zorro III memory card... now with Ethernet (Read 22254 times)

olsen

Re: Zorro III memory card... now with Ethernet

tnt23

Re: Zorro III memory card... now with Ethernet

Plaz

Re: Zorro III memory card... now with Ethernet

tnt23

Re: Zorro III memory card... now with Ethernet

HammerD

Re: Zorro III memory card... now with Ethernet

amigean

Re: Zorro III memory card... now with Ethernet

freqmax

Re: Zorro III memory card... now with Ethernet

tnt23

Re: Zorro III memory card... now with Ethernet

freqmax

Re: Zorro III memory card... now with Ethernet

tnt23

Re: Zorro III memory card... now with Ethernet

freqmax

Re: Zorro III memory card... now with Ethernet

tnt23

Re: Zorro III memory card... now with Ethernet