Amiga.org

Amiga computer related discussion => Amiga Hardware Issues and discussion => Topic started by: tnt23 on November 16, 2013, 01:17:15 PM

Title: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 16, 2013, 01:17:15 PM
A new 1.2 revision of Z3SDRAM, an open Zorro III memory card design, is now sharing up to 64MB of SDRAM and an Ethernet controller (based on DM9000 chip from Davicom) on one PCB. Also the new board can be used in Zorro II configurations, as the rest of Zorro signals are now also been routed to the FPGA.

(http://farm4.staticflickr.com/3755/10832324855_1e7ebecc82_z.jpg)

The hardware side seems to be functional, and SANA-compatible dm9000.device is in its early development stage (read: I was able to compile the 'hello.c' somehow).

(http://farm4.staticflickr.com/3774/10817665015_0a72110d49_z.jpg)

(http://farm8.staticflickr.com/7377/10870389485_f1ca2a491d_z.jpg)

The project page will hopefully be updated soon: http://code.google.com/p/z3sdram/
Title: Re: Zorro III memory card... now with Ethernet
Post by: som99 on November 16, 2013, 01:44:31 PM
Great news, id like one of those when I get my hands on a A4000 040 :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: mechy on November 16, 2013, 01:53:00 PM
Nice project, will poseidon work with it?
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 16, 2013, 02:45:30 PM
Well, as soon as Poseidon is USB stack and Z3SDRAM has no USB host (yet) - I'd say yes, Poseidon would work even without Z3SDRAM :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: Matt_H on November 16, 2013, 05:29:05 PM
Zorro II? Wow. I imagine that's ethernet only (or limited to 8MB of RAM)? Or have you managed some technical wizardry to get more than 8MB out of the Zorro II bus? If so, that's big news! :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 16, 2013, 06:17:19 PM
I doubt I could invent something to exceed the 8M barrier for ZII, but volunteers are welcome to try :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: B00tDisk on November 16, 2013, 06:31:01 PM
Quote from: tnt23;752841
Well, as soon as Poseidon is USB stack and Z3SDRAM has no USB host (yet) - I'd say yes, Poseidon would work even without Z3SDRAM :)

USB over ethernet! (http://www.bb-elec.com/Products/USB-Connectivity/USB-Over-Ethernet/USB-Over-Ethernet-Hub.aspx?gclid=CI6p6__36boCFVRk7AodMB0AxQ&gclid=CI6p6__36boCFVRk7AodMB0AxQ&jadid=22990316046&jap={adposition}&jkId=gpt:pt_71668&js=1&jsid=31200&jt=1&jr=http://www.bb-elec.com/Products/USB-Connectivity/USB-Over-Ethernet/USB-Over-Ethernet-Hub.aspx&gclid=CI6p6__36boCFVRk7AodMB0AxQ)

Never thought I'd see that.

(And I just checked on ebay, you can get them much, much cheaper there.)
Title: Re: Zorro III memory card... now with Ethernet
Post by: Dr.Bongo on November 16, 2013, 08:18:04 PM
Very interesting. Are these for sale?
Title: Re: Zorro III memory card... now with Ethernet
Post by: jkonstan on November 17, 2013, 03:38:10 AM
Your posted sch & pcb files on Google project page for  Z3SDRAM, appear to be Altium or PCAD. When you update project page for Z3SDRAM, are you going to post Gerber files for the PCB for the  Z3SDRAM project as well?
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 17, 2013, 09:23:10 AM
Quote from: Dr.Bongo;752860
Very interesting. Are these for sale?


I have one spare unpopulated PCB left. More PCBs could be ordered for those who don't mind sourcing the components, soldering and JTAGging on their own. Besides, there's no device driver for the network part yet.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 17, 2013, 09:29:43 AM
Quote from: jkonstan;752869
Your posted sch & pcb files on Google project page for  Z3SDRAM, appear to be Altium or PCAD. When you update project page for Z3SDRAM, are you going to post Gerber files for the PCB for the  Z3SDRAM project as well?


The schematics and PCB are in old PCad 200x format. My local PCB house happily accepts PCad PCB file saving me the trouble of converting it to Gerber. If someone wants to do the conversion they are welcome.

Anybody willing to help in developing device driver for the DM9000, you are more than welcome, too.
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on November 17, 2013, 11:35:22 AM
Quote from: tnt23;752890
Anybody willing to help in developing device driver for the DM9000, you are more than welcome, too.

Now that sounds tempting - if only I hadn't my hands full already :(

How do you feel about open sourcing the resulting driver? As things stand, we do not have enough SANA-II Ethernet driver examples which can be reviewed and adapted by anybody interested in the matter. The networking driver source code, as part of the original SANA-II kit, explains how one might create a SLIP driver, but that is so obsolete it's not even funny any more.
Title: Re: Zorro III memory card... now with Ethernet
Post by: polluks on November 17, 2013, 01:24:00 PM
Quote from: tnt23;752890
The schematics and PCB are in old PCad 200x format. My local PCB house happily accepts PCad PCB file saving me the trouble of converting it to Gerber. If someone wants to do the conversion they are welcome.

Anybody willing to help in developing device driver for the DM9000, you are more than welcome, too.
You may take a look at https://en.wikibooks.org/wiki/Aros/Developer/NICDriversDev (https://en.wikibooks.org/wiki/Aros/Developer/NICDriversDev)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 17, 2013, 01:39:55 PM
For the CS8900 driver I was using the 3C589 sources available on Aminet. Same thing with DM9000 driver, both were based on (widely vandalized) 3C589 code. I was able to do TX and was fiddling with RX part under Roadshow just before switching to DM9000 chip.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 18, 2013, 07:00:18 PM
Quote from: olsen;752892

How do you feel about open sourcing the resulting driver? As things stand, we do not have enough SANA-II Ethernet driver examples which can be reviewed and adapted by anybody interested in the matter. The networking driver source code, as part of the original SANA-II kit, explains how one might create a SLIP driver, but that is so obsolete it's not even funny any more.


I would not mind sharing the driver at all, only there's a bit of chicken and egg problem. To come up with a decent driver I'd take a look at a sample one written by someone else (and preferrably not in assembler). It is really great 3c589.device  sources are available, although the code seems rather complex to me. I think I'm going to hack it quickly to get ping or DHCP reply before posting.
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on November 20, 2013, 08:50:27 AM
Quote from: tnt23;752975
I would not mind sharing the driver at all, only there's a bit of chicken and egg problem. To come up with a decent driver I'd take a look at a sample one written by someone else (and preferrably not in assembler). It is really great 3c589.device  sources are available, although the code seems rather complex to me. I think I'm going to hack it quickly to get ping or DHCP reply before posting.
I had a quick look at the 3c589.device source code, and it looks good to me (I'd probably rewrite the basic device I/O code to be more paranoid, though, and the cleanup procedures in case of errors should be reworked). It even has support for the SANA-IIR2 packet filter feature which some drivers choose to omit.

As far as I can tell from my past experience, 3c589.device is a good SANA-II Ethernet device driver design. It may appear to be complex, but this is in fact how you would implement this kind of driver. It properly separates the basic device I/O, the individual hardware units, and the low level hardware access. This is really how it's supposed to be done.

More code documentation would always be nice, but not everybody wants his code to be a didactic example ;)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 21, 2013, 04:21:40 PM
Quote from: olsen;753153
As far as I can tell from my past experience, 3c589.device is a good SANA-II Ethernet device driver design. It may appear to be complex, but this is in fact how you would implement this kind of driver. It properly separates the basic device I/O, the individual hardware units, and the low level hardware access. This is really how it's supposed to be done.

Well, give me some more time and we'll see the difference :) I have just implemented my first /INT2 server (which I am very proud of), and the resulting progress so far is slightly over zero.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 24, 2013, 05:33:22 PM
There are some minor issues, but basic RX code is working. SANAUTIL network tool opens device, requests orphan packets and dumps them.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 26, 2013, 06:32:32 AM
TX code is working now, too. Packets sent from Amiga are seen on the remote side. (Will have to set up a way to grab screenshots instead of using my mobile phone's camera.)

(http://farm6.staticflickr.com/5478/11062719453_3ec4cb09d2_z.jpg)

Remote host (Raspberry Pi running tcpdump):

(http://farm3.staticflickr.com/2860/11062441075_87bd3198a5_z.jpg)

For that reason I've bought the RoadShow TCP/IP stack last night :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on November 26, 2013, 08:20:02 AM
Quote from: tnt23;753448
TX code is working now, too. Packets sent from Amiga are seen on the remote side. (Will have to set up a way to grab screenshots instead of using my mobile phone's camera.)

Luxury. When I was a lad, we used to make screenshots by using the tiniest stubs of chewn and grubby crayons. But we were happy then.

Quote

For that reason I've bought the RoadShow TCP/IP stack last night :)


I guess it's too late to talk you out of buying Roadshow, but the demo version should be able to handle test setups like these just fine ;)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 26, 2013, 08:50:03 AM
Quote from: olsen;753451

I guess it's too late to talk you out of buying Roadshow, but the demo version should be able to handle test setups like these just fine ;)


Well, I've got enough reasons to reboot on my own to deal with demo's 30 minute expiration :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: Bobo68 on November 26, 2013, 12:07:43 PM
Quote from: tnt23;753448
(Will have to set up a way to grab screenshots instead of using my mobile phone's camera.)


sgrab ?
Title: Re: Zorro III memory card... now with Ethernet
Post by: Themamboman on November 26, 2013, 02:04:17 PM
Can you post a better picture of the actual card? Thanks!
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 26, 2013, 02:07:36 PM
Quote from: Bobo68;753456
sgrab ?


Will take a look. So far I have tried MasterGrabber and GrabScreen, both kinda work, producing IFF files, and I was hoping for JPG/PNG output.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 26, 2013, 02:28:07 PM
Quote from: Themamboman;753467
Can you post a better picture of the actual card? Thanks!


I'm afraid this one is the best I can do without my DSLR (another one taken with flash is even worse). Should you want some particular closeup just let me know :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on November 29, 2013, 04:38:09 PM
Quote from: olsen;753153
I had a quick look at the 3c589.device source code, and it looks good to me (I'd probably rewrite the basic device I/O code to be more paranoid, though, and the cleanup procedures in case of errors should be reworked).

So I've been having fun with sashimi and MiamiDX and at some point have tracked an issue with my driver's code freezing when calling S2_CopyToBuff hook. I will have a look into buffer management the MiamiDX provides when opening the device, but here's a piece of code (borrowed from the abovementioned 3c589 sources) that puzzles me.

An opener structure is being allocated:

Code: [Select]
/* Set up buffer-management structure and get hooks */

request->ios2_BufferManagement = opener = AllocVec (sizeof (struct Opener), MEMF_PUBLIC);

And then it is being filled using GetTagData () calls. If I get it right GetTagData () returns either found tag value or default value (second param). However, the same (uninitialized) opener->rx_function var is provided as the default value:

Code: [Select]
opener->rx_function = (APTR)GetTagData (rx_tags [i], (UPINT)opener->rx_function, tag_list);

This would probably make sense if the AllocVec () call used MEMF_CLEAR in addition to MEMF_PUBLIC, but this is not the case. Even if the allocated memory has been zeroed previously why not provide NULL as the default value? The same very source uses NULL just a couple lines of code further:

Code: [Select]
opener->filter_hook = (APTR)GetTagData (S2_PacketFilter, NULL, tag_list);
opener->dma_tx_function = (APTR)GetTagData (S2_DMACopyFromBuff32, NULL, tag_list);

Or perhaps there is some reason of (not) doing so?

Looked into cnetdevice assembler sources, default value is also NULL:

Code: [Select]
; get copyfrom functions:

 move.l  #S2_COPYFROMBUFF,d0
 moveq   #0,d1
 move.l  a2,a0
 jsr     _LVOGetTagData(a6)

Title: Re: Zorro III memory card... now with Ethernet
Post by: nyteschayde on November 29, 2013, 09:20:51 PM
This is awesome and please don't take this question as a complaint, but with RAM being so cheap, why only 64MB? Is there some limitation that I'm unaware of? Is this so it can be used in Zorro II in addition to Zorro III? I actually am unaware of the limitation per card for these devices in regards to addressable memory.
Title: Re: Zorro III memory card... now with Ethernet
Post by: Vlabguy1 on November 29, 2013, 09:28:44 PM
Very cool..
Title: Re: Zorro III memory card... now with Ethernet
Post by: magnetic on November 30, 2013, 02:06:48 AM
Yes very exciting GOGOGOGOGGOGOGGOGO
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on November 30, 2013, 09:15:00 AM
Quote

...

This would probably make sense if the AllocVec () call used MEMF_CLEAR in addition to MEMF_PUBLIC, but this is not the case. Even if the allocated memory has been zeroed previously why not provide NULL as the default value? The same very source uses NULL just a couple lines of code further:

Code: [Select]

opener->filter_hook = (APTR)GetTagData (S2_PacketFilter, NULL, tag_list);
opener->dma_tx_function = (APTR)GetTagData (S2_DMACopyFromBuff32, NULL, tag_list);


Or perhaps there is some reason of (not) doing so?


As far as I can tell (the callbacks are initialized exactly once), this is risky, and there is no benefit in initializing the callbacks in this manner.

I would not call it a bug, since every client of the SANA-II driver is likely to provide the proper callbacks. But if it does not, for some reason, then the device will crash.

The 3c589.device should be more paranoid, and verify that each parameter provided by the client is sound.

I already put a snapshot of the whole 3c589.device/pccard.library source code into my SVN repository, for rework, but there's been too little time to rework it so far :-(
Title: Re: Zorro III memory card... now with Ethernet
Post by: Bobo68 on November 30, 2013, 09:31:19 AM
Quote from: nyteschayde;753590
This is awesome and please don't take this question as a complaint, but with RAM being so cheap, why only 64MB? Is there some limitation that I'm unaware of? Is this so it can be used in Zorro II in addition to Zorro III? I actually am unaware of the limitation per card for these devices in regards to addressable memory.


there is a limit of chip size
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 01, 2013, 08:32:47 AM
Quote from: nyteschayde;753590
This is awesome and please don't take this question as a complaint, but with RAM being so cheap, why only 64MB? Is there some limitation that I'm unaware of? Is this so it can be used in Zorro II in addition to Zorro III? I actually am unaware of the limitation per card for these devices in regards to addressable memory.


64M was the biggest SDRAM chip I was able to find in TSSOP package. The common approach is to have two or more chips on board, but I wasn't brave enough to route another one. Zorro III itself is able to address more than 1G per card.

Speaking of Zorro II, the limit is 8M there, unless someone comes with some sort of banking driver or something.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 01, 2013, 10:44:20 AM
Quote from: olsen;753602
As far as I can tell (the callbacks are initialized exactly once), this is risky, and there is no benefit in initializing the callbacks in this manner.

I would not call it a bug, since every client of the SANA-II driver is likely to provide the proper callbacks. But if it does not, for some reason, then the device will crash.

The 3c589.device should be more paranoid, and verify that each parameter provided by the client is sound.

I already put a snapshot of the whole 3c589.device/pccard.library source code into my SVN repository, for rework, but there's been too little time to rework it so far :-(


Well, this indeed does not seem like a bug, at least no one complained so far. I think I understand what this code should do: since the S2_CopyToBuff is obligatory, the RX hook will be set to S2_CopyToBuff first. If the caller provides S2_CopyToBuff16 then the hook will be assigned that new tag value; otherwise, it will stick to S2_CopyToBuff, and so on. That way, as it seems to me, the request will be serviced using the fastest hook caller provides.

Anyway, the bug on my side was so silly it even isn't worth mentioning. Time to dig DHCP  (and try Sgrab):

(http://farm3.staticflickr.com/2882/11148709935_b41c911712_z.jpg)
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on December 01, 2013, 07:20:47 PM
Quote from: tnt23;753638
Well, this indeed does not seem like a bug, at least no one complained so far. I think I understand what this code should do: since the S2_CopyToBuff is obligatory, the RX hook will be set to S2_CopyToBuff first. If the caller provides S2_CopyToBuff16 then the hook will be assigned that new tag value; otherwise, it will stick to S2_CopyToBuff, and so on.


Yes, that seems to be the intention. However, if S2_CopyToBuff were missing, and S2_CopyToBuff16 were missing, too, then the code will end up using an unitialized pointer, which should be caught before it happens. Same goes for the S2_CopyFromBuff tags.

Quote

That way, as it seems to me, the request will be serviced using the fastest hook caller provides.

The purpose of S2_CopyToBuff16 is not to speed up copying. It is the counterpart to the S2_CopyFromBuff16 tag, which is a workaround for a hardware bug. As far as I know this bug only exists in one type of Amiga Ethernet card, which is the original "Ariadne".

There is a bug in how byte-sized Zorro II accesses to the card are handled. These are treated like word-sized accesses, which means that garbage data will go out or come in the high order byte. This isn't much of a problem for reading (if you read a byte from the receive buffer, you'll probably write it back as a byte, too), but if you write bytes to the Ariadne buffer, this will trash half the buffer contents.

The solution is to copy only in word-sized portions to the buffer, or in long-sized portions if possible. For this purpose the ariadne.device allocates a side-buffer, which all writes will go through. First the data will be copied into the side-buffer, then the side-buffer will be copied quickly to the transmit buffer on the card (in long-sized portions). Problem solved, but at the expense of speed.

The S2_CopyFromBuff16 method solves the problem by requiring that the client copies only in word-sized portions (or long-sized portions). As far as I know, no ariadne.device with the S2_CopyToBuff16 method enabled was ever shipped. The ariadne.device supports a different method, which is functionally identical to S2_CopyToBuff16. The tag ID for this method is (S2_Dummy + 1968). I suppose the ariadne.device author (Stefan Sticht, if I remember correctly) may have been born in 1968 ;)

Put another way, no driver is really required to support the S2_CopyFromBuff16 method unless the driver really, really needs it.
Quote

Anyway, the bug on my side was so silly it even isn't worth mentioning. Time to dig DHCP  (and try Sgrab):

(http://farm3.staticflickr.com/2882/11148709935_b41c911712_z.jpg)

Hm... does the DHCP negotiation succeed, eventually? If not, have you tried tcpdump yet?
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 02, 2013, 05:45:22 AM
Quote from: olsen;753651
Yes, that seems to be the intention. However, if S2_CopyToBuff were missing, and S2_CopyToBuff16 were missing, too, then the code will end up using an unitialized pointer, which should be caught before it happens. Same goes for the S2_CopyFromBuff tags.

The purpose of S2_CopyToBuff16 is not to speed up copying. It is the counterpart to the S2_CopyFromBuff16 tag, which is a workaround for a hardware bug. As far as I know this bug only exists in one type of Amiga Ethernet card, which is the original "Ariadne".

That's fascinating :) One would think the 16/32 buffer management routines have been proposed into SANA with performance in mind, not as some certain bug workarounds.
Quote

Put another way, no driver is really required to support the S2_CopyFromBuff16 method unless the driver really, really needs it.

Since the DM9000 in my design is wired in 16 bits, and I tend to use word accesses wherever possible, using x16 routines would be preferrable in my case.
Quote

Hm... does the DHCP negotiation succeed, eventually? If not, have you tried tcpdump yet?


No, the DHCP gives up after a minute timeout. I suspect there are at least two reasons for that, first that the queueing TX is not done properly, and then there is good load of KPrintF () calls all over the code - running at 9600 by default. If the serial debug routines are blocking then this would also impact timings. I will change the speed to 115200 and also will fix the TX queueing.

Haven't tried tcpdump yet, but definitely will :)
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on December 02, 2013, 07:52:30 AM
Quote from: tnt23;753674
That's fascinating :) One would think the 16/32 buffer management routines have been proposed into SANA with performance in mind, not as some certain bug workarounds.

Since the DM9000 in my design is wired in 16 bits, and I tend to use word accesses wherever possible, using x16 routines would be preferrable in my case.


I would not recommend it. The 16/32 bit copy functions require that the data being copied is aligned to a particular address boundary, and that in itself is a restriction. That restriction may be necessary (if your hardware chokes on unaligned accesses, which would be rather unfortunate), but it does not produce speed gains. On the contrary: the 68030 would benefit from word-sized access restrictions, but since the Zorro II space is marked as non-cacheable there would be no advantage after all. And on a Zorro III board that question wouldn't even come up.

Sticking with S2_CopyFromBuff/S2_CopyToBuff has no downsides. Any client (e.g. TCP/IP stack) should use optimized copying code which would automatically use long-sized accesses.

So, in a nutshell: your driver should use S2_CopyFromBuff/S2_CopyToBuff and ignore everything else, unless your hardware has very specific requirements for which the 16/32 bit aligned copying functions would solve a really big problem.

The same goes for the S2_DMACopyFromBuff32/S2_DMACopyToBuff32 functions: unless your hardware supports this functionality perfectly (that is, it actually supports DMA to/from arbitrary 32 bit aligned addresses) don't bother implementing it. The benefits of these functions are very, very small if you don't support DMA. You might be able to skip one copying step inside the TCP/IP stack, but the gains are small. One case (perhaps the only case) in which the gains are not so small is the PPPoE driver which I cooked up, and which is practically useless today :(

Quote

No, the DHCP gives up after a minute timeout. I suspect there are at least two reasons for that, first that the queueing TX is not done properly, and then there is good load of KPrintF () calls all over the code - running at 9600 by default. If the serial debug routines are blocking then this would also impact timings. I will change the speed to 115200 and also will fix the TX queueing.

Haven't tried tcpdump yet, but definitely will :)


tcpdump is worth a shot if you suspect that traffic has gone missing which should have been processed by the TCP/IP stack. Readability of the output tends to be rather mixed bag, though, so this might be a good idea only if all other options have been exhausted (or if you create binary capture files and view them in "Wireshark").
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 03, 2013, 05:41:40 AM
Thank you Olsen, after spending some time studying tcpdump and sashimi logs I came to a conclusion that the card simply wasn't picking the DHCP ACK from the server. No wonder since the code responsible for multicast/broadcast stuff was, ahem, mostly commented out.

So I went and cowardly let the card accept all and every frame to see if this was an issue. Bingo!

(http://farm6.staticflickr.com/5488/11184159354_8863522dbe_z.jpg)

Ping is reporting duplicates, and FTP won't work even in passive mode, but being connected makes me feel better.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 04, 2013, 01:16:50 PM
Quote from: olsen;753681

Sticking with S2_CopyFromBuff/S2_CopyToBuff has no downsides. Any client (e.g. TCP/IP stack) should use optimized copying code which would automatically use long-sized accesses.


In Roadshow, if the COPYMODE=FAST option is set, buffer management will offer S2_CopyFromBuff16. Is there a way to have it also provide S2_CopyToBuff16? I can imagine the environment where using word-sized and word-aligned access would indeed speed things on the device driver's side compared with S2_CopyFromBuff/S2_CopyToBuff.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 04, 2013, 07:37:20 PM
Quote from: tnt23;753795
In Roadshow, if the COPYMODE=FAST option is set, buffer management will offer S2_CopyFromBuff16. Is there a way to have it also provide S2_CopyToBuff16? I can imagine the environment where using word-sized and word-aligned access would indeed speed things on the device driver's side compared with S2_CopyFromBuff/S2_CopyToBuff.


Here's what I've been looking into: (http://wiki.amigaos.net/index.php/Revision_3)

Code: [Select]

   These are optional callbacks presented to the device with the
   same calling interface as for S2_CopyToBuff or S2_CopyFromBuff,
   respectively. The difference to the original callbacks is the
   required and guaranteed transfer size and alignment for
   accessing the device's buffer for a single piece of a data of
   either 16 or 32 bits, a data word. The copy function called may
   only use 16/32 bit aligned read/write commands of 16/32 bits at
   once to transfer the data words, respectively. If the buffer
   data length is not a multiple of the required data word
   transfer size, the last data word transfer may contain garbage
   padding in either transfer direction.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 09, 2013, 04:32:42 PM
That's what I get with non-debug version of dm9000.device. A4000 with 68030/25MHz and 2MB of Chip RAM, 0MB of Fast RAM, 64MB of Zorro III RAM clocked at 100MHz.

Code: [Select]
NETIO - Network Throughput Benchmark, Version 1.32
(C) 1997-2012 Kai Uwe Rommel

UDP server listening.
TCP server listening.
TCP connection established ...
Receiving from client, packet size  1k ...  135.32 KByte/s
Sending to client, packet size  1k ...  7.59 KByte/s
Receiving from client, packet size  2k ...  143.53 KByte/s
Sending to client, packet size  2k ...  149.24 KByte/s
Receiving from client, packet size  4k ...  146.89 KByte/s
Sending to client, packet size  4k ...  151.80 KByte/s
Receiving from client, packet size  8k ...  142.36 KByte/s
Sending to client, packet size  8k ...  156.03 KByte/s
Receiving from client, packet size 16k ...  144.04 KByte/s
Sending to client, packet size 16k ...  155.74 KByte/s
Receiving from client, packet size 32k ...  134.22 KByte/s
Sending to client, packet size 32k ...  157.05 KByte/s
Done.

I wonder if tweaking the priorities of RX/TX routines would give any boost. Also will try moving to INT6 chain, although I don't think this will improve things dramatically. The CNet driver is able to squeeze ~500KBytes through pccard interface, which is also sharing the INT2 interrupt.

I can use WGET to upgrade the device driver by simply pulling the new version from my PC over HTTP. So I'm judging the single TCP connection kinda works more or less stable. (Obviously even less). A mix of  WGETs and pings also run in parallel quite all right, with sanautil on top of that. However, when the FTP opens second socket in passive mode it never gets the remote directory listing. I can see the listing in tcpdumped packets, probably the device driver does something odd to them upon reception.

Oh, and MiamiDX cannot complete DHCP configuration for some reason, as opposed to Roadshow. Perhaps I will need more packet dumping inside the device driver.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 10, 2013, 11:14:32 AM
Have just resolved the FTP issue.

(http://farm8.staticflickr.com/7349/11304080864_ea04b762ab_z.jpg)

This also seems to fix the small packet transfer speed. According to NetIO test, Tx/Rx is around 130K in both directions.
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on December 10, 2013, 12:22:42 PM
Quote from: tnt23;753805
Here's what I've been looking into: (http://wiki.amigaos.net/index.php/Revision_3)

Code: [Select]
  These are optional callbacks presented to the device with the
   same calling interface as for S2_CopyToBuff or S2_CopyFromBuff,
   respectively. The difference to the original callbacks is the
   required and guaranteed transfer size and alignment for
   accessing the device's buffer for a single piece of a data of
   either 16 or 32 bits, a data word. The copy function called may
   only use 16/32 bit aligned read/write commands of 16/32 bits at
   once to transfer the data words, respectively. If the buffer
   data length is not a multiple of the required data word
   transfer size, the last data word transfer may contain garbage
   padding in either transfer direction.

I don't know if this has been clarified yet.

The purpose of 16 or 32 bit variants of the S2_CopyToBuff and S2_CopyFromBuff callbacks is to restrict all copying to operations which transfer data in amounts of a specific granularity. In the 16 bit variant, only 16 or 32 bit transfer operations will be used. In the 32 bit variant, only 32 bit transfer operations will be used. By contrast, the S2_CopyToBuff and S2_CopyFromBuff methods will use 8, 16 or 32 bit transfer operations, as necessary.

The S2_CopyFromBuff/S2_CopyFromBuff16/S2_CopyFromBuff32 callbacks transfer data to a contiguous buffer. If your hardware has no such contiguous buffer to transfer data to, you will have to copy the data to a contiguous side-buffer, which is then given to S2_CopyFromBuff/S2_CopyFromBuff16/S2_CopyFromBuff32 to process.

It works exactly the same with the S2_CopyToBuff/S2_CopyToBuff16/S2_CopyToBuff32 callbacks, except that the data is transferred into the opposite direction.

You may be able to avoid using a contiguous side-buffer if the TCP/IP stack supports the S2_DMACopyToBuff32 and S2_DMACopyFromBuff32 callbacks. With these callback functions, you may receive a pointer to a contiguous buffer which is at least as large as you requested. You may then access this buffer and directly copy to/from it. Note that you may get a NULL pointer if no such buffer is available, which which case you would need to fall back to calling S2_CopyToBuff or S2_CopyFromBuff instead, respectively.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 12, 2013, 07:23:49 AM
Quote from: olsen;754124
I don't know if this has been clarified yet.

The purpose of 16 or 32 bit variants of the S2_CopyToBuff and S2_CopyFromBuff callbacks is to restrict all copying to operations which transfer data in amounts of a specific granularity. In the 16 bit variant, only 16 or 32 bit transfer operations will be used. In the 32 bit variant, only 32 bit transfer operations will be used. By contrast, the S2_CopyToBuff and S2_CopyFromBuff methods will use 8, 16 or 32 bit transfer operations, as necessary.

Frankly speaking, I don't understand why, for the 16-bit case, there would be any 32-bit transfer at all. Say, if we need to transfer two 16-bit words with respect to both size AND  alignment, then it should look like two "move.w (src)+, (dst)+" instructions should it not? The addressing will be done in words, and that's nice. In my perception this is not equal to one "move.l (src)+, (dst)+" instruction as the latter breaks both the size (transferring 32 bits at once) and alignment constraints (crossing the 16 bit boundary).

Quote
The S2_CopyFromBuff/S2_CopyFromBuff16/S2_CopyFromBuff32 callbacks transfer data to a contiguous buffer. If your hardware has no such contiguous buffer to transfer data to, you will have to copy the data to a contiguous side-buffer, which is then given to S2_CopyFromBuff/S2_CopyFromBuff16/S2_CopyFromBuff32 to process.

That's exactly what I am trying to figure out. It is possible to implement the said contiguous buffer on my card, with the restriction that it should only be accessed in 16-bits using even addresses only. If the S2_CopyFromBuff16/S2_CopyToBuff16 hooks would follow that "move.w (src)+, (dst)+" restriction, everything should work smoothly - and that would eliminate the need in any side buffering, saving in memory and performance.

Now, if the S2_CopyFromBuff16/S2_CopyToBuff16 hooks at some point won't follow the granularity convention and decide to switch to transferring 32 bits at once, that would break the whole idea, I think.

Quote
You may be able to avoid using a contiguous side-buffer if the TCP/IP stack supports the S2_DMACopyToBuff32 and S2_DMACopyFromBuff32 callbacks. With these callback functions, you may receive a pointer to a contiguous buffer which is at least as large as you requested. You may then access this buffer and directly copy to/from it. Note that you may get a NULL pointer if no such buffer is available, which which case you would need to fall back to calling S2_CopyToBuff or S2_CopyFromBuff instead, respectively.

I understand the DMA callbacks idea better now :) In fact, I am trying to perform exactly like that, checking if the DMA hook is available, then asking for the pointer etc. It even seems to work, although is slow as hell. Lot to check on my side.

So, back to our 16-bit stuff. Do you think it would be feasible to implement that 'strict' behaviour S2_CopyFromBuff16/S2_CopyToBuff16 in Roadshow?


UPDATE. I'm afraid I have been terribly wrong: the hardware buffer on my side could only be arranged for long-aligned 16-bit access :(
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on December 19, 2013, 07:28:12 AM
Quick update regarding performance. That's the best of the driver (stock A4000 with 68030@25MHz I guess? no Fast RAM, Zorro memory running at 120MHz), compiled for 030 with -O3.

(http://farm8.staticflickr.com/7381/11446622684_bce3a06e02_o.png)

On the PC side, netio reports RX faster by ~30K.

With Fast RAM, rx/tx speeds increase slightly by ~50K in both directions. I guess I'll leave it as it is for now, will try various optimizations later.
Title: Re: Zorro III memory card... now with Ethernet
Post by: olsen on December 19, 2013, 08:52:57 AM
Quote from: tnt23;754231
Frankly speaking, I don't understand why, for the 16-bit case, there would be any 32-bit transfer at all. Say, if we need to transfer two 16-bit words with respect to both size AND  alignment, then it should look like two "move.w (src)+, (dst)+" instructions should it not? The addressing will be done in words, and that's nice. In my perception this is not equal to one "move.l (src)+, (dst)+" instruction as the latter breaks both the size (transferring 32 bits at once) and alignment constraints (crossing the 16 bit boundary).


There are two reasons.

The first is historic: up until very recently (and with the exception of the DKB WildFire, which I believe was capable of 32 bit wide memory access) all Amiga Ethernet hardware was either accessible only through the Zorro II bus, or did not permit 32 bit wide memory access. On the Zorro II bus, a 32 bit wide access will be broken up into two consecutive 16 bit accesses. How this worked out with hardware which could not support 32 bit wide accesses was up to the glue logic on the board.

The second is performance: the ratio of instructions executed vs. the amount of data copied is terrible for "move.w (a0)+,(a1)+", less terrible for "move.l (a0)+,(a0)+" and becomes better if you can leverage "movem.l (a0)+,d1-d7/a2-a6 ; movem.l d1-d7/a2-a6,(a1)+" style copying (better still if you can unroll the copying loop in which movem.l is used).

I stopped counting execution cycles more than 15 years ago, but I believe that performance of even an unrolled "move.w (a0)+,(a1)+" loop will be quite poor.

Roadshow contains a restricted version of the original, optimized copying function, with the restriction being that only 16 and 32 bit copying operations are used. The goal was to provide for better performance than the S2_CopyFromBuff/S2_CopyToBuff callbacks could. Which was done specifically for the "Ariadne".

There is a slow "move.w (a0)+,(a1)+" variant available in Roadshow already. It is enabled by default, but all the example interface configuration files disable it. To switch back to the slow variant, either remove the "copymode=fast" parameter from the respective interface file, or replace it with "copymode=slow".

Quote

That's exactly what I am trying to figure out. It is possible to implement the said contiguous buffer on my card, with the restriction that it should only be accessed in 16-bits using even addresses only. If the S2_CopyFromBuff16/S2_CopyToBuff16 hooks would follow that "move.w (src)+, (dst)+" restriction, everything should work smoothly - and that would eliminate the need in any side buffering, saving in memory and performance.

Now, if the S2_CopyFromBuff16/S2_CopyToBuff16 hooks at some point won't follow the granularity convention and decide to switch to transferring 32 bits at once, that would break the whole idea, I think.


Could be, but then your code needs to be able to handle the regular S2_CopyFromBuff/S2_CopyToBuff callbacks, which are likely going to be much worse in terms of performance. You will always have to be able to provide for a side-buffer, in case S2_CopyFromBuff/S2_CopyToBuff callbacks are invoked and the client offers no alternative callbacks.

Quote

I understand the DMA callbacks idea better now :) In fact, I am trying to perform exactly like that, checking if the DMA hook is available, then asking for the pointer etc. It even seems to work, although is slow as hell. Lot to check on my side.

So, back to our 16-bit stuff. Do you think it would be feasible to implement that 'strict' behaviour S2_CopyFromBuff16/S2_CopyToBuff16 in Roadshow?


See above: it's already supported :)

Quote

UPDATE. I'm afraid I have been terribly wrong: the hardware buffer on my side could only be arranged for long-aligned 16-bit access :(


If you can make it appear on a 32 bit aligned start address, then testing it with Roadshow's built-in slow 16 bit copy callback might just work out.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on March 27, 2014, 07:17:48 PM
Haven't posted for a while. Not much progress on the software side, was building more cards to test.

(https://farm8.staticflickr.com/7224/13427198063_93cd8b86d3.jpg)
Title: Re: Zorro III memory card... now with Ethernet
Post by: Plaz on March 28, 2014, 02:27:31 AM
Great work and very nice looking cards. I hope you continue to have success.

Plaz
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on May 29, 2014, 07:31:36 PM
Got a hold of Cyberstorm MK3 (060@50MHz), benchmarks get way better:

Code: [Select]
NETIO - Network Throughput Benchmark, Version 1.32
(C) 1997-2012 Kai Uwe Rommel

UDP server listening.
TCP server listening.
TCP connection established ...
Receiving from client, packet size  1k ...  947.98 KByte/s
Sending to client, packet size  1k ...  659.21 KByte/s
Receiving from client, packet size  2k ...  1057.02 KByte/s
Sending to client, packet size  2k ...  895.88 KByte/s
Receiving from client, packet size  4k ...  1119.78 KByte/s
Sending to client, packet size  4k ...  1269.84 KByte/s
Receiving from client, packet size  8k ...  1183.08 KByte/s
Sending to client, packet size  8k ...  1380.43 KByte/s
Receiving from client, packet size 16k ...  1212.76 KByte/s
Sending to client, packet size 16k ...  1411.63 KByte/s
Receiving from client, packet size 32k ...  1207.01 KByte/s
Sending to client, packet size 32k ...  1427.19 KByte/s
Done.
Title: Re: Zorro III memory card... now with Ethernet
Post by: HammerD on May 29, 2014, 08:53:56 PM
Quote from: tnt23;765274
Got a hold of Cyberstorm MK3 (060@50MHz), benchmarks get way better:

Code: [Select]
NETIO - Network Throughput Benchmark, Version 1.32
(C) 1997-2012 Kai Uwe Rommel

UDP server listening.
TCP server listening.
TCP connection established ...
Receiving from client, packet size  1k ...  947.98 KByte/s
Sending to client, packet size  1k ...  659.21 KByte/s
Receiving from client, packet size  2k ...  1057.02 KByte/s
Sending to client, packet size  2k ...  895.88 KByte/s
Receiving from client, packet size  4k ...  1119.78 KByte/s
Sending to client, packet size  4k ...  1269.84 KByte/s
Receiving from client, packet size  8k ...  1183.08 KByte/s
Sending to client, packet size  8k ...  1380.43 KByte/s
Receiving from client, packet size 16k ...  1212.76 KByte/s
Sending to client, packet size 16k ...  1411.63 KByte/s
Receiving from client, packet size 32k ...  1207.01 KByte/s
Sending to client, packet size 32k ...  1427.19 KByte/s
Done.


That is very good transfer rates with the 060 :-)
Title: Re: Zorro III memory card... now with Ethernet
Post by: amigean on May 30, 2014, 12:30:50 AM
this looks very cool - quite an achievement to build this thing from scratch.

I'd love one of them when you're ready to produce them.
Title: Re: Zorro III memory card... now with Ethernet
Post by: freqmax on May 30, 2014, 01:31:32 AM
Regarding networking it seems all driver arhitectures suffers from various bugs:
    * AS225r1 for the A2065 Ethernet uses hardcoded driver.
    * SANA-II suffers from an inefficient buffer handling scheme, lacking proper support for promiscuous and multicast modes.
    * Miami Network Interface (MNI) abandoned without support. And still lacks some Ethernet capabilities.

So making a new bsd socket layer and proper hardware driver abstraction API might be a really good deed. Booting from network is also something that is kind of missing. Btw, did you add 32-bit transfers to speed up things? DMA transfers?

How many layers does your card need? and what was the price to produce just the PCB? (via, tin, laquer, etc options?)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on June 02, 2014, 07:39:35 AM
Quote from: freqmax;765291

So making a new bsd socket layer and proper hardware driver abstraction API might be a really good deed. Booting from network is also something that is kind of missing.


Probably, but who's gonna make it, and who's gonna write driver replacements for all legacy hardware out there?

Quote
Btw, did you add 32-bit transfers to speed up things? DMA transfers?


No, there are neither 32 bit nor DMA transfers.

Quote
How many layers does your card need? and what was the price to produce just the PCB? (via, tin, laquer, etc options?)


The card has 4 layers, and the price of the first (experimental and express) batch was I'd say decent, something like 50 euro per board. Subsequent batches are of course cheaper. I could lookup exact PCB parameters like copper thickness, track width and via size, why?
Title: Re: Zorro III memory card... now with Ethernet
Post by: freqmax on June 02, 2014, 02:34:39 PM
Quote from: tnt23;765537
Probably, but who's gonna make it, and who's gonna write driver replacements for all legacy hardware out there?


The important step is to create a infrastructure that others can fill in. Not to do it oneself all the way.

Quote from: tnt23;765537
The card has 4 layers, and the price of the first (experimental and express) batch was I'd say decent, something like 50 euro per board. Subsequent batches are of course cheaper. I could lookup exact PCB parameters like copper thickness, track width and via size, why?


Curious for other projects probably involving FPGA, ARM, MIPS etc.
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on June 02, 2014, 08:15:47 PM
PCB is 4 layers, FR-4 18u copper, 0.2mm/0.2mm tracks, 0.3mm/0.7mm vias, two sided soldering mask, no silk. Continuity check, express production (4-5 days) by local PCB house in Saint Petersburg, so your mileage most likely will vary.
Title: Re: Zorro III memory card... now with Ethernet
Post by: freqmax on June 03, 2014, 02:12:01 AM
Dunno if Saint Petersburg is local though ;)
Title: Re: Zorro III memory card... now with Ethernet
Post by: tnt23 on June 03, 2014, 07:29:47 AM
Quote from: freqmax;765598
Dunno if Saint Petersburg is local though ;)


Unless you're in Florida :)