Welcome, Guest. Please login or register.

Author Topic: Zorro III memory card... now with Ethernet  (Read 13626 times)

Description:

0 Members and 2 Guests are viewing this topic.

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #14 on: December 01, 2013, 08:32:47 AM »
Quote from: nyteschayde;753590
This is awesome and please don't take this question as a complaint, but with RAM being so cheap, why only 64MB? Is there some limitation that I'm unaware of? Is this so it can be used in Zorro II in addition to Zorro III? I actually am unaware of the limitation per card for these devices in regards to addressable memory.


64M was the biggest SDRAM chip I was able to find in TSSOP package. The common approach is to have two or more chips on board, but I wasn't brave enough to route another one. Zorro III itself is able to address more than 1G per card.

Speaking of Zorro II, the limit is 8M there, unless someone comes with some sort of banking driver or something.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #15 on: December 01, 2013, 10:44:20 AM »
Quote from: olsen;753602
As far as I can tell (the callbacks are initialized exactly once), this is risky, and there is no benefit in initializing the callbacks in this manner.

I would not call it a bug, since every client of the SANA-II driver is likely to provide the proper callbacks. But if it does not, for some reason, then the device will crash.

The 3c589.device should be more paranoid, and verify that each parameter provided by the client is sound.

I already put a snapshot of the whole 3c589.device/pccard.library source code into my SVN repository, for rework, but there's been too little time to rework it so far :-(


Well, this indeed does not seem like a bug, at least no one complained so far. I think I understand what this code should do: since the S2_CopyToBuff is obligatory, the RX hook will be set to S2_CopyToBuff first. If the caller provides S2_CopyToBuff16 then the hook will be assigned that new tag value; otherwise, it will stick to S2_CopyToBuff, and so on. That way, as it seems to me, the request will be serviced using the fastest hook caller provides.

Anyway, the bug on my side was so silly it even isn't worth mentioning. Time to dig DHCP  (and try Sgrab):

 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #16 on: December 02, 2013, 05:45:22 AM »
Quote from: olsen;753651
Yes, that seems to be the intention. However, if S2_CopyToBuff were missing, and S2_CopyToBuff16 were missing, too, then the code will end up using an unitialized pointer, which should be caught before it happens. Same goes for the S2_CopyFromBuff tags.

The purpose of S2_CopyToBuff16 is not to speed up copying. It is the counterpart to the S2_CopyFromBuff16 tag, which is a workaround for a hardware bug. As far as I know this bug only exists in one type of Amiga Ethernet card, which is the original "Ariadne".

That's fascinating :) One would think the 16/32 buffer management routines have been proposed into SANA with performance in mind, not as some certain bug workarounds.
Quote

Put another way, no driver is really required to support the S2_CopyFromBuff16 method unless the driver really, really needs it.

Since the DM9000 in my design is wired in 16 bits, and I tend to use word accesses wherever possible, using x16 routines would be preferrable in my case.
Quote

Hm... does the DHCP negotiation succeed, eventually? If not, have you tried tcpdump yet?


No, the DHCP gives up after a minute timeout. I suspect there are at least two reasons for that, first that the queueing TX is not done properly, and then there is good load of KPrintF () calls all over the code - running at 9600 by default. If the serial debug routines are blocking then this would also impact timings. I will change the speed to 115200 and also will fix the TX queueing.

Haven't tried tcpdump yet, but definitely will :)
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #17 on: December 03, 2013, 05:41:40 AM »
Thank you Olsen, after spending some time studying tcpdump and sashimi logs I came to a conclusion that the card simply wasn't picking the DHCP ACK from the server. No wonder since the code responsible for multicast/broadcast stuff was, ahem, mostly commented out.

So I went and cowardly let the card accept all and every frame to see if this was an issue. Bingo!



Ping is reporting duplicates, and FTP won't work even in passive mode, but being connected makes me feel better.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #18 on: December 04, 2013, 01:16:50 PM »
Quote from: olsen;753681

Sticking with S2_CopyFromBuff/S2_CopyToBuff has no downsides. Any client (e.g. TCP/IP stack) should use optimized copying code which would automatically use long-sized accesses.


In Roadshow, if the COPYMODE=FAST option is set, buffer management will offer S2_CopyFromBuff16. Is there a way to have it also provide S2_CopyToBuff16? I can imagine the environment where using word-sized and word-aligned access would indeed speed things on the device driver's side compared with S2_CopyFromBuff/S2_CopyToBuff.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #19 on: December 04, 2013, 07:37:20 PM »
Quote from: tnt23;753795
In Roadshow, if the COPYMODE=FAST option is set, buffer management will offer S2_CopyFromBuff16. Is there a way to have it also provide S2_CopyToBuff16? I can imagine the environment where using word-sized and word-aligned access would indeed speed things on the device driver's side compared with S2_CopyFromBuff/S2_CopyToBuff.


Here's what I've been looking into: (http://wiki.amigaos.net/index.php/Revision_3)

Code: [Select]

   These are optional callbacks presented to the device with the
   same calling interface as for S2_CopyToBuff or S2_CopyFromBuff,
   respectively. The difference to the original callbacks is the
   required and guaranteed transfer size and alignment for
   accessing the device's buffer for a single piece of a data of
   either 16 or 32 bits, a data word. The copy function called may
   only use 16/32 bit aligned read/write commands of 16/32 bits at
   once to transfer the data words, respectively. If the buffer
   data length is not a multiple of the required data word
   transfer size, the last data word transfer may contain garbage
   padding in either transfer direction.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #20 on: December 09, 2013, 04:32:42 PM »
That's what I get with non-debug version of dm9000.device. A4000 with 68030/25MHz and 2MB of Chip RAM, 0MB of Fast RAM, 64MB of Zorro III RAM clocked at 100MHz.

Code: [Select]
NETIO - Network Throughput Benchmark, Version 1.32
(C) 1997-2012 Kai Uwe Rommel

UDP server listening.
TCP server listening.
TCP connection established ...
Receiving from client, packet size  1k ...  135.32 KByte/s
Sending to client, packet size  1k ...  7.59 KByte/s
Receiving from client, packet size  2k ...  143.53 KByte/s
Sending to client, packet size  2k ...  149.24 KByte/s
Receiving from client, packet size  4k ...  146.89 KByte/s
Sending to client, packet size  4k ...  151.80 KByte/s
Receiving from client, packet size  8k ...  142.36 KByte/s
Sending to client, packet size  8k ...  156.03 KByte/s
Receiving from client, packet size 16k ...  144.04 KByte/s
Sending to client, packet size 16k ...  155.74 KByte/s
Receiving from client, packet size 32k ...  134.22 KByte/s
Sending to client, packet size 32k ...  157.05 KByte/s
Done.

I wonder if tweaking the priorities of RX/TX routines would give any boost. Also will try moving to INT6 chain, although I don't think this will improve things dramatically. The CNet driver is able to squeeze ~500KBytes through pccard interface, which is also sharing the INT2 interrupt.

I can use WGET to upgrade the device driver by simply pulling the new version from my PC over HTTP. So I'm judging the single TCP connection kinda works more or less stable. (Obviously even less). A mix of  WGETs and pings also run in parallel quite all right, with sanautil on top of that. However, when the FTP opens second socket in passive mode it never gets the remote directory listing. I can see the listing in tcpdumped packets, probably the device driver does something odd to them upon reception.

Oh, and MiamiDX cannot complete DHCP configuration for some reason, as opposed to Roadshow. Perhaps I will need more packet dumping inside the device driver.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #21 on: December 10, 2013, 11:14:32 AM »
Have just resolved the FTP issue.



This also seems to fix the small packet transfer speed. According to NetIO test, Tx/Rx is around 130K in both directions.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #22 on: December 12, 2013, 07:23:49 AM »
Quote from: olsen;754124
I don't know if this has been clarified yet.

The purpose of 16 or 32 bit variants of the S2_CopyToBuff and S2_CopyFromBuff callbacks is to restrict all copying to operations which transfer data in amounts of a specific granularity. In the 16 bit variant, only 16 or 32 bit transfer operations will be used. In the 32 bit variant, only 32 bit transfer operations will be used. By contrast, the S2_CopyToBuff and S2_CopyFromBuff methods will use 8, 16 or 32 bit transfer operations, as necessary.

Frankly speaking, I don't understand why, for the 16-bit case, there would be any 32-bit transfer at all. Say, if we need to transfer two 16-bit words with respect to both size AND  alignment, then it should look like two "move.w (src)+, (dst)+" instructions should it not? The addressing will be done in words, and that's nice. In my perception this is not equal to one "move.l (src)+, (dst)+" instruction as the latter breaks both the size (transferring 32 bits at once) and alignment constraints (crossing the 16 bit boundary).

Quote
The S2_CopyFromBuff/S2_CopyFromBuff16/S2_CopyFromBuff32 callbacks transfer data to a contiguous buffer. If your hardware has no such contiguous buffer to transfer data to, you will have to copy the data to a contiguous side-buffer, which is then given to S2_CopyFromBuff/S2_CopyFromBuff16/S2_CopyFromBuff32 to process.

That's exactly what I am trying to figure out. It is possible to implement the said contiguous buffer on my card, with the restriction that it should only be accessed in 16-bits using even addresses only. If the S2_CopyFromBuff16/S2_CopyToBuff16 hooks would follow that "move.w (src)+, (dst)+" restriction, everything should work smoothly - and that would eliminate the need in any side buffering, saving in memory and performance.

Now, if the S2_CopyFromBuff16/S2_CopyToBuff16 hooks at some point won't follow the granularity convention and decide to switch to transferring 32 bits at once, that would break the whole idea, I think.

Quote
You may be able to avoid using a contiguous side-buffer if the TCP/IP stack supports the S2_DMACopyToBuff32 and S2_DMACopyFromBuff32 callbacks. With these callback functions, you may receive a pointer to a contiguous buffer which is at least as large as you requested. You may then access this buffer and directly copy to/from it. Note that you may get a NULL pointer if no such buffer is available, which which case you would need to fall back to calling S2_CopyToBuff or S2_CopyFromBuff instead, respectively.

I understand the DMA callbacks idea better now :) In fact, I am trying to perform exactly like that, checking if the DMA hook is available, then asking for the pointer etc. It even seems to work, although is slow as hell. Lot to check on my side.

So, back to our 16-bit stuff. Do you think it would be feasible to implement that 'strict' behaviour S2_CopyFromBuff16/S2_CopyToBuff16 in Roadshow?


UPDATE. I'm afraid I have been terribly wrong: the hardware buffer on my side could only be arranged for long-aligned 16-bit access :(
« Last Edit: December 12, 2013, 07:36:10 AM by tnt23 »
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #23 on: December 19, 2013, 07:28:12 AM »
Quick update regarding performance. That's the best of the driver (stock A4000 with 68030@25MHz I guess? no Fast RAM, Zorro memory running at 120MHz), compiled for 030 with -O3.



On the PC side, netio reports RX faster by ~30K.

With Fast RAM, rx/tx speeds increase slightly by ~50K in both directions. I guess I'll leave it as it is for now, will try various optimizations later.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #24 on: March 27, 2014, 07:17:48 PM »
Haven't posted for a while. Not much progress on the software side, was building more cards to test.

 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #25 on: May 29, 2014, 07:31:36 PM »
Got a hold of Cyberstorm MK3 (060@50MHz), benchmarks get way better:

Code: [Select]
NETIO - Network Throughput Benchmark, Version 1.32
(C) 1997-2012 Kai Uwe Rommel

UDP server listening.
TCP server listening.
TCP connection established ...
Receiving from client, packet size  1k ...  947.98 KByte/s
Sending to client, packet size  1k ...  659.21 KByte/s
Receiving from client, packet size  2k ...  1057.02 KByte/s
Sending to client, packet size  2k ...  895.88 KByte/s
Receiving from client, packet size  4k ...  1119.78 KByte/s
Sending to client, packet size  4k ...  1269.84 KByte/s
Receiving from client, packet size  8k ...  1183.08 KByte/s
Sending to client, packet size  8k ...  1380.43 KByte/s
Receiving from client, packet size 16k ...  1212.76 KByte/s
Sending to client, packet size 16k ...  1411.63 KByte/s
Receiving from client, packet size 32k ...  1207.01 KByte/s
Sending to client, packet size 32k ...  1427.19 KByte/s
Done.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #26 on: June 02, 2014, 07:39:35 AM »
Quote from: freqmax;765291

So making a new bsd socket layer and proper hardware driver abstraction API might be a really good deed. Booting from network is also something that is kind of missing.


Probably, but who's gonna make it, and who's gonna write driver replacements for all legacy hardware out there?

Quote
Btw, did you add 32-bit transfers to speed up things? DMA transfers?


No, there are neither 32 bit nor DMA transfers.

Quote
How many layers does your card need? and what was the price to produce just the PCB? (via, tin, laquer, etc options?)


The card has 4 layers, and the price of the first (experimental and express) batch was I'd say decent, something like 50 euro per board. Subsequent batches are of course cheaper. I could lookup exact PCB parameters like copper thickness, track width and via size, why?
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #27 on: June 02, 2014, 08:15:47 PM »
PCB is 4 layers, FR-4 18u copper, 0.2mm/0.2mm tracks, 0.3mm/0.7mm vias, two sided soldering mask, no silk. Continuity check, express production (4-5 days) by local PCB house in Saint Petersburg, so your mileage most likely will vary.
 

Offline tnt23Topic starter

  • Full Member
  • ***
  • Join Date: Dec 2005
  • Posts: 195
    • Show all replies
Re: Zorro III memory card... now with Ethernet
« Reply #28 from previous page: June 03, 2014, 07:29:47 AM »
Quote from: freqmax;765598
Dunno if Saint Petersburg is local though ;)


Unless you're in Florida :)