Amiga.org
		Amiga News and Community Announcements => Amiga News and Community Announcements => Amiga Hardware News => Topic started by: Argo on March 09, 2007, 02:45:15 AM
		
			
			- 
				ACube Systems has published the Sam440ep memory test  on its products page.
 
 http://www.sam440.com/eng/products.html
- 
				Not bad, but a today athlon x2 with ddr2 800 runs more than 2200 MB/s, and a core2 with ddr2 800 runs more than 4000 MB/s. I curious its real performance. (Using OS4 on SAM440)
			
- 
				@
 Please notice, that Sam is a embedded board not a desktop system.
 I think the benchmarks are very fast, for such board.
 I hope we will see AmigaOS 4 on it, because without OS4 it is unuseable. :/
- 
				but a today athlon x2 with ddr2 800 runs more than 2200 MB/s, and a core2 with ddr2 800 runs more than 4000 MB/s. 
 But for those who care, it doesn't run OS4. Neither does Samantha, but Sam running it some day is a lot more probable.
- 
				@AmigaPapst
 
 I hope we will see AmigaOS 4 on it, because without OS4 it is unuseable. :/  
 
 That's a bit strong! It is after all an embedded board and, as such, is great for Linux (esp uClinux) etc etc.
 
 I hope they do really well with it because, like with genesi, good sales outside the Amiga market will inevitably lead to more choice inside the Amiga market.  :-)
 
 These guys are successful in leveraging the embeddded market whereas Eyetech were unlucky/hapless.  8-)
- 
				@AmigaPapst
 
 Please notice, that Sam is a embedded board not a desktop system. 
 
 But 99.9% of all Amigans will use it as a desktop system. They are interested in OS4 on desktop platforms, not embedded devices. So what's so wrong about comparing SAM's performance to desktop hardware?
 
 I hope we will see AmigaOS 4 on it, because without OS4 it is unuseable. :/ 
 
 Well, it can be still used "for all your embedded ideas" without OS4. :-P
- 
				Not bad, but a today athlon x2 with ddr2 800 runs more than 2200 MB/s, and a core2 with ddr2 800 runs more than 4000 MB/s. 
 
 athlons 64 do a lot more than 2200 MB/s.
- 
				put it one way 600 and something mhz running os4 its not going to be slow. Look how big linux can get
			
- 
				
 Not bad, but a today athlon x2 with ddr2 800 runs more than 2200 MB/s, and a core2 with ddr2 800 runs more than 4000 MB/s.
 
 athlons 64 do a lot more than 2200 MB/s.
 
 How are you measuring this? The sam board doesn't look too bad to me.
 
 On my 1583MHz Sempron w/ DDR333:
 REP MOVSD (mem copy) = 330MB/S
 REP LODSD (mem read) = 650MB/S
 REP LODSD (mem read in L1 cache) = 3.0GB/S
- 
				@humppa
 >But 99.9% of all Amigans will use it as a desktop system. They are interested in OS4 on desktop platforms, not embedded devices. So what's so wrong about comparing SAM's performance to desktop hardware?
 
 Yes, but for the most desktop uses on OS4 you don't need more performance at the moment.
 
 >Well, it can be still used "for all your embedded ideas" without OS4.
 
 I think no, because fot the most embedded uses you need a fast OS, like os4. Linux is too slow on this low performance machine.
- 
				@DamageX  
 Try using the same stream_d benchmark.
 
 In reference to "sam_memory_test.pdf"
 http://www.sam440.com/eng/images/sam_memory_test.pdf
 
 /usr/checkbench/stream # ./stream_d
 
 This system uses 8 bytes per DOUBLE PRECISION word.
 ...
 Function Rate (MB/s) RMS time Min time Max time
 Copy: 285.7041 0.1121 0.1120 0.1121
 Scale: 270.4713 0.1184 0.1183 0.1185
 Add: 256.4075 0.1873 0.1872 0.1875
 Triad: 250.5925 0.1916 0.1915 0.1918
 
 ----------------------------------------
 Using a real AMD64 processor not some Sempron junk.
 
 AMD Opteron 248 @2200Mhz with PathScale EKO complier.
 
 Compiler: PathScale EKO Compiler Suite, Release 1.1
 Model Name: ASUS SK8N Motherboard, AMD Opteron (TM) Model 248
 CPU: AMD Opteron 248
 CPU MHz: 2200
 FPU: Integrated
 CPU(s) enabled: 1 core, 1 chip, 1 core/chip
 Secondary Cache: 1024KB (I+D) on chip
 Memory: 4x512MB, DDR400, PC3200, Corsair, CL2
 Operating System: SuSE Linux 9.0 (AMD64) 2.4.21-102-default
 
 > pathcc -Ofast -lm stream_d.c second_wall.c -o 1.1ccOfast
 > ./1.1ccOfast
 -------------------------------------------------------------
 This system uses 8 bytes per DOUBLE PRECISION word.
 -------------------------------------------------------------
 Array size = 2000000, Offset = 0
 Total memory required = 45.8 MB.
 Each test is run 10 times, but only
 the *best* time for each is used.
 -------------------------------------------------------------
 Your clock granularity/precision appears to be 1 microseconds.
 Each test below will take on the order of 7885 microseconds.
 (= 7885 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 -------------------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 -------------------------------------------------------------
 Function Rate (MB/s) RMS time Min time Max time
 Copy: 4613.5614 0.0070 0.0069 0.0072
 Scale: 4520.3330 0.0071 0.0071 0.0071
 Add: 4558.4067 0.0106 0.0105 0.0107
 Triad: 4554.9003 0.0105 0.0105 0.0106
 
 
 --------------------
 Using SunFire V40z with 4 x Opteron 2.6GHz processors.
 
 Sun Studio 11(MB/s) complier.
 Sun Studio: -fast -xarch=amd64a -xvector=simd -xprefetch -xprefetch_level=3
 (setenv PARALLEL 1)
 Copy
 4658
 
 Scale
 4614
 
 Add
 4628
 
 Triad
 4627
 
 Sun Studio 11(MB/s) with  4 processor Automatic Parallelization switch.
 
 For Automatic parallelization:
 cc -fast -xarch=amd64a -xvector=simd -xprefetch -xprefetch_level=3 -xautopar stream_d.c second.c
 (setenv PARALLEL 4)
 
 Copy
 18120
 
 Scale
 18108
 
 Add
 17758
 
 Triad
 17626
 
 -----------------------------------------------
 Same machine with GCC4.1 (MB/s)complier.
 
 GCC 4.1: -O3 -funroll-all-loops -ffast-math -fpeephole -m64 -mtune=k8 -fprefetch-loop-arrays
 
 Copy
 2766
 
 Scale
 2745
 
 Add
 2970
 
 Triad
 2969
 
 AMD64 goes well with SUN’s Studio 11 and PathScale's EKO compliers.
 
 The GCC compiler isnt able to exploit this type of scalability since doesn't support automatic parallelization or OpenMP.
 
 Note that there are two main types of Semprons i.e. K7 based (supports only single MCH via FSB) and K8 based(available in single MCH via Socket 754 or Dual MCH via Socket AM2/Socket S1).
 
 SUN’s Studio 11 is avilable as a free download.