In the following, I have compiled the results for a range of easily accessible benchmarks. I hope that this may give some idea of the relative performance you can expect from an Amithlon system. I have not commented any of the results; I want people to be able to form their own, objective opinion. There are some "outliers" in there (i.e. values that are far away from the "usual" and "expected" outcomes). Be careful to check for the influence of such values on any averages you might look at. Also, I have occasionally taken the ratios of arithmetic averages; That is not really a good thing to do, but if I don't, people will complain and/or suggest a hidden motive. It is better than taking arithmetic averages of ratios, at least.... I hope this will come in useful to someone, Bernie (bmeyer@csse.monash.edu.au) %%%%%%%%%%%% % BYTEMARK % %%%%%%%%%%%% This is running the Bytemark port straight from Aminet (see http://www.aminet.net/pub/aminet/util/moni/ByteMark68kPPC.lha for your own copy). Running nbench.040fpu on Amithlon. The following are the results as reported on a 1.3GHz Athlon. The comparison values were taken from Usenet post <37893DA5.MD-0.198.s.giovenella@spamfree.pn.nettuno.it> and were measured on an A3000/CSPPC with 66 MHz 060 and 233 MHz 604e. The comparative MHz columns are not to be taken too seriously, because performance will not actually increase linearly with CPU speed; Memory bandwidth is also an important factor, and thus an 060 or 604e overclocked to these frequencies would still not actually get anywhere near these speeds. BYTEmark (tm) Native Mode Benchmark ver. 2 (3/95) Test ! Amithlon 1.3GHz ! 060-66 ! 604e-233 ! 060MHz ! 604MHz ---------------------------------!---------------------------------------- NUMERIC SORT: Index: 7.360227 ! 0.65 ! 3.21 ! 747.3 ! 534.2 STRING SORT: Index: 6.692935 ! 0.47 ! 1.28 ! 939.8 ! 1218.3 BITFIELD: Index: 6.756898 ! 1.05 ! 4.58 ! 424.7 ! 343.7 FP EMULATION: Index: 7.119398 ! 0.87 ! 3.33 ! 540.0 ! 498.1 FOURIER: Index: 3.872222 ! 0.18 ! 0.79 ! 1419.8 ! 1142.0 ASSIGNMENT: Index: 12.250218 ! 0.94 ! 4.77 ! 860.1 ! 598.3 IDEA: Index: 8.842450 ! 0.65 ! 4.99 ! 897.8 ! 412.8 HUFFMAN: Index: 10.073270 ! 0.93 ! 4.74 ! 714.8 ! 495.1 NEURAL NET: Index: 3.154568 ! 0.35 ! 2.60 ! 594.8 ! 282.6 LU DECOMPOSITION:dex: 3.849416 ! 0.54 ! 2.15 ! 470.4 ! 417.1 ===========OVERALL============= ! ======== ! ======== ! ======= ! ====== INTEGER INDEX: 8.242951 ! 0.77 ! 3.55 ! 706.5 ! 541.0 FLOATING-POINT INDEX: 3.609326 ! 0.32 ! 1.64 ! 744.4 ! 512.7 =============================== ! ======== ! ======== ! ======= ! ====== %%%%%%%%%%%%%%% % Speedometer % %%%%%%%%%%%%%%% This is Speedometer 4.02, running under Shapeshifter 3.10. Screenmode is 1024x768 in 256 colours. The system this is run on is a 1.3GHz Athlon Thunderbird, with a GeForce2 MX-200 graphics card. Comparison values for 060/66 were taken from http://www.amigapro.com/Images/Speedometer.gif Comparison values for iFusion on CSPPC 604e-200 were taken from http://www.blittersoft.com/speed.jpg Once again, don't pay too much attention to the comparative MHz figures, because even if one could overclock the 060 or 604e to these levels, one would still be quite short of the actual performance observed under Amithlon. The "Disk" test was left out, because Speedometer 4.02 is faulty in the way it reports relative disk speeds larger than 5[1] Test ! Amithlon 1.3GHz ! 060-66 ! 604e-200 ! 060MHz ! 604MHz ---------------------------------!---------------------------------------- CPU ! 22.78 ! 4.24 ! 9.71 ! 354.5 ! 469.2 Math ! 405.78 ! 44.47 ! 393.90 ! 602.2 ! 206.0 ........................................................................... KWhet ! 74.05 ! 12.90 ! 312.10 ! 378.8 ! 47.4 Dhry ! 36.65 ! 2.88 ! 12.15 ! 839.8 ! 603.2 Towers ! 38.23 ! 3.55 ! 14.20 ! 710.7 ! 538.4 Quicksort ! 36.12 ! 3.62 ! 17.37 ! 658.5 ! 415.8 Bubble Sort ! 37.04 ! 3.79 ! 11.83 ! 645.0 ! 626.2 Queens ! 39.35 ! 3.16 ! 10.34 ! 821.8 ! 761.1 Puzzle ! 39.06 ! 4.71 ! 20.16 ! 547.3 ! 387.5 Permute ! 33.96 ! 2.63 ! 13.44 ! 852.2 ! 505.3 Int. Matrix ! 43.82 ! 6.44 ! 18.64 ! 449.0 ! 470.1 Sieve ! 45.39 ! 4.09 ! 14.25 ! 732.4 ! 637.0 Bench Average ! 42.37 ! 4.78 ! 44.45 ! 585.0 ! 190.6 ........................................................................... FPU FFT ! 36.20 ! 3.29 ! 2.09 ! 726.2 ! 3464.1 FPU KWhet ! 30.69 ! 4.05 ! 26.31 ! 500.1 ! 233.2 FPU Matrix ! 31.64 ! 4.25 ! 23.55 ! 491.3 ! 268.7 FPU Average ! 32.85 ! 3.86 ! 17.32 ! 561.6 ! 379.3 ........................................................................... 8 Bit GFX ! 4.73 ! 2.05 ! n/a ! %%%%%%%%%%%%%%%%%%%%%%%%%% % Candy Factory Pro Demo % %%%%%%%%%%%%%%%%%%%%%%%%%% This is CandyFactoryPro, Demo version, straight from Aminet. See http://www.aminet.net/pub/aminet/biz/titan/CandyProDemo.lha for your own copy. Once again, this is running on a 1.3GHz Athlon Thunderbird. Screen mode is 1024x768 in 256 colours. Comparison values for 604e-233 were taken from http://www.amigapro.com/benchmarks.html All timings in seconds! Test ! Amithlon 1.3GHz ! 604e-233 ! 604MHz ---------------------------------!---------------------------------------- BlueGlow ! 0.12s ! 0.16s ! 310 Choccy ! 0.15s ! 0.17s ! 264 ChromeShadow ! 0.09s ! 0.08s ! 207 NiceBrown ! 0.17s ! 0.18s ! 246 PurpleGoo ! 0.50s ! 0.93s ! 433 ShinyMetal ! 0.47s ! 1.23s ! 609 Tinfoil ! 0.11s ! 0.09s ! 190 Transparent ! 0.09s ! 0.08s ! 207 %%%%%%%%%%%% % P96Speed % %%%%%%%%%%%% The gfx speedtest program available on Aminet at http://www.aminet.net/pub/aminet/gfx/board/P96Speed.lha 060-70/A4000T/PIV comparison values for 640x480x8 were taken from the included archive. 060-50/Mediator/Voodoo3 2000 comparison values for 640x480x8 and 800x600x16 were taken from message on the Yahoo Mediator mailing list. This is again running on a 1.3GHz Athlon Thunderbird, with a Geforce2 MX-200 (That's the cheapest and slowest card in the GeForce2 range, at a current street price of about US$ 50-60) Test (640x480x8) ! Amithlon ! 060-70/PIV ! 060-50/Mediator/V3-2000 ------------------------------------------------------------------------- RectFill ! 20,071 ! 6,330 ! 19,093 RectFillPat. ! 3,772 ! 5,911 ! 17,152 WritePixel ! 1,930,706 ! 174,161 ! 147,037 WriteChunkyP. ! 11,599 ! 685 ! 590 WritePixelArr. ! 11,625 ! 686 ! 579 WritePixelLine ! 227,528 ! 25,845 ! 21,581 DrawEllipse ! 62.254 ! 11,875 ! 10,432 DrawCircle ! 59,734 ! 12,280 ! 11,032 Draw ! 24,210 ! 4,447 ! 20,619 Draw Hor/Ver ! 87,029 ! 27,791 ! 21,281 ScrollRasterX ! 1,755 ! 403 ! 2,251 ScrollRasterY ! 1,817 ! 394 ! 2,377 PutText ! 60,443 ! 11,085 ! 9,708 BlitBitMap ! 1,016 ! 10,657 ! 15,274 BlitBitMapRP ! 1,013 ! 8,688 ! 10,963 BitMapScale ! 323 ! 93 ! 94 .......................................................................... OpenWindow ! 781 ! 156 ! 122 MoveWindow ! 4,043 ! 398 ! 531 SizeWindow ! 1,094 ! 124 ! 131 CON-Output ! 2,747 ! 503 ! 1,227 ScreenToFront[2] ! 51 ! 83 ! 60 Test (800x600x16)! Amithlon ! 060-50/Mediator/V3-2000 --------------------------------------------------------- RectFill ! 7,030 ! 17,794 RectFillPat. ! 1,505 ! 16,801 WritePixel ! 1,595,647 ! 194,676 WriteChunkyP. ! 3,368 ! 288 WritePixelArr. ! 3,367 ! 285 WritePixelLine ! 134,264 ! 16,158 DrawEllipse ! 58,342 ! 11,050 DrawCircle ! 55,532 ! 11,610 Draw ! 17,078 ! 22,784 Draw Hor/Ver ! 60,072 ! 26,385 ScrollRasterX ! 516 ! 971 ScrollRasterY ! 522 ! 1,002 PutText ! 60,595 ! 12,236 BlitBitMap ! 597 ! 17,194 BlitBitMapRP ! 593 ! 13,037 BitMapScale ! 307 ! 29 ........................................................ OpenWindow ! 768 ! 158 MoveWindow ! 3,140 ! 893 SizeWindow ! 1,053 ! 196 CON-Output ! 1,164 ! 1,176 ScreenToFront[2] ! 51 ! 60 %%%%%%%%%%% % RC5/OGR % %%%%%%%%%%% Yes, I *know* these are not benchmarks. I know these shouldn't even be considered for benchmarking. But of course, some people *do* use them for benchmarking (you know who you are!), so I will provide numbers. Unless you feel an irresistible urge to compare machines based on their RC5 or OGR scores, you should completely ignore this section! Results: * dnetc client 2.8015-469-CTR-01050611: Comparison values: <3AFB321E.MD-1.4.4.sgio@erols.com> Amithlon, 1.3GHz Athlon TB, RC5: 1.88 MKeys/s A3000/CSPPC/604e-280, RC5: 0.93 MKeys/s A3000/CSPPC/060-66, RC5: 0.20 MKeys/s Amithlon, 1.3GHz Athlon TB, OGR: 3.47 MNodes/s A3000/CSPPC/604e-280, OGR: 3.65 MNodes/s A3000/CSPPC/060-66, OGR: 0.42 MNodes/s * dnetc client 2.8010-461-CPR-00062713: Comparison values: http://n0cgi.distributed.net/speed/index.html Amithlon, 1.3GHz Athlon TB, RC5: 2.20 MKeys/s A3000/CSPPC/604e-200(!), RC5: 0.66 MKeys/s A3000/CSPPC/060-66, RC5: 0.20 MKeys/s Amithlon, 1.3GHz Athlon TB, OGR: 3.42 MNodes/s A3000/CSPPC/604e-200(!), OGR: 1.60 MNodes/s A3000/CSPPC/060-66, OGR: 0.34 MNodes/s %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Benchmarks for which I do not have comparison values available % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Here are some benchmarks which I think are good indications of "real world stuff", but for which I, as of now, lack (reliable) comparison values from real 060 and/or 604e Amigas. If you have such results, please send them my way. Thanks! %%%%%%%%%%%%%%%%%%%% % Povray/Povbench % %%%%%%%%%%%%%%%%%%%% This one is fairly memory bandwidth sensitive, and also might thrash the branch prediction cache on some processors. Instructions: Download a version of Povray compiled for your processor/machine. Amiga binaries used to be available at http://www.amigaworld.com/support/povamiga I currently can't connect to that host, though. Then issue the following commands: assign povray31: your_path_to_povray povray31: povray31 -i skyvase.pov +v1 +ft -x +mb25 +a0.300 +j1.000 +r3 -q9 -w640 -H480 -S1 -E480 -k0.000 -mv2.0 +b1000 +MB25 [Note: I broke that last line for readability; It's all one line!] Version used here: Version 3.1 SAS/C-Amiga, Copyright 1998 Result: Amithlon, 1.3GHz Athlon TB --> 104 seconds %%%%%%%%% % bzip2 % %%%%%%%%% This one is using a near-state-of-the-art file compressor. bzip2 is a block sorting compression program, using the Burrows Wheeler Transform. It also does a fair bit of bit fiddling when it comes to entropy-coding the move-to-front ranks (Yes, I am a compression person. If that was all incomprehensible to you, don't worry!). bzip2 is available on Aminet, at http://www.aminet.net/pub/aminet/util/arc/bzip2.lha That archive includes versions for 68k, the Phase5 PPC.library, and WarpOS; Thus, it makes an ideal comparison testbed. Once you have the appropriate version for your machine installed (read the README! And don't forget that you might need to set a sizable stack, or things *will* crash!), get http://byron.csse.monash.edu.au/forsteve/oink.bz2.bin (which is a 150k file, bzip2'ed from 4MB). Bunzip it with any version you like (this isn't the test yet), and then simply time how long the following takes, using the version of bzip2 you installed: bzip2 -9 oink where "oink" is the name you bunzip2'ed the downloaded file to. This might take several minutes, especially on the 060, so don't be impatient. This is best done in a ramdisk, so your results are not influenced by disk I/O Result: Amithlon 1.3GHz Athlon TB --> 34 seconds %%%%%%%%%%%%%% % Voxelspace % %%%%%%%%%%%%%% This is a demo that comes with the WarpOS distribution. It can use either a 68k CPU or a PPC, if available. In both cases, it is executing hand-written assembler code. The "Freq:" entry in the information box is a frames-per-second counter. Some comparison values exist, suggesting around 100fps in 320x200 on a CSPPC/CVPPC combo. However, that resolution is so silly, it's not worth talking about (I *do* have a 21" monitor, after all!). Instructions: Run the "Run Voxelspace P96" or "Run Voxelspace CyberGfx" script, whichever gives better results on your particular setup. Select "640x480" for the first test, and "1280x1024" for the second. Leave all other demo settings at their default values, and (without touching the mouse or keyboard) note down the "Freq:" field. Then move around a bit and check out the range of fps you get while in various spots, looking into various directions (all sorts of cache issues come into play when doing this). Results: Amithlon 1.3GHz Athlon TB, GF2 MX-200, 640x480, P96 -> 140fps (130-170fps) Amithlon 1.3GHz Athlon TB, GF2 MX-200, 1280x1024, P96 -> 31fps ( 20- 35fps) %%%%%%%%% % Quake % %%%%%%%%% There are many versions of this game for the Amiga. On Amithlon, we use Quake68k by Frank Wille and Steffen Haeuser, available from http://www.aminet.net/pub/aminet/game/shoot/Quake68k.lha We also have, currently just for our internal use, a version of this in which three texture mapping routines were replaced with x86 native code. Doing this took one day (most of which was spent on working out which routines to target, and where to find suitable x86 replacements for them). Frank Wille and Steffen Haeuser's port also comes in various PPC flavours, and you can also get a GLQuake port which, IIUC, sits on top of Warp3D. The "standard" Quake speedtest is to go into full screen mode (no status bar, just plain, rendered stuff) activate the console (the '~' key will usually do that) and type timedemo demo2 Then deactivate the console, watch until the demo finishes, reactivate the console, and there will be an fps number. Apparently, Amigactive instead uses "timedemo demo1". Oh well.... Here are some results (from the usual machine) 1024x768, pure 68k code, timedemo demo2 : 16.2fps 1024x768, pure 68k code, timedemo demo1 : 16.7fps 640x480, pure 68k code, timedemo demo2 : 34.1fps 640x480, pure 68k code, timedemo demo1 : 34.4fps 1024x768, some x86 code, timedemo demo2 : 22.2fps 1024x768, some x86 code, timedemo demo1 : 23.8fps 640x480, some x86 code, timedemo demo2 : 42.6fps 640x480, some x86 code, timedemo demo1 : 44.1fps [1] What happens is that the raw, measured value x is taken and divided by 1. If the result is <5, then it gets entered into the record. Otherwise, x is divided by two. If that makes the value <5, then it is used. Otherwise, division by 3, 4, 5 and so on is performed, until finally a value <5 is achieved. [2] This pretty much is the same as the screen refresh rate as perceived by AmigaOS. Unless the value is <50, it has nothing to do with actual speed. %%%%%%%%%% % Benoit % %%%%%%%%%% This is a little fractal renderer that Phase5 included with their CSPPC cards. It can calculate using either the PPC or the 68k CPU; As Phase5 also distributed source (although not to the binary they included....), it is quite evident that there are close to no 68k->PPC->68k context switches when using the PPC to calculate. The same program is also available from Aminet at http://www.aminet.net/pub/aminet/biz/p5/Benoit-V2_0.lha Instructions: Set your screenmode to 1024x768x24 or 1024x768x32 (this is *very* important, because the size of the Benoit window is calculated as 1/4 the size of the current screensize. If you use different screen sizes, you cannot compare results!). Then start the Benoit executable; It will self-time how long it takes for the startup fractal. You can then change the CPU settings and click "Calculate" again, to have the same fractal rendered by your other CPU. Results: (usual machine) Startup Fractal, pure 68k code : 0.30 seconds We also have a version, currently for internal use only, in which we compiled the actual calculation engine to x86 code. Startup Fractal, x86 calculations : 0.06 seconds [ Note: Just recompiling the C source to 68k code with gcc reduced the 68k time to 0.20 seconds, though, so the speedup from going gcc-compiled x86 is not quite as massive as it looks ]