Benchmarks

Linpack
Download this, rename it to linpack.c

Build
root@linaro-alip:~/benchmarks# cc -Ofast -o linpack linpack.c -lm -mcpu=cortex-a8 -march=armv7-a -mfpu=neon -mfloat-abi=hard -funsafe-math-optimizations -fno-fast-math linpack.c: In function ‘main’: linpack.c:78:14: warning: ignoring return value of ‘fgets’, declared with attribute warn_unused_result [-Wunused-result]

Results
-mcpu=cortex-a8 -march=armv7-a -mfpu=neon -mfloat-abi=hard -funsafe-math-optimizations -fno-fast-math Memory required: 315K.

LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance:

Reps Time(s) DGEFA  DGESL  OVERHEAD    KFLOPS

16  0.61  88.52%   6.56%   4.92%  37885.057      32   1.21  85.12%   2.48%  12.40%  41459.119      64   2.43  93.83%   2.47%   3.70%  37561.254     128   4.86  91.77%   2.47%   5.76%  38381.368     256   9.70  92.06%   2.89%   5.05%  38173.000     512  19.41  91.29%   2.47%   6.23%  38634.432 mcpu=cortex-a8 -mtune=cortex-a8 -march=armv7-a -mfpu=neon -mfloat-abi=hard -funsafe-math-optimizations -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations Memory required: 315K.

LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance:

Reps Time(s) DGEFA  DGESL  OVERHEAD    KFLOPS

16  0.53  90.57%   1.89%   7.55%  44843.537      32   1.05  90.48%   3.81%   5.71%  44390.572      64   2.13  90.14%   2.35%   7.51%  44615.905     128   4.23  90.54%   3.07%   6.38%  44390.572     256   8.46  90.19%   2.84%   6.97%  44672.596     512  17.03  90.55%   2.76%   6.69%  44250.892

Whetstone/Dhrystone
TODO

How to test
run openssl speed

Results
Linaro-alip soft-float OpenSSL 1.0.1 14 Mar 2012 built on: Tue Aug 21 05:35:49 UTC 2012 options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr) compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Wformat-security -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_NO_TLS1_2_CLIENT -DOPENSSL_MAX_TLS1_2_CIPHER_LENGTH=50 The 'numbers' are in 1000s of bytes per second processed. type            16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes md2                 0.00         0.00         0.00         0.00         0.00 mdc2                0.00         0.00         0.00         0.00         0.00 md4              4539.13k    23584.98k    68988.33k   133520.04k   184363.69k md5              5140.49k    17237.58k    46162.43k    79220.05k   100848.98k hmac(md5)        6296.96k    20580.39k    51788.37k    83395.93k   101282.91k sha1             5056.81k    15672.85k    36537.09k    54699.01k    64102.40k rmd160           4733.01k    14162.58k    31460.95k    45231.10k    51950.93k rc4             67049.00k    74935.98k    78372.86k    79348.39k    79623.51k des cbc         17689.04k    18793.72k    19138.82k    19248.13k    19292.16k des ede3         6748.10k     6951.38k     6998.10k     7015.77k     6950.87k idea cbc            0.00         0.00         0.00         0.00         0.00 seed cbc        20640.20k    21906.09k    22347.52k    22450.52k    22500.69k rc2 cbc         13089.00k    13998.74k    14224.73k    14294.36k    14164.76k rc5-32/12 cbc       0.00         0.00         0.00         0.00         0.00 blowfish cbc    26759.62k    29755.75k    30726.06k    30958.59k    31053.14k cast cbc        25870.12k    28393.51k    29254.23k    29501.78k    29570.39k aes-128 cbc     19582.69k    20855.45k    21258.07k    21348.35k    21392.04k aes-192 cbc     16902.33k    17731.03k    18009.26k    18094.42k    18117.97k aes-256 cbc     14778.66k    15419.55k    15636.82k    15683.58k    15712.26k camellia-128 cbc   26162.67k    28201.17k    28923.31k    29136.90k    28918.58k camellia-192 cbc   20555.46k    22316.52k    22863.19k    22990.17k    23046.83k camellia-256 cbc   20704.67k    22316.39k    22846.72k    23003.48k    23044.10k sha256           4130.87k     9683.05k    17185.11k    21408.43k    23093.25k sha512            804.45k     3218.84k     4525.99k     6147.07k     6873.09k whirlpool        1201.69k     2457.88k     3979.18k     4716.20k     4917.93k aes-128 ige     18517.42k    19858.50k    20280.58k    20406.61k    20838.75k aes-192 ige     15950.20k    17003.69k    17323.18k    17393.32k    17408.00k aes-256 ige     14102.48k    14868.65k    15100.93k    15172.95k    15174.31k ghash           14806.49k    15383.55k    15564.03k    15625.22k    15652.18k sign   verify    sign/s verify/s rsa 512 bits 0.002293s 0.000203s    436.1   4920.6 rsa 1024 bits 0.012441s 0.000617s    80.4   1621.2 rsa 2048 bits 0.075263s 0.002055s    13.3    486.7 rsa 4096 bits 0.499048s 0.007148s     2.0    139.9 sign   verify    sign/s verify/s dsa 512 bits 0.002058s 0.002299s    485.9    435.0 dsa 1024 bits 0.006101s 0.006964s   163.9    143.6 dsa 2048 bits 0.020326s 0.023641s    49.2     42.3 sign   verify    sign/s verify/s 160 bit ecdsa (secp160r1)  0.0010s   0.0045s    977.2    222.1 192 bit ecdsa (nistp192)  0.0011s   0.0046s    950.8    218.4 224 bit ecdsa (nistp224)  0.0014s   0.0062s    739.1    160.2 256 bit ecdsa (nistp256)  0.0016s   0.0079s    613.0    126.5 384 bit ecdsa (nistp384)  0.0036s   0.0184s    281.4     54.3 521 bit ecdsa (nistp521)  0.0096s   0.0510s    103.9     19.6 163 bit ecdsa (nistk163)  0.0021s   0.0080s    473.6    125.3 233 bit ecdsa (nistk233)  0.0044s   0.0155s    228.5     64.3 283 bit ecdsa (nistk283)  0.0067s   0.0286s    150.2     35.0 409 bit ecdsa (nistk409)  0.0178s   0.0667s     56.3     15.0 571 bit ecdsa (nistk571)  0.0426s   0.1538s     23.5      6.5 163 bit ecdsa (nistb163)  0.0021s   0.0086s    472.9    116.0 233 bit ecdsa (nistb233)  0.0043s   0.0173s    230.3     57.9 283 bit ecdsa (nistb283)  0.0067s   0.0320s    149.7     31.2 409 bit ecdsa (nistb409)  0.0178s   0.0759s     56.1     13.2 571 bit ecdsa (nistb571)  0.0428s   0.1760s     23.3      5.7 op     op/s 160 bit ecdh (secp160r1)  0.0038s    264.8 192 bit ecdh (nistp192)  0.0038s    263.9 224 bit ecdh (nistp224)  0.0052s    191.9 256 bit ecdh (nistp256)  0.0066s    151.4 384 bit ecdh (nistp384)  0.0152s     66.0 521 bit ecdh (nistp521)  0.0422s     23.7 163 bit ecdh (nistk163)  0.0040s    253.0 233 bit ecdh (nistk233)  0.0077s    130.0 283 bit ecdh (nistk283)  0.0142s     70.6 409 bit ecdh (nistk409)  0.0331s     30.2 571 bit ecdh (nistk571)  0.0760s     13.2 163 bit ecdh (nistb163)  0.0042s    235.8 233 bit ecdh (nistb233)  0.0085s    117.0 283 bit ecdh (nistb283)  0.0158s     63.1 409 bit ecdh (nistb409)  0.0378s     26.5 571 bit ecdh (nistb571)  0.0879s     11.4 ArchLinux-ARM hard-float OpenSSL 1.0.1c 10 May 2012 built on: Sat May 12 16:58:09 UTC 2012 options:bn(64,32) md2(int) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr) compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -march=armv7-a -mfloat-abi=hard -mfpu=vfpv3-d16 -O2 -pipe -fstack-protector --param=ssp-buffer-size=4 -D_FORTIFY_SOURCE=2 -DOPENSSL_NO_TLS1_2_CLIENT -DTERMIO -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type            16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes md2              1010.38k     2071.59k     2919.79k     3215.08k     3322.59k mdc2             2238.70k     2724.06k     2915.34k     3063.91k     3044.11k md4              8261.57k    28911.65k    81103.94k   148492.57k   200026.24k md5              6456.03k    20979.84k    54995.08k    89176.35k   111244.43k hmac(md5)        6319.94k    21289.87k    54631.79k    89444.75k   110983.92k sha1             6633.24k    20150.08k    47302.98k    70280.66k    82581.21k rmd160           5493.36k    15627.12k    34127.48k    48159.05k    54297.05k rc4             66233.79k    74331.73k    77396.54k    77693.81k    78829.52k des cbc         18532.54k    19769.99k    20273.26k    20313.14k    20323.82k des ede3         7169.37k     7346.22k     7416.20k     7478.53k     7461.20k idea cbc        15485.30k    16443.47k    16683.46k    16698.54k    16758.24k seed cbc        20667.10k    22857.28k    23349.77k    23677.72k    23609.99k rc2 cbc         13686.09k    14637.63k    14956.66k    15077.94k    14912.37k rc5-32/12 cbc       0.00         0.00         0.00         0.00         0.00 blowfish cbc    27451.80k    30338.11k    31082.36k    31144.23k    31523.17k cast cbc        27317.50k    30075.42k    31215.36k    31145.33k    31403.65k aes-128 cbc     35895.60k    40605.48k    43274.31k    43880.05k    44219.12k aes-192 cbc     30897.64k    35908.66k    37676.55k    38116.09k    38425.23k aes-256 cbc     27594.10k    31650.74k    33180.37k    33427.34k    33498.80k camellia-128 cbc   26308.13k    29661.26k    31114.20k    31346.19k    31582.08k camellia-192 cbc   21422.53k    23395.33k    24418.24k    24554.27k    24599.57k camellia-256 cbc   21457.81k    23333.78k    24369.82k    24582.58k    24617.25k sha256          10078.10k    24314.98k    43970.03k    55573.89k    59677.84k sha512           4133.94k    16576.53k    25365.00k    35504.81k    40002.80k whirlpool        1216.98k     2492.34k     4065.11k     4781.12k     5059.59k aes-128 ige     31316.97k    38357.40k    41833.06k    43101.16k    43264.37k aes-192 ige     27502.07k    34078.09k    36575.95k    37367.50k    37251.58k aes-256 ige     24869.69k    30543.02k    32333.16k    32777.21k    33054.87k ghash           52904.92k    62310.47k    66025.17k    66775.11k    66985.81k sign   verify    sign/s verify/s rsa 512 bits 0.001042s 0.000104s    960.0   9587.7 rsa 1024 bits 0.005983s 0.000327s   167.2   3060.2 rsa 2048 bits 0.038947s 0.001188s    25.7    841.7 rsa 4096 bits 0.280000s 0.004561s     3.6    219.2 sign   verify    sign/s verify/s dsa 512 bits 0.001062s 0.001161s    942.0    861.4 dsa 1024 bits 0.003206s 0.003716s   311.9    269.1 dsa 2048 bits 0.011507s 0.013283s    86.9     75.3 sign   verify    sign/s verify/s 160 bit ecdsa (secp160r1)  0.0006s   0.0023s   1620.6    438.5 192 bit ecdsa (nistp192)  0.0008s   0.0033s   1259.9    304.1 224 bit ecdsa (nistp224)  0.0010s   0.0043s    991.3    232.9 256 bit ecdsa (nistp256)  0.0013s   0.0058s    790.4    173.8 384 bit ecdsa (nistp384)  0.0030s   0.0151s    338.2     66.2 521 bit ecdsa (nistp521)  0.0062s   0.0346s    161.1     28.9 163 bit ecdsa (nistk163)  0.0019s   0.0064s    536.1    157.0 233 bit ecdsa (nistk233)  0.0039s   0.0116s    257.6     85.9 283 bit ecdsa (nistk283)  0.0059s   0.0214s    169.2     46.8 409 bit ecdsa (nistk409)  0.0161s   0.0469s     62.0     21.3 571 bit ecdsa (nistk571)  0.0385s   0.1089s     25.9      9.2 163 bit ecdsa (nistb163)  0.0018s   0.0069s    544.3    145.1 233 bit ecdsa (nistb233)  0.0038s   0.0128s    259.9     78.3 283 bit ecdsa (nistb283)  0.0059s   0.0238s    169.3     42.0 409 bit ecdsa (nistb409)  0.0161s   0.0533s     62.1     18.8 571 bit ecdsa (nistb571)  0.0385s   0.1241s     25.9      8.1 op     op/s 160 bit ecdh (secp160r1)  0.0019s    515.5 192 bit ecdh (nistp192)  0.0027s    374.3 224 bit ecdh (nistp224)  0.0036s    278.7 256 bit ecdh (nistp256)  0.0049s    203.8 384 bit ecdh (nistp384)  0.0126s     79.2 521 bit ecdh (nistp521)  0.0288s     34.7 163 bit ecdh (nistk163)  0.0031s    319.4 233 bit ecdh (nistk233)  0.0057s    176.9 283 bit ecdh (nistk283)  0.0105s     94.9 409 bit ecdh (nistk409)  0.0231s     43.2 571 bit ecdh (nistk571)  0.0538s     18.6 163 bit ecdh (nistb163)  0.0033s    300.7 233 bit ecdh (nistb233)  0.0063s    158.9 283 bit ecdh (nistb283)  0.0118s     85.1 409 bit ecdh (nistb409)  0.0263s     38.1 571 bit ecdh (nistb571)  0.0615s     16.3

Build
wget http://math.nist.gov/scimark2/scimark2_1c.zip unzip -o scimark2_1c.zip -d scimark2_files cd scimark2_files/ g++ -o scimark2 -O *.c -mcpu=cortex-a8 -mtune=cortex-a8 -march=armv7-a -mfpu=neon -mfloat-abi=hard -funsafe-math-optimizations -fomit-frame-pointer -ffast-math -funroll-loops -funsafe-loop-optimizations -fno-tree-vectorize ./scimark2 -large

Results
Using      2.00 seconds min time per kenel. Composite Score:          29.32 FFT            Mflops:    13.57    (N=1048576) SOR            Mflops:    48.51    (1000 x 1000) MonteCarlo:    Mflops:    23.30 Sparse matmult Mflops:    34.22    (N=100000, nz=1000000) LU             Mflops:    26.97    (M=1000, N=1000)
 * SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 * for details. (Results can be submitted to pozo@nist.gov)    **
 * for details. (Results can be submitted to pozo@nist.gov)    **

GPU
Results for X11 libraries and framebuffer libraries may differ.

ioquake3
See ioquake3

es2_gears
X11 libraries:
 * 131FPS


 * r3p0: 195-200 FPS
 * r3p0: 58-75 FPS - fullscreen (1024x768)

Framebuffer libraries: ?

glx_gears
X11 libraries + mesa:
 * 117 FPS
 * ~25 FPS - fullscreen (1024x768)

glmark2-es2
X11 libraries:

=
==========================================   glmark2 2012.08

=
==========================================   OpenGL Information GL_VENDOR:    ARM GL_RENDERER:  Mali-400 MP    GL_VERSION:    OpenGL ES 2.0

=
========================================== [build] use-vbo=false: FPS: 48 FrameTime: 20.833 ms [build] use-vbo=true: FPS: 55 FrameTime: 18.182 ms [texture] texture-filter=nearest: FPS: 56 FrameTime: 17.857 ms [texture] texture-filter=linear: FPS: 56 FrameTime: 17.857 ms [texture] texture-filter=mipmap: FPS: 57 FrameTime: 17.544 ms [shading] shading=gouraud: FPS: 50 FrameTime: 20.000 ms [shading] shading=blinn-phong-inf: FPS: 50 FrameTime: 20.000 ms [shading] shading=phong: FPS: 47 FrameTime: 21.277 ms [bump] bump-render=high-poly: FPS: 37 FrameTime: 27.027 ms [bump] bump-render=normals: FPS: 58 FrameTime: 17.241 ms [bump] bump-render=height: FPS: 57 FrameTime: 17.544 ms [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 30 FrameTime: 33.333 ms [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 19 FrameTime: 52.632 ms [pulsar] light=false:quads=5:texture=false: FPS: 59 FrameTime: 16.949 ms [desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 16 FrameTime: 62.500 ms [desktop] effect=shadow:windows=4: FPS: 43 FrameTime: 23.256 ms Error: Requested MapBuffer VBO update method but GL_OES_mapbuffer is not supported! [buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: Unsupported [buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 18 FrameTime: 55.556 ms Error: Requested MapBuffer VBO update method but GL_OES_mapbuffer is not supported! [buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: Unsupported [ideas] speed=duration: FPS: 48 FrameTime: 20.833 ms [jellyfish] : FPS: 43 FrameTime: 23.256 ms Error: SceneTerrain requires Vertex Texture Fetch support, but GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS is 0 [terrain] : Unsupported [conditionals] fragment-steps=0:vertex-steps=0: FPS: 59 FrameTime: 16.949 ms [conditionals] fragment-steps=5:vertex-steps=0: FPS: 54 FrameTime: 18.519 ms [conditionals] fragment-steps=0:vertex-steps=5: FPS: 58 FrameTime: 17.241 ms [function] fragment-complexity=low:fragment-steps=5: FPS: 57 FrameTime: 17.544 ms [function] fragment-complexity=medium:fragment-steps=5: FPS: 43 FrameTime: 23.256 ms [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 56 FrameTime: 17.857 ms [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 57 FrameTime: 17.544 ms [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 56 FrameTime: 17.857 ms

=
==========================================                                 glmark2 Score: 47

=
==========================================

Video decoding
See CedarXVideoRenderingChart

A13 Benchmarks
A13 needs own CPU benchmarks because DDR3 bus is crippled.

A10S Benchmarks
Should be the same as A10.