Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip nop instructions #638

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Skip nop instructions #638

wants to merge 7 commits into from

Conversation

chfast
Copy link
Collaborator

@chfast chfast commented Nov 10, 2020

This was originally included in #622.

I can't get good measurement of the performance on Haswell (I will try other architectures later).
But this is rather no-brainer as we simply execute less interpreter loop iterations and the "internal code" is smaller.

Internal code size (opcodes and immediate values)
                            Before      After     Change
--------------------------------------------------------
blake2b                       5467       5431      -0.7%
ecpairing                   356044     355056      -0.3%
keccak256                     9807       9677      -1.3%
memset                        1043       1035      -0.8%
mul256_opt0                   1742       1730      -0.7%
ramanujan_pi                  4922       4828      -1.9%
sha1                          8804       8704      -1.1%
sha256                       14902      14806      -0.6%
taylor_pi                      259        253      -2.3%
micro/eli_interpreter          411        393      -4.4%
Number of instructions executed
                                          Before      After     Change
----------------------------------------------------------------------
blake2b/512_bytes_rounds_1                 46389      44144      -4.8%
blake2b/512_bytes_rounds_16               711684     676709      -4.9%
ecpairing/onepoint                     197487108  195713282      -0.9%
keccak256/512_bytes_rounds_1               44148      43508      -1.4%
keccak256/512_bytes_rounds_16             654045     644700      -1.4%
memset/256_bytes                            4243       3974      -6.3%
memset/60000_bytes                        956413     894533      -6.5%
mul256_opt0/input1                         14755      14647      -0.7%
ramanujan_pi/33_runs                       67378      62112      -7.8%
sha1/512_bytes_rounds_1                    54845      53436      -2.6%
sha1/512_bytes_rounds_16                  773210     753351      -2.6%
sha256/512_bytes_rounds_1                  47934      47823      -0.2%
sha256/512_bytes_rounds_16                670314     669153      -0.2%
taylor_pi/pi_1000000_runs               25000006   24000004      -4.0%
micro/eli_interpreter/exec105               2829       1989     -29.7%
micro/factorial/20                           206        206      -0.0%
micro/fibonacci/24                       1911030    1911030      -0.0%
micro/host_adler32/1                          15         13     -13.3%
micro/host_adler32/1000                    12003      11002      -8.3%
micro/icall_hash/1000_steps                26013      25011      -3.9%
micro/spinner/1                                8          6     -25.0%
micro/spinner/1000                          6002       5001     -16.7%

@chfast chfast requested review from gumb0 and axic November 10, 2020 21:42
@chfast
Copy link
Collaborator Author

chfast commented Nov 10, 2020

Haswell 4 GHz, GCC10 LTO

fizzy/execute/blake2b/512_bytes_rounds_1_mean                     -0.0641         -0.0641            75            70            75            70
fizzy/execute/blake2b/512_bytes_rounds_16_mean                    +0.0177         +0.0177          1040          1058          1040          1058
fizzy/execute/ecpairing/onepoint_mean                             +0.0098         +0.0098        356215        359701        356218        359703
fizzy/execute/keccak256/512_bytes_rounds_1_mean                   -0.2045         -0.2045            96            77            96            77
fizzy/execute/keccak256/512_bytes_rounds_16_mean                  -0.2076         -0.2076          1408          1115          1408          1115
fizzy/execute/memset/256_bytes_mean                               +0.0619         +0.0619             6             6             6             6
fizzy/execute/memset/60000_bytes_mean                             +0.0683         +0.0683          1322          1412          1322          1412
fizzy/execute/mul256_opt0/input1_mean                             -0.0019         -0.0019            24            24            24            24
fizzy/execute/ramanujan_pi/33_runs_mean                           -0.0180         -0.0180           101            99           101            99
fizzy/execute/sha1/512_bytes_rounds_1_mean                        +0.0348         +0.0348            78            81            78            81
fizzy/execute/sha1/512_bytes_rounds_16_mean                       +0.0413         +0.0413          1084          1128          1084          1128
fizzy/execute/sha256/512_bytes_rounds_1_mean                      +0.0064         +0.0064            77            78            77            78
fizzy/execute/sha256/512_bytes_rounds_16_mean                     +0.0072         +0.0072          1061          1069          1061          1069
fizzy/execute/taylor_pi/pi_1000000_runs_mean                      -0.0122         -0.0122         36663         36215         36663         36215
fizzy/execute/micro/eli_interpreter/exec105_mean                  -0.0219         -0.0219             4             4             4             4
fizzy/execute/micro/factorial/20_mean                             +0.0120         +0.0120             1             1             1             1
fizzy/execute/micro/fibonacci/24_mean                             +0.0265         +0.0265          4712          4837          4712          4837
fizzy/execute/micro/host_adler32/1_mean                           +0.0055         +0.0055             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                        +0.0109         +0.0109            28            29            28            29
fizzy/execute/micro/icall_hash/1000_steps_mean                    +0.0372         +0.0372            62            64            62            64
fizzy/execute/micro/spinner/1_mean                                -0.0975         -0.0975             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                             +0.0217         +0.0217             7             8             7             8

lib/fizzy/parser_expr.cpp Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 12, 2020

Codecov Report

Merging #638 (2bdbbc5) into master (b58d6e6) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #638   +/-   ##
=======================================
  Coverage   98.36%   98.36%           
=======================================
  Files          69       69           
  Lines        9622     9626    +4     
=======================================
+ Hits         9465     9469    +4     
  Misses        157      157           

@chfast
Copy link
Collaborator Author

chfast commented Nov 12, 2020

Last 4 commits to be squashed, but leaving them for the review period.

@chfast
Copy link
Collaborator Author

chfast commented Nov 12, 2020

EPYC 7601 2.2 GHz, GCC 10

fizzy/parse/blake2b_mean                                           +0.0055         +0.0055            53            53            53            53
fizzy/instantiate/blake2b_mean                                     -0.0599         -0.0599            65            61            64            61
fizzy/execute/blake2b/512_bytes_rounds_1_mean                      +0.0087         +0.0087           218           220           218           220
fizzy/execute/blake2b/512_bytes_rounds_16_mean                     +0.0157         +0.0159          3327          3379          3326          3379
fizzy/parse/ecpairing_mean                                         -0.0203         -0.0203          2803          2746          2803          2746
fizzy/instantiate/ecpairing_mean                                   -0.0163         -0.0163          2884          2837          2884          2837
fizzy/execute/ecpairing/onepoint_mean                              -0.0506         -0.0504       1163668       1104783       1163290       1104675
fizzy/parse/keccak256_mean                                         -0.0165         -0.0165            96            94            96            94
fizzy/instantiate/keccak256_mean                                   -0.0116         -0.0116           101           100           101           100
fizzy/execute/keccak256/512_bytes_rounds_1_mean                    +0.0023         +0.0023           261           262           261           262
fizzy/execute/keccak256/512_bytes_rounds_16_mean                   +0.0032         +0.0029          3855          3867          3855          3866
fizzy/parse/memset_mean                                            -0.0238         -0.0238            13            13            13            13
fizzy/instantiate/memset_mean                                      -0.0189         -0.0190            19            18            19            18
fizzy/execute/memset/256_bytes_mean                                -0.0290         -0.0289            20            19            20            19
fizzy/execute/memset/60000_bytes_mean                              -0.0082         -0.0082          4307          4271          4306          4271
fizzy/parse/mul256_opt0_mean                                       +0.0454         +0.0454            17            18            17            18
fizzy/instantiate/mul256_opt0_mean                                 +0.0143         +0.0143            22            23            22            23
fizzy/execute/mul256_opt0/input1_mean                              +0.0085         +0.0087            92            93            92            93
fizzy/parse/ramanujan_pi_mean                                      -0.0040         -0.0041            56            55            56            55
fizzy/instantiate/ramanujan_pi_mean                                -0.0127         -0.0127            62            61            62            61
fizzy/execute/ramanujan_pi/33_runs_mean                            -0.0422         -0.0422           434           415           434           415
fizzy/parse/sha1_mean                                              -0.0080         -0.0080            87            86            87            86
fizzy/instantiate/sha1_mean                                        -0.0043         -0.0043            93            92            93            92
fizzy/execute/sha1/512_bytes_rounds_1_mean                         +0.0539         +0.0539           234           247           234           247
fizzy/execute/sha1/512_bytes_rounds_16_mean                        +0.0553         +0.0553          3272          3453          3272          3453
fizzy/parse/sha256_mean                                            -0.0117         -0.0117           147           145           147           145
fizzy/instantiate/sha256_mean                                      -0.0015         -0.0010           151           151           151           151
fizzy/execute/sha256/512_bytes_rounds_1_mean                       +0.1339         +0.1339           242           275           242           275
fizzy/execute/sha256/512_bytes_rounds_16_mean                      +0.1436         +0.1436          3356          3838          3355          3837
fizzy/parse/taylor_pi_mean                                         -0.0246         -0.0251             6             5             6             5
fizzy/instantiate/taylor_pi_mean                                   -0.0081         -0.0084            11            11            11            11
fizzy/execute/taylor_pi/pi_1000000_runs_mean                       -0.3378         -0.3378        118338         78367        118329         78361
fizzy/parse/micro/eli_interpreter_mean                             -0.0419         -0.0419             8             8             8             8
fizzy/instantiate/micro/eli_interpreter_mean                       -0.0233         -0.0233            14            13            14            13
fizzy/execute/micro/eli_interpreter/exec105_mean                   -0.1728         -0.1728            13            10            13            10
fizzy/parse/micro/factorial_mean                                   -0.0806         -0.0807             2             2             2             2
fizzy/instantiate/micro/factorial_mean                             -0.0580         -0.0580             2             2             2             2
fizzy/execute/micro/factorial/20_mean                              -0.0043         -0.0044             1             1             1             1
fizzy/parse/micro/fibonacci_mean                                   -0.0647         -0.0647             3             2             3             2
fizzy/instantiate/micro/fibonacci_mean                             -0.0508         -0.0510             3             3             3             3
fizzy/execute/micro/fibonacci/24_mean                              -0.2370         -0.2370         13666         10427         13665         10426
fizzy/parse/micro/host_adler32_mean                                -0.0591         -0.0591             4             3             4             3
fizzy/instantiate/micro/host_adler32_mean                          -0.0433         -0.0433             7             6             7             6
fizzy/execute/micro/host_adler32/1_mean                            +0.0040         +0.0039             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                         +0.0203         +0.0203            57            58            57            58
fizzy/parse/micro/icall_hash_mean                                  -0.0265         -0.0270             7             6             7             6
fizzy/instantiate/micro/icall_hash_mean                            -0.0192         -0.0198            12            12            12            12
fizzy/execute/micro/icall_hash/1000_steps_mean                     +0.0005         +0.0006           122           123           122           122
fizzy/parse/micro/spinner_mean                                     -0.1026         -0.1025             2             2             2             2
fizzy/instantiate/micro/spinner_mean                               -0.0745         -0.0745             2             2             2             2
fizzy/execute/micro/spinner/1_mean                                 -0.0711         -0.0711             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                              -0.0071         -0.0071            17            17            17            17
fizzy/parse/stress/guido-fuzzer-find-1_mean                        +0.0295         +0.0295           247           254           247           254
fizzy/instantiate/stress/guido-fuzzer-find-1_mean                  +0.0288         +0.0291           288           296           288           296

@chfast
Copy link
Collaborator Author

chfast commented Nov 12, 2020

EPYC 7601 2.2 GHz, GCC10 LTO

fizzy/parse/blake2b_mean                                           +0.0300         +0.0300            54            55            54            55
fizzy/instantiate/blake2b_mean                                     -0.0325         -0.0318            65            63            65            63
fizzy/execute/blake2b/512_bytes_rounds_1_mean                      +0.0478         +0.0478           210           220           210           220
fizzy/execute/blake2b/512_bytes_rounds_16_mean                     +0.0517         +0.0517          3184          3349          3184          3349
fizzy/parse/ecpairing_mean                                         +0.0415         +0.0414          2791          2907          2791          2907
fizzy/instantiate/ecpairing_mean                                   +0.0492         +0.0493          2887          3029          2886          3028
fizzy/execute/ecpairing/onepoint_mean                              -0.0148         -0.0150       1122826       1106192       1122764       1105866
fizzy/parse/keccak256_mean                                         +0.0217         +0.0220            96            98            96            98
fizzy/instantiate/keccak256_mean                                   +0.0246         +0.0246           101           104           101           104
fizzy/execute/keccak256/512_bytes_rounds_1_mean                    -0.0548         -0.0548           270           255           270           255
fizzy/execute/keccak256/512_bytes_rounds_16_mean                   -0.0580         -0.0580          3995          3764          3995          3763
fizzy/parse/memset_mean                                            +0.0265         +0.0267            13            14            13            14
fizzy/instantiate/memset_mean                                      +0.0078         +0.0078            19            19            19            19
fizzy/execute/memset/256_bytes_mean                                -0.2477         -0.2477            26            20            26            20
fizzy/execute/memset/60000_bytes_mean                              -0.2528         -0.2526          5847          4369          5845          4369
fizzy/parse/mul256_opt0_mean                                       +0.0172         +0.0177            17            18            17            18
fizzy/instantiate/mul256_opt0_mean                                 +0.0149         +0.0149            23            23            23            23
fizzy/execute/mul256_opt0/input1_mean                              -0.0396         -0.0396            84            81            84            81
fizzy/parse/ramanujan_pi_mean                                      +0.0161         +0.0159            56            57            56            57
fizzy/instantiate/ramanujan_pi_mean                                +0.0197         +0.0197            61            63            61            63
fizzy/execute/ramanujan_pi/33_runs_mean                            -0.0980         -0.0980           437           394           437           394
fizzy/parse/sha1_mean                                              +0.0305         +0.0304            87            89            87            89
fizzy/instantiate/sha1_mean                                        +0.0258         +0.0259            93            95            93            95
fizzy/execute/sha1/512_bytes_rounds_1_mean                         +0.1289         +0.1289           220           248           220           248
fizzy/execute/sha1/512_bytes_rounds_16_mean                        +0.1282         +0.1283          3066          3459          3065          3459
fizzy/parse/sha256_mean                                            +0.0368         +0.0369           143           148           143           148
fizzy/instantiate/sha256_mean                                      +0.0303         +0.0304           150           154           150           154
fizzy/execute/sha256/512_bytes_rounds_1_mean                       +0.0317         +0.0317           264           273           264           273
fizzy/execute/sha256/512_bytes_rounds_16_mean                      +0.0315         +0.0315          3673          3789          3673          3788
fizzy/parse/taylor_pi_mean                                         -0.0341         -0.0344             6             5             6             5
fizzy/instantiate/taylor_pi_mean                                   -0.0121         -0.0118            11            11            11            11
fizzy/execute/taylor_pi/pi_1000000_runs_mean                       +0.0045         +0.0046         77897         78248         77885         78243
fizzy/parse/micro/eli_interpreter_mean                             -0.0121         -0.0120             8             8             8             8
fizzy/instantiate/micro/eli_interpreter_mean                       -0.0113         -0.0112            14            13            14            13
fizzy/execute/micro/eli_interpreter/exec105_mean                   -0.3408         -0.3406            13             8            13             8
fizzy/parse/micro/factorial_mean                                   -0.0518         -0.0519             2             2             2             2
fizzy/instantiate/micro/factorial_mean                             -0.0428         -0.0428             2             2             2             2
fizzy/execute/micro/factorial/20_mean                              -0.0070         -0.0070             1             1             1             1
fizzy/parse/micro/fibonacci_mean                                   -0.0619         -0.0620             3             2             3             2
fizzy/instantiate/micro/fibonacci_mean                             -0.0438         -0.0438             3             3             3             3
fizzy/execute/micro/fibonacci/24_mean                              -0.0136         -0.0136         10687         10542         10687         10541
fizzy/parse/micro/host_adler32_mean                                -0.0476         -0.0475             4             3             4             3
fizzy/instantiate/micro/host_adler32_mean                          -0.0364         -0.0363             7             6             7             6
fizzy/execute/micro/host_adler32/1_mean                            +0.0024         +0.0024             0             0             0             0
fizzy/execute/micro/host_adler32/1000_mean                         -0.0030         -0.0030            60            60            60            60
fizzy/parse/micro/icall_hash_mean                                  -0.0225         -0.0225             7             6             7             6
fizzy/instantiate/micro/icall_hash_mean                            -0.0186         -0.0188            12            12            12            12
fizzy/execute/micro/icall_hash/1000_steps_mean                     -0.0121         -0.0121           124           123           124           123
fizzy/parse/micro/spinner_mean                                     -0.0958         -0.0958             2             2             2             2
fizzy/instantiate/micro/spinner_mean                               -0.0624         -0.0624             2             2             2             2
fizzy/execute/micro/spinner/1_mean                                 -0.0933         -0.0933             0             0             0             0
fizzy/execute/micro/spinner/1000_mean                              -0.0235         -0.0235            17            17            17            17
fizzy/parse/stress/guido-fuzzer-find-1_mean                        -0.0205         -0.0205           251           245           251           245
fizzy/instantiate/stress/guido-fuzzer-find-1_mean                  -0.0054         -0.0054           289           288           289           288

@gumb0
Copy link
Collaborator

gumb0 commented Nov 12, 2020

Test suggestion: function containing all skipped opcodes, check that it's parsed to single end.

@chfast
Copy link
Collaborator Author

chfast commented Nov 12, 2020

Test suggestion: function containing all skipped opcodes, check that it's parsed to single end.

Added.

@axic axic added the optimization Performance optimization label Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization Performance optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants