Accelerating scc startup speed with code generation #594

apocelipes · 2025-02-22T19:56:51Z

We skip the JSON serialization and convert the data directly into Go code, which saves considerable startup time.

"scc --languages" is now 40% faster:

Running on a small code base can gain 7% improvements and on big code bases it could be 1~3%:

Pros:

Code changes are more readable
Faster startup
Human readable data (Go code is more readable than base64 encoded JSON)
Save memory: Language data has a large number of repeating string constants, which can be optimized by the compiler.

Cons:

It may be horrible to see a 12k lines code file, even it is generated by tools.
To generate code, "scripts/include.go" uses a few tricks, such as copying the structure definitions from “processor/structs.go”, which makes the code not particularly elegant.
The size of the binary file was increased by 200kb.
There is a performance degradation when languageDatabase is not required, because now the language data will always be loaded at startup. For example scc --help is 10% slower than before. However, there are only very few scenarios that don't require language data.

If you prefer compressed JSON data, that's fine. This code generator is only my weekend DIY :)

Accelerating scc startup speed with code generation

752ac34

boytertesting bot added VH/complexity Very high complexity XL/size Extra large change labels Feb 22, 2025

Provide feedback