Member-only story
How to Guide the Compiler to Speed up Your Code
The modern compilers are pretty smart. But at times, it’s hard for them to figure out the best optimization. Good for us that we can guide them further on this
Modern compilers don’t just compile our code, as is from a certain high-level language to assembly language (or machine-readable instructions). They spend a great deal of time and effort optimizing our code to achieve the best performance.
This is of course enabled when the right flags are provided to the compilers. You can always instruct the compilers to optimize for binary size or compilation latency instead (read more).
This article will be focused on optimising runtime performance.
Disclaimers:
Most of the examples in this article are using C++, but I believe the content would be useful for everyone.
The content of this article are not reflection of the organisation I work for but instead my own.
Modern CPUs
Let’t talk a little about modern CPUs. This is often taught at school but we tend to forget it.
SIMD vs SISD
SISD stands for Single Instruction Stream, Single Data Stream. Typically, a program’s code is executed in sequence, i.e. one after another. Let’s say we have two arrays say a
and b
, and we want to write a program that converts each element in a
with the following operation:
a[i] = a[i] + b[i];
For each index i
in the arrays.

This is how we often visualize how our code is executed on the CPUs. And we tend to optimize the big Omega — and yes that is the right practice.
But modern CPUs can do better!