Continuing our series of blog posts on optimizing embedded software with the aim of improving (i.e. reducing) worst-case execution times, this week we look at processors.
Software optimization techniques which improve worst-case execution times #2: Knowing the Processor
When optimising code, it is important to have a good understanding of the operations that can be performed efficiently on the target processor.
- Operations on variables that are of the processor’s native size (e.g. 8-bit, 16-bit or 32-bit) will be faster than operations on variables of a smaller or larger size. When processing variables smaller than the native size, additional instructions are typically required to clear the top bits in the registers. Using variables that are larger than the native size, for example 32-bit integers on a 16-bit processor, incurs a large overhead as the overall result is pieced together from its component parts.
- Operations on unsigned variables are often faster than the same operations on signed variables. Operations on integers are typically much faster than similar operations on floating point values. This is notably the case when software floating-point support is used.
- Multiply and divide operations are typically considerably slower than addition and subtraction.
- Some processors have a “barrel shifter” and can carry out shift operations in a single instruction. Shift by a variable amount is a very slow operation on other simpler processors as it is implemented by a loop in the assembler code.
Timing measurements and inspection of the assembler code will reveal which “optimisations” result in improved execution times and help identify those that result in slower code.