In a recent blog post we observed how the presence of advanced hardware features in modern processors makes it more difficult to establish the worst-case execution time (WCET) of an application. Continuing this theme, let’s examine the use of pipelined processor architectures and the effect that this has on WCET in real-time systems.
Pipelined processor architectures explained
The inclusion of a pipeline increases the instruction throughput of a processor by increasing the number of operations that the CPU can perform simultaneously. Typically, a pipelined architecture will divide each instruction into a sequence of steps, where each step can be executed in a single clock cycle. An example of this is the classic RISC pipeline, which divides the execution of an instruction into five separate stages. More modern designs use longer pipelines allowing increased maximum throughput and higher clock speeds.
How do pipelined processor architectures affect WCET in real-time systems?
Although the use of a pipeline increases the overall instruction throughput, it becomes more difficult to predict the length of time that it will take to execute a given instruction. This is particularly the case where the result of one instruction is used as a parameter to the next, as the pipeline must wait until the result is ready before it can continue processing. This is known as a stall in the pipeline.
A major cause of pipeline stalls is branches in the code, which are typically implemented as conditional jumps based on the state of a condition flag or other register value. Until the calculation of the condition is complete, the CPU is unable to fetch the next instruction, and so the pipeline stalls until the branch destination is known.
The length of the stall depends on the number of stages in the pipeline prior to the execution stage, so architectures with longer pipelines generally experience longer stall periods. In code with large numbers of branches, this can result in the pipeline being stalled for a significant proportion of the total execution time, reducing the overall throughput of the pipeline.
Overcoming pipeline stalls with branch prediction algorithms
To combat this problem, modern processors incorporate branch prediction algorithms that aim to reduce the number of pipeline stalls caused by branches. Branch prediction aims to guess the outcome of a branch instruction before it is executed, allowing the CPU to continue executing instructions in the time where it would otherwise experience a pipeline stall. The most naïve form of branch prediction assumes that all branches are always taken: statistically, this approach results in correct prediction of branches half the time. Advanced branch prediction algorithms are able to correctly predict which branch will be taken in excess of 90% of the time.
What effect does branch prediction have on WCET?
The use of branch prediction in CPUs greatly increases the total instruction throughput of the processor, but causes problems for WCET analysis. Because of the delaying effect of a pipeline stall, the execution time of a branch may be considerably longer if the branch predictor is incorrect than it would be if the correct outcome was predicted. This leads to a significant difference between average-case execution and the potential worst-case time. Assuming that the branch prediction is always incorrect for every branch will ensure that the WCET value is not optimistic, but often leads to measurements that are highly pessimistic. Static analysis techniques employ statistical methods to derive an upper bound on the number of branch predictions that will be incorrect, but this is a complex procedure that must be repeated for each analysis.
RapiTime and WCET
For measurement-based techniques, such as end-to-end it becomes increasingly important to ensure that the worst-case path through the code is adequately exercised and that the data used is sufficiently representative to expose all the possible incorrect branch predictions.
RapiTime combines the structural model of the code derived during instrumentation with the timing data it derives from the execution trace. Using this combined data, RapiTime predicts the worst-case path through the code and the worst-case execution time.
RapiTime's approach to testing and the detailed information provided in the RapiTime report can be used to ensure that testing is effective and provide assurance that the calculated WCET value is correct.