What is the state of the practice in WCET (part 2) - what are the limitations?

2012-09-10

Previously we looked at how WCET (worst-case execution time) is measured in many applications. In the second of three posts we now turn to the limitations of current practices, and how they can be overcome through the use of automated timing analysis (specifically RapiTime).

Last week, we established that the "state of practice" approach to measuring WCET is:

Predict worst-case paths through the code;
Create test cases to execute these worst-case paths;
Set up measurement;
Measure the time to execute the test cases.

However, it is difficult and effort-intensive to identify worst-case paths through code because:

Predicting which parts of code are responsible for large execution times is difficult:
- Most of the code will not affect the worst-case and can be ignored. However, if we don’t know which code falls into this category, we could be wasting time looking at it.
- Some code will affect worst-case slightly – reducing or increasing the execution time of this code will have a marginal effect on the overall WCET.
- A small part of the code will have a significant effect on WCET.
There isn't a straightforward relationship between the source code and the execution time:
- a simple assignment statement (especially in C++ or Ada) might result in a significant number of operations if copying a complex data structure;
- some complex-looking groups of statements might be aggressively optimised by the compiler, resulting in a small number of machine instructions (in some cases the statements might result in no instructions generated).

Because of the massive number of paths through the source code, it is extremely difficult to manually identify which code could be on the worst-case path. This means that this approach will almost certainly lead to an optimistic WCET, where the reported WCET is less than the “actual” WCET - possibly very much less than the "actual" WCET.

There are other challenges too: for example, if the unit being measured could be pre-empted by an interrupt or another process, it means that we are measuring response time rather than execution time. (see explaining the difference between execution times and response times for an explanation of the difference). Measuring response time rather than execution time is extremely imprecise because the timings are affected by two, unrelated, sources of variability:

Different execution times within the thread of execution being measured;
The arrival of context switches (either inter- or intra-partition). If the function we're measuring gets pre-empted part way through, the time spent executing other threads will get included in its execution time.

When combined, these two sources of timing variability can lead to significant, and difficult to predict variations in timing measurements, which are not all attributable to the thread being measured. In part 3, we'll look at how RapiTime can get around these limitations.