Worst-case execution time (WCET) analysis is critical for the verification of safety-critical real-time embedded systems. This analysis is typically performed by instrumenting source code and obtaining timing data from an instrumented build. While this is a robust method for determining WCET metrics, it requires that instrumentation be added to source code, increasing overhead. This is why we have been developing RapiTime, our on-target timing analysis tool, to perform WCET analysis with zero instrumentation.
In our RapiTime Object Code Analyser (ROCA) project, as the name implies, we have been developing a tool that lets us use object code to perform timing analysis with RapiTime. Instead of having to rely on instrumentation to obtain timing metrics, our tool uses branch trace information collected from the CPU during program execution. This lets us eliminate instrumentation overhead entirely when performing timing analysis.
We recently completed the ROCA project, and will be taking the technology further in our current SECT-AIR project. In this post, we’d like to share some details on one of the main difficulties analyzing object code presented us with: tracing control flow.
Object code control flow
We need to know the control flow of a program to determine WCET metrics from it. While a control flow graph can easily be determined from source code, it is a little more difficult to do this from object code. To trace control flow from object code, we use branch trace information to map each assembly instruction to its origin in the source code. This lets us present the program as a call tree, and each function as a loop nesting tree, and these trees are necessary for our analysis (Figure 1).
Figure 1. Using branch trace information to determine WCET.
Working out how to map assembly instructions to the source code proved to be no easy task, however. One of the main reasons for this is that object code is unstructured due to the number of “jump” instructions it contains.
Jump instructions in object code
In source code, the use of conditionals, loops and function calls is the most frequently used method to affect program control flow. Object code, however, mainly executes control flow using “jump” instructions. Compare this for
loop written in C with its object code when compiled for x86:
Source code
void loop_example (void) { for (i = 0; i < 10; i++) { state_machine(); } }
Object code
loop_example: movl $0, i iteration: call state_machine movl i, %eax addl $1, %eax cmpl $9, %eax movl %eax, i jle iteration ret
You may notice that the for
loop has been compiled as a “jump” instruction jle iteration
, preceded by compare ( cmpl
) and add ( addl
) instructions.
In this case, the "jump" instruction clearly executes loop control flow by directly addressing iteration
, but "jumps" can be indirect, addressing code via an address stored in memory. These indirect "jumps" make it difficult to map assembly instructions to the source code. Some other features of compiler optimization also make this mapping difficult, for example tail call optimization.
Tail call optimization
A tail call is a sub-routine call made immediately before a function returns. Calls have high overheads as they add a new stack frame to the call stack. Tail calls can be replaced with “jumps”, and compiler optimization eliminates tail calls to minimize overhead; see the example below:
int state; int (* func) (void); int state_machine (void) { switch (state) { case 1: state = 2; return 0; case 2: state = 3; return 0; default: return func (); } }
A compiler eliminates the high-overhead tail call by replacing the expression with a “jump” to the func
sub-routine:
state_machine: cmpl $2, state jbe switch case_default: movl func, %eax jmp *%eax /* This jump originates from a tail call */ switch: movl state, %eax movl jump_table(,%eax,4), %eax jmp *%eax /* This jump originates from a switch statement */ case_1: movl $2, state ret case_2: movl $3, state ret .section .rodata jump_table: .long case_default .long case_1 .long case_2
The compiler optimized the call to func
into a “jump”. You can see that the jmp *%eax
“jump” instructions in this example are indirect, addressing data stored in memory rather than directly jumping to a code address. This can make it difficult to determine function boundaries, which are needed to determine WCET using RapiTime. It is trivial to see that func
is stored in *%eax
in this example, but this may not always be the case. To determine a call tree around “jump” instructions in object code, we make some assumptions.
We initially assume that “jump” instructions jump to another location within their parent function, as their origin is usually either a switch
statement or a for
or while
loop. Optimized tail calls thus present a challenge; they appear identical to other types of “jump”, so we assume their destination is inside their parent function, but it may not be. In the code above, for example, one jump originates from a tail call to an external function, and the other from a switch
statement.
To solve this problem, we parse trace data in two steps (see Figure 1 above). After parsing the trace once we reclassify instructions to recognize both the type and destination of a jump. The destination indicates whether the “jump” is to within or outside the parent function and lets us complete the call tree and loop nesting tree. Using the information from these trees, we can parse the trace again to determine WCET metrics.
Find out more
We hope you enjoyed this insight into the difficulties of WCET analysis on object code. Stay tuned for more details about object-code tracing in our SECT-AIR project.
In the meantime, if you would like to know more about timing analysis using RapiTime, visit our product page at /products/rapitime.