Multicore systems are the future of DO-178C and ED-12C-certified avionics. With their increased SWaP characteristics, and the increasing challenge in sourcing single core processors, it’s inevitable that future developments will need to use multicore processors and have them certified.
But therein lies the challenge – AC 20-193 and AMC 20-193 provide the additional means of compliance for multicore DO-178C avionics, but many questions remain; to state just a few of these:
- How should the guidance be interpreted?
- What methods can you use to achieve the objectives?
- How much effort is required?
- When do you stop testing?
At Rapita Systems, we regularly deliver training and webinars on multicore certification in the US and Europe. In this blog, we’ve collated some of the questions we’re asked most often, along with our responses.
-
Which objectives do I need to meet? Are alternative paths to compliance available?
A(M)C 20-193 specifies the objectives that are expected to demonstrate airworthiness of avionics that include multicore software based on software DAL. No objectives are indicated for DAL D systems, and only a portion of the objectives are indicated for DAL C systems, but this does not necessarily mean that related activities should be performed – see the corresponding FAQs below.
In the general case, the simplest path to certification of a multicore processor should be by addressing the applicable A(M)C 20-193 objectives. If you wish to use alternative methods of compliance, you should discuss these with your certification authority.
-
As some A(M)C 20-193 objectives aren’t indicated for DAL C systems, do I not need to perform related activities for DAL C?
It’s true that some A(M)C 20-193 objectives only apply at DAL A/B. However, the MCP_Software_1 objective requires determination of software worst-case execution time (WCET) even for DAL C systems. This will require some understanding of the interference channels on the platform and their impact on the software. You should speak to your certification authority to determine what level of rigor may be necessary for related activities. A white paper on the topic is on a shortlist to be added to a future version of MACH178 Foundations.
-
Which activities are required to certify multicore systems to DAL D?
The A(M)C 20-193 objectives apply for DAL A-C software. However, DO-178C requires robustness testing at DAL D, which would include some level of multicore interference analysis. While this will not require the same rigor as a full platform analysis, some generic interference generators on key resources may be necessary. A white paper on this topic is on a shortlist be added into a future version of MACH178 Foundations.
-
Is there a difference between what different certification authorities expect in terms of meeting A(M)C 20-193 objectives?
Differences can be seen in how certification authorities interpret what is necessary to meet A(M)C 20-193 objectives. An example of this is interpretation of the MCP_Resorce_Usage_4 objective. Even between the organizations and individuals that wrote the guidance, the required activities can range from a demonstration of functionally correct software meeting deadlines to understanding the maximum capacity of all MCP Resources as well as the maximum usage of hosted software. It is important to agree on these concepts early in the certification process.
-
Do the certification authorities agree on the definitions used in the A(M)C 20-193 guidance?
A(M)C 20-193 provides some definitions of terms, for example for “Robust Partitioning” and “Determinism”, which differ from those in other documentation such as DO-178C. As multicore certification is in its infancy, how these definitions are interpreted and their impact on the activities associated with achieving the A(M)C objectives can differ between certification authorities. As such, we strongly recommend establishing and agreeing upon the definition and scope of what is required with your certification authority early in the process.
When developing our MACH178 solutions, we have extensively discussed any terms in A(M)C 20-193 that are open to interpretation or are ambiguous (due to being redefinitions that differ from DO documentation, e.g., “Determinism”) with certification authorities and representatives and used terms accordingly in MACH178 Foundations.
-
Our certification authorities are never satisfied with the evidence provided. How much is enough?
Per the DO-178C definition, it is not possible to demonstrate “determinism” for multicore systems based due to their inherent non-deterministic nature. Accordingly, A(M)C 20-193 utilizes the DO-297 definition of determinism; “the ability to produce a predictable outcome” where “the outcome occurs in a specific period of time with repeatability”.
We believe this can be achieved if sensible design decisions are made. We discuss this in the Software Architecture Considerations white paper in MACH178 Foundations. Your approach to demonstrating WCET in a multicore system should be agreed with your certification authority in advance. You can find information that’s helpful for achieving this in the WCET Considerations white paper in MACH178 Foundations.
-
Which activities are required if using a multicore processor with all cores but one deactivated?
Using a multicore processor with only a single core active is a popular approach for organizations making their first efforts towards using multicore processors. In principle, developing software where only a single core is active avoids meeting the objectives of A(M)C 20-193, which specifies that applicable Airborne Electronic Hardware guidance (A(M)C 20-152A, DO 254) can be followed to inform how a core should be deactivated.
In practice, however, not all deactivation methods may be available, appropriate, or without unforeseen side effects. For example, we have seen hardware deactivation cause timing penalties due to bus accesses requiring communication from now deactivated cores. Different certification authorities have differing opinions about core deactivation methodologies, which can include requiring the use of redundant methods to deactivate cores.
A detailed white paper on this topic is available in MACH178 Foundations.
-
How much effort does it take to verify multicore software to A(M)C 20-193?
The effort required to meet the objectives of the A(M)C 20-193 is heavily dependent on a number of factors, not least the complexity of the platform and software. With the same high-level requirements, the certification effort associated with a project could have orders of magnitude in difference depending on the platform, RTOS and configuration used, and software architecture choices made. For a simple platform with a carefully curated configuration of active devices and interference channel mitigations in place, together with sensibly structured software, some small number of person years effort may be required. A complex, no compromises platform used to host software with a complex architecture may require tens or even hundreds of person years of effort, and for such a system, there may even be no clear path to certification.
The key to an achievable certification is early planning. Taking multicore certification into account at early design and decision-making stages not only de-risks the certification itself, but will also have massive time and cost saving impacts.
More information on making decisions related to processor, RTOS and software architecture are discussed in a series of dedicated white papers in MACH178 Foundations focused on each area.
-
What role does thermal/environmental testing have in A(M)C 20-193?
Thermal effects can impact any platform, not just multicore platforms, and DO-254 currently has no additional considerations for multicore systems. While a revision to DO-254 is upcoming in which some multicore concerns may be addressed, it may be sensible to assume that on-target testing for A(M)C 20-193 should take environmental testing in mind.
A(M)C 20-193 objectives state that worst-case scenarios are implicit for WCET analysis, which implies ensuring the objectives are met in the worst-case conditions the system will be subjected to. For example, thermal throttling of one core may cause operating frequency changes in other cores.
Dynamic voltage and frequency scaling in general is out of scope of A(M)C 20-193.
-
How much rigor is enough to demonstrate airworthiness for defense avionics? Where do you draw the line?
There is generally a lot more leniency in the defense industry when it comes to adhering to DO or A(M)C 20-193 objectives. It should be noted, however, that interference can have serious impacts on execution time, and may be a performance-limiting factor, not simply a barrier to certification. Generally, military certification authorities may have their own interpretation of the certification objectives, for example the AA-22-01 (USAF) or “Defence Standard 00-970” (UK MAA, though this points to AMC 20-193), but the core activities required to demonstrate airworthiness remain the same.
-
Is the USAF document AA-22-01 equivalent to A(M)C 20-193?
CAST-32A and A(M)C 20-193 was used as a guideline for writing AA-22-01, which will be incorporated in the upcoming MIL-HNDBK-516D. Almost all aspects of A(M)C 20-193 are defined by at least one criterion of AA-22-01. Some small clarifications or differences exist, for example AA-22-01 includes guidelines for an engineering margin (maximum 90% utilization), a specification that the lowest levels of cache must be private, and advice for building and loading processes.
-
Do you have any recommendations for structuring plans for A(M)C 20-193?
The complex nature of multicore platforms and the activities associated with addressing A(M)C 20-193 objectives generate many artifacts. As such, it is crucial to structure the planning documents accordingly.
A(M)C 20-193 does not indicate the need for explicit multicore planning documents, so multicore specific documentation may be incorporated into existing PSAC, SVP and PHAC documents. Alternatively, a multicore version of corresponding DO-178C documents can be written to allow A(M)C artifacts to be independently generated without impacting the existing documents. In either case, it is crucial that traceability mechanisms are in place. We have seen both approaches to documenting A(M)C 20-193 activities to implemented successfully. Which option is best likely depends on the existing infrastructure and team structure in place.
MACH178 Foundations includes a template Plan for Multicore Aspects of Certification (PMAC) and Multicore Software Verification Plan (MSVP), which structurally follow the PSAC and SVP respectively, to allow easy integration into existing documents or use as standalone documents.
-
How can I ensure that a platform will have the required performance with multicore interference taken into account before verification?
You cannot know the performance of multicore software considering interference without determining and testing the interference channels. However, it is very important to de-risk the performance not being sufficient when selecting a platform and software stack.
These considerations are explained in detail in the MACH178 Foundations white papers Processor Selection Considerations, RTOS Selection Considerations, and Software Architecture Considerations.
-
Is it possible to use multithreading in multicore systems? What is the certification impact?
Simultaneous multithreading is out of scope of A(M)C 20-193 as this is not specifically a multicore issue.
In practice, multithreading adds even more complexity to the testing and reasoning about the complex behaviors of a multicore system. As there is no additional guidance or means of compliance available from A(M)C 20-193, any approaches to making a certification argument should be discussed with your certification authority.
-
Is it possible to use GPUs in multicore systems? What is the certification impact?
In principle, using GPUs as a follower device of a CPU for use with displays is a well understood and established approach, which is not fundamentally different in multicore systems.
However, using a GPU for general purpose GPU compute can put the GPU in scope as a core, requiring similar treatment as a CPU with respect to A(M)C 20-193. The exact use case can change the required activities drastically, particularly as there is a gray area as to how core-like the GPU may be.
A white paper on this topic is on a shortlist to be added to future versions of MACH178 Foundations.
-
Is IMA compatible with multicore processors?
In principle, there is nothing preventing IMA (Integrated Modular Avionics, DO-297) systems from including multicore processors. However, IMA requires robust time partitioning to be in place, and this is only possible when all interference channels are mitigated.
However, the system integrator should take care to ensure that even when some applications are causing worst-case interference, other applications are still able to meet all their timing and functional requirements.
This is a significant challenge, as all interference channels must be accounted for all applications.
There are some concessions that may need to be made, in terms of requirements for resource usage and/or CPU utilization on new partitions to be accepted. IMA requires the ability to integrate new partitions without the need for the re-acceptance of other partitions, which is incompatible with A(M)C objective MCP_Software_1, which requires that the WCET is calculated in the intended final configuration. A white paper on this topic is on a shortlist be added to future versions of MACH178 Foundations.
-
What is the impact on certification if applications move between cores?
The movement of applications between two cores can be interpreted in two main ways, the dynamic allocation of applications to cores at run time (symmetric multi-processing, SMP), or the use of multiple static schedules that assign an application to different cores.
In general, dynamic (at run time) allocation of applications to cores is not covered by A(M)C 20-193. If you use this approach, you will likely need to agree an approach to demonstrating compliance with your certification authority.
If software is designed with multiple static schedules that can be selected between at run time, this will increase the complexity of A(M)C 20-193 verification. For each schedule, the software would need to meet relevant A(M)C 20-193 objectives such as MCP_Software_1 and MCP_Software_2.
-
How can I ensure that multicore platform components delivered at different time points exhibit the same behavior?
This is not inherently a multicore problem. Most avionics developers have mechanisms in place for the acceptance of new deliveries from suppliers, but these may not cover testing of multicore-specific problems. This is where tools that allow for efficient re-testing and automation are important. Once the on-target platform characterization tests for interference channel characterization and hardware event monitor validation have been implemented, they can be reused to ensure that results are consistent across different batches of platforms.
-
How much of a challenge is vendors not having or not sharing information about platform components?
Most suppliers that provide multicore platform components to avionics customers are aware of the need to provide detailed documentation and support. However, there can still be issues with IP blocks that have been embedded on a supplied board, for which the vendor has no access to the documentation from their supplier. In these cases, it can become potentially blocking if the information cannot be obtained, or is difficult to obtain, for example requiring additional legal or procurement costs.
In some cases, the only option for achieving A(M)C 20-193 evidence related to such a device’s interference channels may be to perform testing as a black box or reverse engineering the device in question. Such approaches should be discussed with your certification authority as they may not be viable in all cases. In our MACH178 solution, we recommend a means to de-risk platform selection through the Hardware Resource Identification procedure, using the principles described in the Processor Selection Considerations white paper. Both documents are available in MACH178 Foundations.
-
How do I identify interference channels?
Interference channels can be identified through analysis of the documentation of all resources that are in scope for analysis. This should include detailed analysis of all devices, mechanisms and features to understand the potential interactions each presents for software running concurrently on different cores of the multicore processor. We provide detailed instructions of how to perform this activity in the Interference Channel Identification procedure in MACH178 Foundations.
-
Is a parallel or sequential approach to analyzing resources most efficient?
Whether a parallel or sequential approach to analyzing resources is most efficient likely depends on the configuration of your platform analysis team.
MACH178 Foundations includes details on what should be analyzed and how. This is introduced in a template Multicore Software Verification Plan (MSVP) and described in individual procedures for specific activities, beginning with Hardware Resource Identification.
-
How does deactivating resources impact certification?
To satisfy the objectives of A(M)C 20-193, all active resources should be investigated for interference channels, and characterization should be performed on all identified interference channels that are not verifiably mitigated. Disabling unnecessary resources can remove them, and their associated interference channels, from the scope of analysis early on, which can reduce overall verification effort.
-
Don’t many mitigations put too much trust in the scheduling and partitioning mechanisms?
Your scheduling and partitioning mechanisms must be developed to the highest DAL of any hosted software, so relying on them to provide some mitigation methods is a valid approach to mitigating interference channels.
-
Do you have a list of potential interference channel mitigations?
Interference channel mitigations differ from platform to platform, though some general approaches exist such as using cache partitioning, disabling unnecessary devices and features, and applying good principles for software architecture and schedule design.
Some mitigation strategies are discussed in our joint white paper with Wind River “Mitigation of interference in multicore processors for A(M)C 20-193”. The MACH178 Foundations white papers RTOS Selection Considerations and Software Architecture Considerations provide some considerations on the topic, and a dedicated white paper on the topic is on a shortlist to be added to future versions of MACH178 Foundations.
-
For which objectives are interference generators required?
Interference generators such as Rapita’s RapiDaemons are utilized in all on-target testing activities to characterize interference channels, verify mitigations and calculate software WCET. Interference generators are an important part of constructing “worst-case” scenarios for multicore platforms. As such, they are required for on-target testing and analysis activities to address A(M)C 20-193 objectives MCP_Resource_Usage_3, MCP_Resource_Usage_4 and MCP_Software_1.
-
Can interference generators interfere with each other?
Interference generators such as Rapita’s RapiDaemons can interfere with each other.
The procedures in Rapita’s MACH178 solution, which utilize RapiDaemons to produce on-target verification evidence, account for this.
During Interference Channel Characterization, we specify the selection of a RapiDaemon that is sensitive to an interference channel to use as a victim as well as aggressive RapiDaemons. Using sensitive and aggressive RapiDaemons together in this way allows measurement of the impact of the interference channel they are designed for. It is possible for aggressive interference generators, including RapiDaemons to interfere with each other. To determine the full interference profile, we specify testing all possible configurations of aggressive RapiDaemons against a victim RapiDaemon.
During Software Characterization, a similar approach is specified, where the combinations of interference channels are considered and all viable permutations of aggressive RapiDaemons are tested.
Detailed guidance on selecting interference generators is available in the MACH178 Foundations white papers Interference Generator Selection and WCET Considerations.
-
What qualification level should interference generators be classified at?
The appropriate tool qualification level for interference generators depends on how they are used, and with some use cases, some organizations may argue that tool qualification is not required at all.
RapiDaemon interference generators that are used in the MACH178 solution can be classified as TQL 5 or TQL 4 depending on how they are used. This is discussed in detail in the template Plan for Multicore Aspects of Certification in MACH178 Foundations.
-
What do I need to test for my software?
A(M)C 20-193 includes two software objectives, MCP_Software_1 and MCP_Software_2, which explicitly require calculation of the worst-case execution time and understanding of the data coupling and control coupling of your software, respectively. To achieve these objectives, you will first need to understand the platform and the interference channels within it (MCP_Resource_Usage_3). MCP_Resource_Usage_4 also indicates that the ability of the platform to allocate sufficient resources to the hosted software is verified. MACH178 Foundations provides details of how to achieve the objectives in its procedures, templates and white papers.
-
How do you know you have tested all required permutations of interference channels?
While it may not be necessary to test all permutations of interference channels, as some permutations may never be able to produce a worst-case timing result, a strong justification is required to remove the need to test a permutation. As such, the initial starting point may be to test all permutations of interference channels and the software under test, mapped to all active cores. Where you can justify why specific permutation(s) could never produce a worst-case timing result, for example where interference channels are mutually exclusive, you can remove these permutations from your analysis. Information helpful to this can be found in the MACH178 Foundations white paper WCET Considerations.
-
How can you test software running on a continuous loop expecting external inputs?
This is not inherently a multicore problem – it can also be encountered on single core systems. The additional complexity of high-performance multicore platforms may add to the required effort to construct the test vectors and test environment required for software characterization, which must be sufficient to construct “worst-case” scenarios. This may require external stimulation or inputs that can add to the complexity of the activity as it may make it hard to synchronize events or create other challenges. This is often achievable, but the additional effort and de-risking activities should be considered during planning.
-
Can statistical models be used for multicore WCET analysis? Is it sufficient to be x standard deviations away from the mean?
There are some promising paths for doing WCET analysis for multicore systems that utilize statistical approaches. No statistical methodologies have been well established in terms of certification, so your chosen approach should be agreed upon with your certification authority.
Execution time in a multicore system does not follow a normal distribution, so the standard deviation is not meaningful as a statistical measure. This is due to consecutive executions of software not being independent from each other. One cause of this is it not being possible to know or control the initial state of deeply embedded hardware elements, which may have been left in any number of states by previous executions.
The use of statistical models for WCET analysis is discussed in detail in the white paper WCET Considerations in MACH178 Foundations.
-
What is a safe upper bound on software WCET?
This is dependent on your system safety analysis, your safety argument, and your method for WCET analysis. There is no “right answer” and these factors should be discussed with your certification authority. More details on methods for WCET determination can be found in the WCET Considerations white paper in MACH178 Foundations.
-
What feedback have certification authorities given on the MACH178 approach? Is it proven?
The MACH178 approach has been used in a successful certification of multicore software with the Spanish National Institute for Aerospace Technology (INTA), which oversees certification for Spanish military avionics. There have only been very few complete certifications of multicore software for avionics so far, and we are frequently involved in projects at every stage of the process.
-
How many multicore certifications have been completed using Rapita’s approach to A(M)C 20-193 compliance?
The MACH178 approach has been used in a successful certification of multicore software with the Spanish National Institute for Aerospace Technology (INTA), which oversees certification for Spanish military avionics. There have only been very few complete certifications of multicore software for avionics so far, and we are frequently involved in projects at every stage of the process.
-
Are unknown behaviors often discovered during platform analysis? If so, how do you validate them?
We have found unexpected hardware behaviors on every project we’ve worked on. Causes can range from silicone bugs to undocumented IP blocks that have interference channels associated with them. Solutions can range from no action needed to needing to reverse engineer devices to understand their architecture and enable interference channel characterization.
-
How are unexpected behaviors discovered on multicore platforms?
To identify interference channels on a multicore platform, it is necessary to analyze technical documentation that describes the functionality of platform components. This documentation will generally be accurate, descriptive and helpful, but will likely contain some errors, omissions, contradictions and ambiguities. This might include missing or vague information about a feature or definitions that are in contradiction with the actual functionality. In general, issues can often be resolved or clarified through support from the vendor to enable continued analysis.
In our experience, most unexpected behavior is on-target behavior which differs from that expected from an analysis of the documentation. As such, anomalies or errors are generally detected through on-target testing, and they are generally discovered when attempting to stress associated interference channels by targeted testing of a specific device or feature. A DDR interference channel, for example, will require tests to correctly miss L1, L2 and maybe even L3 cache. Depending on the architecture of the system, this may require using different datasets, strides through pages and access patterns.
Performance counters can be used to provide assurance for the correct execution of tests to characterize interference channels. Their use can highlight unexpected behaviors due to deviations from expected test outcomes, which should prompt further investigation and analysis. In an extreme example of this, we discovered and reverse-engineered an entire undocumented cache, which of course had associated interference channels.
Multicore certification is a complex topic, and you may have questions that aren't answered above. If you do, you may be able to find answers in our public multicore training courses or MACH178 Foundations, our off-the-shelf guide for multicore DO-178C certification, which includes template plans, procedures and checklists, as well as white papers on specific multicore topics.