CERT C Benchmark

October 27, 2022

What is CERT C & why is it especially relevant to improve code safety, reliability, and security? 

Table of Contents

  • What is CERT C & why is it especially relevant to improve code safety, reliability, and security?
  • The Benchmark Breakdown
  • Undefined Behavior Only Rules
  • All CERT C Rules
  • Medium-High Severity Rules
  • High Severity Rules 
  • Example of a UB-Causing Rule that Requires Static Analysis to Detect
  • Why does TrustInSoft Stand Out?
  • Continued R&D
  • Conclusion 

What is CERT C & why is it especially relevant to improve code safety, reliability, and security?

The SEI CERT Coding Standards are software coding standards developed by the Software Engineering Institute of Carnegie Mellon University. They are steadily becoming one of the key industry references for creating safe and secure software. One of these is SEI CERT C which has been updated for C11 but is also applicable to the earlier versions of the C language.

CERT C is primarily intended for software developers. However, it is also used by software integrators to define the requirements concerning code quality. There is a special interest for high-stakes and critical code developers who must build reliable code that is robust and resistant to attacks. That is why these standards are increasingly being used as a metric to evaluate the quality of the source code.

The goal of the SEI CERT C Coding Standard is to improve the safety, reliability, and security of software systems written in the C language. The standard achieves that goal by providing a set of guidelines for secure coding. These “guidelines” are divided into two subsets: the “rules” and the “recommendations”:

  • The rules are particularly important. They concern software defects that may directly and adversely affect the safety, reliability, or security of a system. Violating a CERT C rule can result in introducing a security flaw into the code – a flaw that is potentially an exploitable vulnerability.
  • CERT C rules are well-defined and formulated in a strict way. This allows for determining rather unambiguously whether a piece of code is conformant with the given rule or not.
  • Recommendations are less clear-cut than the rules. These are suggestions that help improve code quality. Some of them are more generic and incomplete versions of particular rules, while others just describe good coding practices.

For achieving CERT C compliance, adhering to the rules is critical. Following the recommendations are merely optional. Satisfying the CERT C rules is indeed considered very strong evidence that you do not have vulnerabilities in your code.

CERT C defines 120 rules. Each rule, in addition to detailed explanations and rationale, contains at least one pair of C code examples:

  • a non-compliant code example that illustrates a violation of the rule,
  • and a compliant code example – a fixed version of the first code snippet – which is now conformant with the rule.

It is expected that a Code Analyzer raises an alarm on the non-compliant example. It is also important that a Code Analyzer does not raise any false alarm on the compliant example, false alarms greatly hinder the usability of Code Analyzers, wasting time in developer reviews that prove useless after the fact.

Also, not all guidelines (rules and recommendations) are equally important. Each rule has several attached metrics which provide an indication of the consequences of not adhering to the rule, e.g.:

  • Severity (Low/Medium/High): How serious are the consequences of the rule being ignored?
  • Likelihood (Unlikely/Probable/Likely): How likely is it that a flaw introduced by violating the rule can lead to an exploitable vulnerability?

CERT C rules are mostly semantic in nature – they deal with the meaning of the source code and the behavior of programs. Many of them concern undefined behaviors (as defined by the C standard) which should be avoided at all costs.  These are unpredictable outcomes in the code that should never be left unexamined because they are inherently very dangerous. Not only do they have the potential for disaster, but they can be very difficult to detect. One very infamous example of such an undefined behavior is the out-of-bounds array access. It is addressed by the CERT C rule ARR30-C: Do not form or use out-of-bounds pointers or array subscripts.

Imagine you have a static array int t[42] defined in the program. Its cell t[i] is accessed somewhere in the code. If you want to determine whether this access is valid, you need to know exactly which values this variable i takes at this program point. Obviously, knowing this may be arbitrarily difficult. For example, if you read the program and see that i is an integer constant equal to 13, you know that this access will always be safe. But if i is computed inside two nested for-loops within some complex sorting function, then you are out of luck – you need to understand what the program is doing there in order to determine anything useful. This is the case with most semantic rules – outside of very simple programs, it may be arbitrarily difficult to check if such rules are respected or not.

In contrast, other coding standards are often primarily syntactic – they deal with the superficial characteristics of the source code which do not always affect the program’s behavior. Syntactic coding guidelines forbid certain constructions that are not always an error. It is also worth noting that not everything that is forbidden by syntactic-based standards systematically indicates a real error.

In MISRA C for example, it is easier to discern whether a rule is being followed or not by simply observing the source code. There are even some instances where breaking a MISRA C rule is necessary. Often, it is possible to achieve these same program behaviors in other ways, in a way that gets the same results with constructions that are not forbidden. View an example of conversion between pointers of different object types should not be performed here. MISRA C rule conversion between pointers of different object types should not be performed forbids constructions like converting an int pointer into a short pointer, e.g., (short*)p where p is an int* . This construction is not dangerous by itself and is frequently used in practice because of the lack of genericity in the C language.

This is not, however, the case with CERT C.

You could not check off boxes one by one from the CERT C guideline list even if you wanted to. It is inherently difficult to say whether a program satisfies CERT C rules. You cannot manually, or even with most tools, check all the execution paths and all the possible values – and this is what it takes to verify if one of such semantic UB-causing rules is being respected.

This is unlike MISRA C where you can easily check whether the rules are respected because they mostly are syntax rules. You can determine whether a program is respecting the rule by looking at the source code of the program yourself or using any decent static analysis tool. It is important to note that just because syntax rules are adhered to does not mean that there are no vulnerabilities in the code.

Now, with exhaustive static analysis based on formal methods, you can ensure that you’ve met the CERT C rules with mathematical proof.

The Benchmark Breakdown

This benchmark illustrates TrustInSoft Analyzer’s performance in relation to the CERT C rules.

It has two parts:

  • First, we look at a subset containing only the rules that can cause undefined behavior (as defined by the C11 standard). These are the most dangerous coding errors that can result in vulnerabilities.
    We will show the level at which TrustInSoft Analyzer identifies these undefined behaviors in comparison to traditional tools.
  • Second, we will discuss a similar comparison between TrustInSoft Analyzer and traditional tools, but with relation to all the CERT C rules.

We will also compare the results on subsets of CERT C rules of:

  • medium to high severity
  • and high severity.

Find out more CERT C measurement parameters in the Risk Assessment Section.

The performance was calculated in comparison with two traditional static analysis tools which will hereby be referred to as Tool A and Tool B.

It is worth underlining that the methodology used in this benchmark was very strict:

  1. A test suite was constructed using code snippets taken directly from the CERT C webpage describing each of the 120 rules. At least one compliant and one non-compliant test case was included for every rule.
  2. These code snippets were run through TrustInSoft Analyzer, Tool A, and Tool B.
  3. The results obtained this way were examined to check:Whether a tool correctly identified errors in the non-compliant code examples.Whether a tool correctly marked the compliant code examples as free of errors.

This strict methodology lets us assess how each tool is effective in finding bugs in an objective way.

Undefined Behavior-Only Rules

Figure 1: UB only: percentages are out of 52 rules. 

CERT C Benchmark undefined behavior rules comparison

As you can see in Figure 1, TrustInSoft Analyzer finds the bugs in 87% of the undefined behavior subset of CERT C rules.

Tool A can find the bugs in 55% of the undefined behavior subset.

Tool B can find the bugs in 38% of the undefined behavior subset.

All CERT C Rules

Figure 2: All rules: percentages are out of 120 rules.

All CERT C rules comparison

 In the context of all CERT C rules, Figure 2 shows that TrustInSoft Analyzer finds bugs in 59% of all CERT C rules.

Tool A finds bugs in 34% of all CERT C rules.

Tool B finds bugs in 28% of all CERT C rules.

Medium-High Severity Rules Coverage

Figure 3: Medium to High: percentages are out of 70 rules.

High and medium severity CERT C rules comparison

For High and Medium Severity rules, as seen in Figure 3, TrustInSoft finds bugs in 66% of the two-tiered subset. 

Tool A finds bugs in 32% of the High and Medium severity rules.

Tool B finds bugs in 27% of the High and Medium Severity rules.

High Severity Rules Coverage

Figure 4: High: percentages are out of 34 rules.

High Severity CERT C rules coverage

In Figure 4, we see that TrustInSoft Analyzer finds bugs in 76% of the High Severity rules of the CERT C test suite.

Tool A finds bugs in 40% of the High Severity rules.

Tool B finds bugs in 38% of the High Severity rules.

As you can see in figures 1-4, TrustInSoft Analyzer consistently outperforms traditional static analyzers in finding CERT C rule violations. This helps you ensure that you are in line with the most critical aspect of CERT C compliance.

The gap between TrustInSoft Analyzer and other tools only widens when considering the highest severity rules. That is to say that TrustInSoft Analyzer is better able to find the most serious problems.

Example of UB-Causing Rule that Requires Static Analysis to Detect

Let us look again at the CERT C rule ARR30-C: Do not form or use out-of-bounds pointers or array subscripts. As mentioned before, one of the sub-cases of violating this rule is Using Past-the-End Index (i.e., attempting to read/write an index of an array that points past its end) and assessing whether this can happen or not can be arbitrarily difficult. For example, we might consider this small function: 

 
void f(int t[], int max) {
    for (size_t i = 0; i < max; i++)
        for (size_t j = 0; j < max; j++)
            t[t[i + j]]++;
}
 

There are two operations where it is possible to access past-the-end index of the array t in this example:

  • First, when we access the cell with index i + j.Second, when we access the cell with index t[i + j].

The validity of first access is simple to assess. The maximum value of i + j in this program is equal to twice the value of the function’s argument max. So, as long as two times max does not exceed the size of array t(which is unknown inside the function f), the behavior is well defined. In order to reason about this access, it is sufficient to: know the size of the array t); know the value of the argument max; and understand for-loops and the addition operator.

The validity of the second access is a much more difficult problem. The maximum value of t[i + j] depends on the contents of the memory. These values can change dynamically during the function’s execution, as some cell of the array t is incremented in every iteration of the for-loop. Therefore, reasoning about these values requires not only the knowledge of the full contents of the array t before calling function f, but also understanding how these contents are changed inside function f .

This little example is obviously somewhat artificial, but it showcases the point well. The CERT C rules are mostly inherently semantic rules. Verifying compliance with such rules often requires understanding the program’s semantics – examining superficial code patterns is not enough.

Why Does TrustInSoft Analyzer Stand Out?

TrustInSoft Analyzer stands out among its competitors in this comparison. What allows us to get higher quality results? How do we reach a higher percentage of the rules that have the biggest impact on your code?

The reason for the large gap between the results of TrustInSoft Analyzer and other tools is the huge difference in the underlying technology. Our tool conducts semantic analysis using formal methods to interpret the source code. In other words, it leverages the power of mathematical reasoning on the logic of your program to obtain proven and guaranteed results. Traditional static analysis simply cannot compete with this level of precision because it does not fully understand the semantics of your source code and its behavior.

Thanks to this innovative approach, TrustInSoft Analyzer provides unique results with guaranteed zero false negatives and a very low number of false positives. That means that we are able to find all of the vulnerabilities that we search for without wasting your time on false positives.

TrustInSoft brings you beyond the threshold of compliance with CERT C and into the realm of real results. You can say you are more than just “certified compliant with CERT C” because you are certain, with a mathematical guarantee, that your code is free of vulnerabilities that could be caused by violations of these CERT C rules.

Continued R&D

TrustInSoft Analyzer uses state-of-the-art formal methods technology to get to this level of results, and finetuning the tool to include additional rules that contain undefined behaviors that aren’t currently covered requires more R&D in formal methods, which TrustInSoft’s team is working on.

Conclusion

TrustInSoft Analyzer outperforms traditional tools on CERT C rules, allowing for complete detection of these undefined behaviors and a mathematical guarantee of their absence.

TrustInSoft Analyzer does this while producing no false negatives (we find ALL the undefined behaviors without restrictions) and no to few false positives. It goes beyond the scope of traditional static analysis using formal methods and has a deep semantic understanding of the code to perform exhaustive static analysis.

TrustInSoft Analyzer guarantees to find everything meaning that all bugs are detected with no false negatives in the code that might be a violation of the CERT C rules within its scope no matter the complexity of the program.

To learn more about TrustInSoft Analyzer and how you can use it to write code that conforms to the CERT C guidelines visit our product page.

If you want to talk to an expert about the CERT C benchmark or see a demo, get in contact with us here.

Newsletter