Why conventional static analysis and software testing are not ideal for removing security vulnerabilities from low-level code

January 16, 2023

Discover some of the drawbacks to traditional testing methods and modern alternatives in this blog or for the full white paper, click here

In the first post of this series, we discussed why ensuring the security of low-level code—the code used in OS kernels, device firmware and other applications that interface directly with hardware—has become critical to the overall cybersecurity of embedded systems.

In this post, we’ll explore why developers of low-level code need to go beyond traditional static analysis and software testing to verify that their applications are free from coding defects that can be exploited by hackers.

This post is Part 2 of a 3-part series derived from TrustInSoft’s latest white paper, “From Bare Metal to Kernel Code: How Exhaustive Static Analysis Can Guarantee Airtight Security in Low-level Software and Firmware” To obtain a FREE copy, CLICK HERE.

At the conclusion of our last post, we asked the question, “How can software development organizations protect their products against (a software hacker’s) exploits?” That is to say, how can we remove all the bugs that leave our low-level code vulnerable to such exploits?

Most software development teams answer this question the way it has been answered for years: with conventional static analysis and software testing. But as you will soon see…

Traditional analysis and testing are not the answer

The two standard solutions for software verification and bug removal—and still the most common methods used today by the majority of software and systems developers—are traditional static code analysis and software testing.

Unfortunately, both these methods have shortcomings that are magnified when applied to embedded systems and other low-level code applications.

Drawbacks of traditional static analysis

Unlike applications that run atop an operating system, low-level code doesn’t have the support of an abstracted, generic platform created by an operating system. It must take into account the specifics of the hardware on which it runs and any restrictions that hardware presents, such as power consumption constraints or memory limitations. Code for embedded systems often has to meet very stringent timing requirements as well.

For those reasons, low-level code often can’t conform to coding standards built for upper-layer applications. What’s more, low-level code accesses memory in a manner that is quite different from that of higher-level (abstracted) applications. Traditional code analysis tools are generally not equipped to deal with either of these factors. As a result, they frequently yield a high volume of false positives and false negatives when applied to such code.

False positives

Traditional static analysis is based on a set of rules that the static analysis tool expects code to follow. These rules include standards of what is considered good coding structure. In a static analysis context, a “false positive” occurs when the static analysis tool incorrectly reports that one of its rules was violated.

Since low-level programmers must account for the particulars of their target hardware and tend to stray frequently from the rules of standard coding, low-level code is prone to high volumes of false positives when conventional static analysis tools and techniques are applied to it.

False positives tend to annoy developers, because they slow progress, increase the tedium of the job, and waste precious time. They force programmers to investigate issues that turn out to be unimportant.

Developers get bored very quickly with verifying errors flagged by their static analysis tools. The tedium of spending days investigating large numbers of false alarms can often lead them to dismiss some warnings as false positives when they are, in fact, true bugs. They thus compromise the integrity and security they’ve been trying to build into their system.

False negatives

“False negatives” are undefined behaviors (bugs) that are missed and therefore not flagged by the analysis tool.

Since the structure of low-level code is often complicated due to its hardware constraints, it may contain errors that traditional static analysis tools are not programmed to recognize. Some of these bugs may require a significant amount of calculation to reveal—calculations that are omitted from traditional static analysis tools in the interest of returning results very quickly.

Thus, once you’ve managed to correct all the bugs and verify all the false positives your static analysis tool has found, you may be left with a false sense of security. In reality, this is a very dangerous feeling. Your tool has given you the green light, but there may still be dozens or even hundreds of bugs in your code. Some of them could be very serious.

In a critical embedded system, these unflagged errors—these false negatives—could be disastrous for both the system manufacturer and their customer, as they were in cases like:

The WhatsApp Integer Overflow,[i]
Toyota’s unintended acceleration firmware problem,[ii]
Smiths Medical’s Medfusion 4000 Wireless Syringe Infusion Pump,[iii] and
The Boeing 787 integer overflow error.[iv]

Drawbacks of traditional software testing

Like traditional static analysis, software testing also suffers from two major drawbacks when used to verify low-level code, especially code that must be either highly reliable or highly secure.

The first of these drawbacks is the length of the testing process.

Traditional software testing relies on defining test cases that account for as many operational scenarios as possible. You then run tests until you either (1) cover all your scenarios, or (2) run out of time. The latter tends to be the more frequent case.

For complex code, however, the number of possible test cases—i.e. the number of possible input and state combinations—can be astronomical. Even a vaguely representative subset of those cases could require more time than the project schedule and budget will allow.

The second drawback, highly related to the first, is test case coverage.

You may have an automated test campaign that tests for millions of input value combinations, but some issues are specifically difficult to track. For example, a test with a given input value could well be successful but this same test with a similar input value but a different state (e.g. different values stored in memory) could fail. You can never test every combination because there are simply too many. Even when you stop finding errors, you’re never sure if you’ve tested enough.

So, just as you don’t know how many bugs your static analysis tool failed to flag, you don’t know how many of those bugs also slipped past your testing campaign.

Each of the drawbacks just discussed presents a risk many organizations cannot afford to take. They would be exposing their customers to potential dangers which are difficult to predict. As a consequence, they would be exposing their own company to costly product recalls, prolonged loss of revenue, potential lawsuits, and long-term brand reputation damage.

So, again, how can companies protect themselves? What can they use instead?

We’ll give you the answer in our next post.

Download the white paper

In our next post…

We’ll conclude this series by examining why exhaustive static analysis—the method employed and automated by TrustInSoft Analyzer—is ideally suited to ensuring the security of low-level code.

References

[i] Pieter Arntz, Critical WhatsApp vulnerabilities patched: Check you’ve updated!,Malwarebytes Labs, September 2022.

[ii] Dunn, Michael, Toyota’s killer firmware: Bad design and its consequences, EDN, October 2013.

[iii] ICS Advisory (ICSMA-17-250-02A): Smiths Medical Medfusion 4000 Wireless Syringe Infusion Pump Vulnerabilities (Update A), CISA, September 2017.

[iv] Goodin, Daniel, Boeing 787 Dreamliners contain a potentially catastrophic software bug, Ars Technica, May 2015.