All serious developers and project managers pride themselves on the quality of code they create and maintain. Acutely aware of the disastrous consequences which may arise from unnoticed bugs and vulnerabilities[Heartbleed,Toyota Recall 2018], they put tremendous effort into assuring their software is robust, correct, and secure.
During long busy days, fueled by caffeine rushing through their veins and cool breeze of air conditioning blowing in their faces, they manage to remain calm and steady while navigating through the towering piles of feature requests stacking up in their inboxes. They are careful to never steer into uncharted territory without thorough preparation. They move forward only after they meticulously design necessary test cases first, doing their best to avoid any foreseeable hazards. Then, during long, sleepless nights, in a dim glare of computer screens and soft hum of cooling fans, they do not abandon their posts despite their fatigue. Instead, they stare incessantly at the sea of code they feel responsible for, their sharp eyes scrupulously scanning each line with focus and perseverance, looking for any sign of danger.
These stalwart individuals are compelled to always carry the heavy burden of uncertainty and responsibility. Especially those who work with C/C++, the most fearless and resolute of all developers, accustomed to wrestling an element so incredibly powerful and volatile. This single chilling question is always playing in the back of their heads: “What if I missed something?”. Because no matter how many test suites were prepared and executed, how strict coding guidelines enforced, how advanced sanitizers employed, and how many hours spent carefully reviewing the code, there is simply no guarantee for the results.
Well, the solution we propose in this paper provides guarantees. Strong mathematical guarantees based on formal methods and backed with decades of scientific research. It relies on exact techniques, like abstract interpretation[Abstract Interpretation in a Nutshell], in order to prove properties of programs written in C/C++. No more uncertainty, no more hoping for the best. Instead, hard evidence that allows to actually trust the software for once.
Formal verification is a wholly different paradigm than verification by testing. The level of assurance it provides, concerning both safety and security of the analyzed code, is radically superior compared to other methods. Generally though, formal verification of a program is a difficult, expensive, and time-consuming process which requires skill and experience[seL4,CompCert]. Due to the high cost, in terms of both time and effort, this approach does not suit everyone. According to Wikipedia, formal verification may be as much as 80 percent of the total design cost[Wikipedia: Software verification and validation]. For most, it is only used for particularly critical pieces of code.
TrustInSoft proposes a new way of addressing this issue. We hope to make formal verification more accessible by implementing a hybrid solution. It equips developers with tools, and an accompanying methodology, to gradually lift a C/C++ project from the level of verification by testing to the level of formal verification through leveraging already existing test suites. This solution provides access to the many blessings of state-of-the-art formal verification techniques without necessarily performing a full-scale analysis of the code[applying formal methods]. The goal is to provide developers with great advantages for a fraction of effort.
Systematic approaches to developing correct software
When striving to develop a correct program, writing a set of tests is usually the first solution which comes to mind. And it is a pretty good one! A set of tests is a quick, simple, and effective means to rapidly find the most glaring faults in the code. You can even reverse the order and start by writing tests to engage in test-driven development. This is an excellent strategy to efficiently write both simple and complex programs.
However, a set of manually crafted test cases can only go so far. The problem of building safe and secure software must be eventually tackled in a more systematic manner if the ultimate goal is to achieve a complete and reliable outcome. There are several distinctive perspectives and approaches which allow advancement on the glorious path towards correct code.
Automated testing / test generation
The efficacy of testing itself can be vastly improved using dedicated methodologies and tools. We may employ various techniques of automatic or semi-automatic testing or test generation to greatly enhance test coverage and pertinence. For example, fuzzing is a highly adaptable and efficient method for providing invalid, unexpected, or random inputs for a given program for testing purposes. Using tools that implement such techniques[afl] is an excellent idea which we recommend wholeheartedly.
The important thing to keep in mind is that no matter how advanced a methodology or a tool applied, the result, basically, is more tests and / or better tests. Unfortunately, it is impossible to test every possible scenario of non-trivial program’s execution one by one. There are just too many of them.
Testing is like probing the waters to check for rocks (or sea monsters) under the surface before veering forward. It is definitely better to probe more rather than less. It is more efficient to probe in relevant places rather than employing a random technique. Likewise, it is more prudent to probe more intensely in the areas where more danger possibly exists. Still, regardless of how these checks are completed, there is no way to physically probe the whole sea in such a point-by-point manner. For programming, this creates uncertainty. There is simply no way to tell if a particular neglected spot does not hide some grotesque monstrosity, ready to crush the hull of the ship with its horrific tentacles and gruesomely devour the entire crew with its dreadful yawning maw. On the other hand, no risk, no fun. Arr!
Dynamic checking tools
Now, one step further from testing there is dynamic checking. Tools that implement this approach are known as sanitizers. Verification still happens mostly on the level of executables. Sanitizers determine whether the program is exhibiting any dangerous behavior during its execution. Their added value, on top of traditional testing, is that such tools reveal hidden problems which do not impact test results or directly cause runtime errors.
Two main techniques are employed to track what a program is actually doing. The first technique is to insert additional instrumentation into the original source code during compilation (some sanitizers work simply as compiler extensions, e.g. for Clang we have ASan, UBSan, MSan, and TSan). The second technique is to simulate the program’s execution in a virtual environment controlled and monitored by the sanitizer (e.g. Valgrind can do that).
As both these methods allow for a deeper level of assurance about the correctness of program’s behavior on particular test cases, sanitizers can be roughly summarized as testing on steroids. We perform the same tests as usual, but we discover more problems. We do not only see what is happening on the surface of the water (i.e. the test’s results), we can also peek a little bit below the waves to get an idea of what lurks beneath (and see if some dangerous operations were performed during the test’s execution). Perform the same tests as always, detect ten times more krakens! Nice!
Using sanitizers is a direct improvement over simple testing. As such it is highly recommended! Their main shortcoming is that such tools are still not exhaustive. They can (and will, but hang on, we are going to highlight some example a bit later) miss many problems. Sanitizers are, in essence, just debuggers. They help find and understand faults in the code, but they will never be able to guarantee to eliminate all the faults.
Pattern-matching tools (traditional static analysis)
Another interesting set of tools are all the different linters. Linters work directly on the source code level and use pattern-matching, and other traditional static analysis techniques, in order to find and flag potential problems. Depending on their sophistication, these tools can detect many different classes of issues: obvious programming errors, violations of style guidelines, or usage of suspicious constructs. Using a linter is a great way to improve general code quality and spot bugs. Moreover, this category of tools is pretty handy when striving to reach conformance with a particular coding standard (e.g. Misra, Cert C) especially in the area of style-related rules and recommendations.
The caveat here is buried in the main assumption underlying these tools: the hope that by making code nicer, we will also make it more correct. Of course, making code nicer does not guarantee correctness. Blindly following any coding standard does not guarantee it, either[Safer Language Subsets]. That said, following style guidelines or coding rules and generally writing nice code is not a half bad idea. Keeping everything tidy and manageable may not be the ultimate and complete solution to all problems, but it definitely helps eliminate some immediate errors, and yields undeniable maintenance-related gains in the long run.
Every sailor knows that you should keep your boat clean and orderly. This is not for aesthetic reasons, but for safety! It helps to avoid potential problems, makes those problems easier to spot, and makes it easier to react faster. No, it does not stop a rope from snapping. Yet, it definitely improves a sailor’s chances of noticing that a rope is beginning to fray. This increases the odds of preventing the problem or at least allows for the mitigation of consequences.
Correct from the ground up
A radical approach to eradicating bugs and vulnerabilities is to not make them in the first place. There are several technologies and frameworks that aim at designing and building correct software from the ground up. For example, the SCADE Suite[SCADE] uses a formally-defined, domain-specific language to express what the program should do. It then generates executable C/C++ code that is correct by design. Other noteworthy examples are the B method[B method], used for safety-critical systems, and Cryptol[Cryptol], used for cryptographic algorithms.
Following such an approach is extremely effective and recommended when possible. Unfortunately, it does have many constraints. These techniques are best suited for the development of embedded critical software. They are definitely not applicable to existing programs, at least not unless rewriting them completely is an option.
Full formal verification
Finally, let us get back to the idea of fully verifying the software formally. Formal verification of a piece of code is undeniably superior to both testing and pattern-matching. It provides actual mathematical guarantees concerning its safety and security properties. With a tool like TrustInSoft Analyzer, such an undertaking is within the realm of feasibility (even if it remains a rather ambitious task, especially for a beginner).
Looks like we have cornered ourselves into inventing another nautical metaphor… Well, alrighty then!
Formal verification is much like creating a mathematical model of the ship, analyzing its behavior with respect to the sea, and proving some properties about it (e.g. that it is sea-monster-proof). Which is in fact not so far from what we do when designing boats, cars, bridges, or other objects used by people! Important aspects of such objects, like material endurance versus weight to bear, are mathematically determined and calculated on paper using models based on our understanding of the laws of physics. Doing the same for programs seems nothing but sane and rational.
Additionally, modeling behaviors of programs is a much more precise discipline than modeling behaviors of physical objects. At least in theory, we humans have more mastery over computer science than over physics. Hence such an approach should be natural and more exact than what we can ever hope to achieve in architecture or naval engineering.
The solution we advocate proposes to boldly go in the same general direction, but it strives to eliminate the daunting difficulty of performing a full formal verification. To decrease this difficulty, we will be aiming a bit lower and piggybacking on existing groundwork. Essentially, we narrow the verification’s objectives and we cut down the amount of verification work. Instead of attempting to formally verify a C/C++ program with all the bells and whistles, the goal is to benefit from as many advantages of formal verification as possible with minimal effort. We achieve it by using TrustInSoft Analyzer and following a specific methodology based on leveraging existing test suites. This minimizes the cost and maximizes the gain.
Even taking a single small step toward formal verification of a piece of code is precious. It already unlocks access to results and benefits that are unachievable by simple testing or even dynamic checking. In this approach, formal verification is fueled by the tests. Thus it combines perfectly well with all the test generation techniques, further amplifying their efficiency. In addition, it opens door for potentially continuing the route to complete formal verification.
So, essentially, we are presenting a brand new trade-off between the quality of software verification and the cost of software verification. This solution employs formal methods, immediately outclassing tests and pattern-matching. And it manages to do so without the steep increase in the overall development effort often associated with formal methods.
What is TrustInSoft Analyzer? What does it do?
The TrustInSoft Analyzer is a powerful and versatile tool used for advanced verification of critical C/C++ software. It can determine with mathematical certainty if the provided code is safe and secure by detecting undefined behaviors[Undefined Behavior Guide] or proving the absence of those behaviors.
Here, we use the Analyzer in a very specific way, though. We tailor it for a much narrower and more specific purpose. Thanks to this particular configuration, called the interpreter mode, we manage to circumvent all the inevitable complexity involved when performing a full-blown analysis. We apply the Analyzer almost directly to existing test suites with minimal setup effort.
In fact, this tool is available for evaluation now! An online version, TSnippet, which may be used to analyze snippets of C code, can be found here.
When software is tested, it usually means that the compiled program is executed on a computer, and its actual behavior is compared to how it was expected to behave. The program is fed specific inputs (defined by the test case) and its outputs are checked for errors.
TrustInSoft Analyzer does not execute a compiled program natively on the machine where it runs. Instead, it works directly on the program’s source code level, interpreting it line by line and simulating the program’s behavior in a virtual environment. Based on complete formal understanding of the C/C++ language semantics and a complex model of the computer’s memory, this simulation is mathematically sound.
Thanks to this approach, the program is no longer a black box. All the details and fine points of its internal behavior become observable. Now, not only the program’s outputs can be checked for errors, but everything that happens during the program’s execution can be thoroughly examined and verified for signs of trouble. In difference to the dynamic checkers, this verification is sound and exhaustive. We detect all the faults without fail.
Why is this important? Because programs are vicious little clever beasts. They are fully capable of committing most atrocious, forbidden, and dangerous things. And they can hide their crimes so well that they can go unnoticed for a very long time. Moreover, these monsters do not even feel guilty about it… And then one day, when you go down to the basement and stumble upon all these cadavers stuffed in the freezer, your program just goes “But why are you mad? I was not supposed to do that?”. So, do not trust them. Install cameras in your proverbial basement. Be safe.
In other words (coming back from whatever happened in the last paragraph…), it may happen that the program’s naughty behavior does not cause a runtime error and does not change the test’s result (so is not observable in any way during testing). It may even be that such behavior is masked by the compiler optimization or triggers only on a particular target architecture in specific circumstances (so is not detectable by a sanitizer, either). In this manner, just because of unfavorable circumstances, the problem might get overlooked. And a bug or vulnerability will remain concealed, lurking somewhere in the program, waiting to be exploited.
Writing outside of array’s bounds is a perfect example of such a situation. In some cases, out-of-bound writing will not cause a runtime error and will not alter the program’s output. Maybe it will just change the value of some random object, which very rarely impacts the program’s behavior. Maybe the write will even get completely optimized away during compilation. And then, one day, some unfortunate event or a malevolent hacker discovers a combination of parameters and environment variables which causes this out-of-bound access to be executed and do something consequential and… all hell will break loose.
We can see a good and simple example(borrowed from [Undefined Behavior in 2017]) of such fault in this little piece of code: TSnippet: buffer overflow. Other tools will not warn about this undefined behavior because it gets optimized away before the instrumentation has a chance to run, as we can observe here: Godbolt: buffer overflow missed by ASan. Another example[idem] of a problem which will not be discovered by dynamic checkers is this case of unsequenced access to a memory with two different pointers referencing the same memory location: TSnippet: memory aliases.
TrustInSoft Analyzer can detect if such dangerous behaviors may happen during the program’s execution. In fact, it is capable of doing much more than just detecting them! When Analyzer concludes that such behaviors do not happen, it does not only mean that it cannot find any. It means that it has mathematically guaranteed the total absence of out-of-bound accesses in the context of the given test case. And this is not an uncertain “Well, umm, I searched around, and I could not really find any problems. I guess it is probably all right…” situation. Instead, this is a serious “There are no problems left! We’ve taken care of all of them, you can trust me on that, sir!” situation.
Now, enough talking about the wonderful advantages of TrustInSoft Analyzer (which is easy to use, formally sound, fully exhaustive, soft to the touch, and provides incredibly strong mathematical guarantees). Let us discuss what is necessary to deploy this magnificent tool on an actual C/C++ project.
Source code: complete C/C++ projects
TrustInSoft Analyzer works directly on the source code level. In order to carry out a meaningful analysis of a program, it requires access to all the C/C++ source code that the given project uses or includes. Unfortunately, this constraint restricts the Analyzer applications to complete C/C++ projects. The code of all the dependencies must be available, otherwise the Analyzer is simply unable to provide any immediate service. Compiled binaries are not enough!
However, there are some means that allow circumvention of this restriction if necessary. A good practical solution is, for example, to create an appropriate stub for any function whose source code is not available which is called by the program. This requires some effort and is a little more advanced than running the Analyzer on your code as-is, but that is the price of exhaustiveness. The tool does not make guesses about the source code, it only works within a well-defined context.
Existing test suite
The methodology presented here allows to quickly deploy TrustInSoft Analyzer on a C/C++ project by leveraging an existing test suite. We analyse the program’s behavior by performing abstract interpretation of all the available test cases. Basically, we use existing test drivers directly as analysis drivers. A test suite with significant coverage greatly increases the profitability of this approach.
Know-how: to compile and build the project
When working on a complex multi-file project, TrustInSoft Analyzer needs to know how the whole project is compiled and built. The information concerning which source code files are used, which headers should be included, and what compilation options should be set, is necessary to properly parse and analyze the code.
This know-how might have different shapes and forms, possibly not even particularly structured or formal. Maybe all the instructions about building and compiling are just written in plain English in a single README file. Maybe the project comes with a regular Makefile. Either way, when setting up TrustInSoft Analyzer, all the relevant pieces of information must be extracted from these sources (manually or using some helper tools) and an appropriate configuration file, tis.config, must be produced.
Run, correct, rerun, guarantee!
After proper configuration, the Analyzer can be run on each of the tests which constitute the test suite. If undefined behaviors are detected in some tests, it will provide specific warnings. The underlying program faults should be investigated and corrected before continuing. Then, the Analyzer should be run again and again to discover problems, each of which should be corrected one by one, until no more undefined behaviors are found in the code.
When a certain test passes through the Analyzer without any problem, this means that the corresponding execution path in the source code is guaranteed to be 100% free of undefined behaviors. Mathematically proven, pinky promise!
What comes next?
What happens when the whole test suite passes through without raising any alarms? Well, you can finally sit in that soft, comfortable armchair in front of the fireplace, open that priceless bottle of old bourbon, light that exquisite Cuban cigar, and rest! And try to enjoy some well-deserved peace of mind for a while… Or immediately get back to work, because you hate comfy furniture, open flames, alcohol, smoking, idleness, and generally being relaxed! Arrr!
So, three major axes for improvement present themselves at this point:
First, incorporating TrustInSoft Analyzer in the continuous integration efforts of the project.
This is actually a very easy task, especially as all the difficult groundwork has just been prepared. If you are already performing continuous integration, mixing in the Analyzer should be pretty straightforward. Specific support for continuous integration activities is currently under preparation. Get in touch with us to be part of the beta testing.
Second, extending the existing test suite.
Writing new tests, or generating them using a dedicated tool, will not only augment the test coverage, but also the analysis coverage. All the execution paths will be guaranteed clear of undefined behaviors.
Third, moving towards more complete formal analysis.
With TrustInSoft Analyzer, the existing test drivers can be generalized until the analysis perimeter stretches throughout the whole program, and exhaustive verification is reached.
All serious developers and project managers pride themselves on the quality of code they create and maintain. Acutely aware of the disastrous consequences which may arise from unnoticed bugs and vulnerabilities, they put tremendous effort into assuring their software is robust, correct, and secure.
Testing is not enough to verify software that matters for security and safety. Formal verification can satisfy such high concerns, but requires a significant effort for deployment. It is rarely used on non-critical software. Other techniques (automatic test generation, dynamic checking, pattern-matching static analysis, etc.) are pretty good ways to complement or improve testing, but do not address the core of the problem.
We propose a new hybrid approach, based on deploying TrustInSoft Analyzer on existing C/C++ projects, and leveraging their current test suites for immediate gains. This provides the benefit from the strong mathematical guarantees associated with formal methods, in the whole perimeter covered by the test suite, without investing all the time and effort needed to fully verify a program formally.
Hopefully, such a solution will help make formal verification more accessible and allow more C/C++ developers enjoy its ample advantages. They will start sleeping better at night and living happier lives, relaxed and serene, knowing that their programs are finally free of bugs and vulnerabilities. And what is more gratifying than safe and secure software? A smile of bliss on a developer’s face, of course!
Now, if you would like to try out the TrustInSoft Analyzer, check out TSnippet, the online version of the tool specially adapted for small C code snippets: TSnippet. A tutorial for new users is available, and you can also read this short introductory article by Michele Alberti: TSnippet introduction. If you are satisfied by what you see, and you would like to get more information about implementing the approach described here in your own project, feel free contact us directly at TrustInSoft, we will gladly help.