Upgrade your existing test suites with TrustInSoft

Robust, correct, and secure software.
Robust, correct, and secure software.

All serious developers and project managers pride themselves on the quality of code they create and maintain. Acutely aware of the disastrous consequences which may arise from unnoticed bugs and vulnerabilities[Heartbleed,Toyota Recall 2018], they put tremendous effort into assuring their software is robust, correct, and secure.

A serious developer / project manager looking for signs of danger.
A serious developer / project manager looking for signs of danger.

During long busy days, fueled by caffeine rushing through their veins and cool breeze of air conditioning blowing in their faces, they manage to remain calm and steady while navigating through the towering piles of feature requests stacking up in their inboxes. They are careful to never steer into uncharted territory without thorough preparation. They move forward only after they meticulously design necessary test cases first, doing their best to avoid any foreseeable hazards. Then, during long, sleepless nights, in a dim glare of computer screens and soft hum of cooling fans, they do not abandon their posts despite their fatigue. Instead, they stare incessantly at the sea of code they feel responsible for, their sharp eyes scrupulously scanning each line with focus and perseverance, looking for any sign of danger.

The stormy and treacherous Sea of Code.
The stormy and treacherous… Sea of Code.

These stalwart individuals are compelled to always carry the heavy burden of uncertainty and responsibility. Especially those who work with C/C++, the most fearless and resolute of all developers, accustomed to wrestling an element so incredibly powerful and volatile. This single chilling question is always playing in the back of their heads: “What if I missed something?”. Because no matter how many test suites were prepared and executed, how strict coding guidelines enforced, how advanced sanitizers employed, and how many hours spent carefully reviewing the code, there is simply no guarantee for the results.

So fearless and resolute!... must be a C developer.
So fearless and resolute!…
Must be a C developer.

Well, the solution we propose in this paper provides guarantees. Strong mathematical guarantees based on formal methods and backed with decades of scientific research. It relies on exact techniques, like abstract interpretation[Abstract Interpretation in a Nutshell], in order to prove properties of programs written in C/C++. No more uncertainty, no more hoping for the best. Instead, hard evidence that allows to actually trust the software for once.

Formal verification is a wholly different paradigm than verification by testing. The level of assurance it provides, concerning both safety and security of the analyzed code, is radically superior compared to other methods. Generally though, formal verification of a program is a difficult, expensive, and time-consuming process which requires skill and experience[seL4,CompCert]. Due to the high cost, in terms of both time and effort, this approach does not suit everyone. According to Wikipedia, formal verification may be as much as 80 percent of the total design cost[Wikipedia: Software verification and validation]. For most, it is only used for particularly critical pieces of code.

A new way? A glimpse of hope? Sail with us into the glorious sunrise!
A new way? A glimpse of hope?
Sail with us into the glorious sunrise!

TrustInSoft proposes a new way of addressing this issue. We hope to make formal verification more accessible by implementing a hybrid solution. It equips developers with tools, and an accompanying methodology, to gradually lift a C/C++ project from the level of verification by testing to the level of formal verification through leveraging already existing test suites. This solution provides access to the many blessings of state-of-the-art formal verification techniques without necessarily performing a full-scale analysis of the code[applying formal methods]. The goal is to provide developers with great advantages for a fraction of effort.

Systematic approaches to developing correct software

When striving to develop a correct program, writing a set of tests is usually the first solution which comes to mind. And it is a pretty good one! A set of tests is a quick, simple, and effective means to rapidly find the most glaring faults in the code. You can even reverse the order and start by writing tests to engage in test-driven development. This is an excellent strategy to efficiently write both simple and complex programs.

However, a set of manually crafted test cases can only go so far. The problem of building safe and secure software must be eventually tackled in a more systematic manner if the ultimate goal is to achieve a complete and reliable outcome. There are several distinctive perspectives and approaches which allow advancement on the glorious path towards correct code.

Automated testing / test generation

Fuzzing, by the Peanuts.
Fuzzing, by the Peanuts.

The efficacy of testing itself can be vastly improved using dedicated methodologies and tools. We may employ various techniques of automatic or semi-automatic testing or test generation to greatly enhance test coverage and pertinence. For example, fuzzing is a highly adaptable and efficient method for providing invalid, unexpected, or random inputs for a given program for testing purposes. Using tools that implement such techniques[afl] is an excellent idea which we recommend wholeheartedly.

The important thing to keep in mind is that no matter how advanced a methodology or a tool applied, the result, basically, is more tests and / or better tests. Unfortunately, it is impossible to test every possible scenario of non-trivial program’s execution one by one. There are just too many of them.

Oops, we must have missed this one during testing...
Oops, we must have missed this one during testing…

Testing is like probing the waters to check for rocks (or sea monsters) under the surface before veering forward. It is definitely better to probe more rather than less. It is more efficient to probe in relevant places rather than employing a random technique. Likewise, it is more prudent to probe more intensely in the areas where more danger possibly exists. Still, regardless of how these checks are completed, there is no way to physically probe the whole sea in such a point-by-point manner. For programming, this creates uncertainty. There is simply no way to tell if a particular neglected spot does not hide some grotesque monstrosity, ready to crush the hull of the ship with its horrific tentacles and gruesomely devour the entire crew with its dreadful yawning maw. On the other hand, no risk, no fun. Arr!

Dynamic checking tools

Now, one step further from testing there is dynamic checking. Tools that implement this approach are known as sanitizers. Verification still happens mostly on the level of executables. Sanitizers determine whether the program is exhibiting any dangerous behavior during its execution. Their added value, on top of traditional testing, is that such tools reveal hidden problems which do not impact test results or directly cause runtime errors.

Two main techniques are employed to track what a program is actually doing. The first technique is to insert additional instrumentation into the original source code during compilation (some sanitizers work simply as compiler extensions, e.g. for Clang we have ASanUBSanMSan, and TSan). The second technique is to simulate the program’s execution in a virtual environment controlled and monitored by the sanitizer (e.g. Valgrind can do that).

As both these methods allow for a deeper level of assurance about the correctness of program’s behavior on particular test cases, sanitizers can be roughly summarized as testing on steroids. We perform the same tests as usual, but we discover more problems. We do not only see what is happening on the surface of the water (i.e. the test’s results), we can also peek a little bit below the waves to get an idea of what lurks beneath (and see if some dangerous operations were performed during the test’s execution). Perform the same tests as always, detect ten times more krakens! Nice!

"This is Yellow Submarine reporting to the Command! We've got a visual on a category four or bigger, definitely lurking. Requesting immediate debug!"
“This is Yellow Submarine reporting to the Command!
We’ve got a visual on a category four or bigger, definitely lurking.
Requesting immediate debug!”

Using sanitizers is a direct improvement over simple testing. As such it is highly recommended! Their main shortcoming is that such tools are still not exhaustive. They can (and will, but hang on, we are going to highlight some example a bit later) miss many problems. Sanitizers are, in essence, just debuggers. They help find and understand faults in the code, but they will never be able to guarantee to eliminate all the faults.

Pattern-matching tools (traditional static analysis)

Another interesting set of tools are all the different linters. Linters work directly on the source code level and use pattern-matching, and other traditional static analysis techniques, in order to find and flag potential problems. Depending on their sophistication, these tools can detect many different classes of issues: obvious programming errors, violations of style guidelines, or usage of suspicious constructs. Using a linter is a great way to improve general code quality and spot bugs. Moreover, this category of tools is pretty handy when striving to reach conformance with a particular coding standard (e.g. MisraCert C) especially in the area of style-related rules and recommendations.

The caveat here is buried in the main assumption underlying these tools: the hope that by making code nicer, we will also make it more correct. Of course, making code nicer does not guarantee correctness. Blindly following any coding standard does not guarantee it, either[Safer Language Subsets]. That said, following style guidelines or coding rules and generally writing nice code is not a half bad idea. Keeping everything tidy and manageable may not be the ultimate and complete solution to all problems, but it definitely helps eliminate some immediate errors, and yields undeniable maintenance-related gains in the long run.

Every sailor knows that you should keep your boat clean and orderly. This is not for aesthetic reasons, but for safety! It helps to avoid potential problems, makes those problems easier to spot, and makes it easier to react faster. No, it does not stop a rope from snapping. Yet, it definitely improves a sailor’s chances of noticing that a rope is beginning to fray. This increases the odds of preventing the problem or at least allows for the mitigation of consequences.

Correct from the ground up

A radical approach to eradicating bugs and vulnerabilities is to not make them in the first place. There are several technologies and frameworks that aim at designing and building correct software from the ground up. For example, the SCADE Suite[SCADE] uses a formally-defined, domain-specific language to express what the program should do. It then generates executable C/C++ code that is correct by design. Other noteworthy examples are the B method[B method], used for safety-critical systems, and Cryptol[Cryptol], used for cryptographic algorithms.

Following such an approach is extremely effective and recommended when possible. Unfortunately, it does have many constraints. These techniques are best suited for the development of embedded critical software. They are definitely not applicable to existing programs, at least not unless rewriting them completely is an option.

Full formal verification

Finally, let us get back to the idea of fully verifying the software formally. Formal verification of a piece of code is undeniably superior to both testing and pattern-matching. It provides actual mathematical guarantees concerning its safety and security properties. With a tool like TrustInSoft Analyzer, such an undertaking is within the realm of feasibility (even if it remains a rather ambitious task, especially for a beginner).

Maybe, you know, use maths instead of YOLO?...
Maybe, you know, use maths
instead of YOLO?…

Looks like we have cornered ourselves into inventing another nautical metaphor… Well, alrighty then!

Formal verification is much like creating a mathematical model of the ship, analyzing its behavior with respect to the sea, and proving some properties about it (e.g. that it is sea-monster-proof). Which is in fact not so far from what we do when designing boats, cars, bridges, or other objects used by people! Important aspects of such objects, like material endurance versus weight to bear, are mathematically determined and calculated on paper using models based on our understanding of the laws of physics. Doing the same for programs seems nothing but sane and rational.

Additionally, modeling behaviors of programs is a much more precise discipline than modeling behaviors of physical objects. At least in theory, we humans have more mastery over computer science than over physics. Hence such an approach should be natural and more exact than what we can ever hope to achieve in architecture or naval engineering.

Hybrid solution

The solution we advocate proposes to boldly go in the same general direction, but it strives to eliminate the daunting difficulty of performing a full formal verification. To decrease this difficulty, we will be aiming a bit lower and piggybacking on existing groundwork. Essentially, we narrow the verification’s objectives and we cut down the amount of verification work. Instead of attempting to formally verify a C/C++ program with all the bells and whistles, the goal is to benefit from as many advantages of formal verification as possible with minimal effort. We achieve it by using TrustInSoft Analyzer and following a specific methodology based on leveraging existing test suites. This minimizes the cost and maximizes the gain.

Even taking a single small step toward formal verification of a piece of code is precious. It already unlocks access to results and benefits that are unachievable by simple testing or even dynamic checking. In this approach, formal verification is fueled by the tests. Thus it combines perfectly well with all the test generation techniques, further amplifying their efficiency. In addition, it opens door for potentially continuing the route to complete formal verification.

So, essentially, we are presenting a brand new trade-off between the quality of software verification and the cost of software verification. This solution employs formal methods, immediately outclassing tests and pattern-matching. And it manages to do so without the steep increase in the overall development effort often associated with formal methods.

What is TrustInSoft Analyzer? What does it do?

The TrustInSoft Analyzer is a powerful and versatile tool used for advanced verification of critical C/C++ software. It can determine with mathematical certainty if the provided code is safe and secure by detecting undefined behaviors[Undefined Behavior Guide] or proving the absence of those behaviors.

Here, we use the Analyzer in a very specific way, though. We tailor it for a much narrower and more specific purpose. Thanks to this particular configuration, called the interpreter mode, we manage to circumvent all the inevitable complexity involved when performing a full-blown analysis. We apply the Analyzer almost directly to existing test suites with minimal setup effort.

In fact, this tool is available for evaluation now! An online version, TSnippet, which may be used to analyze snippets of C code, can be found here.

Difference

When software is tested, it usually means that the compiled program is executed on a computer, and its actual behavior is compared to how it was expected to behave. The program is fed specific inputs (defined by the  test case) and its outputs are checked for errors.

TrustInSoft Analyzer does not execute a compiled program natively on the machine where it runs. Instead, it works directly on the program’s source code level, interpreting it line by line and simulating the program’s behavior in a virtual environment. Based on complete formal understanding of the C/C++ language semantics and a complex model of the computer’s memory, this simulation is mathematically sound.

If you don't get what "complete formal understanding of the C/C++ language semantics" and "complex model of the computer's memory" is about, don't worry. Imagine something like on this picture, but blinking and buzzing, and you're gonna be fine.
If you don’t get what “complete formal understanding of the C/C++ language semantics” and “complex model of the computer’s memory” is about, don’t worry. Imagine something like on this picture, but blinking and buzzing, and you’re gonna be fine.

Thanks to this approach, the program is no longer a black box. All the details and fine points of its internal behavior become observable. Now, not only the program’s outputs can be checked for errors, but everything that happens during the program’s execution can be thoroughly examined and verified for signs of trouble. In difference to the dynamic checkers, this verification is sound and exhaustive. We detect all the faults without fail.

In The Proverbial Basement No One Can Hear You Scream...
In The Proverbial Basement No One Can Hear You Scream…

Why is this important? Because programs are vicious little clever beasts. They are fully capable of committing most atrocious, forbidden, and dangerous things. And they can hide their crimes so well that they can go unnoticed for a very long time. Moreover, these monsters do not even feel guilty about it… And then one day, when you go down to the basement and stumble upon all these cadavers stuffed in the freezer, your program just goes “But why are you mad? I was not supposed to do that?”. So, do not trust them. Install cameras in your proverbial basement. Be safe.

In other words (coming back from whatever happened in the last paragraph…), it may happen that the program’s naughty behavior does not cause a runtime error and does not change the test’s result (so is not observable in any way during testing). It may even be that such behavior is masked by the compiler optimization or triggers only on a particular target architecture in specific circumstances (so is not detectable by a sanitizer, either). In this manner, just because of unfavorable circumstances, the problem might get overlooked. And a bug or vulnerability will remain concealed, lurking somewhere in the program, waiting to be exploited.

Examples

Writing outside of array’s bounds is a perfect example of such a situation. In some cases, out-of-bound writing will not cause a runtime error and will not alter the program’s output. Maybe it will just change the value of some random object, which very rarely impacts the program’s behavior. Maybe the write will even get completely optimized away during compilation. And then, one day, some unfortunate event or a malevolent hacker discovers a combination of parameters and environment variables which causes this out-of-bound access to be executed and do something consequential and… all hell will break loose.

We can see a good and simple example(borrowed from [Undefined Behavior in 2017]) of such fault in this little piece of code: TSnippet: buffer overflow. Other tools will not warn about this undefined behavior because it gets optimized away before the instrumentation has a chance to run, as we can observe here: Godbolt: buffer overflow missed by ASan. Another example[idem] of a problem which will not be discovered by dynamic checkers is this case of unsequenced access to a memory with two different pointers referencing the same memory location: TSnippet: memory aliases.

Guarantees

TrustInSoft Analyzer can detect if such dangerous behaviors may happen during the program’s execution. In fact, it is capable of doing much more than just detecting them! When Analyzer concludes that such behaviors do not happen, it does not only mean that it cannot find any. It means that it has mathematically guaranteed the total absence of out-of-bound accesses in the context of the given test case. And this is not an uncertain “Well, umm, I searched around, and I could not really find any problems. I guess it is probably all right…” situation. Instead, this is a serious “There are no problems left! We’ve taken care of all of them, you can trust me on that, sir!” situation.

Deployment

Now, enough talking about the wonderful advantages of TrustInSoft Analyzer (which is easy to use, formally sound, fully exhaustive, soft to the touch, and provides incredibly strong mathematical guarantees). Let us discuss what is necessary to deploy this magnificent tool on an actual C/C++ project.

Should Doge become our Brand Ambassador? Tell us in the comments!
Should Doge become our Brand Ambassador?
Tell us in the comments!

Requirements

Source code: complete C/C++ projects

TrustInSoft Analyzer works directly on the source code level. In order to carry out a meaningful analysis of a program, it requires access to all the C/C++ source code that the given project uses or includes. Unfortunately, this constraint restricts the Analyzer applications to complete C/C++ projects. The code of all the dependencies must be available, otherwise the Analyzer is simply unable to provide any immediate service. Compiled binaries are not enough!

However, there are some means that allow circumvention of this restriction if necessary. A good practical solution is, for example, to create an appropriate stub for any function whose source code is not available which is called by the program. This requires some effort and is a little more advanced than running the Analyzer on your code as-is, but that is the price of exhaustiveness. The tool does not make guesses about the source code, it only works within a well-defined context.

Existing test suite

The methodology presented here allows to quickly deploy TrustInSoft Analyzer on a C/C++ project by leveraging an existing test suite. We analyse the program’s behavior by performing abstract interpretation of all the available test cases. Basically, we use existing test drivers directly as analysis drivers. A test suite with significant coverage greatly increases the profitability of this approach.

Know-how: to compile and build the project

When working on a complex multi-file project, TrustInSoft Analyzer needs to know how the whole project is compiled and built. The information concerning which source code files are used, which headers should be included, and what compilation options should be set, is necessary to properly parse and analyze the code.

This know-how might have different shapes and forms, possibly not even particularly structured or formal. Maybe all the instructions about building and compiling are just written in plain English in a single README file. Maybe the project comes with a regular Makefile. Either way, when setting up TrustInSoft Analyzer, all the relevant pieces of information must be extracted from these sources (manually or using some helper tools) and an appropriate configuration file, tis.config, must be produced.

Run, correct, rerun, guarantee!

After proper configuration, the Analyzer can be run on each of the tests which constitute the test suite. If undefined behaviors are detected in some tests, it will provide specific warnings. The underlying program faults should be investigated and corrected before continuing. Then, the Analyzer should be run again and again to discover problems, each of which should be corrected one by one, until no more undefined behaviors are found in the code.

When a certain test passes through the Analyzer without any problem, this means that the corresponding execution path in the source code is guaranteed to be 100% free of undefined behaviors. Mathematically proven, pinky promise!

100% free of undefined behavior.
100% free of undefined behavior.

What comes next?

After a job well done.
After a job well done.

What happens when the whole test suite passes through without raising any alarms? Well, you can finally sit in that soft, comfortable armchair in front of the fireplace, open that priceless bottle of old bourbon, light that exquisite Cuban cigar, and rest! And try to enjoy some well-deserved peace of mind for a while… Or immediately get back to work, because you hate comfy furniture, open flames, alcohol, smoking, idleness, and generally being relaxed! Arrr!

So, three major axes for improvement present themselves at this point:

The famous three major axes for improvement. Ba-dum-tsss...
The famous
three major axes
for improvement
.
Ba-dum-tsss…

First, incorporating TrustInSoft Analyzer in the continuous integration efforts of the project.
This is actually a very easy task, especially as all the difficult groundwork has just been prepared. If you are already performing continuous integration, mixing in the Analyzer should be pretty straightforward. Specific support for continuous integration activities is currently under preparation. Get in touch with us to be part of the beta testing.

Second, extending the existing test suite.
Writing new tests, or generating them using a dedicated tool, will not only augment the test coverage, but also the analysis coverage. All the execution paths will be guaranteed clear of undefined behaviors.

Third, moving towards more complete formal analysis.
With TrustInSoft Analyzer, the existing test drivers can be generalized until the analysis perimeter stretches throughout the whole program, and exhaustive verification is reached.

Summary

All serious developers and project managers pride themselves on the quality of code they create and maintain. Acutely aware of the disastrous consequences which may arise from unnoticed bugs and vulnerabilities, they put tremendous effort into assuring their software is robust, correct, and secure.

Testing is not enough to verify software that matters for security and safety. Formal verification can satisfy such high concerns, but requires a significant effort for deployment. It is rarely used on non-critical software. Other techniques (automatic test generation, dynamic checking, pattern-matching static analysis, etc.) are pretty good ways to complement or improve testing, but do not address the core of the problem.

We propose a new hybrid approach, based on deploying TrustInSoft Analyzer on existing C/C++ projects, and leveraging their current test suites for immediate gains. This provides the benefit from the strong mathematical guarantees associated with formal methods, in the whole perimeter covered by the test suite, without investing all the time and effort needed to fully verify a program formally.

Hopefully, such a solution will help make formal verification more accessible and allow more C/C++ developers enjoy its ample advantages. They will start sleeping better at night and living happier lives, relaxed and serene, knowing that their programs are finally free of bugs and vulnerabilities. And what is more gratifying than safe and secure software? A smile of bliss on a developer’s face, of course!

The aftermath: a happy, relaxed, and serene developer.
The aftermath: a happy, relaxed, and serene developer.

Now, if you would like to try out the TrustInSoft Analyzer, check out TSnippet, the online version of the tool specially adapted for small C code snippets: TSnippet. A tutorial for new users is available, and you can also read this short introductory article by Michele Alberti: TSnippet introduction. If you are satisfied by what you see, and you would like to get more information about implementing the approach described here in your own project, feel free contact us directly at TrustInSoft, we will gladly help.

Variadic Functions in TIS Interpreter

Working on improving and applying TIS Interpreter, with funding from the CII, we ended up implementing support for program-defined variadic functions. Recently, while applying TIS Interpreter to the musl standard C library implementation, we found one slight violation of the C standard in the way musl defined functions of the printf family. This post describes program-defined variadic functions and the ways they can be misused, and argues that they are worth supporting in TIS Interpreter.

What are variadic functions and why should we care?

Variadic functions, found in several programming languages, accept a variable number of arguments. As an example, we can imagine a little variadic function add(arg1, arg2, …, argn) that takes some numbers as arguments, and computes their sum, arg1 + arg2 + … + argn, whatever the number of passed arguments actually is.

In C, variadic functions are not that popular, and a seasoned C programmer, anticipating what this is going to entail, would probably avoid specifying and implementing such an add function. Nevertheless, in some cases, the possibility of going beyond functions with fixed number and type of arguments is convenient. When, in the 1990s, C was still considered and used as a high-level language, variadic functions were a reasonable solution in situations where the flexibility they provided made a difference with more cumbersome alternatives. An especially good example of such a situation is for input and output formatting using format strings. Because of that, even though a contemporary C programmers would usually avoid defining new variadic functions, many are present in existing C software, notably in legacy code.

The printf function in C

Let us take a closer look at C’s ubiquitous printf function. Its declaration is a part of the stdio.h standard library header file and it looks like as follows:

int printf(const char * restrict format, ...);

The declaration states that the printf function takes:

The mandatory argument format here is the format string itself. The expected type of the remaining arguments is not established statically on the declaration level, but depends dynamically on the contents of this first argument. In the case of printf, this argument is a format string, and it is the conversion specifiers (the special patterns introduced with the % character) present in this format string that define the types and the number of arguments to be printed on the standard output. The conversion specifier %d expects an int argument and displays it in decimal style, the conversion specifier %c expects a char argument and displays it as an ASCII character, and so on; the list is long. The characters in the format string that are not conversion specifiers simply pass through unchanged.

A simple use of printf could be:

char c = 'X';
int i = 42;
printf("The %c is %d.", c, i);

This code snippet prints “The X is 42.” to the standard output when executed.

In a different context, the same printf function might be passed a double argument and a void* argument. Although this lax type handling has some advantages in the case of formatting functions (the same printf function can be used to print strings, floating-point values, pointers and integers), major dangers come with it. The main advantage is the flexibility that it offers: without it the whole printf function family could not exist in this form, and producing formatted outputs in C would be much more tedious. The danger is that this mechanism may give rise to a whole lot of significant type-safety related problems. If we follow section 7.21.6.1 of the ISO C11 standard dedicated to the fprintf function (of whom printf is a specialised version), we discover that both failure to provide enough arguments and a mismatch in a provided argument’s type cause Undefined Behavior. CWE-134: Use of Externally-Controlled Format String shows that the security risks involved, information leak and—through the %n formatter—foreign code execution, are real and should be taken seriously.

Home-brew variadic functions in C

The printf function is only the most ubiquitous example of a C function that takes a variable number of arguments. The C standard library not only contains several other variadic functions, but also provides the tools necessary for the programmer to define their own variadic functions. The ingredients are available in the stdarg.h header file.

Declaring, calling, and defining variadic functions

We have already seen with printf how to declare a variadic function. After providing a number of mandatory arguments, together with their types, we simply add an ellipsis (i.e. ...) after the last argument, which means additional arguments may follow:

int add(int n, ...);

We also know how to call such functions, as they are in fact called almost exactly like the normal ones. The fixed mandatory arguments are simply followed by the additional variable ones. Therefore the number of passed arguments can differ between different call sites of the same function:

int foo = add(3, 23, 12, 7);
int bar = add(2, 23, 19);

Defining variadic functions is a bit more complicated. The function’s mandatory arguments are accessed just the usual way, but a special mechanism is needed in order to access the variable ones. The stdarg.h header provides four variadic macros: va_start, va_arg, va_end, and va_copy, as well as the type va_list, which together make it possible to cycle through the additional arguments that have been passed to a variadic function:

  • The type va_list holds the necessary information needed by the variadic macros. The traditional name of a variable of this type is ap, for argument pointer.
    Intuitively, we can imagine that an object of this type contains a pointer to the next variadic argument (or a list of remaining variadic arguments). Of course, as the interface is abstract, the actual implementation of this type is not known to the library user. The C standard only describes how it will behave if used correctly.
  • The macro va_start(va_list ap, parmN) initializes the va_list object ap.
    Intuitively, after invoking va_start the ap object points to the first variadic argument passed to the function.
    (Note: The parmN parameter here must be the same as the name of the last non-variadic argument in the function definition, that is, the argument just before the .... The necessity of passing this parameter hints at the low-level details of the implementation.)
  • The macro invocation va_arg(va_list ap, type) has the type and the value of the next variadic argument. Also, each invocation of va_arg on ap modifies ap so that subsequent arguments are returned in sequence. The type provided to the va_arg macro should match the actual type of the next argument.
    Intuitively, va_arg returns the argument that ap points to and makes ap point to the next one.
  • Finally, the macro va_end(va_list ap) deinitializes the ap object. Each va_list object should be deinitialized before the function returns.
  • For simplicity’s sake we will ignore the va_copy macro here, as it is not necessary for the examples in this post.

There is no built-in way (such as an additional variadic macro) to know from the inside of a variadic function how many arguments and of which types have been passed to it. The only way is to provide this information explicitly and consistently at each call site, and to retrieve it inside the variadic function. The usual approach is to encode it using the function’s mandatory arguments, exactly like in the case of printf‘s format string.

A full example

In order to see how this all works, let us implement the closest approximation that can be achieved in C of the add function that sums a variable number of arguments:

int add(int n, ...) {
  va_list ap;
  va_start(ap, n);
  int sum = 0;
  for (; n > 0; n--) {
    int next_arg = va_arg(ap, int);
    sum += next_arg;
  }
  va_end(ap);
  return sum;
}

The internal workings of function add are pretty straightforward:

  1. The beginning:
    • First we declare a va_list object ap and we intialize it using the va_start macro. Now, intuitively, ap points to the first variadic argument of add.
    • Also, we initialize the sum variable to zero.
  2. The middle:
    • n times we consume the next variadic argument, using the va_arg(ap, int) macro. Note that we expect each of the variadic arguments to be of type int and we expect their number to be equal at least n.
    • We add the subsequent arguments’ values to the variable sum.
  3. The end:
    • We deinitialize the ap variable using the va_end macro.
    • We return the sum variable’s value, which is now (obviously) equal to the sum of the n variadic arguments.

Now, in order to perform a correct call to add we must at least make sure that:

  • the number of variadic arguments that we pass to the function is not smaller than the value of the argument n passed in the same call (it is all right if more arguments are passed than consumed),
  • and that all these n arguments are of type int.

Examples of some correct calls would be:

add(3, 23, 12, 7); /* == 42 */
add(1, 42); /* == 42 */
add(3, 23, 12, 7, 'x', 13.0, "hello"); /* == 42 */
add(0); /* == 0 */
add(-42); /* == 0 */
Variadic macros and the C standard

Now, please notice the inconspicuous at least, in “we must at least make sure that”, which found its way to the previous paragraph. Well, these are by far not the only conditions necessary to call the add function correctly. In fact the section of the C11 standard concerning variadic functions and variadic macros is complex and it provides numerous ways to introduce Undefined Behavior into programs that use the stdarg.h library header.

The C11 standard section 7.16, entitled Variable arguments <stdarg.h>, defines the semantics of the four variadic macros, as well as the va_list type, and thus decides how home-brew functions with variable arguments behave, and when they may misbehave. There are many constraints concerning correct usage of these macros, some pretty straightforward and direct, some subtle. Violating most of these constraints seems completely innocuous in practice for all common compilation platforms, while breaking others causes visible problems on some or all common compilation platforms.

Correct sequence of variadic macros invocations

Several constraints concerning the variadic macros are concerned with the order in which these macros are supposed to be invoked on a va_list object. The allowed order is strictly defined by the standard. The following rules paraphrase, in a simplified and more casual way, what the standard says on this subject (note that we omit, again, the va_copy macro):

  • Each va_list object ap begins its life uninitialized.
  • If ap is not initialized it may be initialized using the va_start macro.
  • Once ap has been initialized, va_arg macro may be invoked on it at most number of times equal to the number of variadic arguments which were passed to the function.
  • If a va_list variable has been initialized, it must be deinitialized in the same function before the function returns, using the va_end macro.
  • This sequence can be repeated any number of times: after deinitializing ap with va_end we can initialize it again with va_start, and iterate on all the variadic arguments again (from the beginning) with va_arg, and then deinitialize it again with va_end.
  • If any other sequence of events happens, the behavior is undefined.

This ends up to be a simple pattern, which resembles a finite state machine that validates how va_start, va_arg, and va_end (invoked on a given va_list variable) are allowed to be interwoven in a single function. Such machine for a 3 variadic arguments passed in the call would look like this:

Finite state machine validating variadic macros invocations for 3 arguments passed in the function call

Consuming too many variadic arguments

The interesting aspect of these rules is how the impact of violating each of them is differently visible in practice. For example, if we try to consume more variadic arguments than are available in the given call we will run into trouble quite quickly. Of course what happens exactly is compiler and platform dependent, but in most cases the implementation of the underlying Undefined Behavior will result in reading some random data from memory. Let us see a simple program that simulates such a situation:

/* FILE not_enough_args.c */

#include "stdarg.h"
#include "stdio.h"

int f(int fst, ...) {
  va_list ap;
  va_start(ap, fst);
  int x = va_arg(ap, int);
  printf("%d\n", x);
  va_end(ap);
  return x;
}

int main() {
  f(11);
  return 0;
}

In this program, the function f tries to consume one int variadic argument and in the call no argument is passed at all. On my machine compiling it with gcc and then executing ten times prints following values, which seem pretty random indeed:

1691332760
-262231160
1531375144
-1205239608
133570616
985826840
-1260096984
1604313000
-883866920
-473417640
-121339624
The unpleasant case of va_end macro

What may be more disturbing, disobeying many of the the other mentioned constraints concerning the sequence of variadic macro invocations will usually have no visible effect during both the program’s compilation and its execution. In particular, all the rules which involve the va_end macro appear to be optional in practice. The va_end macro is translated by most compilers on most architectures to a do-nothing operation (a fact recognised directly in the C standard rationale, section 7.15.1.3 : The va_end macro). As this macro is supposed to perform a clean-up after initializing and using the va_list object, and in most stdarg.h implementations there is simply nothing to clean up, thus the macro is actually not needed at all and may be just ignored. You might ask, why was it included in the standard in the first place? Following the C rationale again: those implementations that need it probably need it badly is the explanation.

The variadic arguments’ types

There are also several constraints which concern the types of the variadic arguments. Basically, the type provided when invoking the va_arg(ap, type) macro should be compatible with the actual type of the next variadic argument (with several well-defined exceptions, see the C11 standard section 7.16.1.1p2 for details). The danger related with the underlying Undefined Behavior is quite serious in this case, and definitely of the harmless-looking, but waiting to stab you in the back one day kind. As the size of C types may vary depending on compiler, the compilation options, and the platform, it is not hard to imagine situations when a given program works perfectly well on one configuration, where the two theoretically incompatible types happen to align well (and they happen to be passed through the function call exactly in the same way), and it fails miserably on another configuration, where exactly the same two types in the same circumstances do not behave the same anymore, and thus the variadic argument is recovered incorrectly, and suddenly there we have a nice shiny bug…

Let us look at a simplistic example that showcases the problem. In the following code we call the variadic function f passing two arguments, both of type long and value 42, and then we attempt to consume one variadic argument of size long long:

/* FILE type_mismatch.c */

#include "stdarg.h"
#include "stdio.h"

void f(int hello, ...) {
  va_list ap;
  va_start(ap, hello);
  long long b = va_arg(ap, long long);
  printf("%lld\n", b);
  va_end(ap);
}

int main(void) {
  f(0, (long) 42, (long) 42);
  return 0;
}

On my machine when I compile this source code with gcc using two different target options I get different results upon execution:

  • Option -m64 selects 64-bit x86-64 build. In this case the first variadic argument’s value 42 of type long is read correctly, as the sizes of the types long and long long seem to match: the program prints 42.
  • Option -m32 selects 32-bit i386 build. In this case the argument is read incorrectly: the program prints 180388626474 which is definitely not the value we expected.

This example is simplistic indeed, but it shows exactly the disturbing property that we were just talking about: on one configuration it works perfectly fine, and on another it does not. Of course here it is pretty evident that the two concerned types will not always match and that something might go wrong. However, if this kind of mismatch is well hidden in a much larger program which has many execution paths and #define directives all around the place, the existence of a potential problem will not be so obvious anymore. Furthermore, no amount of testing on a 64-bit i386 build will ever throw any doubt at the Works on My Machine certificate that we might have given this piece of code. But compiling and running on a 32-bit system tells a different story.

Support of variadic functions in TIS Interpreter

In TIS Interpreter we have recently implemented support for variadic functions written using the variadic macros from stdarg.h. TIS Interpreter, developed thanks to the funding of the CII and available as Open-Source, is an abstract interpretation tool for C programs, capable of finding and identifying a very large set of Undefined Behaviors and C standard violations. Now, with its newly gained proficiency in this particular domain, it can also discover problems concerning variable argument handling, like the ones mentioned above.

Let us see how TIS Interpreter handles the examples that we have introduced so far:

    • Interpreting the example where too many variadic arguments consumed produces the following warning:
not_enough_args.c:7:[value] warning: va_arg macro called when all the variadic arguments have been already used up; assert enough arguments
    • When we interpret the example where variadic argument types are not really matching we get:
type_mismatch.c:7:[value] warning: the actual type of the next variadic argument (long) does not match the type provided to the va_arg macro (long long); assert the type of each variadic arguments provided to a function matches the type given to the corresponding call to the va_arg macro
    • And as of the example that we did not explicitly state, with a va_end macro invocation removed from an otherwise correct program:
missing_va_end.c:13:[value] warning: local variable ap of type va_list in function add has been initialized using va_start or va_copy macro and has not been deinitialized by a matching va_end macro; assert va_list variable ap has been uninitialized using the va_end macro

All these warnings are extracted from the output of TIS Interpreter when we execute it directly on the code that we have seen here, simply like that:

$ tis-interpreter not_enough_args.c
The case of musl

Recently the support for variadic functions deemed itself very useful, as we were running musl code through TIS Interpreter. The musl library is an example of a perfect target for TIS Interpreter: it is an important and widely used Open Source component written in C, and it aims “to be correct in the sense of standards-conformance and safety”.

musl is a libc: an implementation of the standard library functionality described in the ISO C and POSIX standards. musl‘s main objectives are to be lightweight, fast, simple, free, and, as we have already emphasised, correct. It is a key component of the Alpine Linux, a security-oriented, small, simple, resource-efficient Linux distribution, very well adapted to use in software containers. The features of Alpine Linux make it a frequent choice for using in Docker containers, rumours say that it is even considered as the default platform option in the official Docker image library. And, as Docker is an extremely popular (the world’s leading according to its website) software container platform, musl happens thus to be a pretty widely deployed libc version. Hence our interest in it.

What did we find?

As musl is high-quality software written with standard-conformance in mind (POSIX standard for the interface it provides, C standard for its assumptions with respect to the compiler), we did not expect to find many issues to report. And effectively we have only managed to encounter minor transgressions of the C standard. One of these, which ultimately has been deemed important enough to be corrected, was present in the implementation of the printf and scanf variadic functions. In musl the implementation of these library functions is in fact based on the variadic macros from stdarg.h.

The issue was related to the type and value of the argument passed to the va_arg macro. Consider the program:

char dest[100];
int x = sprintf(dest, "%lld\n", -1LL);

These two lines are correct C. The type of the -1LL argument matches the format specifier %lld. Still, TIS Interpreter emits a warning when using this snippet to drive the sprintf implementation that was in musl at the time.

src/stdio/vfprintf.c:141: warning: although the type provided to the va_arg macro (unsigned long long) and the actual type of the next variadic argument (long long) are corresponding unsigned - signed variants of the same integer type, the actual value of the argument (signed) is negative thus it cannot be represented in the provided (unsigned) type

This warning refers to the Undefined Behavior that we have already come across earlier, described in C11 standard section 7.16.1.1:2:

(…) if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:

  • one type is a signed integer type, the other type is the corresponding unsigned integer type, and the value is representable in both types;
  • one type is pointer to void and the other is a pointer to a character type.

So what happens here exactly? The next variadic argument at this point of the execution is the -1LL constant, which is a negative value of long long type. Deep inside the sprintf implementation, the va_arg macro expects at this moment is an argument of unsigned long long type. Though these two types are not compatible, we fall into one of two exception cases: one type is a signed integer type, the other type is the corresponding unsigned integer type. But this use of a type with a different signedness is only valid if the argument’s value exists in both the signed and the unsigned type, which is not the case for -1. Consuming -1LL with va_arg(…, unsigned long long) is undefined. And that is exactly what TIS Interpreter is warning about here.

A short investigation led to the va_arg invocation that consumed this variadic argument. It was the one at line 141 of the vprintf.c file, effectively expecting unsigned long long:

break; case ULLONG: arg->i = va_arg(*ap, unsigned long long);

The cause behind the issue is optimization in the vprintf.c file. Two symbols, LONG_IS_INT and ODD_TYPES, are defined conditionally and then employed, using #ifdef directives, to fiddle with the enum type related with handling the conversion specifiers and with the switch cases which select the correct va_arg invocation in the function pop_arg. Let us see exactly how the ODD_TYPES makes us get to the switch case with the unsigned long long type:

First, the symbol ODD_TYPES is defined or not, depending on the representation of certain types on the platform:

#if SIZE_MAX != ULONG_MAX || UINTMAX_MAX != ULLONG_MAX
#define ODD_TYPES
#endif

Then ODD_TYPES decides if LLONG is an actual enumeration tag or just a synonym for the ULLONG tag:

enum {
  /* ... */
  PTR, INT, UINT, ULLONG,
  /* ... */
#ifdef ODD_TYPES
  LLONG, SIZET, IMAX, UMAX, PDIFF, UIPTR,
#else
#define LLONG ULLONG
/* other #define directives here... */
#endif
  /* ... */
};

Finally, inside the pop_arg function’s switch statement the case corresponding to LLONG is conditionally avoided (as LLONG is in this situation just an alias for ULLONG, that would be de facto second ULLONG case in the switch):

switch (type) {
  /* ... */
  break; case ULLONG: arg->i = va_arg(*ap, unsigned long long);
  /* ... */
#ifdef ODD_TYPES
  break; case LLONG: arg->i = va_arg(*ap, long long);
  /* ... */
#endif

So what purpose did these type-related shenanigans serve? This optimization can shave off a few bytes from the compiled code by unifying certain execution paths: if these different types have the same underlying representation, they can be both treated in the same way. Unfortunately, as we have just seen, this optimization also introduces Undefined Behavior. After (see the discussion on the musl mailing list), a cleanup patch has been applied by Rich Felker, the primary author of musl, in this commit.

Conclusion

In an ideal world, this kind of optimization would not need to exist at the C level in the first place. If the compiler recognised what was going on in this situation, i.e. that two or more execution paths are equivalent on a given architecture, these two switch cases could be merged at compile-time. Then the programmer could just stick to the C standard, express their intentions, and get an executable as small as when applying the hack discussed above.

Luckily, in these particular circumstances, the efficiency impact of removing altogether this dubious optimization was negligible, so the choice was easy to make. In other cases though, if the difference in efficiency was more substantial, it might be less clear if sticking to the C standard is worth the price.

As a side-effect, this lead to a short discussion about compilers producing efficient code from the type-safe and correct version that musl now uses exclusively. It is a pleasant convergence that a problem uncovered in musl with TIS Interpreter revealed an instance of a compilation challenge which was being worked on at the same time.

Acknowledgments: Shafik Yaghmour provided comments on an early version of this post, and it was edited by Pascal Cuoq. Joakim Sindholt provided the cleanup patch that was merged into the musl tree by Rich Felker.