Beyond the x86_64: developing for embedded systems without development boards

October 20, 2020

Challenges of Cross-Platform Development and Solutions 

Introduction

In embedded software development, one of the first decisions made even before application development is the choice of a hardware platform, which due to its respective constraints and capabilities, can define functionalities of the embedded system once development has begun.  To maintain flexibility so that the software developed for the embedded system will function correctly in a world that is always updating, it can be beneficial to develop the program in a way that it can run correctly on a variety of platforms.

The Challenges of Cross-Platform Development

Most office computers today run on x86 platforms.  But in embedded systems, developers tend to write code for platforms that differ greatly from this modern office standard, like ARM platforms, which are often present in smartphones.  Another example: SPARC is an instruction set architecture found in Sun Unix workstations in the 1990s that continues to be used today in the space industry by use of LEON microprocessors, although it is no longer a typical platform readily available.  

 Even if your current development project doesn’t involve sending someone into space, the flexibility to develop on several platforms that aren’t as readily available as the x86 remains a challenge for the embedded community.  It is important to test the program execution on different platforms because different platforms can in fact differ widely in terms of how they manage and represent data. This can lead to undefined behaviors during the execution of a program on one platform and not another, or can lead to a misrepresentation of variables in the memory of a given platform.

 So how can a developer test the execution of their program on a variety of platforms?

Different platforms can in fact differ widely in terms of how they manage and represent data. This can lead to undefined behaviors during the execution of a program on one platform and not another, or can lead to a misrepresentation of variables in the memory of a given platform.


The current solution and why it’s inefficient

Today’s developers who seek to test the execution of their software on a variety of platforms for which they do not have easy access often turn to development boards, which function like microcomputers containing a microprocessor and often include little else to facilitate the execution.

 Depending on the platform, development boards can be hard to find, especially in mass quantities for a team of developers.  They can also be consequently expensive and thus might deter some developers from testing the program execution.  Additionally, development boards are not particularly practical to use.  To use one, the developer must first compile the program separately before running it on the development board, which in addition to being cumbersome also slows down the process. What’s more is that since the development board is basically a stripped-down version of a computer including essentially only the processor, it may not include other tools such as a debugger.  Debugging the program is an important part of testing the execution of the program on the different platforms, because, as previously mentioned, undefined behaviors may persist when a program is run on different platforms. 

 If development boards are so inconvenient, why hasn’t a better solution been developed?

The solution you didn’t know about

In addition to its other merits including the strengthening of existing software tests to exhaustively detect vulnerabilities in source code, TrustInSoft Analyzer is also able to simulate the execution of programs on different platforms, including those that can be difficult to find today and for which procuring a development board could prove difficult.  Instead of needing to use several different development boards to test the program execution, developers can test multiple platforms with one single manipulation in the command line with TrustInSoft Analyzer.  This all-in-one solution saves time for the development process, without needing to wait to procure development boards or compile the program beforehand.  There is a full list of the platforms currently available for analysis below in the appendix.

 While it’s true that some development boards include debuggers, one of the key benefits to using TrustInSoft Analyzer is its interactive GUI which enables developers to navigate through their source code to understand, with helpful commentary from TrustInSoft Analyzer, the root causes of any errors that may occur during the execution of the program on different platforms.  Since debugging for embedded platforms can be tricky and differences between platforms can yield very different results, this added feature of TrustInSoft Analyzer will ensure that any and all potential problems with the execution are identified and easy to understand thanks to its intuitive interface.

 Ready to see how it works? 

See for yourself

To show you the benefits of using TrustInSoft Analyzer for this purpose, we are running the code on TSnippet, an easy-to-use, free online version of TrustInSoft Analyzer that can analyze small snippets of code, including for multiple platform types.  You too can follow along with these examples by accessing TSnippet here.  You can also test out your own code on the different platforms already available for analysis.

 Example 1: Critical differences between platforms

Here we are analyzing a program using the popular x86_64 architecture, which you may find in most modern office computers today; for example, Intel chips generally implement x86_64. 

t-snippet screenshot

Here, unsigned long variables a and b are multiplied to form unsigned long c for a value of 0x1234002468. As the space reserved for the variable c in memory is fixed in advance though the value may change according to the program execution, and the program is being interpreted in a 64-bit platform context, unsigned long c takes 8 bytes of memory on an x86_64 platform.  The result of the multiplication can thus fully be represented in memory.  We can see here that this particular x86 platform is “little endian”, meaning it represents the variable in the memory beginning with the least significant value in the sequence, in this case 0x68. 

 So, we know that the variable unsigned long c receives the mathematical result when the program is executed on an x86_64 platform.

 Let’s change the platform using TSnippet to see what happens to the program when run on a new target platform; specifically, what changes can TrustInSoft Analyzer detect moving from 64 bit architecture to 32 bit architecture.

 TSnippet executes the program on sparc_32.  Mathematically, the true result of the unsigned multiplication a * b should be as we saw earlier: 0x1234002468.

However, when we change the platform to a 32-bit one, the multiplication result changes and becomes 0x34002468.  Because the architecture uses only 32 bits to represent the result of the multiplication, the mathematical result cannot be represented in its entirety on a 32-bit platform.

 Unsigned types can only take on positive values. One of the various odd rules of the C language is that for an unsigned type, getting a result that cannot be represented is not an error. Here, the multiplication is unsigned, and therefore this behavior applies with respect to out-of-range results. In the C language, unsigned values function rather like an odometer in a vehicle that rolls over once it reaches more than 99,999 kilometers. Instead of being able to be expressed as 100,000, an odometer limited to 5 digits can only express this milestone as 00000. Similarly, since this particular platform is limited to 32 bits, and the mathematical result of the multiplication would take more than 32 bits to represent, the highest-weight binary digits are lost.

Odometer

In comparison, a signed overflow does not adhere to this “rolling over” rule as do unsigned types. When overflow occurs for a signed value, it is undefined behavior that typically goes unnoticed as it is not signaled; however, TrustInSoft Analyzer is actually able to detect this kind of bug using its powerful formal methods-based approach.

 Here as well, TrustInSoft Analyzer knows that sparc_32 is a big endian platform and thus represents the variable unsigned long c starting with the most significant value in the sequence 0x34.

 Example 2: Undefined behaviors that occur with certain supported platforms and not others

 Oftentimes a program is developed with the intent to support multiple platforms beyond the most popular.  Of course, a developer is limited when it comes to trying to support all possible platforms, as different platforms have different attributes, as we clearly saw in example one.  But what if a program is written to support a given platform, but during the execution of this program, an error occurs?

 This was the case for bitcoin-core, an open-source project on Github which, when run through TrustInSoft Analyzer, identified no undefined behaviors for any of its supported platforms, with the exception of the x86_16 bit architecture. On this supported platform and uniquely this platform, TrustInSoft Analyzer detected two signed overflow undefined behaviors, the first of which you can see with TSnippet here.  

T-Snippet bitcoincorerror

On the left-hand side of this screenshot is the source code, and on the right-hand side is the internal representation on which the analyzer works. On the x86-16 platform, the variable b is “promoted” to int before anything else happens: this “promotion” operation, mandated by yet another rule of the C standard, is made explicit in the internal representation. And, also on this platform, the type int is 16 bits wide, which makes the “left shift” operation a signed overflow. On the more usual platforms, the type int is wider and the operation is safe. On a very exotic platform where the types char and int are both 16 bits wide, the variable b of type unsigned char would be promoted to unsigned int and the shift operation would be safe by virtue of being an unsigned operation. This shows that when writing C code, keeping in mind what can happen on different platforms can be very difficult.

 This error is naturally only a bug if the program was in fact designed to support the platform in question.  This was the case for bitcoin-core, and the bug has since been corrected by the project’s maintainers.

Conclusion

Ensuring the functionality of code on new and different platforms is often considered a hassle, but it doesn’t have to be! TrustInSoft Analyzer is able to virtually simulate a wide range of platforms to help you identify if program execution will vary from platform to platform and if there are any undefined behaviors in changing the platform.  This means you have all of the platforms and analysis tools you need at your fingertips, without needing to compile and run the program on a series of development boards.

Appendix

List of platforms currently available for analysis with TrustInSoft Analyzer: aarch64, aarch64eb, arm eabi, armeb eabi, mips_64, mips_n32, mips_o32, mipsel_64, mipsel_n32, mipsel_o32, ppc_32, ppc_64, rv32ifdq, rv64ifdq, sparc_32, sparc_64, x86_16, x86_16_huge, x86_32, x86_64, x86_win32, x86_win64.  More platforms can be added upon request.

Acknowledgements

Special thanks to Pascal Cuoq, Chief Scientist at TrustInSoft and co-author of this article.

Newsletter