Trusted-CPP Documentation

Overview

The Trusted-CPP project demonstrates an implementation of the concept of guarantees for safe software development in C++ at the language syntax level, while preserving backward compatibility with existing source code.

This makes it possible to change the approach to software safety from a point-by-point search and fixing of individual bugs and vulnerabilities to guaranteeing their absence in the program source code. In other words, if the code compiled correctly, then certain classes of errors are absent in it, and therefore the vulnerabilities associated with them are absent as well.

The Trusted-CPP project consists of two components. The first component is the header file trusted-cpp.h. It contains template classes and the main settings required for the safe code analyzer. The second component is a static C++ code analyzer implemented as a plugin for the clang compiler.

Moreover, the source code can be compiled by any other compiler and without using a special plugin, because it is needed only for static analysis and does not modify the executable file generated by the compiler in any way.

Trusted-CPP implementation details:


A simple example to get started

Download the header file, the compiler plugin, and an auxiliary launcher script that simplifies using the plugin.

The helper script trusted-cpp.sh automatically adds clang arguments to load the plugin and pass it command-line arguments, so you don’t have to write something like this every time:

    $ clang++ -Xclang -load -Xclang ./trusted-cpp_clang.so -Xclang -add-plugin -Xclang trust -Xclang -plugin-arg-trust -Xclang verbose example.cpp

and can replace it with a simple call trusted-cpp.sh -trust verbose example.cpp

When compiling the file invalidate.cpp that contains the following code, no errors will be reported.

    std::vector vect(100000, 0);
    auto x = vect.begin();
    auto &y = vect[0];
    vect = {};
    std::sort(x, vect.end()); // Error
    y += 1; // Error

Whereas with the plugin, the following warning and error messages about invalidation of reference variables will be printed:

    $ ./trusted-cpp.sh -std=c++20 -fsyntax-only invalidate.cpp

../invalidate.cpp:26:5: warning: using main variable 'vect'
   26 |     vect = {};
      |     ^
../invalidate.cpp:31:15: error: Using the dependent variable 'x' after changing the main variable 'vect'!
   31 |     std::sort(x, vect.end()); // Error
      |               ^
../invalidate.cpp:36:5: error: Using the dependent variable 'y' after changing the main variable 'vect'!
   36 |     y += 1; // Error
      |     ^
3 warnings and 2 errors generated.

Description of the safe C++ development concept

The project is based on the concept of safe memory management from the NewLang (trust-lang) language, which was ported to C++ as a separate memsafe library and later extended, including by implementing part of the STEELMAN requirements adapted for C++.

By the term “safe development in C++” we mean:

And since any solution for safe C++ must be economically viable, this means it must provide backward compatibility with existing C++ code, detect errors at the program compilation stage (i.e., as close as possible to the code writing stage), and use automatic control at the source level, similar to safe development facilities built directly into the language.

In other words, if the C++ source code compiled correctly, then it contains no errors, and therefore no vulnerabilities due to:

Guarantees of safe software development at the language syntax level are implemented by the C++ compiler by automatically checking the program source text and imposing restrictions on the use of certain code fragments, which are syntactically correct, but may lead to runtime errors or vulnerabilities.

Direct analysis of Trusted-CPP source code is performed by the compiler plugin, but the source code itself remains an ordinary C++ program, and it can be compiled by any compiler without using the plugin, and you can also use linters and additional static analyzers.

Source code analysis in Trusted-CPP is based on marking elements using user-defined C++ attributes that appeared in C++11. This is very similar to proposals in the safety profiles p3038 by Bjarne Stroustrup and P3081 by Herb Sutter, but does not require developing a new C++ standard.

At the moment, the checking of syntactic rules when the plugin is connected is activated automatically during compilation by using the built-in function __has_attribute (trust). If the plugin is absent during compilation, then the use of user-defined attributes is disabled using preprocessor macros to suppress warnings like warning: unknown attribute 'trust' ignored [-Wunknown-attributes].

A fundamental feature of Trusted-CPP is the ability to mark various elements using user-defined C++ attributes not only when defining a class or function, but also at an arbitrary place in the program source text (or even in an external configuration file). This feature makes it possible to mark important classes, functions, or their arguments after they have been defined and without having to change previously written code.


Nominal (named) typedef typing

In C/C++, a typedef declaration creates an alias for an existing type, but during type casting the type and its alias are not distinguished. Adding nominal typing for typedef makes it possible to prevent accidental type equivalence compared to structural typing and means that two variables are type-compatible if and only if their declarations contain the names of the same type.

Code example using nominal (named) typedef typing

Safe memory management

Safe memory management is fully compatible with C++ code and STL templates and is implemented by using the following kinds of addressing (address variables):

The main difference between strong and weak references and the corresponding standard templates is the way the object is accessed when dereferencing a reference: this is done by creating a temporary variable, and then direct access to the data (object) is obtained through it. Code example

All other variables that store data by value (variables by value) cannot create references. To create references, you must use a smart pointer (shared or weak reference).

Control of multithreaded access to data

Safe multithreaded programming - is the automatic elimination of problems that lead to a “data race condition”.

And to minimize logical errors when acquiring a synchronization object (if this is required for variables with controlled multithreaded access), the attempt to acquire access and dereference a reference is performed as a single operation.

Automatic data access variable is not only a temporary owner of a strong reference, but also performs ownership functions for an inter-thread synchronization object in the style of std::lock_guard, whose lifetime is limited to the current scope and is managed by the compiler automatically.

Open and closed variable scopes

The implementation of safe multithreaded programming is based on the STEELMAN requirements using open and closed scopes for external variables. In fact, this is an implementation of item 5G from the STEELMAN requirements, only refined for C++ and OOP.

In open scopes there are no restrictions on the use of external variables, whereas in closed scopes non-local variables must be explicitly imported (listed). The scope and the list of imported external variables for nested scopes are inherited from higher-level scopes until explicitly overridden.

Closed scopes - are essentially an inversion of OOP (OOP the other way around), where it is not the object’s internal data that is hidden from the external environment, but vice versa: within the current scope, access to variables from the external environment is restricted, and access to them is possible only after their explicit listing in the program source code. (The same basis is used for pure functions, where the external environment becomes unavailable from the function body).

Creating a closed scope or redefining it is done using the macro TRUST_USING_EXTERNAL(""), which is applied to the definition of a function, class, class method, or an individual expression. An example of using closed scopes can be seen in the examples below.

For the purposes of safe multithreaded access in C++, the following scheme is implemented:

Marking with attributes happens once and is inherited for derived classes, and then the compiler automatically tracks correct usage, i.e. so that when creating an execution thread, its body is marked with the THREAD attribute, and the arguments passed into the thread are THREADSAFE.

In addition, the thread function becomes closed for accessing external variables, and the analyzer (the compiler via the plugin) will automatically check that from a function with the THREAD attribute any imported external variables must have the THREADSAFE attribute.

An example of controlling multithreaded access to data is given below

Additional features

Macro restrictions in the exported interface of a C++ module

The differences between the two modes are as follows. Legacy macro processing mode is used for all files except C++ module files (i.e., macro processing happens as before - only at the preprocessor level), whereas the new macro processing mode is intended exclusively for C++ modules, and macro processing itself is performed with AST awareness.

Manual and automatic stack overflow checking

Manual and automatic stack overflow checking - is the only functionality that introduces changes into the generated code when compiling the program, therefore this part was completed as a separate project stack-check, which can be used both together with Trusted-CPP and without it.

Formal proof analysis of program correctness *

In fact, this is an implementation of static checking of dynamic AoRTE (“Absence of Run-Time Errors”) expressions, which does not produce false positives, although false negative results are possible. That is, if there are no compilation errors, then you can be sure that there are no problems in the code, while an indication of a possible error does not always correspond to reality and the tool may be wrong.

Formal analysis does not try to prove the correctness of the program as a whole. It is used only to prove user-defined assertions in different parts of the program and in function calls. Moreover, the correctness proof is performed only to the extent defined by the user, and the assertions themselves correctly and fully describe and constrain the program implementation.

Formal proof analysis of program correctness is implemented following the gnatprove principle for the Ada language and uses three macros to define preconditions, postconditions, and assertions: TRUST_ASSERT_PRED(), TRUST_ASSERT_POST(), and TRUST_ASSERT() respectively. This is something between assert and static_assert, which is evaluated during program compilation, but the expression may use non-constant values (non-constant expressions must be computable at the data type level or described in pre- and postconditions).


*) - This functionality is planned for implementation, but is currently paused until the main part of the project is completed

Code examples

Code example using nominal (named) typedef typing

Creating a data type with nominal typing is done using an attribute, which is expanded when using the TRUST_NOMINAL macro or is listed in the TRUST_NOMINAL_TYPES(...) macro.

    typedef int IntType;

    int int_value = 0;
    IntType IntType_value = 0;
    int int_value_cast = IntType_value;  // OK
    IntType IntType_cast = int_value; // OK

    TRUST_NOMINAL typedef int IntSubType; // Nominal typing at type definition time

    IntSubType IntSubType_value = 0; // OK
    int int_value_cast2 = IntSubType_value; // OK
    IntSubType IntSubType_cast = int_value; // ERROR

    IntType IntType_cast2 = IntSubType_value; // ERROR

    TRUST_NOMINAL_TYPES("IntType"); // Nominal typing for an existing type

    IntType IntType_value2 = 0;
    IntType IntType_cast3 = int_value; // ERROR
    IntType IntType_cast4 = IntSubType_value; // ERROR

Example of using closed scopes

To specify imported variables, you can use a mask for the variable name or the namespace:

int global = 0;

int func_default() {
    // Open scopes by default
    return global;
}

TRUST_USING_EXTERNAL("") // Forbid access to all external variables
int func_closed() {
    // In fact, it is a pure function with no side effects.
    return global; // ERROR
}

TRUST_USING_EXTERNAL("global") // Access is allowed only to the variable global
int func_using_external() {
    return global; // OK
}

Applying safe multithreaded programming

One-time marking of functions and classes that create separate execution threads.

// Set the 'thread' attribute for the first constructor arguments for std::thread and std::jthread classes
TRUST_SET_ATTR_ARGS(thread, std::thread, 1);
TRUST_SET_ATTR_ARGS(thread, std::jthread, 1);

// Set the 'threadsafe' attribute for all constructor arguments of std::thread and std::jthread classes
TRUST_SET_ATTR_ARGS(threadsafe, std::thread::thread, 0);
TRUST_SET_ATTR_ARGS(threadsafe, std::jthread::jthread, 0);

// Mark pthread_create arguments with attributes:
// 'thread' for the third argument and 'threadsafe' for the fourth
TRUST_SET_ATTR_ARGS(thread, pthread_create, 3);
TRUST_SET_ATTR_ARGS(threadsafe, pthread_create, 4);

// Set the 'threadsafe' attribute for thread-safe templates
TRUST_SET_ATTR(threadsafe, std::atomic);
TRUST_SET_ATTR(threadsafe, trust::SyncTimedShared);

Code example that controls a thread function against a “race condition”

uint64_t notrust_count = 0; // Without setting the THREADSAFE attribute
void *thread_notrust(void *arg) { // Thread without the THREAD attribute
    ++notrust_count;              // Race
    return nullptr;
}

std::atomic<uint64_t> trust_count = 0; // Automatic THREADSAFE marking for std::atomic
TRUST_THREAD void *thread_trust(void *arg) { // Thread function (marked with THREAD attribute)
    trust_count++;
    notrust_count++; // ERROR: Expected attribute 'threadsafe' for 'notrust_count'
    return nullptr;
}

Code example of creating threads with control of potential “data race condition” errors


    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_t tid;

    pthread_create(&tid, &attr, thread_notrust, nullptr); // ERROR
    // error: Expected attribute: 'thread' for 3 argument
    //     pthread_create(&tid, &attr, thread_notrust, nullptr);
    //                                 ^

    pthread_create(&tid, &attr, thread_trust, nullptr); // OK

    {
        std::thread t_lambda([&]() {
            for (auto i = 0; i < 1'000'0000; ++i)
                ++notrust_count; // ERROR
            // error: Expected attribute 'threadsafe' for 'notrust_count'
            //              ++notrust_count;
            //                ^
        });

        std::thread t_notrust(thread_notrust, nullptr); // ERROR
        // error: Expected attribute: 'thread' for 1 argument
        //        std::thread t_notrust(thread_notrust, nullptr);
        //                              ^

        std::thread t_trust(thread_trust, nullptr);

        t_lambda.join();
        t_notrust.join();
        t_trust.join();
    }

Example of dereferencing a reference and acquiring an access lock for smart pointers


trust::Shared<int, trust::SyncTimedMutex> var_sync(1); // derived from std::shared_ptr
trust::Weak< trust::Shared<int, trust::SyncTimedMutex> > var_weak = var_sync.weak(); // derived from std::weak_ptr


TRUST_THREAD void func_thread(){
    try{
        // Cannot capture into a static variable
        // static auto static_fail1(var_sync.lock());

        auto sync = var_sync.lock();  // or *var_sync
        auto weak = var_weak.lock();  // or *var_weak

        *sync += *weak;

    } catch(...){
        // Handle access lock or reference dereference error
    }
}

Conclusions and summary

This document provides a simple description of the project with examples of solving obvious and understandable tasks of safe programming in C++.

The current state of the project - not production-ready yet. Most likely it has flaws and omissions, since there are many other interesting and complex situations not covered here, but we will be glad to accept your suggestions for improving the project if you want to add or improve something.