Rust/C++ Interop Part 1 - Just the Basics

9 minutes (2320 words)
NOTE

This is a blog post covering the content I prepared for the Boulder Rust Meetup.

§ A Collective Craft

Before we talk about how we must discuss why. I will presume these things:

This is the situation I find myself in. The primary objections I heard about using Rust in projects at work are social and not technical. Here are some that I have heard:

Using one language makes the project more accessible for everyone. Using multiple languages will make it harder for me to contribute. If we write it in Rust, it will make hiring people to work on this project challenging.

Research has demonstrated that diversity leads to better outcomes. It is also better for our projects. The underlying sentiment in these statements is the fear of the other. Instead of criticizing coworkers for their fear, I have found it helpful to focus on how building a heterogeneous codebase can lead to a better project.

Programming is a craft, and the quality of the outcome of our work is improved by working together from different perspectives. C++, the projects built in it, and the programmers who use it have value. Rust brings new ideas and perspectives, and building a bridge within our projects can lead to better codebases than they were as homogeneous projects.

Golden Gate Bridge, San Francisco
Golden Gate Bridge, San Francisco

Writing software is a team sport where we want to welcome a diversity of ideas and approaches to find the best solutions to any given problem. Even if you wanted to rewrite a large C++ project into Rust, that is unlikely to be possible given project timelines and the makeup of your team. If you have a C++ codebase, you likely have C++ programmers as coworkers, and building a bridge will make you more likely to win their support.

§ Code Generation

A handful of well-known projects aim to automate creating bridges to and from C++.

Due to my desire to create interfaces involving library types in Rust and C++ that felt first class in both languages none of these tools met my requirements. At PickNik we write robotics code and much of C++ code uses Eigen types. In Rust I wanted to use nalgebra types to represent the same concepts.

§ On the Shoulders of Giants

OptIk is the project I learned much of this from. Look at it for a complete example.

Leonard P. Zakim Bunker Hill Memorial Bridge, Boston
Leonard P. Zakim Bunker Hill Memorial Bridge, Boston

§ System Design

Interop to C++ is done via the classic hourglass approach. We bridge the Rust library to C and create C++ types that safely use the C interface. This is the same way interop to other languages such as Python works.

Hourglass Pattern
Hourglass Pattern

We need a way for C++ to have those Rust objects to call Rust functions that take Rust objects as arguments. We must create Rust objects and leak pointers to the C++ code to do this. We also include functions in Rust that can destroy these objects, given a pointer to one. We can use a C++ class with opaque pointers to Rust objects as members, which takes care of freeing them using its destructor method. One important reason this is necessary is that allocators and deallocators come in pairs. It is not valid to destroy a Rust object with the C++ deallocator or vice-versa.

In cases where we must create C++ library types from Rust library types, such as making an Eigen::Isometry3d from a nalgebra::geometry::Isometry3, we must copy the underlying data instead of sharing the memory. This is because, in C++, we cannot extend a library type to handle the destruction of the underlying memory using a different deallocator.

In the particular case of the Rust homogeneous transform type nalgebra::geometry::Isometry3, the underlying data is a 4x4 matrix of doubles represented by a single array of 16 doubles. A fixed-size array is something that we can pass across the FFI boundary. We’ll take advantage of this to avoid making extra copies or allocations.

Fremont Bridge, Portland
Fremont Bridge, Portland

There is a concern about how we integrate with a C++ build system. As the C++ code at my work uses CMake, I will link to an example showing how to make this C++ project consumable by other CMake projects.

I will separate my project into two Rust crates (packages) for code layout.

§ Custom Opaque Types

Given this Rust struct and factory function, we must create a C interface.

pub struct Joint {
    name: String,
    parent_link_to_joint_origin: Isometry3<f64>,
}

impl Joint {
    pub fn new() -> Self;
}

Over in robot_joint-cpp, I create a lib.rs with these details.

use robot_joint::Joint;

#[no_mangle]
extern "C" fn robot_joint_new() -> *mut Joint {
    Box::into_raw(Box::new(Joint::new()))
}

#[no_mangle]
extern "C" fn robot_joint_free(joint: *mut Joint) {
    unsafe {
        drop(Box::from_raw(joint));
    }
}

Each function needs the #[no_mangle] attribute to turn off Rust name mangling and extern "C" to give the function the C calling convention. Box::into_raw(Box::new( is a technique for creating a Rust object on the heap and leaking a pointer to it. Lastly, drop(Box::from_raw) is a way to take a pointer, convert it back into a Box type, and destroy it.

Next, we create a C++ header robot_joint.hpp.

namespace robot_joint {
namespace rust {
// Opaque type for holding pointer to rust object
struct Joint;
}

class Joint {
  public:
    Joint();
    ~Joint();

    // Disable copy as we cannot safely copy opaque pointers to rust objects.
    Joint(Joint& other) = delete;
    Joint& operator=(Joint& other) = delete;

    // Explicit move.
    Joint(Joint&& other);
    Joint& operator=(Joint&& other);

  private:
    rust::Joint* joint_ = nullptr;
};

}  // namespace robot_joint

Here, we create the source file for our C++ interface. Note how we use extern "C" to enable our C++ code to call the C functions from our Rust code. This is something we are manually keeping in sync. Had we used one of the previously linked-to code-generators, we would not have had to do this.

The constructor calls the Rust function that creates the Joint type and stores the pointer in the member joint_. The move constructor and assignment functions make this C++ type-safe to move by never creating two copies of the internal pointer. Lastly, the destructor frees the rust joint_ object by calling the Rust function, which drops the memory.

#include "robot_joint.hpp"

extern "C" {
extern robot_joint::rust::Joint* robot_joint_new();
extern void robot_joint_free(robot_joint::rust::Joint*);
}

namespace robot_joint {

Joint::Joint() : joint_(robot_joint_new()) {}

Joint::Joint(Joint&& other) : joint_(other.joint_) {
  other.joint_ = nullptr;
}

Joint& Joint::operator=(Joint&& other) {
  joint_ = other.joint_;
  other.joint_ = nullptr;
  return *this;
}

Joint::~Joint() {
  if (joint_ != nullptr) {
    robot_joint_free(joint_);
  }
}

}  // namespace robot_joint

Lastly, the most challenging part is to make this compatible with CMake projects. I wrote a follow-on blog post about that subject.

§ First-class Library Types

Remember, I said I took the manual approach because I wanted an interface with Eigen types on the C++ side. Here is a simple example of how to accomplish that. Presume we have this Rust function on our Joint type.

impl Joint {
    pub fn calculate_transform(&self, variables: &[f64]) -> Isometry3<f64>;
}

We want to create a C++ interface like this.

class Joint {
  public:
    Eigen::Isometry3d calculate_transform(const Eigen::VectorXd& variables);
};

First, we must create the Rust FFI interface for this function.

use std::ffi::{c_double, c_uint};

#[repr(C)]
struct Mat4d {
    data: [c_double; 16],
}

#[no_mangle]
extern "C" fn robot_joint_calculate_transform(
    joint: *const Joint,
    variables: *const c_double,
    size: c_uint,
) -> Mat4d {
    unsafe {
        let joint = joint.as_ref().expect("Invalid pointer to Joint");
        let variables = std::slice::from_raw_parts(variables, size as usize);
        let transform = joint.calculate_transform(variables);
        Mat4d {
            data: transform.to_matrix().as_slice().try_into().unwrap(),
        }
    }
}

C types we need for parameters come from the ffi module in the Rust standard library. Before calling the rust calculate_transform, we must first construct the Rust types from the parameters.

Interestingly, we use an undocumented fact that thin pointers can be utilized in ffi. A sized slice is a thin pointer that does not store the size at runtime. We can return a sized slice by value by placing it in a struct and setting the memory representation as C.

Then, we can write a C++ function that calls the C functions.

struct Mat4d {
  double data[16];
};

extern "C" {
extern struct Mat4d robot_joint_calculate_transform(
  const robot_joint::rust::Joint*, const double*, unsigned int);
}

namespace robot_joint {
Eigen::Isometry3d Joint::calculate_transform(const Eigen::VectorXd& variables)
{
  const auto rust_isometry = robot_joint_calculate_transform(
    joint_, variables.data(), variables.size());
  Eigen::Isometry3d transform;
  transform.matrix() = Eigen::Map<Eigen::Matrix4d>(rust_isometry.data);
  return transform;
}
}  // namespace robot_joint

The Rust Mat4d type returned from robot_joint_calculate_transform contains a fixed-size array of sixteen doubles. We can type-cast a 4x4 Eigen matrix using this array and assign it to an Isometry3d, which we then return.

§ Conclusion

Building a bridge that creates excellent C++ and Rust interfaces is more straightforward than many think. You will likely have more trouble convincing your C++-loving coworkers to let you write code in Rust than doing the interop.

Code without tests should be considered broken. To trust all this unsafe C++ and Rust code, we should write tests that exercise all the code paths and run them with sanitizers. In a future post, I’ll show you how to use the excellent C++ Catch2 library to test your C++ bindings with addresses and undefined behavior sanitizers. I wrote a follow-on about CMake Integration to explain how to do just that.

Red Cliff Bridge
Red Cliff Bridge

§ Future Work

I also want to explore the idea of relying primarily on the cxx crate for interop and building a C++ interface or extending the macros to handle types like Isometry3. The significant upside is that I can reduce the amount of manually written unsafe code.

Next: C++ Interop Part 2 - CMake

§ References

Tags: #Talks #Rust #C++