C++ ABI compatibility.

There’s a lot of discussion among people involved in the C++ standard committee over the need to introduce ABI breaking changes. There are proposals for introducing epochs, apparently following the rusty language. There’s some that think that we don’t really have too much of an ABI stability to begin with (i’m on this camp, for instance).

This article is a mind dump on this subject and an idea that I have on how we can solve this and other problems.

What is ABI?

This article objective is not to define what is and what isn’t an ABI. I don’t think I even have the expertise to go into the ABI with too much details. To read this article, all you need to know is that ABI define how each part of a program will interact with each other.

Among other things the ABI answer the following questions:

How does one function responds to another?
What does each bit of memory represents?
Where are the parts of a certain structure?
What is inside each type?
How can one part of the program find another part it needs?

Many of those are defined by the target processor architecture, given that C++ targets to have the best performance possible. But some other things might be implementation details of a C++ tool chain or be heavily influenced by the wording on the standard.

The actual ABI expert, Titus Winters, said on this cppcast episode, that ABI is like a network protocol. While most of the network protocols that we use do have versioning embedded into them, the ABI does not have one. The reason for that is simply that you cannot make every function to negotiate a common ABI before call, as this would simply be too expensive. But there’s nothing stopping the compilers (and or linkers) from doing such negotiations.

Why ABI is a problem?

The standard does not mention ABIs, as it is based on an abstract idealized machine that doesn’t really exists. And yet many (some?) proposals stop on its track for introducing ABI incompatibilities with older versions. Some changes to the language or implementation could force some of the types to become incompatible with previous versions.

When dealing at the ABI level of interoperability, even minute changes to the ABI minutia may become incompatible. Keep in mind that those changes might not even show up in actual code, they are just a side effect of how the compiler translate the code. This means that people need to recompile their code to use an updated ABI.

C++ implementors would rather no force people to recompile all of their code again. There are good reasons to make sure people can still run older code that was generater with older versions of the compiler and linkers.

Some proprietary libraries are distributed in binary and cannot be recompiled.
Original source code not even exists anymore as companies close, loose their data or employees retire.
Original code is no longer compilable with modern versions of the compiler.

So this problem is really about being able to link code that was produced with older and potentially incompatible version of the compiler. And since this is C++ after all, being able to run that code with the smallest performance impact possible.

linking external code

So the impact of the ABI is about linking code, specifically external code. The standard is somewhat silent on linking it’s index, shows a single instance of "link", and that refers to file-system links.

But even if I could not fine a direct mention to the linking action in the standard, this is implied in many places. The very next item on the index is "http://eel.is/c++draft/generalindex#:linkage[linkage]", and this is term that is relevant for this discussion. The usual form of communicating to the compiler that a symbol is not defined here is to specify is as an "external linkage".

Another common term used in the standard that imply a linking stage is "translation unit". So in most (if not all) of the C++ implementations we have a compilation, or translation, stage followed by a linking stage.

ABI and external linkage

The problem is that traditionally the linker is a generic tool, it doesn’t know anything about C++ or even in what language the bits were written.

Fun fact the linker calls the section that holds executable bits as text. All the actual text and other initialized data goes into a .data, a confusing nomenclature in my opinion.

Each part that has to be linked together is a symbol. Each symbol has a name and a content and perhaps some other meta data associated. All the objects, variables, and functions that your program have will become more or less one of those symbols.

But in order to be most efficient as possible the compiler will emit the least possible number of symbols and metadata. This means that many things work implicitly by conventions, the ABI conventions. Data layout, paddings and even naming are expected to align exactly right for a linked library to work as expected.

Suppose that your code import a function with external linkage. The compiler, lacking any other information, assumes that this function follows the same ABI conventions. So when your code needs to call that function, the compiler uses its current ABI convention to invoke it.

Later the linker, that is completely unaware of how the language works, will simply finish connect the two pieces, blindly. If the imported function was generated with some other conventions, in most cases, the process will simply work only to fail later at runtime.

Linking other languages

Now, how can we link and use libraries that were written in other languages? Those other languages aren’t bound by the same conventions that the C++ standard and implementors follow.

The answer for "C++" is usually "C", the "C" ABI is treated as a Rosetta stone. The C++ standard makes a special case for that language and you can specify some symbols as extern "C". This special linkage marker, trigger the compiler to switch its expectations on ABI and start to use the assumptions used on the "C" ABI.

So how all of this can solve the ABI compatibility lock down on specification? To me the answer is simple, an older "ABI" is no different than an "ABI" for a different language like "C". If we can communicate to the compiler that we will use a different ABI convention for the external symbols and types. As long as the compiler understand the other convention it can switch into that convention and adapt calls that cross the ABI boundary.

One possible way is to use the same strategy that we have today :

extern "C" { (1)

#include <c_code_header.h>

}

extern "C++@2005" { (2)

#include <old_code.h>

}

1	The standard way to define external "C" symbols.
2	Here the special `@2005` shows to the compiler that those symbols uses a potentially different ABI convention.

In this example the compiler would interpret all of the included file using the same ABI as if it was 2005 again.

Linking into older standards

But there’s a catch, imagine that an ABI change makes it so that std::string is incompatible with older versions of std::string? How can the brand new code link with old code that uses this incompatible version of the same type? This has happen before, old versions of the gcc c++ compiler had a std::string that used COW ^[1] incompatible with the C++11 standard.

Modern C++ use namespaces to deal with potential naming clashes. Namespacing for those external symbols could disambiguate those two types.

The idea is to introduce special namespaces that isolate the external symbols that follow a different ABI. Objects, types, templates and everything else inside those namespaces would be linked to external symbols following the conventions from the other ABI.

Consider the following example that employs the keyword using as a possible syntax to mark a namespace that has an ABI translation:

import std.string;
import std.stream;

// Proposed translation namespace :
namespace old_code using("c++@2005") {(1)

#include <string> (2)

}

int main()
{
    std::string hi("hello");
    old_code::std::string world("world"); (3)
    std::cout << hi << " " << world.cstr() << "\n";
}

1	In order to help the compatibility the compiler connects this namespace to the root of any code that is declared here.
2	This inclusion will find the older version of the strings header, this could be gcc’s "COW" implementation for instance. The compiler compiles all the code, including the included files, as if the namespace `old_code` was the root. Think of this as a `chroot` in the unix/linux world.
3	The new code will see all the symbols defined on that section as if they were inside the namespace. The compiler could either direct translate the symbols or use some constexpr library to translate them.

As you can see above, the type old_code::std::string is translated into the older version of the std::string for the old code. The new code is able to refer to the old code seamless, and there’s no risk of clashes.

There is a risk for duplicated symbols though. This risk can be mitigated by thin constexpr translation layer and de-duplication on the linking step. The translation would simply use the original symbols in such a way that the linker could collapse them.

TLDR.

The ABI situation, is more about linking external methods and symbols, specifically those originated in older versions of the compiler. The current model attempts to interpret old compiled code using the same assumptions made for code that was just compiled. This means that the old code will see the exact same std::string as the new code. That assumption must go, because it implies that newer std::string implementations cannot fix some shortcomings from older implementations.

To make this type of linkage easier we need :

A way to isolate the symbols from each other.

This would need to be made in such a way to allow the linker to also merge symbols when it makes sense.
Define how a translation layer works.

I believe that well defined and customisable translation layer could be a way to consolidate foreign language exports. Other language vendors, or binding libraries could create their own translation layer and help bind with C++ code.
Profit! with a better standard that can break ABI.

All of this dance is only necessary in most cases when the older code is linked in a binary form. I believe that C++ should stay as much as possible compatible with older code at source-code level.

This is not a C++ proposal…

…yet.

This is just an idea, the whole syntax and how it should work is a quick and dirty dump of what I’ve been thinking on this subject. I’m very sure that a lot of hard problems would arise in the process of making this into a full fledge proposal. But I believe that this feasible, if the community is excited about it.

I could help make it into a proposal if I come to believe that this is indeed a desired direction. For it to become a proposal I would probably have to do the following yet :

listen from feed back from the community
Better define how the mappings works.
How the type definitions works.

Types defined inside the translated namespace would potentially need to be layouted differently.
Nail down the syntax.

Possibly a shortcut to import/include an external header from a different ABI.

One possibility here is to pig-back in the meta-classes proposal by Herb Suther. Potentially turning the proposal into a library only proposition perhaps?

1. Copy on write

Bogado.net