The C++ Compilation Model

And other Unixy bits & pieces. A tutorial.

To follow along with this tutorial you will need a Unix-like system such as Linux, BSD, Mac OS X, or (on Windows) Cygwin.

A simple C program

We’ll begin with a very simple C (not C++) program. Unless mentioned otherwise, everything we’ll see about C also applies to C++ (which was designed with C compatibility in mind); in due course we will move onto C++ specifics.

Our program, reproduced below in full, is a complete implementation of a read-only “database” of the number of passengers on a set of flights. Take a moment to understand it fully, as we will be using this program throughout this study.

paxCount.c

int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return 0; } int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1]); return 0; }

paxCount takes one command-line argument: A flight number, like 0 or 1 or 2 (in fact those are the only flights in our database).

The program returns, as its shell exit status, the number of passengers on the specified flight.

Let’s try it:

$ gcc -Wall -Werror paxCount.c && ( ./a.out 1 ; echo $? ) 15

Notes:

gcc is the Gnu Compiler Collection. It can compile C, Objective-C, and C++, as well as Ada and Fortran.

We used the -Wall flag to enable all compiler warnings, and -Werror to treat all warnings as errors (so that we are forced to fix them to get the code to compile!). This is a good habit. You should also consider using -Wextra or -pedantic. To add debugging information, usable by gdb, compile with -g.

The above shell command, up to the “&&”, compiles the code. By default, gcc creates an executable named a.out (you can override this with the -o flag).

If the compilation succeeded, we run the commands inside the parentheses: We run the program we just compiled, passing it the argument “1”; then we display the exit status of the program (which is the number of passengers on flight “1”).

Object files & the linker

Our boss is very impressed by our database of flights. He’s asked us to extract it from the paxCount program so that he can use it in his own programs.

Our plan is to split the program into two separate source files, paxCount.c and paxDB.c, in such a way that paxDB.c is re-usable by other programs.

paxCount.c

int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1]); return 0; }

paxDB.c

int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return 0; }

gcc paxCount.c paxDB.c” will compile both these sources together into a single executable file. Let’s manually step through the operations gcc would perform.

What we usually think of as compilation consists of several stages, some of which we will cover later. For now, we are interested in what we’ll call the actual compilation, followed by linking.

In the compilation stage, the compiler takes each source file (.c) individually, and compiles it into an object file (.o) consisting of machine-language instructions. For example, when compiling paxCount.c to paxCount.o, the compiler doesn’t look at paxDB.c at all. paxDB.c might be on a different drive, or it might not even exist yet, and it wouldn’t make any difference.

Even when you specify all the sources in a single command like “gcc paxCount.c paxDB.c”, gcc will compile each source file individually, to its own object file.

In the linking stage, the linker combines all of the object files into a single executable file.

The compiler comes from your compiler vendor (in this case, GNU), whereas the linker comes with your system (in the case of Linux, the linker is also made by GNU). This implies that the compiler must create object files in a format that the linker will understand (on Linux, and many other systems, this is the “Executable and Linkable Format”, or ELF).

Let’s try to compile our new paxCount.c:

$ gcc -Wall -Werror paxCount.c paxCount.c: In function ‘main’: paxCount.c:4: warning: implicit declaration of function ‘getCount’

Oh-oh! The definition of getCount is in a different file (paxDB.c). After compiling paxCount.c we will link it with paxDB.o, but right now we can’t even get this function-call to compile, unless we tell the compiler what the function looks like! (Remember, when compiling paxCount.c, the compiler doesn’t look at paxDB.c.)

To compile paxCount.c we have to declare the function. The declaration tells the compiler that the function exists somewhere else, and what the function’s signature is: Its return type, how many arguments it takes, and their types.

Now our paxCount.c looks like this:

paxCount.c

int getCount(char* flightNumber); /* function declaration */ int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1]); return 0; }
$ gcc -Wall -Werror paxCount.c Undefined symbols: "_getCount", referenced from: _main in ccjyTfqz.o ld: symbol(s) not found collect2: ld returned 1 exit status

This time the compilation to .o was successful, but then gcc tried to do the linking stage, and couldn’t find the function getCount. That’s no surprise, because we didn’t tell the linker about paxDB.

To tell gcc to compile but not link, we pass it the -c flag:

$ gcc -c -Wall -Werror paxCount.c $ ls paxCount.o paxCount.o

Finally! Now let’s compile paxDB.c:

$ gcc -c -Wall -Werror paxDB.c $ ls paxDB.o paxDB.o

We can use the nm command to view the symbols in each of these object files:

$ nm paxCount.o U _getCount 0000000000000000 T _main $ nm paxDB.o 0000000000000058 D _flights 0000000000000000 T _getCount

The last column lists the symbols in the object file. The first column shows each symbol’s value (its location or offset within the file) and the second column shows the symbol’s type.

We will investigate the various types later on, but for now all we need to know is that “U” means “undefined” (paxCount.c declared getCount, but didn’t define it).

The other symbols are all defined, so let’s be content knowing that paxCount.o does indeed contain main, and paxDB.o contains flights (the array) and getCount (the function).

Now we can tell gcc to link the object files we compiled just before. Note that we use gcc for convenience, but behind the scenes gcc is calling the linker, ld.

$ gcc -Wall -Werror paxCount.o paxDB.o $ ls a.out a.out $ ./a.out 1 ; echo $? 15

Just like our previous monolithic program!

To take advantage of our modular database code, let’s write another program. This one is called “paxCheck”: It takes the flight number on the command line, and returns success if the flight is in the database and has any passengers, or error otherwise (note that to the shell, a zero exit code means success, and non-zero means error). paxCheck.c is very similar to paxCount.c:

paxCheck.c

int getCount(char* flightNumber); int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1]); if (count == 0) return 1; /* error */ else return 0; /* success */ }

So that we don’t get confused about which “a.out” we are calling, let’s start giving our executables proper names. For brevity we will use a single command to compile paxCheck.c to object code, and then link it with paxDB.o to generate our paxCheck executable. Even though we use a single command, gcc performs these distinct operations sequentially.

$ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.o $ ./paxCheck 1 && echo "OK" || echo "ERROR" OK $ ./paxCheck 5 && echo "OK" || echo "ERROR" ERROR

All that hard work paid off! See how easy it was to re-use the flights database code.

Before we move on, let’s ask ourselves: What would happen if we tried to compile and link paxDB.c on its own? paxDB.c doesn’t reference any functions it hasn’t defined itself, so it should work… right?

$ gcc -Wall -Werror paxDB.c Undefined symbols: "_main", referenced from: start in crt1.10.6.o ld: symbol(s) not found collect2: ld returned 1 exit status

The compilation (to object code) worked, but the linker reported an error. Every C (or C++) program must define a “main” function, which is where the program execution will begin.

Questions:

What have we learned so far that has implications into parallel compilation of large C/C++ codebases?

Notes:

Other useful tools are file, which displays the type of a file (executable, object, etc); objdump, which displays all sorts of information about an object binary file; and strings, which displays any ASCII strings in a binary file.

The assembler

In the previous section we saw some of the stages of what we usually think of as compilation. We saw that compilation produced an object file from a source code file, and that linking combined one or more object files into a single executable file.

Now we will see that what we called “compilation” in the previous section actually consists of two separate stages: compilation proper and assembly.1

Compilation converts the C (or C++) code to assembler code; assembly, performed by the assembler, “assembles” the assembler code into object code—machine instructions.

Like the linker, the assembler comes with your system.

To tell gcc to compile but not assemble our code, we use the -S flag. This produces a .s file:

$ gcc -S -Wall -Werror paxCount.c $ cat paxCount.s _main: LFB2: pushq %rbp LCFI0: movq %rsp, %rbp LCFI1: subq $32, %rsp LCFI2: movl %edi, -4(%rbp) movq %rsi, -16(%rbp) cmpl $1, -4(%rbp) jle L2 movq -16(%rbp), %rax addq $8, %rax movq (%rax), %rdi call _getCount movl %eax, -20(%rbp) jmp L4 L2: movl $0, -20(%rbp) L4: movl -20(%rbp), %eax leave ret

In the assembler code for the main function, above, we can see a call to getCount. Think of this as a placeholder, because getCount is defined somewhere else, and paxCount.o doesn’t know where.

After linking with paxDB, we see that the linker has substituted the actual address of getCount2 (here we use gdb’s disassemble command to view the assembler code for the final, compiled and linked, file):

$ gcc -Wall -Werror -o paxCount paxCount.c paxDB.c $ nm paxCount 0000000100000e9a T _getCount 0000000100000e64 T _main $ echo "disassemble main" > gdb.instructions $ gdb -n -batch -x gdb.instructions paxCount Reading symbols for shared libraries .. done Dump of assembler code for function main: 0x0000000100000e64 <main+0>: push %rbp 0x0000000100000e65 <main+1>: mov %rsp,%rbp 0x0000000100000e68 <main+4>: sub $0x20,%rsp 0x0000000100000e6c <main+8>: mov %edi,-0x4(%rbp) 0x0000000100000e6f <main+11>: mov %rsi,-0x10(%rbp) 0x0000000100000e73 <main+15>: cmpl $0x1,-0x4(%rbp) 0x0000000100000e77 <main+19>: jle 0x100000e8e <main+42> 0x0000000100000e79 <main+21>: mov -0x10(%rbp),%rax 0x0000000100000e7d <main+25>: add $0x8,%rax 0x0000000100000e81 <main+29>: mov (%rax),%rdi 0x0000000100000e84 <main+32>: callq 0x100000e9a <getCount> 0x0000000100000e89 <main+37>: mov %eax,-0x14(%rbp) 0x0000000100000e8c <main+40>: jmp 0x100000e95 <main+49> 0x0000000100000e8e <main+42>: movl $0x0,-0x14(%rbp) 0x0000000100000e95 <main+49>: mov -0x14(%rbp),%eax 0x0000000100000e98 <main+52>: leaveq 0x0000000100000e99 <main+53>: retq End of assembler dump.

Notes:

  1. You rarely need to think about compilation and assembly as two separate stages, but I’ve included this chapter for completeness. Personally, my main use case for looking at the assembler code is to figure out what optimizations the compiler is and isn’t performing. It can help you win arguments like “should I write

    iterator e = container.end();
    for (iterator i = container.begin(); i != e; ++i)

    instead of

    for (iterator i = container.begin(); i != container.end(); ++i)

    for performance reasons?” (Answer: No. In most cases, with compiler optimizations enabled, the second is just as performant; the first one could actually be incorrect if you modify the container inside the loop.)

  2. As we saw, the linker has replaced all references to a given symbol by that symbol’s address, so we don’t really need the symbols anymore. strip will remove the symbol information from an executable file, reducing the size of the file. However, the symbol information can be used by a debugger, so unless space is at a premium (or you find 0x100000e9a more readable than getCount) don’t strip your binaries!

Header files & the preprocessor

After we shipped paxCount, our client wants to change the specification! paxCount should distinguish between flights with 0 passengers (returning 0), and flights not in the database (returning -1).

Easy enough. For future flexibility, we decide to add a “default” parameter to getCount:

paxDB.c

int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

paxCount.c

int getCount(char* flightNumber, int deflt); int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1], -1); return 0; }

This works pretty much as expected (255 is -1 in an unsigned 8-bit encoding):

$ gcc -Wall -Werror -o paxCount paxCount.c paxDB.c $ ./paxCount 2 ; echo $? 0 $ ./paxCount 5 ; echo $? 255

However, next morning our automated tests of paxCheck are producing strange results:

$ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.c $ ./paxCheck 5 && echo "OK" || echo "ERROR" OK

What’s going on? Flight 5 is not in our database, so the above command should return ERROR, not OK! Let’s have a look at paxCheck.c:

paxCheck.c

int getCount(char* flightNumber); int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1]); if (count == 0) return 1; /* error */ else return 0; /* success */ }

Oh right! We changed getCount in paxDB.c, but we forgot to update the declaration in paxCheck.c.

Even if you are confused by how this still compiles and runs,1 at least it is clear what the cause of the problem is. It’s easy enough to fix, by correcting the declaration of getCount to match its definition in paxDB.c:

paxCheck.c

int getCount(char* flightNumber, int deflt); int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1]); if (count == 0) return 1; /* error */ else return 0; /* success */ }
$ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.c paxCheck.c: In function ‘main’: paxCheck.c:8: error: too few arguments to function ‘getCount’

Aha! Now that the compiler knows the correct prototype for getCount, it can tell that the function call is incorrect, and stop with a compilation error.

The correct program is:

paxCheck.c

int getCount(char* flightNumber, int deflt); int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1], 0); if (count == 0) return 1; /* error */ else return 0; /* success */ }
$ gcc -Wall -Werror -o paxCheck paxCheck.c paxDB.c $ ./paxCheck 5 && echo "OK" || echo "ERROR" ERROR

Well, that works, but imagine that we had 10, 20, or 100 programs all using paxDB.c. Clearly this isn’t a solution that is going to scale. For this very reason, header files were invented.

The idea is that paxDB will provide an official header file, containing only the declaration of getCount. All the programs that use paxDB will include this header file, instead of writing their own declaration of getCount. If the interface ever changes, we only have to change paxDB’s header file, and errors in the other programs won’t slip past the compiler.

paxDB.h

int getCount(char* flightNumber, int deflt);

paxDB.c2

#include "paxDB.h" int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

paxCount.c

#include "paxDB.h" int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1], -1); return 0; }

paxCheck.c

#include "paxDB.h" int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1], 0); if (count == 0) return 1; /* error */ else return 0; /* success */ }

So how exactly does this work? Time to introduce another stage in the compilation process: Preprocessing. Although we have studied it last of all, it is the first stage to take place in the compilation process.

The C PreProcessor (cpp) takes our source file, and replaces “#include "paxDB.h"” with the contents of paxDB.h. To see the output of the preprocessor, we use gcc -E:

$ gcc -E -Wall -Werror paxCount.c # 1 "paxCount.c" # 1 "<built-in>" # 1 "<command-line>" # 1 "paxCount.c" # 1 "paxDB.h" 1 int getCount(char* flightNumber, int deflt); # 2 "paxCount.c" 2 int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1], -1); return 0; }

That output, called a translation unit, is what is fed to the compiler. The lines beginning with “#” are inserted by the preprocessor to help the compiler generate error messages with the correct line numbers, and filenames, of the original files. Unlike the other stages, using the -E flag prints to the standard output rather than to a file.

That’s really all there is to header files: Textual inclusion. There is nothing magic about them. We could call our file paxDB.not-a-header and say #include "paxDB.not-a-header" instead, and it’s still a header file. The .h suffix is just a convention. What the compiler proper sees is exactly the same (bar those “#” comments) as our original paxCount.c (the one that declared getCount directly instead of #includeing the header file).

In spite of their simplicity, header files and the preprocessor provide a powerful mechanism for specifying the interface to a source code library or module.3

To recap all the stages of compilation, the following commands show all the operations that take place for the command gcc -Wall -Werror -o paxCount paxCount.c paxDB.c:4

$ gcc -E -Wall -Werror paxCount.c > paxCount.i $ gcc -S -Wall -Werror paxCount.i $ gcc -c -Wall -Werror paxCount.s $ gcc -E -Wall -Werror paxDB.c > paxDB.i $ gcc -S -Wall -Werror paxDB.i $ gcc -c -Wall -Werror paxDB.s $ gcc -Wall -Werror -o paxCount paxCount.o paxDB.o $ ./paxCount 1 ; echo $? 15

Notes:

  1. The original version of paxCheck.c still compiles, because as far as it knows, getCount will behave exactly as it is declared to (remember, the compiler doesn’t have any information to suggest otherwise). The linking also works, because as far as the linker knows, paxCheck.o asked for a symbol called getCount, and paxDB.o exports just such a symbol (remember, this is C; C++ behaves a little differently in this scenario, but we will get to that later). So main calls getCount, providing just one argument. But getCount expects two arguments, in some particular memory locations. The first of these memory locations is correctly populated with the first argument, but the second is uninitialized, random, memory. Whatever value it contains is very unlikely to be 0, so main returned success.

  2. Our new version of paxDB.c doesn’t strictly have to include paxDB.h to get the declaration of getCount, because you can write a function definition without a previous declaration. It is a good idea though, because if we change the definition and forget to update the header file, the compiler will detect the mismatch. (Again, in this exact scenario C++ behaves a little differently, as we shall see.)

  3. C and C++ don’t have a real facility for making modules or packages, so this is the best we can do.

  4. Try this at home: gcc -v -Wall -Werror -o paxCount paxCount.c paxDB.c.
    With the -v flag, gcc will print the commands it executes to run each of the stages of compilation. If you’re using the GNU toolchain, cpp is the C preprocessor, cc1 is the compiler proper, as is the assembler, and collect2 is the linker driver.

Preprocessor macros

The preprocessor can also make other substitutions to your source files—there’s more than just #include. You are probably familiar with #define:

paxCheck.c

#include "paxDB.h" #define ERROR 1 #define SUCCESS 0 int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1], 0); if (count == 0) return ERROR; else return SUCCESS; }

The preprocessor replaces every occurrence of the #defined constants with their textual value:

$ gcc -E -Wall -Werror paxCheck.c # 1 "paxCheck.c" # 1 "<built-in>" # 1 "<command-line>" # 1 "paxCheck.c" # 1 "paxDB.h" 1 int getCount(char* flightNumber, int deflt); # 2 "paxCheck.c" 2 int main(int argc, char** argv) { int count = 0; if (argc > 1) count = getCount(argv[1], 0); if (count == 0) return 1; else return 0; }

#define can also take parameters, to create preprocessor macros:

paxDB.c

#include "paxDB.h" int flights[] = { 20, 15, 0 }; #define GET(n) if (flightNumber[0] == #n[0]) return flights[n] int getCount(char* flightNumber, int deflt) { GET(0); GET(1); GET(2); return deflt; }
$ gcc -E -Wall -Werror paxDB.c # 1 "paxDB.c" # 1 "<built-in>" # 1 "<command-line>" # 1 "paxDB.c" # 1 "paxDB.h" 1 int getCount(char* flightNumber, int deflt); # 2 "paxDB.c" 2 int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == "0"[0]) return flights[0]; if (flightNumber[0] == "1"[0]) return flights[1]; if (flightNumber[0] == "2"[0]) return flights[2]; return deflt; } $ gcc -Wall -Werror -o paxCount paxCount.c paxDB.c $ ./paxCount 1 ; echo $? 15

Preprocessor macros (and constants) don’t have to be all-uppercase; that’s a convention. It’s a useful convention because macros are plain text substitution, so they act very differently to run-time function calls, sometimes in unexpected ways. Much has been written on the evils of preprocessor macros—google it. They are very useful, very occasionally.

Internal vs. external linkage

Remember how we looked at the symbols in paxDB.o using the nm tool:

$ nm paxDB.o 0000000000000058 D _flights 0000000000000000 T _getCount

The D in front of the symbol flights indicates that it is in the data section of the object file. It is an upper-case D indicating that the symbol is external. We say that flights has external linkage. This means that code outside of paxDB.c’s translation unit can access flights by name. This may not be what we want:

paxCount.c

#include "paxDB.h" extern int flights[]; int main(int argc, char** argv) { flights[1]++; if (argc > 1) return getCount(argv[1], -1); return 0; }
$ gcc -Wall -Werror -o paxCount paxCount.c paxDB.o $ ./paxCount 1 ; echo $? 16

The line starting with extern declares that flights is an array of ints defined somewhere else (this line is a declaration rather than a definition because it contains the extern specifier, and it doesn’t have an initializer). So when we linked the two object files together, paxCount.o was able to change the array in paxDB.o!

In paxDB.c we can disable outside access to the flights array by giving it internal linkage.1 In C, global symbols are given internal linkage with the static keyword:2

paxDB.c

#include "paxDB.h" static int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

Looking at the symbols in paxDB.o, we see that flights is internal (the d is now lower-case):

$ gcc -c -Wall -Werror paxDB.c $ nm paxDB.o 0000000000000058 d _flights 0000000000000000 T _getCount

Now the code outside of paxDB.c’s translation unit can’t change flights:

$ gcc -c -Wall -Werror paxCount.c $ gcc -Wall -Werror -o paxCount paxCount.o paxDB.o Undefined symbols: "_flights", referenced from: _main in paxCount.o _main in paxCount.o ld: symbol(s) not found collect2: ld returned 1 exit status

Notes:

  1. Even with internal linkage, you can reference a function or variable if you somehow know its address. Internal linkage simply stops you from referring to it by name (which, for most purposes, is all that really matters).

  2. This usage of static is deprecated in C++. We will see an alternative soon. For the rules of linkage in C++, see section 3.5 of the C++ standard.

C++ name mangling

Suppose we want to package up paxDB into an even more useful library. Some clients may want to call getCount with an integer, instead of a character-string. We want to provide both options:

paxDB.h

int getCount(char* flightNumber, int deflt); int getCount(int flightNumber, int deflt);

paxDB.c

#include "paxDB.h" static int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; } int getCount(int flightNumber, int deflt) { if (flightNumber >= 0 && flightNumber <= 2) return flights[flightNumber]; return deflt; }
$ gcc -c -Wall -Werror paxDB.c In file included from paxDB.c:1: paxDB.h:2: error: conflicting types for ‘getCount’ paxDB.h:1: error: previous declaration of ‘getCount’ was here paxDB.c:14: error: conflicting types for ‘getCount’ paxDB.c:6: error: previous definition of ‘getCount’ was here

Error! C doesn’t allow function overloading, or two functions with the same name but taking different parameters.

This makes sense if you think about the linker’s job: As we saw, the linker has to take a machine instruction like call getCount and replace it with the address of the function getCount. The linker doesn’t have enough information to decide which getCount to use (if more than one were allowed).

Instead of trying to solve this problem in C, let’s start looking at C++. We’ll compile and link C++ code by calling g++. C++ is (mostly) backwards-compatible with C, so we don’t need to change our code, but for clarity we will rename the files to use the .cpp file extension instead of .c.

$ mv paxDB.c paxDB.cpp $ g++ -c -Wall -Werror paxDB.cpp $ mv paxCount.c paxCount.cpp $ g++ -c -Wall -Werror paxCount.cpp $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o $ ./paxCount 1 ; echo $? 15

Now let’s investigate how C++ goes about supporting function overloading:

$ nm paxDB.o 0000000000000000 T __Z8getCountPci 0000000000000058 T __Z8getCountii 0000000000000098 d __ZL7flights $ nm paxCount.o U __Z8getCountPci 0000000000000000 T _main

To tell the two getCount functions apart, C++ uses name mangling to encode a function’s type information into the symbol itself.

The exact format of the name mangling isn’t defined by the C++ standard so it depends on your compiler. Here the two getCounts were mangled into __Z8getCountPci and __Z8getCountii.1 Note that flights is also mangled, even though there is only one—name mangling applies to all symbols.

Let’s look at the assembler code for the main function in paxCount.cpp:

$ g++ -S -Wall -Werror paxCount.cpp $ cat paxCount.s _main: LFB2: pushq %rbp LCFI0: movq %rsp, %rbp LCFI1: subq $32, %rsp LCFI2: movl %edi, -4(%rbp) movq %rsi, -16(%rbp) cmpl $1, -4(%rbp) jle L2 movq -16(%rbp), %rax addq $8, %rax movq (%rax), %rdi movl $-1, %esi call __Z8getCountPci movl %eax, -20(%rbp) jmp L4 L2: movl $0, -20(%rbp) L4: movl -20(%rbp), %eax leave ret

The compiler generated assembler code using the mangled names. The linker doesn’t know anything about the compiler’s name-mangling scheme; all the linker knows is that there is a call to __Z8getCountPci in paxCount.o, and a matching definition in paxDB.o.2

The tool c++filt demangles names into a human-readable form:3

$ nm paxDB.o | c++filt 0000000000000000 T getCount(char*, int) 0000000000000058 T getCount(int, int) 0000000000000098 d flights

Notes:

  1. For gcc it works as follows: __Z followed by the length of the name (getCount, so 8), followed by the name, followed by the parameters' type information. For the first getCount this is a Pointer to a char, and an int. For the second getCount this is an int and another int.

  2. This implies that the linker can only link object files compiled by the same C++ compiler. Even different versions of the same compiler have been known to use different name-mangling schemes.

  3. You can also use nm -C.

Questions:

In the footnotes for chapter 4 we stated that C and C++ behave differently when a function definition doesn’t match the function’s declaration. What have we learned in this chapter that explains that statement?

Linking C++ code with C libraries

Suppose that paxDB is supplied by a separate team to ours, and we have no control over it. Furthermore, it is implemented in C and all we have are the header file and object file.1

paxDB.h

int getCount(char* flightNumber, int deflt);

paxDB.c

#include "paxDB.h" static int flights[] = { 20, 15, 0 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

However, we want to write our paxCount program in C++, not C.

paxCount.cpp

#include "paxDB.h" int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1], -1); return 0; }

What happens if we try to link these as they are?

$ g++ -c -Wall -Werror paxCount.cpp $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o Undefined symbols: "getCount(char*, int)", referenced from: _main in paxCount.o ld: symbol(s) not found collect2: ld returned 1 exit status $ nm paxDB.o 0000000000000058 d _flights 0000000000000000 T _getCount $ nm paxCount.o U __Z8getCountPci 0000000000000000 T _main

It should be clear, from the above mismatch, why this didn’t work.

We can fix this situation by declaring that getCount has C linkage:

paxCount.cpp

extern "C" { #include "paxDB.h" } int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1], -1); return 0; }

extern "C" is a linkage specification. Let’s look at the output of the pre-processor, just to be clear that the point of the linkage specification is to wrap the declaration of getCount:

$ g++ -E -Wall -Werror paxCount.cpp # 1 "paxCount.cpp" # 1 "<built-in>" # 1 "<command-line>" # 1 "paxCount.cpp" extern "C" { # 1 "paxDB.h" 1 int getCount(char* flightNumber, int deflt); # 3 "paxCount.cpp" 2 } int main(int argc, char** argv) { if (argc > 1) return getCount(argv[1], -1); return 0; }

Now the linking will succeed:

$ g++ -c -Wall -Werror paxCount.cpp $ nm paxCount.o U _getCount 0000000000000000 T _main $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o $ ./paxCount 1 ; echo $? 15

Notes:

  1. Later on we will see how to package object code into libraries, rather than passing object files around.

Namespaces

Let’s forget about the previous chapter, and assume that once again we have control over the source code of paxDB, and that it is written in C++.

In this chapter we are going to write a cargo database, and a new program that retrieves either the passenger information, or the cargo information, based on a command-line flag. It’s all much simpler than it sounds because our cargo database only stores the number of containers on each flight, and, like paxDB, it is read-only and only has the information for 3 flights.

cargoDB.h

int getCount(char* flightNumber, int deflt);

cargoDB.cpp

#include "cargoDB.h" static int flights[] = { 0, 8, 9 }; int getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

As you surely noticed, the interface and implementation of the cargo database are identical to the passenger database. Only the data is different (flights 0, 1 and 2 have 0, 8 and 9 cargo containers, compared to 20, 15 and 0 passengers, respectively).

However, since we chose the same name for our interface (getCount) we won’t be able to use the cargo database together with the passenger database; both versions of getCount take exactly the same arguments, so even with name mangling there will be a conflict. For example if we try to link cargoDB and paxDB with paxCount:

$ g++ -c -Wall -Werror paxDB.cpp $ g++ -c -Wall -Werror cargoDB.cpp $ g++ -c -Wall -Werror paxCount.cpp $ g++ -Wall -Werror -o paxCount paxCount.o paxDB.o cargoDB.o ld: duplicate symbol getCount(char*, int)in cargoDB.o and paxDB.o collect2: ld returned 1 exit status

This error is due to the one definition rule: “Every program shall contain exactly one definition of every non-inline function or object that is used in that program” (quoted from the C++ standard, section 3.2).

Namespaces to the rescue! We can group related functions and data in a namespace, and we can disambiguate between the different getCounts by using the appropriate namespace name:

paxDB.h

namespace pax { int getCount(char* flightNumber, int deflt); }

paxDB.cpp

#include "paxDB.h" static int flights[] = { 20, 15, 0 }; int pax::getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

cargoDB.h

namespace cargo { int getCount(char* flightNumber, int deflt); }

cargoDB.cpp

#include "cargoDB.h" static int flights[] = { 0, 8, 9 }; int cargo::getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

See below how the C++ compiler incorporates the namespace names into its name mangling. Note also that the two different flights arrays don’t conflict, because they have internal linkage so they are local to their translation unit.

$ g++ -c -Wall -Werror paxDB.cpp $ nm paxDB.o 0000000000000058 d __ZL7flights 0000000000000000 T __ZN3pax8getCountEPci $ g++ -c -Wall -Werror cargoDB.cpp $ nm cargoDB.o 0000000000000058 d __ZL7flights 0000000000000000 T __ZN5cargo8getCountEPci

Now we can write our new program, flightInfo. It takes two command-line parameters: A flag (c or p) indicating whether to look up cargo or passenger data; and a flight number.

flightInfo.cpp

#include "cargoDB.h" #include "paxDB.h" int main(int argc, char** argv) { if (argc > 2) { if (*argv[1] == 'c') return cargo::getCount(argv[2], -1); if (*argv[1] == 'p') return pax::getCount(argv[2], -1); } return 0; }
$ g++ -Wall -Werror -o flightInfo flightInfo.cpp cargoDB.o paxDB.o $ ./flightInfo c 1 ; echo $? 8 $ ./flightInfo p 1 ; echo $? 15

Unnamed namespaces

We mentioned in chapter 6 that C++ deprecated usage of the static keyword for internal linkage (presumably because static is also used for so many other things).

Deprecated features are not removed altogether, so our programs still behave correctly. But can we achieve the same effect without using deprecated features? We could try something like this:

paxDB.cpp

#include "paxDB.h" namespace nobodywilleverguessthisname { int flights[] = { 20, 15, 0 }; } int pax::getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return nobodywilleverguessthisname::flights[0]; if (flightNumber[0] == '1') return nobodywilleverguessthisname::flights[1]; if (flightNumber[0] == '2') return nobodywilleverguessthisname::flights[2]; return deflt; }

We can make usage of flights within paxDB.cpp slightly less cumbersome with a using declaration:

paxDB.cpp

#include "paxDB.h" namespace nobodywilleverguessthisname { int flights[] = { 20, 15, 0 }; } int pax::getCount(char* flightNumber, int deflt) { using nobodywilleverguessthisname::flights; if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

This simply allows main to refer to flights without having to specify the full namespace name each time.

If the namespace had multiple entities inside it, instead of separate using declarations for each entity, we can make the entire namespace’s contents visible with a using directive:

paxDB.cpp

#include "paxDB.h" namespace nobodywilleverguessthisname { int flights[] = { 20, 15, 0 }; } int pax::getCount(char* flightNumber, int deflt) { using namespace nobodywilleverguessthisname; if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }
$ g++ -S -Wall -Werror paxDB.cpp $ cat paxDB.s __ZN3pax8getCountEPci: LFB2: pushq %rbp LCFI0: movq %rsp, %rbp LCFI1: movq %rdi, -8(%rbp) movl %esi, -12(%rbp) movq -8(%rbp), %rax movzbl (%rax), %eax cmpb $48, %al jne L2 movl __ZN27nobodywilleverguessthisname7flightsE(%rip), %eax movl %eax, -16(%rbp) jmp L4

Note that the compiler, thanks to the using directive, converted references to flights into nobodywilleverguessthisname::flights.

$ g++ -c -Wall -Werror paxDB.cpp $ nm paxDB.o 0000000000000058 D __ZN27nobodywilleverguessthisname7flightsE 0000000000000000 T __ZN3pax8getCountEPci

Note that flights has external linkage again, but we hope that no-one outside of its translation unit will be able to call it, because nobodywilleverguessthename of the namespace it’s in. This might work, but there are two potential problems: The caller could find out the namespace name (even without the source code, all we’d need is to run nm on the object file) and be able to modify flights; or some other unrelated translation unit might use the same namespace name with a flights inside it, and the two would clash.

The alternative recommended by the C++ standard is the unnamed (or anonymous) namespace:

paxDB.cpp

#include "paxDB.h" namespace { int flights[] = { 20, 15, 0 }; } int pax::getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

cargoDB.cpp

#include "cargoDB.h" namespace { int flights[] = { 0, 8, 9 }; } int cargo::getCount(char* flightNumber, int deflt) { if (flightNumber[0] == '0') return flights[0]; if (flightNumber[0] == '1') return flights[1]; if (flightNumber[0] == '2') return flights[2]; return deflt; }

Unnamed namespaces have an implicit using directive placed at the translation unit’s global scope. Depending on your compiler implementation, names inside an unnamed namespace will be given internal linkage; or the compiler will generate a random namespace name, guaranteed to be unique.

$ g++ -c -Wall -Werror paxDB.cpp $ nm paxDB.o 0000000000000058 d __ZN12_GLOBAL__N_17flightsE 0000000000000000 T __ZN3pax8getCountEPci $ g++ -c -Wall -Werror cargoDB.cpp $ nm cargoDB.o 0000000000000058 d __ZN12_GLOBAL__N_17flightsE 0000000000000000 T __ZN5cargo8getCountEPci

It seems our compiler chooses the internal linkage method, with the same generated name for both unnamed namespaces.1

c++filt knows about namespaces too:

$ nm paxDB.o | c++filt 0000000000000058 d (anonymous namespace)::flights 0000000000000000 T pax::getCount(char*, int)

For the record, the compiler I used to generate this material is:

$ g++ --version GCC 4.2.1 (Apple Inc. build 5664) Copyright (C) 2007 Free Software Foundation, Inc.

Notes:

  1. Presumably this compiler gives internal linkage when possible. There are some cases (types or variables used as template parameters) where the name must have external linkage, even though you want to disallow access outside of its translation unit. See “Why is an unnamed namespace used instead of static?” from the Comeau C++ FAQ.

Questions:

Does it make any sense to put an unnamed namespace in a header file? Consider the following program. What will main return?

counter.h

namespace { int counter = 0; } void count();

counter.cpp

#include "counter.h" void count() { ++counter; }

main.cpp

#include "counter.h" int main() { count(); return counter; }

Include guards

Include guards are placed around the contents of a header file to prevent the contents being seen twice by the compiler:

paxDB.h

#ifndef __PAX_DB_H__ #define __PAX_DB_H__ namespace pax { int getCount(char* flightNumber, int deflt); } #endif

These prevent the preprocessor from outputting the contents between #ifndef#endif whenever the header file is included more than once in the same translation unit.

This is more likely to happen on large codebases, where a .cpp file includes many header files, and one of those in turn includes a header file already included earlier on.

Include guards do not affect the inclusion into separate translation units, so they won’t help if you are seeing duplicate symbol errors at link time.

Static libraries

Multiple object files can be packaged together into a single archive called a static library.

The tool for this is ar:

$ ar -r libFlightDBs.a paxDB.o cargoDB.o $ nm libFlightDBs.a | c++filt libFlightDBs.a(paxDB.o): 0000000000000058 d (anonymous namespace)::flights 0000000000000000 T pax::getCount(char*, int) libFlightDBs.a(cargoDB.o): 0000000000000058 d (anonymous namespace)::flights 0000000000000000 T cargo::getCount(char*, int)

As a library supplier, you would deliver the archive file together with the relevant header files (paxDB.h and cargoDB.h).

The linker will look inside archive files specified with the -l flag, and it looks for them in the locations specified with the -L flag. Unlike straight object files specified on the command line, the linker will only link the symbols actually used.1 For example:

$ g++ -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp $ nm paxCount | c++filt 0000000100001068 d (anonymous namespace)::flights 0000000100000e4c T pax::getCount(char*, int) 0000000100000ea4 T _main

Notes:

  1. When linking several such libraries, and one library references symbols defined in another library, the order you specify the libraries on the command line matters. If library A refers to symbols in library B, the linker needs to have processed A before it gets to B.

Shared libraries

When a library is used by many different programs (think, for example, of the C Posix library), copying the used functions into each executable program is an inefficient use of disk and memory.

Functions in shared libraries aren’t linked into an executable program directly; instead, the linker generates code that, at run time, will look up the address of the shared library’s symbols. The run-time overhead is minimal (only one extra jump, via a jump table containing the addresses of all shared library symbols used by the program).

At run time, only one copy of the shared library needs to be loaded in memory, regardless of how many different programs are using it. Another advantage is that a shared library can be upgraded independently of the programs that use it (as long as the library’s interface hasn’t changed).

To generate a shared library, the object files must be compiled with the -fPIC option, which tells gcc to generate position independent code (so that, for example, function calls won’t depend on the function definition being at a particular position in memory).

$ g++ -c -fPIC -Wall -Werror paxDB.cpp $ g++ -c -fPIC -Wall -Werror cargoDB.cpp

To build the shared library, we use gcc’s -shared flag. Depending on your system, you might need to pass certain options directly to the linker — if so use gcc’s -Wl,option flag, which will pass option as a command-line parameter to the linker. See man ld for linker-specific options.

$ g++ -shared -fPIC -o libFlightDBs.so paxDB.o cargoDB.o $ nm libFlightDBs.so | c++filt 0000000000001014 d (anonymous namespace)::flights 0000000000001008 d (anonymous namespace)::flights 0000000000000e50 T pax::getCount(char*, int) 0000000000000ea8 T cargo::getCount(char*, int) 0000000000000000 t __mh_dylib_header U dyld_stub_binder

After we compile a program that uses pax::getCount from the shared library, we can see that the function definition isn’t included in the program binary:

$ g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp $ nm paxCount | c++filt U pax::getCount(char*, int) 0000000100000ee4 T _main

When we execute the program, the OS first invokes the dynamic linker (or loader) which loads the required shared libraries. The dynamic loader searches for libraries in standard locations like ‘/usr/lib’, as well as (on Linux) the directories specified by the environment variable LD_LIBRARY_PATH.1 On my environment (OS X) this is DYLD_LIBRARY_PATH instead:

$ DYLD_LIBRARY_PATH=. ./paxCount 1 ; echo $? 15

On Linux, ldd (list dynamic dependencies) will print the shared libraries required by the program (on OS X use otool instead):

$ otool -L paxCount paxCount: libFlightDBs.so (compatibility version 0.0.0, current version 0.0.0) /usr/lib/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.9.0) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.1)

Notes:

  1. But LD_LIBRARY_PATH/DYLD_LIBRARY_PATH can be problematic. See “Why LD_LIBRARY_PATH is bad”. Common solutions are to install your shared library into one of the system default locations, or to hard-code the full run-time path of the library into the executable, using the linker’s -R or -rpath option.

Makefiles

No discussion of C++ compilation would be complete without mentioning make (though we will only touch on it briefly).

A makefile contains a set of rules. Each rule specifies a target (or multiple targets), prerequisites, and a recipe (a shell command) for generating the target from its prerequisites. For example:

makefile

paxCount: paxCount.cpp paxDB.h libFlightDBs.so g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp libFlightDBs.so: paxDB.o cargoDB.o g++ -shared -fPIC -o $@ $^ paxDB.o cargoDB.o: %.o: %.cpp %.h g++ -c -fPIC -Wall -Werror $< clean: rm paxCount libFlightDBs.so paxDB.o cargoDB.o

The first rule specifies how to build the paxCount executable (the recipe should look very familiar to you). The recipe must be preceded by exactly one tab (not spaces). The recipe will be re-run whenever paxCount.cpp or libFlightDBs.so changes (make compares the timestamps of the prerequisites against the timestamp of the target).

The second rule specifies how to build the shared library. It uses make’s automatic variables, where $@ means the name of the target and $^ means the names of all the prerequisites with spaces between them.

The third rule is a pattern rule. Its effect is the same as specifying separate rules for each of the object files: A rule with target paxDB.o and prerequisites paxDB.cpp and paxDB.h; and another similar rule for cargoDB.o. It uses the automatic variable $< which means the name of the first prerequisite.

The final rule specifies how to remove all generated files. It has no prerequisites so it will be run whenever you specify the target name (make clean).

If we type make paxCount, make will figure out from the prerequisites that it needs to build libFlightDBs.so; and to build that, it needs to build paxDB.o and cargoDB.o.

$ make paxCount g++ -c -fPIC -Wall -Werror paxDB.cpp g++ -c -fPIC -Wall -Werror cargoDB.cpp g++ -shared -fPIC -o libFlightDBs.so paxDB.o cargoDB.o g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp

If we run make again, it won’t do anything because none of the source files have changed. But if they have changed, make will rebuild only the targets affected:

$ make paxCount make[1]: `paxCount' is up to date. $ touch paxDB.cpp $ make paxCount g++ -c -fPIC -Wall -Werror paxDB.cpp g++ -shared -fPIC -o libFlightDBs.so paxDB.o cargoDB.o g++ -fPIC -Wall -Werror -o paxCount -L. -lFlightDBs paxCount.cpp

In large projects, tracking the prerequisites of a .cpp file manually becomes impossible (every header included by the file is a prerequisite). gcc -M will generate a list of prerequisites in makefile format:

$ g++ -M paxDB.cpp paxDB.o: paxDB.cpp paxDB.h

(Integrating this output with the project’s makefiles is beyond the scope of this tutorial; see “Generating Prerequisites Automatically” in the GNU make manual.)

Further reading: