10: Separate Compilation

UC Irvine - Fall ‘22 - ICS 45C

Last updated on Oct 17, 2022 UCI - F22 - ICS45C, UCI - F22 - ICS45C - Lecture Notes

Quick list of things I want to talk about:

libraries
.h and .cpp files
#ifdef protection
namespaces

Expanded notes:

Now that we can break down our code into smaller functions, you might end up with some functions that can be used in a bunch of different projects.

If that happens, instead of copying and pasting that same function to all your .cpp files, you can create a library that you can simply include into your new source files.

The way this is usually done is by having a header file (.h) and a source file (.cpp). They usually will have the same name and should live in the same directory, but you will include only the header file.

For the rest of this note, let’s say you created a function that searches for an element inside an array and returns its index or -1 if it’s not there:

int find(int array[], int size, int lookup_value) {
  for (int i=0; i < size; i++) {
    // Compare each element in the array with the value we're searching.
    if (array[i] == lookup_value) {
      return i;
    }
  }
  return -1;  // to indicate we didn't find the value.
}

So we have a first function that might be helpful, but we’ll probably create more later on. So we decide to call our files helper.h and helper.cpp.

Header files (.h)

The header file includes the information that someone that wants to use your library needs to know. For the example function above, a user needs to know what the function expects, and what it returns. It doesn’t matter how we are implementing the search, just that we return the index given those values.

So in header files you’ll usually find definitions (structs/unions/classes which we’ll see later on), and function descriptions. Function descriptions are usually given with prototypes.

Function prototypes

A function prototype (sometimes called a function stub) is just a signature of a function without its implementation. For the function we had defined above, the two following options are valid prototypes:

// Option 1: using only types.
int find(int[], int, int);

// Option 2: types and variable names.
int find(int array[], int size, int lookup_value);

Keep in mind that you’ll only use one of them!

Besides that, the user should also know when the value was not found. So we can also add a constant to the header file:

// Return value that indicates lookup_value was not found.
const int NOT_FOUND = -1;

Example header file

Here’s an example header file combining the above:

// Return value that indicates lookup_value was not found.
const int NOT_FOUND = -1;
int find(int array[], int size, int lookup_value);

Source files (.cpp)

The source files are the actual implementation of the code. If you provided a prototype to a user in the header file, here you will implement the code that makes that function work.

First you will include the header file you’re implementing, and then define the functions’ bodies.

Any external libraries that you need only for implementation would also go in the source fille. If you need any auxiliary functions, they could be defined directly here. Including and defining things only inside the source file would deny access to other users, so you can control what’s accessible and what’s not.

Example source file

A valid source file (helper.cpp) for our example function above would be this:

#include "helper.h"

int find(int array[], int size, int lookup_value) {
  for (int i=0; i < size; i++) {
    // Compare each element in the array with the value we're searching.
    if (array[i] == lookup_value) {
      return i;
    }
  }
  return NOT_FOUND;
}

First, it includes the definitions (i.e., the header file we’re implementing). Note the quotes "" instead of angle brackets <>, we’ll discuss why we use them in the next section.

By doing that we have the definitions available and any structures/variables defined.

Then, we actually have the code for each function prototype we defined in our header file.

Including a user-defined library

Assuming you have created the files as above, you can test if they work. For example, you can create a file called test_main.cpp with this code:

#include <iostream>
#include "helper.h"

using namespace std;

int main() {
  int values[]{1,2,3,4,5,6,7,8,9,10};
  cout << find(values, 10, 6);

  return 0;
}

Note that, again, we use quotes around helper.h. The difference is that:

when you use quotes, the compiler will look for those files in “local files” and later the modules you have installed;
when you use angle brackets, it only looks for modules you have installed.

However, if you try compiling this code like this:

g++ test_main.cpp -std=c++11 -Wall -Wextra -Wpedantic -Werror -o test_main

You’ll get an error that either the symbol is undefined, or that the template couldn’t be matched. Both of these mean the same thing, the compiler could not find your find implementation.

That’s because we didn’t tell them where to look! So…

Compiling multiple files

If you’re trying to compile multiple files like above, you’ll need to give all of them to the compiler. For the previous example, we have test_main.cpp and helper.h/.cpp. So we need to give all source files to the compiler:

g++ test_main.cpp helper.cpp -std=c++11 -Wall -Wextra -Wpedantic -Werror -o test_main

The order here doesn’t matter, because the compiler will combine everything and then look for the main function. So you can only have one main function in all of these files!

Multiple inclusion protection

If you create a very helpful library (e.g., iostream), you can imagine that many files would try to use it. If that happens, you could have many different source files that include your library, and the compiler doesn’t know that they are the same thing.

So what can happen is that the compiler would say some name is being re-defined, because two includes are trying to get that variable/function with the same name.

Let’s see that happening! Create a second helper file helper2.h like this¹:

#include "helper.h"

int my_var = 2;

and change your test_main.cpp to be like this:

#include <iostream>
#include "helper.h"
#include "helper2.h"

using namespace std;

int main() {
  int values[]{1,2,3,4,5,6,7,8,9,10};
  cout << find(values, 10, my_var);

  return 0;
}

You should get an error when compiling:

In file included from test_main.cpp:3:
In file included from ./helper2.h:1:
./helper.h:6:11: error: redefinition of 'NOT_FOUND'
const int NOT_FOUND = -1;
          ^
test_main.cpp:2:10: note: './helper.h' included multiple times, additional include site here
#include "helper.h"
         ^
./helper2.h:1:10: note: './helper.h' included multiple times, additional include site here
#include "helper.h"
         ^
./helper.h:6:11: note: unguarded header; consider using #ifdef guards or #pragma once
const int NOT_FOUND = -1;
          ^
1 error generated.

What happened here is that we created the NOT_FOUND variable once when we included helper.h, and then a second time when helper2.h is included.

The way to solve this, is by using an include guard. An include guard is a compile-time if statement that makes sure the header is only included once.

As an example, let’s see a modified version of our helper.h:

#ifndef __MYHELPER_H__
#define __MYHELPER_H__

// Return value that indicates lookup_value was not found.
const int NOT_FOUND = -1;
int find(int array[], int size, int lookup_value);

#endif  // __MYHELPER_H__

Let’s take a look at the changes.

we added a #define __MYHELPER_H__. We’ve seen #define before and that’s to define a compile time variable. In this case, we don’t want any value in it, we just want that name to be created at compilation time.
#ifndef is a special compile-time check that decides if certain pieces of code will be included or not in the binary. Here, we are saying that if __MYHELPER_H__ has not been defined, we want to include this code. The block ends on the #endif directive, so the entire file is protected by that.

After the first include, we will have defined __MYHELPER_H__. So when we try to include it again, nothing will happen, because we are preventing that code to be “re-included”.

After making these changes, you can try compiling again and it should work!

Namespaces

Now we have a nice helper library that has an include guard, and all looks fine. But find is a very common name, and other libraries might have different functions with that same name that do other things.

To solve this problem, we can use namespaces. Namespaces let you defined what’s the scope of your functions with identifiable names.

For example, in this case we could create a namespace called MyCustomHelper. But for different project you could have more specific namespaces, like ParkingLotManager or FoodOrdering.

Using namespaces on header vs source files

In header files, you usually create namespaces surrounding all declarations.

Revisiting our previous example:

#ifndef __MYHELPER_H__
#define __MYHELPER_H__

namespace MyCustomHelper {

// Return value that indicates lookup_value was not found.
const int NOT_FOUND = -1;
int find(int array[], int size, int lookup_value);

}  // namespace MyCustomHelper

#endif  // __MYHELPER_H__

Then, in the source file, you would specify the namespace in the function names using NAMESPACE::function.

Revisiting our previous example:

#include "helper.h"

int MyCustomHelper::find(int array[], int size, int lookup_value) {
  for (int i=0; i < size; i++) {
    // Compare each element in the array with the value we're searching.
    if (array[i] == lookup_value) {
      return i;
    }
  }
  return NOT_FOUND;
}

By doing that, we are specifying when our find function should be used, and that’s only when the user wants to use the find from MyCustomHelper namespace.

MyCustomHelper is probably a bad name for a namespace since it doesn’t give us any information when/why/what it’s being used for, but you could make this more meningful like the examples above (ParkingLotManager, FoodOrdering).

Revisiting `using namespace std;`

Ok, now that we know what’s a namespace and why they’re used for, let’s revisit this instruction.

So far, in pretty much all our code we add using namespace std; at the top after the includes. What this is doing is telling the compiler that our code should be in the std (standard) namespace, so all functions/variables/etc defined in that same namespace should be accessible directly.

But since we might start creating our own namespaces, we can either modify that instruction or use fully-specified names.

For example, you can define which names you’re using:

using std::cout;
using std::cin;
using MyCustomHelper::find;

Which tells the compiler that when you say cout, you actually mean std::cout. cin maps to std::cin, and find maps to MyCustomHelper::find.

You could also just use std::cout everywhere you need to print something, so you’re not asking the compiler to figure out namespaces for you, you’re specifying them yourself. This is useful when you need to use functions with the same name from different namespaces, so you can specify MyCustomHelper::find, OtherNamespace::find, and so on.

So after adding namespaces to helper.h/.cpp, we can modify our test_main.cpp like this:

#include <iostream>
#include "helper.h"

using std::cout;
using MyCustomHelper::find;

int main() {
  int values[]{1,2,3,4,5,6,7,8,9,10};
  cout << find(values, 10, 2);

  return 0;
}

Since we’re not using other couts or finds, it’s fine to have the using statements on the top.

Nested namespaces

It’s possible to have nested namespaces too. You would just created them one inside the other, and then specify each level.

For example:

namespace NS1 {
namespace NS2 {
namespace NS3 {

bool my_function() {
  return true;
}

}  // namespace NS3
}  // namespace NS2
}  // namespace NS1

bool is_true = NS1::NS2::NS3::my_function();

So you could have something meaningful like UCI::ACM::Comp1::sort() if you want to define a custom sort without mixing up with any existing ones.

Partial compilation

Sometimes you might want to compile a file without your project being completed yet. For example, if you wrote a library without a main function, you can compile just that part and combine it with the rest later on. This is done using the -c flag:

g++ file1.cpp -std=c++11 -Wall -Wextra -Wpedantic -Werror -c

This should create a file named file1.o.

Later on, when you actually want your final binary, you can use these .o files instead of the regular .cpp ones:

g++ file1.o file2.o file3.o -std=c++11 -Wall -Wextra -Wpedantic -Werror -o my_binary

This can come in handy when you have many files and only a few of them change. In such cases, you can keep partially compiled objects for all files and replace only the ones where the source was modified. But at that point, you probably should be looking into some build environment (e.g., make, cmake, bazel).

Complete example

helper.h (click here to download):

#ifndef __MYHELPER_H__
#define __MYHELPER_H__

namespace MyCustomHelper {

// Return value that indicates lookup_value was not found.
const int NOT_FOUND = -1;
int find(int array[], int size, int lookup_value);

}  // namespace MyCustomHelper

#endif  // __MYHELPER_H__

helper.cpp (click here to download):

#include "helper.h"

int MyCustomHelper::find(int array[], int size, int lookup_value) {
  for (int i = 0; i < size; i++) {
    // Compare each element in the array with the value we're searching.
    if (array[i] == lookup_value) {
      return i;
    }
  }
  return NOT_FOUND;  // to indicate we didn't find the value.
}

test_main.cpp (click here to download):

#include <iostream>

#include "helper.h"

using MyCustomHelper::find;
using std::cout;

int main() {
  int values[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
  cout << find(values, 10, 5);

  return 0;
}

References

Comic reference: https://xkcd.com/303/

this is a header only library, everything we need is here – there is no source file! For such libraries, we don’t need to add them to the list when compiling. ↩︎