unity build

Speeding up build times with Master/Unity/Bulk Builds

C++ is great for all sorts of things, but sooner or later the project starts suffering from long build times. There are lots of tips; however, most of them will take many man hours to resolve (i.e. fixing the dependencies, removing boost, fixing your template recursion, etc). One solution is fairly easy to put into practice. Let’s look at how unity builds helped decrease our build times in Diamond Digger Saga here at King.

First, unity builds have nothing to do the Unity engine. Second, sometimes they’re called bulk or master builds. It’s a process of using the preprocessor to unify many translation units (.cpp files) into a single one. To understand this better, let’s take a quick look at how C++ goes from lines of code to an executable:

unity build

The trivial example makes a ‘foobar’ executable which is made of two files (foo.cpp and bar.cpp) that both include a set of headers. When the preprocessor running the .cpp file is converted into a .i file (this is a copy and paste from the .cpp file and the .hs and all preprocessor directives (the #pragma, #define, #ifdef, etc) applied). The compiler is then invoked, turning the .i file into an .obj file. The linker then takes the collection of .obj files and links them into one executable.

In a unity build we ask the preprocessor to do more work so we that can have the compiler and linker do less. Under a unity build our example would look like this:

unity build

It’s easy to see why this is faster! The preprocessor and compiler runs one time less. The linker has to link one fewer file. While the resulting master.cpp file is bigger, in reality the .cpp file is usually a very small percentage of the lines of code in the resulting .i file (the file the compiler has to process).

Real-world example

Let’s look at one of our games (names have been sanitised) and compare the unity solution to the non-unity one. The names have been changed, but the content is real. We look at the lines of code (LOC) in both the .cpp and the .i file, after the preprocessor does its magic.

File LOC (.cpp) LOC (.i) Percentage of the Master
master 215140 100%
foo 720 184980 86%
bar 1660 201720 94%
mar 24 107816 50%
far 96 146846 68%
bob 38 108010 50%
bing 30 142239 66%
test 76 108290 50%
bot 36 116062 54%
del 40 107822 50%
ser 100 108215 50%
ver 9 107803 50%

With the loose build the compiler has to compile 1439 kLOC. With the unity build it needs to compile just 215 kLOC. That is a seven-fold improvement. What is amazing is that even a simple nine line .cpp file is transformed into a 107803 line .i file. That’s a lot of expansion of #includes.

Now you may be thinking, that’s great for a full rebuild, but my use-case is touching one .cpp file; compiling 100k lines is faster than 200k! While that is true; the startup cost of the compiler process is a big portion of the time – then there is the linker step. The linker has to combine all the .obj files into one .exe. Part of this is fulfilling the ‘one definition rule’ each symbol (class, enum, variable, etc.) must be made unique which is at best an O(N^2) operation. A unity build will have far fewer object files. In our game ~200 source files and nine master files and 40000 is much much slower than 81 (200^2 vs 9^2). In other words the time you save compiling fewer LOC you waste in the linking.

How many master files?

Finding the right balance is important. The trick is keeping the workload for the compiler and linker in balance. It is also important to note that most computers have at least four cores now, which means having one master file will result in 75% of your cores sitting idle. Some experimentation is needed to find a good balance. A quick rule of thumb is to start with 8-12 master files and adjust as needed.

Making the master files is also important, you want to group like with like to minimize the final number of lines of code. As most of the code comes from the #includes having the same set between the source files will generate the most savings. In other words: put AI code together, UI code together, Rendering together, etc. Also, put the Audio code together – I only mention this as audio always seems to get forgotten!

What’s the downside?

If it’s so great, why isn’t everyone using it, right? Well there are some dangers! First it requires good design and discipline. Having all of the sources together means that everything is in the same translation unit one can’t hide their privates. To hide the privates completely one can use named namespaces. This means the compiler errors are also a bit obscure. We identify three major downside – duplicate symbols, #defines, and poor include hygiene. Let’s look at each one in turn.

Duplicate symbols

Take these files:

1
2
3
4
Foo.cpp
const char const * ACTION = "action";
Bar.cpp
const char const * ACTION = "action";

This will cause a duplicate symbol problem as the master file will have two definitions of ACTION.  Note: It only works in the loose build because they are identical and the linker can combine them into one.

#defines

Defines are especially tricky. There are more #define than #undef in an average code base. As such combining translation units will cause #define pollution.

For example, say windows.h defines this:

1
#define PlaySound PlaySoundA

And foo.cpp has:

1
sounds->PlaySound(Id);

The compiler will tell you something like ‘PlaySoundA’: is not a member of ‘Sounds’ which will take a long time to find!

Poor include hygiene

When adding code to fifth file, the compiler will know about all of the #includes that were part of files 1 through 4. This means it’s very likely one will forget to add the extra #include at the top of file 5, which will cause problems if your master files are ever reordered.

Summary

In short, going to a master build file system will be a giant improvement in your build time for very little effort. Most likely a few hours will be required to rename the duplicate global symbols; cry about the macros in your code base, and find some amazing code horror stories. In our test case the build went from an average of 3.5 minutes to 30 seconds. As a last piece of making a script that generates your master files is highly recommended (If you work here feel free to use ours). The question is how many new features can you add now that you’re not waiting for things to compile?

Tomasz Niwinski

About Tomasz Niwinski

Tom has been making games for almost two decades and continues to have the privilege of working on some great games with some outstanding teams. Following the scent of adventure, he left AAA console games on the wet coast(*) of Canada for the sunny shores of Barcelona (the only change for the family was “everything”; though he continues to be a strict part time vegetarian). When not in front of the computer, he is failing at KOM’ing Strava segments around Catalunya. (*) While Vancouver also happens to be on the west coast, the rain is what gives it its name.

Leave a Reply

Your email address will not be published. Required fields are marked *