Distributed software compilation for the Raspberry Pi

Lead Image © Brett Giza, 123RF.com

Slices of Pi

Distributed compiling with distcc offloads the CPU-intensive compilation tasks from the Raspberry Pi to other computers, saving you days of time and frustration.

The Pi is wonderful and all, but it is not really ideal for compiling. Try to build anything more complex than a "Hello World" program, and you will lock it up for hours. However, Raspbian runs compiled programs, so how did they get there? Of the several ways to compile programs for the Raspberry Pi, they all, interestingly, involve removing the Rasp Pi's hardware from the equation.

One option is to compile using a tool chain, which is exactly what it says on the box: a series of tools you chain together on a regular non-Rasp Pi computer and through which you pipe the source code of a program. Out the other end pops the compiled version of the program ready for your Pi. The Raspberry Pi Foundation distributes an official tool chain [1].

I can think of two problems with tool chains. First, you need to replicate the Rasp Pi environment by copying over directories to the machine that is going to do the compiling. This, in itself, is not too difficult, but if you come across an unexpected, unmet dependency while compiling, you then have to go back to your Pi, install the packages you need, copy over the directories again, and restart the compile … and you have to do this every time the compile borks.

Second, you run into the problem of not actually being able to test your program until you copy it onto the Pi and try it out. If you skip a file by accident or fudge the install, you can spend hours or days trying to figure out what you missed.

Your second option is to use a virtual machine. You can't get Raspbian running on VirtualBox (VirtualBox only does x86, not ARM, architectures), but it does work on Qemu [2]. The idea is you decompress a Raspbian image file, mount it, and run it as a virtual SD card on Qemu. The biggest problem with this method is that virtual machines tend to be sloooooow, and resource hungry! Even on a modern multicore machine, you're going to be sucking up at least one core and, to make the experience less painful, probably two. Therefore, while compiling (which is a CPU-intensive task), you will seriously hamper any other heavy-duty activities, like playing a video game, using a design program, or watching a high-resolution movie.

The third option is distributed compiling, and this is the most intriguing of them all. The idea here is that the Rasp Pi works as a master and forks out the job of compiling to other computers on the network (aka nodes). The Rasp Pi "thinks" it is compiling locally, and all unmet dependencies are dealt with directly on the running Pi – no going back and forth.

Once set up, the nodes doing the real heavy lifting can be headless, so they can be idle print servers, file servers, or whatever you have laying around in your office or home. Even old computers can do the job decently well. This means you don't have to tie up your own computer in a CPU-intensive task. Of course, you can use your own computer as a node, but you don't have to. Because the nodes only do one thing – compile – and don't need to run a virtual environment, they are fast, or at least much faster than using a virtual machine.

Finally, there's an app for that: It's called distcc [3], and it is available in the Raspbian repositories and for most other Linux distributions. (See also the "Aim of the Game" box.)

Aim of the Game

This article came about because I wanted to port a program I had written for the Arduino 101 in a previous article to the Rasp Pi. In that article [4], I demoed how to use the gyroscope on the Arduino 101 by waving it around to move the model of a 3D helicopter on the screen of a my laptop. However, being a contributor to a magazine that has Raspberry Pi in its name, the fact that the bits and pieces didn't work on the Rasp Pi bugged me.

Unfortunately, Panda3D [5], one of the cornerstones of the project, has no native package for the Rasp Pi. Compiling Panda3D on the Pi is nearly impossible because of the resources it sucks up in the process. I tried once, and it took more than 24 hours to reach about 30%, and then it just stopped, locking up the Pi completely.

Looking for ways to compile Panda3D led me down the rabbit hole of distributed compiling. And here we are. Although this article might seem a bit dry at the beginning, stick with me: The pay off is pretty great and includes cool things such as animated 3D graphics and gesture-controlled devices.

distcc on the Pi

Installing distcc on the Rasp Pi is straightforward:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install distcc

Configuring is a bit more complicated. First, you have to edit distcc's /etc/distcc/hosts file. This file contains the list of names or IPs on which distcc will compile. Comment out the line that says +zeroconf and add the IPs of your nodes, one per line. For example, I am only going to use one node, a quadcore i5 that I use as a printer/scanner server on my home network and lives at 192.168.1.24. This machine is idle most of the time, so it is ideal. Adding the line

192.168.1.24

allows distcc to use that computer as a compile node.

If you have several nodes, distcc will try the first one in the list; if that doesn't work or is too busy, it will move on to the next, then the next, and the next. This means that if you have a number of computers you can use as nodes, it is a good idea to put the fastest or least busy nodes at the top. If you do want to include your personal computer (i.e., the one on which you regularly work), you might want to put it toward the bottom of the list. If no nodes are available, distcc will try to compile your program locally using the local compile tools.

The distcc program creates a directory, /usr/lib/distcc, which it fills with dummy compilers, soft links that actually point to the distcc executable. To make sure you always compile using distcc, you want to put the path to that directory at the beginning of your Raspbian $PATH environment variable. Do that by adding the line

PATH=/usr/lib/distcc:$PATH

to the end of the /etc/profile file. This ensures that, when the time to compile comes, Raspbian will first look into the distcc directory before it looks anywhere else.

To activate the change, type:

. /etc/profile

You can check that everything is okay by typing:

echo $PATH

This should show /usr/lib/distcc at the beginning of the list of directories.

Use the which tool to check that Raspbian is picking up the correct compiler (i.e., the distcc dummy compiler):

$ which gcc
/usr/lib/distcc/gcc

Although not strictly necessary, you can include the following variables in your .bashrc file, assuming you are the user who is going to do the compiling:

DISTCC_BACKOFF_PERIOD=0
DISTCC_IO_TIMEOUT=3000
DISTCC_SKIP_LOCAL_RETRY=1

The DISTCC_BACKOFF_PERIOD variable tells distcc how long (in seconds) it should wait when a node fails before trying again. By setting it to 0, distcc will try immediately. The DISTCC_IO_TIMEOUT=3000 variable tells distcc how long it has to wait before quitting with a timeout error when a node doesn't respond immediately. Finally, DISTCC_SKIP_LOCAL_RETRY=1 tells distcc not to try and compile locally if all the other nodes fail. As mentioned before, the Pi is bad at compiling, so this is probably a sensible setting.

These variables don't work on distcc versions earlier than 3.2, and, at the moment of writing, the version in Raspbian's repository is 3.1. However, some day it will be updated and then you'll be ready!

Installing on Nodes

Now you have to configure what distcc calls the hosts – that is, the nodes (in my case, "node" in the singular) on which you will be compiling. My computer at IP 192.168.1.24 is a Debian machine, so I access it and install distcc onto it:

apt-get install distcc

When you do that, distcc actually installs two bits of software. You already saw how to configure the client bit on your Rasp Pi in the previous section, but now you need the daemon component, a program that runs as a server in the background on each of your nodes and listens for compile requests from the Pi.

The first thing to do is modify the /etc/default/distcc file as root by changing the line that says

STARTDISTCC="false"

to:

STARTDISTCC="true"

This change makes sure you can start the distcc daemon and starts it again every time you reboot the node. The next line to change is

ALLOWEDNETS="127.0.0.1"

to:

ALLOWEDNETS="192.168.1.0/24"

If your network is like mine, with IPs that go from 192.168.1.1 to 192.168.1.254, this makes sure that the whole network is covered. My Rasp Pi, which has currently been assigned 192.168.1.111, will be able to pass on compile tasks to the node.

If your IPs are something different (e.g., 192.168.0.1 to 192.168.0.254), you would use:

ALLOWEDNETS="192.168.0.0/24"

If you have configured your Pi to have a static address (e.g., 192.168.1.31), you could use

ALLOWEDNETS="192.168.1.31"

and only allow compiling from the Pi.

Continuing down in the file, the last thing you need to change is the line that says

LISTENER="127.0.0.1"

to:

LISTENER="0.0.0.0"

This will ensure that distcc listens to the outside network.

The most modern version of Debian uses systemd, so to get distcc started immediately, type

systemctl start distcc

as root. To check that everything is working as it should, use:

systemctl status distcc

You should see output something like that shown in Figure 1.

Figure 1: Distcc running as a daemon on a node.

The next step is to install the Rasp Pi tool chain. I know what I said before, but you are going to be using the specially tailored ARM compilers that come with it.

Make a directory in your home directory (I called mine RPiTC) and download the Rasp Pi tool chain into it:

cd RPiTC
git clone https://github.com/raspberrypi/tools.git --depth=1

This grabs the latest version directly from the Raspberry Pi Foundation's repository. Next, open /etc/init.d/distcc and add the path to the tool chain's compiler collection to the PATH variable

PATH=/home/<your_user>/RPiTC/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin:$PATH

and reload the distcc daemon:

systemctl daemon-reload

The distcc from the Rasp Pi is going to come looking for executable compilers called cpp, gcc, c++, g++, and so on, but if you look in tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin, you'll see compilers called arm-linux-gnueabihf-cpp, arm-linux-gnueabihf-gcc, and so on. To avoid the Rasp Pi distcc from bailing, create some soft links so that it finds what it's looking for:

cd ~/RPiTC/tools/arm-bcm2708/gcc-linaro-arm-linux-gnueabihf-raspbian/bin
ln -s arm-linux-gnueabihf-c++ c++
ln -s arm-linux-gnueabihf-cpp cpp
ln -s arm-linux-gnueabihf-g++ g++
ln -s arm-linux-gnueabihf-gcc gcc

This makes sure the above-named compilers exist, even though they are really pointing to the ARM equivalents.

You have to do all of the above for each node. When you're done, trying to compile anything that is CPU-intensive on the Pi will result in it being shipped off to the nodes, at which point they will take over.

Buy this article as PDF

Express-Checkout as PDF

Pages: 8

Price $2.95
(incl. VAT)

Buy Raspberry Pi Geek

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content