Log on:
Powered by Elgg

Caius :: Blog :: Compiling at Hyperspeed

September 12, 2007

After I started working with C++ I found that the compile times increased drastically. Especially using lots of templates, as in the case of the Standard Template Library, compiles incredibly slow. My codebase used to compile in about 30 seconds, but after rewriting substantial portions to C++ it increased to almost three minutes. As we all know, drastic times calls for drastic measures. So I'm going to delve into some of the solutions I've started using. You can benefit from these even if you're using C, as long as you're using the GNU Compiler Collection (GCC).

ccache
ccache, or compilercache as it originally was called, is a program that stores all the information about your compiles, so that when you recompile your code it can pull from the cache the files that are unchanged since last compile. This cache is stored in a directory called .ccache.

Using it is amazingly simple. You just run gcc/g++ through it. So if you compile a file called myfile.c you simply type: ccache gcc myfile.c. The first time you compile a piece of code the cache is built, so obviously this may be slightly slower than a normal compile, but not by much. You can also use it in a Makefile by prefixing gcc or g++ with ccache. In my example, the three minutes are reduced to 15 seconds! No, I'm not bluffing.

Keep in mind that it can only pull unchanged files from the cache, so if you change a header file, all files that includes it will be recompiled normally.

If your Linux/Unix distro doesn't come with ccache, you can get it here: http://ccache.samba.org/

distcc
Now we're getting a little more advanced. distcc is a distributed compiling system. For this you need at least two computers, connected on a high speed line. After installing and configuring this software, you can distribute your compiling jobs across multiple computers. And it's fairly simple to set up too.

You can think of this as a master/slave server setup. One of the computers will control the entire job, while the others simply listen, receive, compile and send back the results. After setting up the configuration you start the daemon called distccd which will then run in the background, listening for requests for compiling. The other program you will use is called distcc and is used just as ccache. Ie, you prefix gcc or g++ with it. At this point, if you simply type make it will only compile locally. However, make has an option called -j, which will start multiple compiling processes at once. So if you type 'make -j4' it will start four compiling processes at once. This is snatched up by distcc, and it distributes the job among the computers it's configured to know about. Using my desktop and laptop together my compile time is almost cut in half.

distcc can use two protocols: http and ssh. ssh is safer, but a little slower because it must encrypt the data first. So if you're compiling in your private LAN you should prefer http. It's also possible to compile over the internet, but then you will need fast connections between all involved computers to avoid bottlenecks.

distcc works with C, C++, Objective C and Objective C++. Oh, and I want to kill one myth right away: distcc is not a compiler, it's a front end for GCC. The final thing I want to say on distcc is concerning compiler versions. Although I haven't tested this myself, it's reported to work even if the computers involved have different versions of GCC, as long as they're binary compatible. I expect this is more often the case with C than with C++.

For more information on setting up the system, see here: http://distcc.samba.org/ 

The big question
Can ccache and distcc be combined? Yes! In fact, they cooperate incredibly well. It is recommended to run ccache first, like this: ccache distcc gcc. According to the distcc FAQ ccache will work better this way.

So, with these tools in your arsenal you can leave slow compiling in the dust. Happy coding, and I'll hopefully be back soon with another blog. 

Keywords: c, c++, ccache, Compiling, distcc, gcc, network

Posted by Caius


Comments

  1. We (the folks at The FUSS Project) had tried adding ccache to the SmaugFUSS makefile for a bit, but in the end it was decided that this was not a good thing. Samson could probably tell you more about why.

    Conner DestronConner Destron on Wednesday, 12 September 2007, 19:12 MDT # |

  2. Interesting. I've actually been using it for a while. Haven't had any issues with it.

    CaiusCaius on Wednesday, 12 September 2007, 19:16 MDT # |

  3. Although, generally speaking I wouldn't add it to a publically available package like the FUSS bases myself. It makes it slightly less portable by default. People who wants to use it can add it anyway.

    CaiusCaius on Wednesday, 12 September 2007, 19:19 MDT # |

  4. Going back and searching The FUSS Project, it appears that there were several concerns, and at least one of them did get addressed, over the 6 months or so (ending almost two years ago now) that we'd experimented with it. Kerberus appears to have had the least trouble with it, I myself experienced what appeared to be a problem with it insisting on using the same cache to compile from whether I was compiling my dev port or my live port.

    Conner DestronConner Destron on Wednesday, 12 September 2007, 23:36 MDT # |

  5. I'm using different user accounts for the development and running, so that is probably why I haven't run into that problem.

    However, as with any problem, there is always a solution. I'll take a closer look at it. 

    CaiusCaius on Thursday, 13 September 2007, 02:28 MDT # |

  6. If you can find the solution to that one, combined with Kerberus' solution, I'd very happily go back to using ccache for my mud. :)

    Conner DestronConner Destron on Thursday, 13 September 2007, 20:48 MDT # |

  7. Ok, here's a simple way of solving this. When ccache runs it checks the environment variable CCACHE_DIR to find out where the cache is supposed to be. By default this is $HOME/.ccache

    So the solution is to change this variable in your Makefile.

    Insert:

    export CCACHE_DIR:=./.ccache

    before ccache is called in the makefile. In this particular case the cache will be stored in the same directory as the Makefile instead of the top level of your home directory. This will only change the variable in the context of the Makefile, so no nothing else is affected.

    There is one limitation here. You can't use the standard ccache options in the shell, like ccache -s for statistics or ccache -C to clean it up,because ccache would once again look in $HOME/.ccache which is not where you have put your cache. If you need these options to work you could call them from the Makefile instead by adding new compiling rules.

    Personally I rarely use these options, and if you don't use them either, this will solve your problem.

    Cheers! 

    CaiusCaius on Friday, 14 September 2007, 04:22 MDT # |

  8. http://www.fussproject.org/index.php?a=topic&t=487

    The basic gist of the problem for me was the fact that the cache ate up so much space, and even with the size limiting settings it's not appropriate for use in a shared hosting environment where package sizes are limited.

    It's also not very well suited in my experience to using multiple ports in the same user account, which would still be a common scenario with muds that don't necessarily have a tight quota.

    I've had much better luck with Makefile dependency checking and physical hardware upgrades. I might try to implement distcc though to take advantage of the other server. 

    SamsonSamson on Friday, 14 September 2007, 08:12 MDT # |

  9. For my own mud, Caius, that might work out as I self host and don't use features like those anyway, so between the space limiting settings and the environment variable, that might make ccache worthwhile afterall, though I am currently using the Makefile dependency checking that Samson mentioned so I have to wonder how the two will interact together. The main problem I had with it before was the fact that I have both of my ports under the same user account.

    Conner DestronConner Destron on Friday, 14 September 2007, 09:32 MDT # |

You must be logged in to post a comment.