Unconstant Conjunction A personal blog

An Autoconf Primer for R Package Authors

Have you ever noticed something like the following when you’re installing an R package?

checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... yes
...
configure: creating ./config.status
config.status: creating src/Makevars

on Windows, users install pre-built packages, but on all other platforms (including macOS and Linux), they are built from source, including any native C, C++, or Fortran code. Often it’s enough to use the default R settings to build these packages, but sometimes you might need know a bit more about a user’s system in order to get things working correctly.

For this reason, R permits packages to have a ./configure shell script to make these checks before a package is installed.

Far and away the most common reason you’d need a ./configure script is to check for the presence of a system library that your package needs. You can’t be sure in advance what this library will be called – it may have a different name on Linux and macOS, for example – or where it will be located on a user’s machine, so you need at least a little bit of logic to try and track it down.

Some package authors like to write ./configure scripts by hand (for example, data.table and protolite). But many package authors instead opt to use Autoconf, which is an arcane but highly useful language and tool for generating ./configure scripts that will work on most systems. It also comes with a ton of built-in functions for the kinds of checks you are likely to make.

It is Autoconf that generates output like the snippet at the start of the post.

Unfortunately, there are few resources out there on how to write Autoconf scripts for R packages.1 Partly as a result, these scripts can be messy and inconsistent in practice. This post aims to provide the boilerplate you need to get started with some “best practices” in place.

Getting Started

To use Autoconf, you’ll generally need the following:

  • A configure.ac script.
  • Template files, most often a src/Makevars.in.
  • Various entries in your .gitignore or .Rbuildignore; and
  • Optionally, a ./cleanup script to delete various intermediate files.

First, create an src/Makevars.in file with the following contents:2

PKG_CPPFLAGS = -I. @PKG_CPPFLAGS@
PKG_LIBS = @PKG_LIBS@

This is a template file that Autoconf will fill in to produce a final Makevars file telling R how to compile your package correctly. As such, you should add src/Makevars.in to your .Rbuildignore and src/Makevars to your .gitignore.

Second, create a file called configure.ac like the one below:

AC_INIT([<package>],[<version>])

# Find the compiler and compiler flags used by R.
: ${R_HOME=`R RHOME`}
if test -z "${R_HOME}"; then
  echo "could not determine R_HOME"
  exit 1
fi
CC=`"${R_HOME}/bin/R" CMD config CC`
CFLAGS=`"${R_HOME}/bin/R" CMD config CFLAGS`
CPPFLAGS=`"${R_HOME}/bin/R" CMD config CPPFLAGS`

# Search for a system library, in this case the zlib compression library (which
# contains the function 'deflate'). This sets the variable 'ac_cv_search_deflate'
# to what must be passed to ${PKG_LIBS}.
AC_SEARCH_LIBS(deflate, z, [], [AC_ERROR([The zlib library is required.])])
AC_CHECK_HEADERS(zlib.h, [], [AC_ERROR([The zlib library headers are required.])])

# Write the flags into the src/Makevars file.
AC_SUBST([PKG_CPPFLAGS], ["${PKG_CPPFLAGS}"])
AC_SUBST([PKG_LIBS], ["${LIBS} ${PKG_LIBS} ${ac_cv_search_deflate}"])
AC_CONFIG_FILES([src/Makevars])
AC_OUTPUT

echo "
  --------------------------------------------------
  Configuration for ${PACKAGE_NAME} ${PACKAGE_VERSION}

    cppflags: ${CPPFLAGS} ${PKG_CPPFLAGS}
    libs:     ${PKG_LIBS}

  --------------------------------------------------
"

I was not kidding about the “arcane” bit. But you can now run the command autoconf (or, better yet autoreconf) in your package directory to generate a configure script that R can understand.

Now, the contents of this generated script are scary and near-unreadable, but that doesn’t really matter, since you will never edit the file by hand anyway. It is good practice to check both the configure.ac and configure files into version control, and in fact R CMD check will even warn you if you forget.

Finally, you might also notice that autoreconf produces a bunch of intermediate files that you will want to add you your .gitignore:

autom4te.cache
config.log
config.status

This should give you a functional Autoconf setup that will check for the zlib library on the user’s system and link it to your compiled code.

More Common Patterns

1. Using C++ Instead of C

For a project using C++ code instead of C – for example, if you are using Rcpp – you can make the following small adjustment. Replace

CC=`"${R_HOME}/bin/R" CMD config CC`
CFLAGS=`"${R_HOME}/bin/R" CMD config CFLAGS`
CPPFLAGS=`"${R_HOME}/bin/R" CMD config CPPFLAGS`

in the example above with

CXX=`"${R_HOME}/bin/R" CMD config CXX`
CXXFLAGS=`"${R_HOME}/bin/R" CMD config CXXFLAGS`
CPPFLAGS=`"${R_HOME}/bin/R" CMD config CPPFLAGS`
AC_LANG(C++)
AC_PROG_CPP

Autoconf will then run automated checks for C++ features (instead of C). Note that it is important to set CXX, CXXFLAGS and CPPFLAGS before you make these checks, to ensure that Autoconf will check these features using the C++ toolchain that R will actually use to compile your package.

2. Finding Libraries Using pkg-config

Most Linux distributions3 come with a program called pkg-config that allows you to query installed libraries and the compiler flags used to build them.

I would suggest using pkg-config by default, falling back on AC_CHECK_HEADERS otherwise. Autoconf actually comes with a PKG_PROG_PKG_CONFIG macro that will set $PKG_CONFIG if it is available, so you can set up these checks fairly easily:

have_zlib=no
ZLIB_CXXFLAGS=""
ZLIB_LIBS="-lz"

PKG_PROG_PKG_CONFIG

if test [ -n "$PKG_CONFIG" ] ; then
  AC_MSG_CHECKING([pkg-config for zlib])
  if $PKG_CONFIG --exists zlib; then
    have_zlib=yes
    ZLIB_CXXFLAGS=`"${PKG_CONFIG}" --cflags zlib`
    ZLIB_LIBS=`"${PKG_CONFIG}" --libs zlib`
  fi
  AC_MSG_RESULT([${have_zlib}])
fi

if test "x${have_zlib}" = xno; then
  AC_CHECK_HEADERS(zlib.h, [have_zlib=yes], [AC_ERROR(
    [The zlib library headers are required.]
  )])
fi

AC_SUBST([PKG_LIBS], ["${LIBS} ${PKG_LIBS} ${ZLIB_LIBS}"])

3. Showing Users What They Need to Install

The examples above will print

The zlib library headers are required.

in case of failure. This isn’t a terribly helpful message for users, so an emerging best practice for R packages is to provide more actionable messages explaining what users need to install on various systems to get things working.

For example, you could remove the AC_ERROR message above add the following block below it:

if test "x${have_zlib}" = xno; then
  AC_MSG_FAILURE([
  ---------------------------------------------
   'zlib' and its header files are required.

   Please install:

   * zlib1g-dev (on Debian and Ubuntu)
   * zlib-devel (on Fedora, CentOS and RHEL)
   * zlib (via Homebrew on macOS)
   * libz1 (on Solaris)

   and try again.

   If you believe this library is installed on your system but
   this script is simply unable to find it, you can specify the
   include and lib paths manually:

   R CMD INSTALL ${PACKAGE_NAME} \\
     --configure-vars='LIBS=-L/path/to/libs CPPFLAGS=-I/path/to/headers'
  ---------------------------------------------])
fi

A few quick searches should turn up the various package names on each platform. The ones in the example above are the main platforms that CRAN checks against, although instructions for Solaris are seldom worth the bother.

4. Using a Cleanup Script

R permits packages to have a ./cleanup script. From Writing R Extensions:

Under a Unix-alike only, an executable (Bourne shell) script cleanup is executed as the last thing by R CMD INSTALL if option –clean was given, and by R CMD build when preparing the package for building from its source.

This can be a useful way to automatically clean up generated Autoconf files. Here is one to get you started:

#!/bin/sh

rm -f config.*
rm -f src/Makevars
rm -rf autom4te.cache

5. Windows Support

Windows users generally install binary packages, and will not run the ./configure script during installation. However, you might want to help out the CRAN maintainers who create these binary packages by creating a static, Windows-oriented src/Makevars.win file:

PKG_CPPFLAGS = -I. -lz

This will obviously break when it cannot find zlib, but we’re punting that responsibility to the binary package maintainers in this case.

Further Reading

If you’d like to dig deeper, the following popular R packages all have Autoconf scripts that I have looked to for inspiration in the past:

  • The git2r package checks for libgit2, and if none is found, falls back on a bundled version of that library instead.

  • The RProtoBuf package checks not only for the Protobuf library, but also the protoc compiler needed to generate some of the package’s C++ headers.

  • The rJava package allows for discovering (and further customising, via user-supplied flags) many features of the R <-> Java bridge.

  • The sf package checks for some complex and demanding system dependencies.

  • The uuid package checks for some system libraries, but also checks if the socket address structure on that platform contains a particular field.

And, for resources on Autoconf more broadly, check out the official documentation and Autotools Mythbuster, which has a more example-driven approach.


  1. I once tried to get a use_autoconf() function incorporated into the usethis package, but Hadley Wickham thought that too few users would benefit for it to be worth maintaining.

    [return]
  2. If you have an existing src/Makevars, it should be fairly clear how to move it to src/Makevars.in and add the right templating syntax.

    [return]
  3. pkg-config is also available for macOS, although it is not installed by default. Other Unix-based systems (such as Solaris and the BSDs) usually have pkg-config as well.

    [return]
comments powered by Disqus