R: are *apply loops faster than for loops?

In a high-level programming language such as R or Python, it is typically a good idea to avoid loops whenever possible, since looping in high-level programming languages can be very time-consuming.

However, sometimes loops cannot be avoided. In R, in addition to traditional looping constructs such as the for and while loop, the *apply() functions provide looping functionality using a single function call. It is often suggested to avoid traditional looping constructs in favor of these *apply() functions, because a single call to an *apply() function would be faster than running through a for or while loop. To investigate this claim, I will run a small benchmark: I will time a (trivial) loop that returns a vector of integers 1 to n, for different values of n, using four different variants of running a loop in R:

  • Using a for loop: the return vector is dynamically extended while iterating through the loop. This variant I call FOR_LOOP_DYN_ALLOC.
  • Using a for loop: the memory for the return vector is pre-allocated, before iterating through the loop. This variant I call FOR_LOOP_PRE_ALLOC.
  • Using R’s sapply() function.
  • Using R’s lapply() function, followed by a call to unlist() to return a vector instead of a list.

Below, you can find an R function for timing these loops, for a vector of different loop sizes N, and a number of replicates N to average the execution time over.


r_loops <- function(
    N, # The different loop sizes to try
    nrep # The number of replicates to average the execution time over
) {
    T <- 1e3 * data.frame(
        FOR_LOOP_DYN_ALLOC = rowMeans(replicate(nrep, {
            unlist(lapply(N, function(n) {
                system.time({
                    x <- integer(0)
                    for (i in 1:n) {
                        x <- c(x, i)
                    }
                })[3]
            }))
        })),
        FOR_LOOP_PRE_ALLOC = rowMeans(replicate(nrep, {
            unlist(lapply(N, function(n) {
                system.time({
                    x <- integer(n)
                    for (i in 1:n) {
                        x[i] <- i
                    }
                })[3]
            }))
        })),
        SAPPLY = rowMeans(replicate(nrep, {
            unlist(lapply(N, function(n) {
                system.time({
                    x <- sapply(1:n, function(i) {
                        i
                    })
                })[3]
            }))
        })),
        UNLIST_LAPPLY = rowMeans(replicate(nrep, {
            unlist(lapply(N, function(n) {
                system.time({
                    x <- unlist(lapply(1:n, function(i) {
                        i
                    }))
                })[3]
            }))
        }))
    )    

    matplot(
        N,
        T,
        type = "l",
        lty = 1,
        lwd = 2,
        col = rainbow(ncol(T)),
        log = "xy",
        xlab = "size of loop",
        ylab = "execution time (milliseconds)",
        main = sprintf("Average of %i replicates", nrep)
    )
    legend(
        "topleft",
        lty = 1,
        lwd = 2,
        col = rainbow(ncol(T)),
        legend = colnames(T)
    )

}

Let’s run this for relatively short loops and 10 replicates:


png("short_r_loops.png")
r_loops(2^(7:15), 10)
dev.off()

This is what the result looks like:

short_r_loops
Short loops: execution time in milliseconds of different types of looping in R, as a function of the size of the loop.

It seems that for these relatively short loops, using for is actually quite comparable to using sapply(). The lapply() function seems a bit faster than sapply(), probably because sapply() is essentially a wrapper for lapply() and thus includes some extra overhead. The for loop with dynamically extended return vector (FOR_LOOP_DYN_ALLOC) is by far the worst performing.

Let’s try some longer loops (omitting FOR_LOOP_DYN_ALLOC, because of the poor performance; using only three replicates):

long_r_loops
Long loops: execution time in milliseconds of different types of looping in R, as a function of the size of the loop.

Now this is interesting: it seems that for long loops, intelligent use of for (by pre-allocating memory) is actually faster than both of the *apply() functions! My guess is that, when calling an *apply() function, the size of the return object is dynamically increased (i.e. during the iteration). For long loops and larger amounts of memory, this will take more time compared to pre-allocation using for.

In any case, the above two plots show that *apply() is not necessarily faster than for. Therefore, an important reason for using *apply() functions may instead be that they fit the functional programming paradigm better, where everything is done using function calls and side effects are reduced: The *apply() functions are essentially ‘functionals‘ that take a certain function f as an input argument. The scope of the variables defined within f is limited to f, and variables defined outside f cannot be modified inside f (except using the special scoping assignment operator <<-). This is what makes calls to *apply() functions very different from running for loops.