by Guangming Lang
2 min read

Categories

  • r

This post contains my solution to one of the programming assignments from the R Programming course on coursera. If you want to solve it yourself, please don’t read any further.

When running time consuming computations, it’s good to cache the results so that you can look them up later instead of computing them again. For example, maxtrix inversion is usually costly, especially when running inside of a loop. The following functions can compute and cache the inverse of a matrix.

  1. makeCacheMatrix(): creates a special “matrix” object that can cache its inverse.
  2. cacheSolve(): computes the inverse of the “matrix” returned by makeCacheMatrix(). If the inverse has already been calculated and the matrix has not changed, it’ll retrieves the inverse from the cache directly.
makeCacheMatrix <- function(x = matrix()) {
        ## @x: a square invertible matrix
        ## return: a list containing functions to
        ##              1. set the matrix
        ##              2. get the matrix
        ##              3. set the inverse
        ##              4. get the inverse
        ##         this list is used as the input to cacheSolve()
        
        inv = NULL
        set = function(y) {
                # use `<<-` to assign a value to an object in an environment 
                # different from the current environment. 
                x <<- y
                inv <<- NULL
        }
        get = function() x
        setinv = function(inverse) inv <<- inverse 
        getinv = function() inv
        list(set=set, get=get, setinv=setinv, getinv=getinv)
}

cacheSolve <- function(x, ...) {
        ## @x: output of makeCacheMatrix()
        ## return: inverse of the original matrix input to makeCacheMatrix()
        
        inv = x$getinv()
        
        # if the inverse has already been calculated
        if (!is.null(inv)){
                # get it from the cache and skips the computation. 
                message("getting cached data")
                return(inv)
        }
        
        # otherwise, calculates the inverse 
        mat.data = x$get()
        inv = solve(mat.data, ...)
        
        # sets the value of the inverse in the cache via the setinv function.
        x$setinv(inv)
        
        return(inv)
}

To test out these functions. I wrote a function called test(), which takes in any invertible matrix, calculates its inverse twice using the above functions, and prints out the times it takes for both runs. The first run should take longer than the second run because it actually calculates the inverse while the second run only does a look-up from the cache.

test = function(mat){
        ## @mat: an invertible matrix
        
        temp = makeCacheMatrix(mat)
        
        start.time = Sys.time()
        cacheSolve(temp)
        dur = Sys.time() - start.time
        print(dur)
        
        start.time = Sys.time()
        cacheSolve(temp)
        dur = Sys.time() - start.time
        print(dur)
}

Let’s try it on a matrix of 1000 rows and 1000 columns filled with normal random numbers.

set.seed(1110201)
r = rnorm(1000000)
mat1 = matrix(r, nrow=1000, ncol=1000)
test(mat1)
## Time difference of 1.946601 secs
## getting cached data
## Time difference of 0.0005111694 secs

The time it took for the first run is 2.02 seconds, and for the second run, 0.000498 seconds. This is a 99.8% decrease. Now imagine if you had to run the computation in a loop of 1000 iterations, without caching, it’s going to take you about 34 minutes, with caching, only 0.08 seconds. This is the power of caching!