RFP - Part2: R Lists

Master R

By Guangming Lang Comment

Previously, we discussed R vectors. We now turn to R lists. Like a vector, a R list can also grow or shrink. But unlike a vector, a R list can contain any R objects. For example, vectors, lists, functions, or environments can all be the elements of a list, and it’s perfectly okay to mix them in the same list.

The empty list

The empty list, with syntax list(), has 0 elements. Indeed, if we call length(list()), we get 0.

Non-empty lists

We can make a list with list(e1, ..., en) where each expression1 e is evaluated to a R object. It’s more common to make a list with c(e1, e2), called called “e1 combined with e2,” where e1 evaluates to a “list” and e2 evaluates to another “list.” The result is a new list that starts with the elements in e1 followed by the elements in e2. In the simpler case when e1 evaluates to a single value like 5, 1.2, “hello”, TRUE or NA, c(e1, e2) can also be called “e1 consed onto e2.” The result is then a new list that starts with the value of e1 followed by the elements in e2. Here’re some examples of lists:

list(1L, 2L, 3L) # each element is an integer
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
list(c(1, 2), c("gm", "lang")) # each element is a length-2 vector of diff. type
## [[1]]
## [1] 1 2
## 
## [[2]]
## [1] "gm"   "lang"
list(list(1, 2), TRUE, function(x) x+1, environment()) # 1st element is a list, 3rd element is a function, 4th element is an env object
## [[1]]
## [[1]][[1]]
## [1] 1
## 
## [[1]][[2]]
## [1] 2
## 
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## function (x) 
## x + 1
## 
## [[4]]
## <environment: R_GlobalEnv>
# cons 9 onto the list(3, 6)
c(9, list(3, 6)) 
## [[1]]
## [1] 9
## 
## [[2]]
## [1] 3
## 
## [[3]]
## [1] 6
# list(1, list(2, 3)) is different from list(1, 2, 3)
!identical(list(1, list(2, 3)), list(1, 2, 3))
## [1] TRUE
# list(1, list(2, 3)) is different from list(list(1, 2), 3))
!identical(list(1, list(2, 3)), list(list(1, 2), 3))
## [1] TRUE

How to use lists

Once again, powered by recursion, we only need three simple operations when working with lists:

  1. Check if a list is empty.
  2. Get the first element of a list, raising an exception if the list is empty.
  3. Get the tail of a list without its first element, raising an exception if the list is empty.

Like in the case of vectors, R doesn’t provide these operations perfectly out of the box. So we need to write them ourselves:

# Returns TRUE if a list is empty, FALSE otherwise.
is_empty = function(xs) { # xs: a list 
        length(xs) == 0
}

# Get the first element of a list, raising an exception if the list is empty.
hd = function(xs) { # xs: a list 
        if (is_empty(xs)) stop("List is empty.")
        else xs[[1]] 
}

# Get the tail of a list without its first element, raising an exception if the list is empty.
tl = function(xs) { # xs: a list 
        if (is_empty(xs)) stop("List is empty.")
        else xs[-1] 
}

Having defined is_empty(), hd() and tl(), we can use them inside of recursive functions to perform complex operations on lists. For example, given a list of integer pairs, we can write a function to sum them up.

sum_pair_list = function(xs) {
        # xs: a list of interger pairs represented by length-2 vectors c(i1, i2)
        if (is_empty(xs)) 0
        else hd(xs)[1] + hd(xs)[2] + sum_pair_list(tl(xs))
}
sum_pair_list(list())
## [1] 0
sum_pair_list(list(c(1, 2), c(3, 4)))
## [1] 10

Or we can write functions to put the first or second element of each pair into separate lists.

firsts = function(xs) {
        # xs: a list of interger pairs represented by length-2 vectors c(i1, i2)
        if (is_empty(xs)) list()
        else c(hd(xs)[1], firsts(tl(xs)))
}

seconds = function(xs) {
        # xs: a list of interger pairs represented by length-2 vectors c(i1, i2)
        if (is_empty(xs)) list()
        else c(hd(xs)[2], seconds(tl(xs)))
}

xs = list(c(1, 2), c(3, 4))
firsts(xs)
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 3
seconds(xs)
## [[1]]
## [1] 2
## 
## [[2]]
## [1] 4

It should make us cringe that firsts() and seconds() look almost identical yet we wrote them as two different functions. We’ll learn how to fix it later. Notice that is_empty(), hd() and tl() defined above are very similarly to the ones defined in part1 of this series when we discussed R vectors. This is because lists are just a special kind of vectors in R. To see this, realize that another way of declaring an empty list is vector("list").

identical(vector("list"), list())
## [1] TRUE
  1. The word ‘expression’ here is being used in the sense of expressions in any programming language or mathematical expressions. Do NOT confuse it with the R expression object. 

If you enjoyed this post, get updates. It's FREE