Become Great at R

RFP - Part 9: Local Functions and Local Variables

A common thing to do in functional programming is to create, within a (outer) function, a local (inner) function that uses other variables in scope. Let me give you an example. The function countup_from1() uses a local helper function count() to accumulate results recursively. Its argument x is used directly inside count().

RFP - Part 8: R Functions

Functions play a quintessential role in R. John Chambers said that “in R: everything that happens is a function call.” Before diving into R functions, I want to explain the mathematical concept of “function” because it will help us understand R functions.

RFP - Part 7: Well-behaved R Functions

Function is a big topic in R programming. I’ll spend several blog posts talking about it. But before I dive into R functions with full force, I want to show you how nice it is to work with them. Consider the following code:

RFP - Part 6: R «- Assignment Operator

If you have been following this R Functional Programming (RFP) series¹, you know by now we have discussed:

If you haven’t, I recommend you to read the linked posts first. ↩

RFP - Part5: Immutability in R

Do not confuse variable reassignment (or rebinding) with mutation. Reassignment in R changes the binding. (See the second diagram for an example.) Mutation changes the referenced object itself. R supports limited mutation by default and base R objects are mostly immutable. As a result, R code often behaves like what we’d expect it to behave mathematically. This allows the programmer to focus on the mathematical or statistical problem at hand without being distracted by the “computer side of things.” Indeed, if you’re trained as a mathematician or statistician without any programming experience, you’ll find it’s easier to write R code than most typical programming languages like Python, Ruby, Java, C++, or Rust etc.

RFP - Part4: R Variable Shadowing

Now we know that R variables are objects, and that they can be manipulated, assigned any R objects and reassigned.¹ (detailed discussion here.) We’ll discuss variable shadowing and its difference from variable reassignment.

The terms “variable (re)assignment” and “variable (re)binding” are used interchangeably throughout this series because we use them for their mathematical meaning: (re)associate a symbol with a value. ↩

RFP - Part3: R Variables

Previously, we discussed two data structures in R: vectors and lists. If you programmed in another language before, you probably already got a sense that R vectors and lists are a tiny bit different than what you’re used to. This is to be expected. In general, there is a nuanced difference between R data structures and that of many other languages.

RFP - Part2: R Lists

Previously, we discussed R vectors. We now turn to R lists. Like a vector, a R list can also grow or shrink. But unlike a vector, a R list can contain any R objects. For example, vectors, lists, functions, or environments can all be the elements of a list, and it’s perfectly okay to mix them in the same list.

RFP - Part1: R Vectors

The simplest data structure in R is the vector. A vector is one dimensional and can be imagined as a sequence of blocks containing values:

Accurate Calculation of Years between Dates in R

When doing feature engineering, it’s common to turn dates into numbers by calculating the time differences. For example, given date of birth, you may want to calculate age. Given sign up date and churn date, you may want to calculate the days to churn. Depending on the situation, sometimes you want the time difference in years, and sometimes you want it in months, weeks or days; sometimes you want the years/months/weeks rounded to whole numbers, and other times you may want to keep the decimal points for more accuracy. For example, suppose we want to calculate the number of years (without rounding) between two dates, there’re 3 ways we can go about it: