by Guangming Lang
2 min read

Categories

  • r

Previously, I wrote about the intuitions behind skewness and kurtosis. Today I’m going to do some calculations using the daily adjusted close prices of the S&P500 and NovaGold between January 4, 2007 and October 10, 2014. You can download the data from yahoo and follow along. First of all, I defined some helper functions, which can also be used in other contexts to improve code organization and efficiency.

## helper functions
as_date = function(x, ...) { 
        # Converts numeric or character vectors to dates
        # 
        # Args:
        #       x   : a numeric or character vector
        #       ... : other arguments like format = "%Y%m%d"
        # Returns:
        #       a vector of dates
        dts.char = unique(x)
        if (is.numeric(dts.char)) dts = as.Date(as.character(dts.char), ...)
        else dts = as.Date(dts.char, ...)
        
        # match returns a vector of positions of first matches of elts of x in 
        # dts.char, and because dts.char is unique, match here really finds, 
        # for any element in x, the index of the element in dts.char that 
        # matches it.
        dts[match(x, dts.char)] 
}

calc_cc_ret = function(df) {
        library(dplyr)
        a = mutate(df, Date = as_date(Date, format = "%m/%d/%y")) 
        a = arrange(a, Date)
        a = mutate(a, Pt = Adj.Close, Pt.minus1 = lag(Adj.Close))
        a = a[-1, ]
        mutate(a, rt = 100*log(Pt / Pt.minus1))
}

calc_shape_char = function(x) {
        library(PerformanceAnalytics)
        c(mean=mean(x), sd=sd(x), skewness=skewness(x), 
          excess.kurtosis=kurtosis(x))        
}

plt_density = function(df) {
        library(ggplot2)
        ggplot(df, aes(x=ret, color=label)) + geom_density() + theme_bw() + 
                labs(x = "daily continuously compounded return (%)")
}

Next, I loaded the data into R and calculated the daily continuously compounded returns for each stock.

proj_path = "~/Communities/masterr/_knitr/data"
ng_path = file.path(proj_path, "ng.csv")
sp500_path = file.path(proj_path, "sp500.csv")
ng = read.csv(ng_path, header=T, stringsAsFactors=F)
sp500 = read.csv(sp500_path, header=T, stringsAsFactors=F)

# calculate continuously compounded returns
ng = calc_cc_ret(ng)
sp500 = calc_cc_ret(sp500)

Next, I calculated the 4 shape characteristics of these daily returns for each stock.

# calcuate shape characteristics
(ng.shape = calc_shape_char(ng$rt))
## Error in library(PerformanceAnalytics): there is no package called 'PerformanceAnalytics'
(sp500.shape = calc_shape_char(sp500$rt))
## Error in library(PerformanceAnalytics): there is no package called 'PerformanceAnalytics'

Note the S&P500 had a mild negative skewness while NovaGold had a big negative skewness. Remember the normal distribution has a skewness of 0. In addition, both stocks had excess kurtosis comparing to the normal distribution. In particular, NovaGold had an excess kurtosis of 85.8, which made it much more likely to experience wild price swings than both the S&P500 and a normal distribution would.

Next, I simulated normal data using the corresponding mean and sd of the daily returns of each stock respectively. I then plot the densities of each stock along side with their normal counterparts.

set.seed(1020)
norm.ng = rnorm(nrow(ng), mean=ng.shape["mean"], sd=ng.shape["sd"])
## Error in rnorm(nrow(ng), mean = ng.shape["mean"], sd = ng.shape["sd"]): object 'ng.shape' not found
norm.sp500 = rnorm(nrow(sp500), mean=sp500.shape["mean"], sd=sp500.shape["sd"])
## Error in rnorm(nrow(sp500), mean = sp500.shape["mean"], sd = sp500.shape["sd"]): object 'sp500.shape' not found
library(tidyr)
ret.ng = data.frame(ng = ng$rt, normal = norm.ng)
## Error in data.frame(ng = ng$rt, normal = norm.ng): object 'norm.ng' not found
ret.ng = ret.ng %>% gather(label, ret, ng:normal)
## Error in eval(lhs, parent, parent): object 'ret.ng' not found
ret.sp500 = data.frame(sp500 = sp500$rt, normal = norm.sp500)
## Error in data.frame(sp500 = sp500$rt, normal = norm.sp500): object 'norm.sp500' not found
ret.sp500 = ret.sp500 %>% gather(label, ret, sp500:normal)
## Error in eval(lhs, parent, parent): object 'ret.sp500' not found
# plot
plt_density(ret.ng)
## Error in ggplot(df, aes(x = ret, color = label)): object 'ret.ng' not found
plt_density(ret.sp500)
## Error in ggplot(df, aes(x = ret, color = label)): object 'ret.sp500' not found

Note, both NovaGold and the S&P500 have longer and fatter tails than the normal curve.