Moving mean as a function in dplyr

Question

I'd like to create a function that can calculate the moving mean for a variable number of last observations and different variables. Take this as mock data:

df = expand.grid(site = factor(seq(10)),
                 year = 2000:2004,
                 day = 1:50)
df$temp = rpois(dim(df)[1], 5) 

Calculating for 1 variable and a fixed number of last observations works. E.g. this calculates the average of the temperature of the last 5 days:

library(dplyr)
library(zoo)

df <- df %>% 
            group_by(site, year) %>% 
                arrange(site, year, day) %>% 
                      mutate(almost_avg = rollmean(x = temp, 5, align = "right", fill = NA)) %>%
                          mutate(avg = lag(almost_avg, 1))

So far so good. Now trying to functionalize fails.

avg_last_x <- function(dataframe, column, last_x) {

  dataframe <- dataframe %>% 
    group_by(site, year) %>% 
      arrange(site, year, day) %>% 
        mutate(almost_avg = rollmean(x = column, k = last_x, align = "right", fill = NA)) %>%
          mutate(avg = lag(almost_avg, 1))

  return(dataframe) }

avg_last_x(dataframe = df, column = "temp", last_x = 10)

I get this error:

Error in mutate_impl(.data, dots) : k <= n is not TRUE 

I understand this is probably related to the evaluation mechanism in dplyr, but I don't get it fixed.

Thanks in advance for your help.


Show source
| R   | dplyr   | moving-average   | nse   2017-01-07 11:01 1 Answers

Answers ( 1 )

  1. 2017-01-07 14:01

    This should fix it.

    library(lazyeval)
    
    avg_last_x <- function(dataframe, column, last_x) {
      dataframe %>% 
        group_by(site, year) %>% 
        arrange(site, year, day) %>% 
        mutate_(almost_avg = interp(~rollmean(x = c, k = last_x, align = "right", 
                                              fill = NA), c = as.name(column)),
                avg = ~lag(almost_avg, 1))
    }
    
◀ Go back