Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dynamic dof selection for ljung_box feature for both single and multiple models #143

Open
ghost opened this issue Aug 6, 2021 · 0 comments

Comments

@ghost
Copy link

ghost commented Aug 6, 2021

Unless I am mistaken, it seems like the ljung_box feature requires manual specification for the dof and lag arguments outside of the defaults, which are 0 and 1 respectively. This can be an issue when your mable contains models which have varying parameter counts, in which case dof should be different for the respective models. In that case, you'd want the ljung_box feature to calculate the statistic and p-value based on each model.
Ex.)

# subset data for training
train <- aus_production %>%
  filter_index("1992 Q1" ~ "2006 Q4")

# Create models 
beer_fit <- train %>%
  model(
    Mean = MEAN(Beer),
    `Seasonal naïve` = SNAIVE(Beer)
  )

# check how many estimated parameters each model has, if any. Only `Mean` will show
# as having at least 1 parameter
beer_fit %>%
  tidy() %>%
  group_by(.model) %>%
  count()

# get ljung box information
beer_fit %>%
  augment()  %>%
  features(.innov,ljung_box)

Note that the last command in the code will produce ljung_box information but with both having a dof value of 0 when Mean should be 1 and Seasonal naïve should be 0.

I believe this can be fixed using a relatively simple mapply() function. (This could obviously be improved on, is just a rough draft) as follows:

ljung_box_mult <- function(dat,lag = 10){
  
  input <- dat %>%
    augment() %>%
    as_tibble() %>%
    select(.model) %>%
    unique(by = ".model") %>%
    left_join(dat %>%
                tidy() %>%
                group_by(.model) %>%
                count()) %>%
    mutate(n = if_else(is.na(n),0L,n))
  
  output <- mapply(function(x,y){
    
    dat %>%
      select(x) %>%
      augment() %>%
      features(.innov,ljung_box,lag=lag,dof = y)
      
  },input$.model,input$n,SIMPLIFY = FALSE)
  
  return(do.call(rbind,output))
  
}

beer_fit %>%
  lung_box_mult()

If I am a bonehead and there is a way to do this already please let me know. If not, then I am open to suggestions on how this can be implemented/improved upon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants