-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathhorseshoe_prior.Rmd
69 lines (54 loc) · 4.49 KB
/
horseshoe_prior.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
title: "``horseshoe ``"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(brms)
```
Set up a horseshoe prior in brms
#### Description
Function used to set up a horseshoe prior for population-level effects in brms. The function does
not evaluate its Arguments
– it exists purely to help set up the model.
#### Usage
<pre><code>
horseshoe(df = 1, scale_global = 1, df_global = 1, scale_slab = 2,
df_slab = 4, par_ratio = NULL, autoscale = TRUE)
</code></pre>
#### Arguments
* df Degrees of freedom of student-t prior of the local shrinkage parameters. Defaults to 1.
* ``scale_global`` Scale of the student-t prior of the global shrinkage parameter. Defaults to 1.
In linear models, scale_global will internally be multiplied by the residual standard deviation parameter sigma.
* ``df_global``: Degrees of freedom of student-t prior of the global shrinkage parameter. Defaults to 1.
* scale_slab Scale of the student-t prior of the regularization parameter. Defaults to 2.
* df_slab Degrees of freedom of the student-t prior of the regularization parameter. Defaults to 4.
* par_ratio Ratio of the expected number of non-zero coefficients to the expected number of zero coefficients. If specified, scale_global is ignored and internally computed as par_ratio / sqrt(N), where N is the total number of observations in the
data.
* autoscale Logical; indicating whether the horseshoe prior should be scaled using the residual standard deviation sigma if possible and sensible (defaults to TRUE). Autoscaling is not applied for distributional parameters or when the model does
not contain the parameter sigma.
#### Value
A character string obtained by match.call() with additional </code></pre>
#### Examples
```{r}
set_prior(horseshoe(df = 3, par_ratio = 0.1))
```
#### Details
The horseshoe prior is a special shrinkage prior initially proposed by Carvalho et al. (2009). It is symmetric around zero with fat tails and an infinitely large spike at zero. This makes it ideal for sparse models that have many regression coefficients, although only a minority of them is nonzero.
The horseshoe prior can be applied on all population-level effects at once (excluding the intercept) by using set_prior("horseshoe(1)"). The 1 implies that the student-t prior of the local shrinkage parameters has 1 degrees of freedom. This may, however, lead to an increased number of divergent transition in Stan. Accordingly, increasing the degrees of freedom to slightly higher values (e.g., 3) may often be a better option, although the prior no longer resembles a horseshoe in this case.
Further, the scale of the global shrinkage parameter plays an important role in amount of shrinkage applied. It defaults to 1, but this may result in too few shrinkage (Piironen & Vehtari, 2016). It is thus possible to change the scale using argument scale_global of the horseshoe prior, for instance horseshoe(1, scale_global = 0.5). In linear models, scale_global will internally be multiplied by the residual standard deviation parameter sigma. See Piironen and Vehtari (2016) for recommendations how to properly set the global scale. The degrees of freedom of the global shrinkage prior may also be adjusted via argument df_global.
Piironen and Vehtari (2017) recommend to specifying the ratio of the expected number of non-zero coefficients to the
expected number of zero coefficients par_ratio rather than scale_global directly. As proposed by Piironen and Vehtari (2017), an additional regularization is applied that only affects non-zero coefficients. The amount of regularization can be controlled via scale_slab and df_slab. To make sure that shrinkage can equally affect all coefficients, predictors should be one the same scale. Generally, models with horseshoe priors a more likely than other models to have divergent transitions so that increasing adapt_delta from 0.8 to values closer to 1 will often be necessary.
See the documentation of brm for instructions on how to increase adapt_delta.
#### Arguments
.
References
Carvalho, C. M., Polson, N. G., & Scott, J. G. (2009). Handling sparsity via the horseshoe. In
International Conference on Artificial Intelligence and Statistics (pp. 73-80).
Piironen J. & Vehtari A. (2016). On the Hyperprior Choice for the Global Shrinkage Parameter in
the Horseshoe Prior. https://arxiv.org/pdf/1610.05559v1.pdf
Piironen, J., and Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and
other shrinkage priors. https://arxiv.org/abs/1707.01694
#### See Also
set_prior
Hurdle 71