Last updated: 2017-07-18
Code version: 49915de
Dirichlet adaptive shrinkage, dash, performs Bayesian adaptive shrinkage to produce refined probabilities for compositional data. Assume the counts of A,C,G,T in one position follows multinomial distribution: \((n_A,n_C,n_G,n_T)\sim multinom(n,p_A,p_C,p_G,p_T)\), where \(n=n_A+n_C+n_G+n_T, p_A+p_C+p_G+p_T=1\) and the probability vector follows a mixture of Dirichlet distributions: \(p=(p_A,p_C,p_G,p_T)\sim \Sigma_{k=1}^K\pi_k*Dirichlet(\alpha_k,...,\alpha_k)\). The component Dirichlet distributions in the mixture have varying degrees of concentration, from infinity to less than 1. Inf corresponds to a point mass on the mode. Here, the \(\alpha_k,k=1,...,K\) are chosen to be \(Inf,100, 50, 20, 10, 5, 2, 1, 0.5, 0.1.\). To make the model have shrinkage effect, the prior of \(\pi_k,k=1,...,K\) is proportional to \((10,1,1,1,1,1,1,1,1,1)\). Then posterior mean of \(p\) is the estimator. For example, \(\hat p_A=\Sigma_{k=1}^K\hat\pi_k*\hat p_{Ak}\), where \(\hat p_{Ak}\) is the posterior mean of \(p_A\) of each Dirichlet component. Hence, \[\hat p_A=\Sigma_{k=1}^K\hat\pi_k*[(\frac{n_A}{n}*\frac{n}{n+4\alpha_k})+\frac{\alpha_k}{4\alpha_k}*\frac{4\alpha_k}{n+4\alpha_k}]\] \[=\frac{n_A}{n}\Sigma_{k=1}^K\hat\pi_k\frac{n}{n+4\alpha_k}+\frac{1}{4}\Sigma_{k=1}^K\hat\pi_k\frac{4\alpha_k}{n+4\alpha_k} .\] Obviously, the \(\hat p_A\) is the weighted average of prior mean and the observed value. When \(n\to \infty\), $p_A $; \(n\to 0\), \(\hat p_A\to \frac{1}{4}\). When \(\alpha_k\to\infty, \hat\pi_k\to1\), \(\hat p_A\to \frac{1}{4}\); \(\alpha_k\to0, \hat\pi_k\to1\), \(\hat p_A\to \frac{n_A}{n}\).
sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.0 backports_1.0.5 magrittr_1.5 rprojroot_1.2
[5] tools_3.4.0 htmltools_0.3.5 yaml_2.1.14 Rcpp_0.12.11
[9] stringi_1.1.5 rmarkdown_1.6 knitr_1.15.1 git2r_0.18.0
[13] stringr_1.2.0 digest_0.6.12 evaluate_0.10
This R Markdown site was created with workflowr