ShrinkBayes
This site corresponds to the papers:
Van de Wiel MA, Neerincx M, Buffart TE, Sie D, Verheul HMW (2014). ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs. BMC Bioinformatics. 15(1):116.
Van de Wiel MA, De Menezes RX, Siebring E, Van Beusechem VW (2013). Analysis
of small-sample clinical genomics studies using multi-parameter
shrinkage: application to high-throughput RNA interference screening. BMC Med Genom 6 (Suppl 2), S1.
Van de Wiel MA, Leday GGR, Pardo L, Rue H, Van der
Vaart AW,
Van Wieringen WN (2012). Bayesian
analysis of RNA sequencing data by
estimating multiple shrinkage priors. Biostatistics 14, 113-128.
Why use ShrinkBayes?
- It is demonstrated to be more reproducible than most other approaches, in particular for small samples
- It is build upon the INLA fundament. Hence, much faster than MCMC.
- Allows for (zero-inflated)
Counts and Gaussian data [Hence can be applied to (mi)RNAseq, CAGE,
mRNA/miRNA microaarray, HT RNAi, etc....]
- Very versatile in terms of designs. GLM-context, and allows for random effects.
- Provides Bayesian FDR and lfdr estimates.
- Accommodates a variety of priors, including mixture and nonparametric ones.
- Enables multi-parameter shrinkage
R Package (AVAILABLE FROM GITHUB since 20/4/2016)
Note: if you have a
choice to use either Windows or Unix/Linux, opt for the
latter. ShrinkBayes runs more efficiently under Unix/Linux than
under Windows. NOTE: when running ShrinkBayes you may see ***
WARNINGS *** from INLA (e.g. on eigenvalues, or on convergence,
or even something like 18500 Aborted...). They can currently not be
surpressed, because they are produced by C-code. Please ignore them.
Installation instructions
For Windows users: PLEASE shutdown Windows Error
Reporting. Windows XP: Windows key + Pause/Break, Advanced, Error reporting, Completely (no
critical errors either). Windows 7 (and other) see: shutdown error reporting
ShrinkBayes depends on the following packages (see below for installation):
INLA (which requires packages sp and pixmap), snowfall, VGAM, mclust, logcondens, Iso, XML, rgl [All available from CRAN]
Steps:
1. install.packages(c("sp","pixmap", "snowfall", "VGAM", "mclust", "logcondens", "Iso","XML","rgl"), repos="http://cran.r-project.org")
Unix/Linux: if you can't install "XML", "rgl", try
sudo
apt-get build-dep r-cran-xml
sudo apt-get build-dep r-cran-rgl
2.source("http://www.math.ntnu.no/inla/givemeINLA.R")
[or if you installed INLA before 01/10/2012 you should upgrade by using inla.upgrade() ]
3. library(devtools)
install_github("markvdwiel/ShrinkBayes")
#### IMPORTANT NOTICE !!!!! ####
ShrinkBayes does NOT perform internal NORMALIZATION. Here's a solution using edgeR's TMM normalization.
library(edgeR)
cnf <- calcNormFactors(mydat,method="TMM")
normfac <- cnf$samples[,3] #here are the normalization factors stored
libsize <- colSums(mydat)
rellibsize <- libsize/exp(mean(log(libsize))) #relative library size
nf <- normfac * rellibsize #final normalization factor including library size
SOLUTION 1: Produce normalized counts and apply ShrinkBayes on those.
mydatnorm = round(sweep(mydata, 2, nf, "/"))
SOLUTION 2: Leave the counts as they are (so apply ShrinkBayes on mydat data), but add sample specific offsets to the model by specifying
myoffsets <- log(nf)
form <- ~ 1 + group + offset(myoffsets)
The second solution is the preferred one, but the 1rst maybe useful
when you desire to use the normalize counts for other purposes as well.
#### END OF NOTICE ####
RNA-seq
data
Full data sets are available from the ReCount
web site. Below: data as used in the ShrinkSeq paper.
Balanced
split.
Four unbalanced splits
The small set of the first split is used as an Example in the
Supplementary Material of the Biostatistics paper.