ShrinkBayes

ShrinkBayes

This site corresponds to the papers:

Van de Wiel MA, Neerincx M, Buffart TE, Sie D, Verheul HMW (2014). ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs. BMC Bioinformatics. 15(1):116.

Van de Wiel MA, De Menezes RX, Siebring E, Van Beusechem VW (2013). Analysis of small-sample clinical genomics studies using multi-parameter shrinkage: application to high-throughput RNA interference screening. BMC Med Genom 6 (Suppl 2), S1.

Van de Wiel MA, Leday GGR, Pardo L, Rue H, Van der Vaart AW, Van Wieringen WN (2012). Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics 14, 113-128.

Why use ShrinkBayes?

It is demonstrated to be more reproducible than most other approaches, in particular for small samples
It is build upon the INLA fundament. Hence, much faster than MCMC.
Allows for (zero-inflated) Counts and Gaussian data [Hence can be applied to (mi)RNAseq, CAGE, mRNA/miRNA microaarray, HT RNAi, etc....]
Very versatile in terms of designs. GLM-context, and allows for random effects.
Provides Bayesian FDR and lfdr estimates.
Accommodates a variety of priors, including mixture and nonparametric ones.
Enables multi-parameter shrinkage

R Package (AVAILABLE FROM GITHUB since 20/4/2016)
Note: if you have a choice to use either Windows or Unix/Linux, opt for the latter. ShrinkBayes runs more efficiently under Unix/Linux than under Windows. NOTE: when running ShrinkBayes you may see *** WARNINGS *** from INLA (e.g. on eigenvalues, or on convergence, or even something like 18500 Aborted...). They can currently not be surpressed, because they are produced by C-code. Please ignore them.

Installation instructions
For Windows users: PLEASE shutdown Windows Error Reporting. Windows XP: Windows key + Pause/Break, Advanced, Error reporting, Completely (no critical errors either). Windows 7 (and other) see: shutdown error reporting

ShrinkBayes depends on the following packages (see below for installation):
INLA (which requires packages sp and pixmap), snowfall, VGAM, mclust, logcondens, Iso, XML, rgl [All available from CRAN]

Steps:
1. install.packages(c("sp","pixmap", "snowfall", "VGAM", "mclust", "logcondens", "Iso","XML","rgl"), repos="http://cran.r-project.org")

Unix/Linux: if you can't install "XML", "rgl", try
sudo apt-get build-dep r-cran-xml
sudo apt-get build-dep r-cran-rgl

2.source("http://www.math.ntnu.no/inla/givemeINLA.R")
[or if you installed INLA before 01/10/2012 you should upgrade by using inla.upgrade() ]

3. library(devtools)
install_github("markvdwiel/ShrinkBayes")

#### IMPORTANT NOTICE !!!!! ####
ShrinkBayes does NOT perform internal NORMALIZATION. Here's a solution using edgeR's TMM normalization.

library(edgeR)
cnf <- calcNormFactors(mydat,method="TMM")
normfac <- cnf$samples[,3] #here are the normalization factors stored
libsize <- colSums(mydat)
rellibsize <- libsize/exp(mean(log(libsize))) #relative library size
nf <- normfac * rellibsize #final normalization factor including library size

SOLUTION 1: Produce normalized counts and apply ShrinkBayes on those.

mydatnorm = round(sweep(mydata, 2, nf, "/"))

SOLUTION 2: Leave the counts as they are (so apply ShrinkBayes on mydat data), but add sample specific offsets to the model by specifying

myoffsets <- log(nf)
form <- ~ 1 + group + offset(myoffsets)

The second solution is the preferred one, but the 1rst maybe useful when you desire to use the normalize counts for other purposes as well.

#### END OF NOTICE ####

RNA-seq data
Full data sets are available from the ReCount web site. Below: data as used in the ShrinkSeq paper.
Balanced split.
Four unbalanced splits

The small set of the first split is used as an Example in the Supplementary Material of the Biostatistics paper.