Bioconductor software has become a standard tool for the analysis
and comprehension of data from high-throughput genomics experiments. Its application spans a broad field of technologies used in contemporary molecular biology. In this volume, the authors present a collection of cases to apply Bioconductor tools in the analysis of microarray gene expression data.
Topics covered include
* import and preprocessing of data from various sources
* statistical modeling of differential gene expression
* biological metadata
* application of graphs and graph rendering
* machine learning for clustering and classification problems
* gene set enrichment analysis
Each chapter of this book describes an analysis of real data using hands-on example driven approaches. Short exercises help in the learning process and invite more advanced considerations of key topics.
The book is a dynamic document. All the code shown can be executed on a local computer, and readers are able to reproduce every computation, figure, and table.
The ALL data set.- R and Bioconductor introduction.- Processing affymetrix expression data.- Two color arrays.- Fold changes, log-ratios, background correction, shrinkage estimation and variance stabilization.- Easy differential expression.- Differential expression.- Annotation and metadata.- Supervised machine learning.- Unsupervised machine learning.- Using graphs for interactome data.- Graph layout.- Gene set enrichment analysis.- Hypergeometric testing used for gene set enrichment analysis.- Solutions to exercises.- References.- Index.
From the reviews: "This work has extended R substantially and is an important tool for research. ! All the code, including solutions to the exercises, is available for downloading on the Web and-this is well worth mentioning-it runs straight out of the box!. The book describes various analysis, provides the code for them and discusses the output. This makes for an easy read and anyone who works through the book will gain confidence that they can carry out analysis on their own data. The discussion of analysis is generally sound and practical. In particular the interpretation of the results of clustering is more sensible then you often see!. This book is strongly recommended for learning more about Bioconductor." (Antony Unwin, Journal of Statistical Software, January 2009, Volume 29, Book Review 1). "The readership of this book will be specialized but the text deserves to be read more widely within the statistics and computer science communities as there is much to interest the inquiring mind. ! Exercises for private study and their solutions are provided as an integral part of the text. "(C.M. O'Brien, International Statistical Review, 2009, 77, 1) "One of the great advantages of the R language is its dynamic nature, where code and other resources are continuously generated in order to address novel analytical challenges. Microarray gene expression data present such a challenge, and the Bioconductor project has risen over the years to become the foremost central repository of R-implemented approaches for such data. However, while individual packages within Bioconductor are usually well documented, it is often hard to know which packages to use in what circumstances, especially when tools from several packages are best used in concert. This text aims to fill that void by offering a collection of case studies derived from the authors' own Bioconductor courses, covering the topics of processing raw intensities; correcting for background noise and variation across chips; differential expression analysis; machine learning for clustering and classification; graph creation; and gene set enrichment. ...All in all, this text is an excellent, well-written reference for many of the common tasks that arise during the analysis of microarray gene expression datasets, as implemented by Bioconductor. It is well worth the modest sum required for its purchase." (The American Statistician, May 2010, Vol. 64, No. 2)