Like a data-guzzling turbo engine, advanced data mining has been powering post-genome biological studies for two decades. Reflecting this growth, Biological Data Mining presents comprehensive data mining concepts, theories, and applications in current biological and medical research. Each chapter is written by a distinguished team of interdisciplinary data mining researchers who cover state-of-the-art biological topics.
The first section of Biological Data Mining discusses challenges and opportunities in analyzing and mining biological sequences and structures to gain insight into molecular functions. The second section addresses emerging computational challenges in interpreting high-throughput Omics data. Biological Data Mining then describes the relationships between data mining and related areas of computing, including knowledge representation, information retrieval, and data integration for structured and unstructured biological data. The last part explores emerging data mining opportunities for biomedical applications.
Biological Data Mining examines the concepts, problems, progress, and trends in developing and applying new data mining techniques to the rapidly growing field of genome biology. By studying the concepts and case studies presented, readers will gain significant insight and develop practical solutions for similar biological data mining projects in the future.
SEQUENCE, STRUCTURE, AND FUNCTION
Consensus Structure Prediction for RNA Alignments
Junilda Spirollari and Jason T.L. Wang
Invariant Geometric Properties of Secondary Structure Elements in Proteins
Matteo Comin, Concettina Guerra, and Giuseppe Zanotti
Discovering 3D Motifs in RNA
Alberto Apostolico, Giovanni Ciriello, Christine E. Heitsch, and Concettina Guerra
Protein Structure Classification Using Machine Learning Methods
Yazhene Krishnaraj and Chandan Reddy
Protein Surface Representation and Comparison: New Approaches in Structural Proteomics
Lee Sael and Daisuke Kihara
Advanced Graph Mining Methods for Protein Analysis
Yi-Ping Phoebe Chen, Jia Rong, and Gang Li
Predicting Local Structure and Function of Proteins
Huzefa Rangwala and George Karypis
GENOMICS, TRANSCRIPTOMICS, AND PROTEOMICS
Computational Approaches for Genome Assembly Validation
Jeong-Hyeon Choi, Haixu Tang, Sun Kim, and Mihai Pop
Mining Patterns of Epistasis in Human Genetics
Jason H. Moore
Discovery of Regulatory Mechanisms from Gene Expression Variation by eQTL Analysis
Yang Huang, Jie Zheng, and Teresa M. Przytycka
Statistical Approaches to Gene Expression Microarray Data Preprocessing
Megan Kong, Elizabeth McClellan, Richard H. Scheuermann, and Monnie McGee
Application of Feature Selection and Classification to Computational Molecular Biology
Paola Bertolazzi, Giovanni Felici, and Giuseppe Lancia
Statistical Indices for Computational and Data-Driven Class Discovery in Microarray Data
Raffaele Giancarlo, Davide Scaturro, and Filippo Utro
Computational Approaches to Peptide Retention Time Prediction for Proteomics
Xiang Zhang, Cheolhwan Oh, Catherine P. Riley, Hyeyoung Cho, and Charles Buck
FUNCTIONAL AND MOLECULAR INTERACTION NETWORKS
Inferring Protein Functional Linkage Based on Sequence Information and Beyond
Li Liao
Computational Methods for Unraveling Transcriptional Regulatory Networks in Prokaryotes
Dongsheng Che and Guojun Li
Computational Methods for Analyzing and Modeling Biological Networks
Nataša Pržulj and Tijana Milenkovic'
Statistical Analysis of Biomolecular Networks
Jing-Dong J. Han and Chris J. Needham
LITERATURE, ONTOLOGY, AND KNOWLEDGE INTEGRATION
Beyond Information Retrieval: Literature Mining for Biomedical Knowledge Discovery
Javed Mostafa, Kazuhiro Seki, and Weimao Ke
Mining Biological Interactions from Biomedical Texts for Efficient Query Answering
Muhammad Abulaish, Lipika Dey, and Jahiruddin
Ontology-Based Knowledge Representation of Experiment Metadata in Biological Data Mining
Richard H. Scheuermann, Megan Kong, Carl Dahlke, Jennifer Cai, Jamie Lee, Yu Qian, Burke Squires, Patrick Dunn, Jeff Wiser, Herb Hagler, Barry Smith, and David Karp
Redescription Mining and Applications in Bioinformatics
Naren Ramakrishnan and Mohammed J. Zaki
GENOME MEDICINE APPLICATIONS
Data Mining Tools and Techniques for Identification of Biomarkers for Cancer
Mick Correll, Simon Beaulah, Robin Munro, Jonathan Sheldon, Yike Guo, and Hai Hu
Cancer Biomarker Prioritization: Assessing the in vivo Impact of in vitro Models by in silico Mining of Microarray Database, Literature, and Gene Annotation
Chia-Ju Lee, Zan Huang, Hongmei Jiang, John Crispino, and Simon Lin
Biomarker Discovery by Mining Glycomic and Lipidomic Data
Haixu Tang, Mehmet Dalkilic, and Yehia Mechref
Data Mining Chemical Structures and Biological Data
Glenn J. Myatt and Paul E. Blower
Jake Y. Chen is an assistant professor of informatics at Indiana University, an assistant professor of computer science at Purdue University, and director of the Indiana Center for Systems Biology and Personalized Medicine.
Stefano Lonardi is an associate professor of computer science and engineering at the University of California, Riverside.
"The book will be useful to those interested in applying data mining to biology. Specialists in interdisciplinary areas will also find the book helpful. Despite the diversity of the topics presented, the editors manage to maintain homogeneity throughout the book. I recommend this book as a valuable resource on biological data mining. The chapters offer a wealth of useful information [...]"
– Computing Reviews, January 2011
"[...] Chen and Lonardi present in this book a showcase of successful recent projects in the research area where biology, computer science, and statistics intersect. The editors have done a good job of pulling together the work of over 80 authors into a well-typeset product with high-resolution graphics and even several diagrams of proteins. [...] The authors leave no stone unturned in terms of topics and techniques. [...] There is a veritable alphabet soup of special software employed [...] there is something for everyone with an interest in bioinformatics in this book. Make sure your library has a copy, or that you buy one for yourselves.
– International Statistical Review (2010), 78, 3