Data Mining with R: Learning with Case Studies, Second Edition uses practical examples to illustrate the power of R and data mining. Providing an extensive update to the best-selling first edition, this new edition is divided into two parts. The first part will feature introductory material, including a new chapter that provides an introduction to data mining, to complement the already existing introduction to R. The second part includes case studies, and the new edition strongly revises the R code of the case studies making it more up to-date with recent packages that have emerged in R.
Data Mining with R does not assume any prior knowledge about R. Readers who are new to R and data mining should be able to follow the case studies, and they are designed to be self-contained so the reader can start anywhere in the document.
Data Mining with R is accompanied by a set of freely available R source files that can be obtained at the book's web site. These files include all the code used in the case studies, and they facilitate the "do-it-yourself" approach followed in the book.
Designed for users of data analysis tools, as well as researchers and developers, Data Mining with R should be useful for anyone interested in entering the "world" of R and data mining.
Introduction
- How to Read This Book
- A Short Introduction to R
- A Short Introduction to MySQL
Predicting Algae Blooms
- Problem Description and Objectives
- Data Description
- Loading the Data into R
- Data Visualization and Summarization
- Unknown Values
- Obtaining Prediction Models
- Model Evaluation and Selection
- Predictions for the 7 Algae
Predicting Stock Market Returns
- Problem Description and Objectives
- The Available Data
- Defining the Prediction Tasks
- The Prediction Models
- From Predictions into Actions
- Model Evaluation and Selection
- The Trading System
Detecting Fraudulent Transactions
- Problem Description and Objectives
- The Available Data
- Defining the Data Mining Tasks
- Obtaining Outlier Rankings
Classifying Microarray Samples
- Problem Description and Objectives
- The Available Data
- Gene (Feature) Selection
- Predicting Cytogenetic Abnormalities
Bibliography
Index
Index of Data Mining Topics
Index of R Functions
Luís Torgo is an associate professor in the Department of Computer Science at the University of Porto in Portugal. He teaches Data Mining in R in the NYU Stern School of Business’ MS in Business Analytics program. An active researcher in machine learning and data mining for more than 20 years, Dr. Torgo is also a researcher in the Laboratory of Artificial Intelligence and Data Analysis (LIAAD) of INESC Porto LA.