A Hands-On Way to Learning Data Analysis
Part of the core of statistics, linear models are used to make predictions and explain the relationship between the response and the predictors. Understanding linear models is crucial to a broader competence in the practice of statistics. Linear Models with R, Second Edition explains how to use linear models in physical science, engineering, social science, and business applications. The book incorporates several improvements that reflect how the world of R has greatly expanded since the publication of the first edition.
New to the Second Edition:
- Reorganized material on interpreting linear models, which distinguishes the main applications of prediction and explanation and introduces elementary notions of causality
- Additional topics, including QR decomposition, splines, additive models, Lasso, multiple imputation, and false discovery rates
- Extensive use of the ggplot2 graphics package in addition to base graphics
Like its widely praised, best-selling predecessor, this edition combines statistics and R to seamlessly give a coherent exposition of the practice of linear modeling. The text offers up-to-date insight on essential data analysis topics, from estimation, inference, and prediction to missing data, factorial models, and block designs. Numerous examples illustrate how to apply the different methods using R.
Introduction
Before You Start
Initial Data Analysis
When to Use Linear Modeling
History
Estimation
Linear Model
Matrix Representation
Estimating b
Least Squares Estimation
Examples of Calculating ˆb
Example
QR Decomposition
Gauss–Markov Theorem
Goodness of Fit
Identifiability
Orthogonality
Inference
Hypothesis Tests to Compare Models
Testing Examples
Permutation Tests
Sampling
Confidence Intervals for b
Bootstrap Confidence Intervals
Prediction
Confidence Intervals for Predictions
Predicting Body Fat
Autoregression
What Can Go Wrong with Predictions?
Explanation
Simple Meaning
Causality
Designed Experiments
Observational Data
Matching
Covariate Adjustment
Qualitative Support for Causation
Diagnostics
Checking Error Assumptions
Finding Unusual Observations
Checking the Structure of the Model
Discussion
Problems with the Predictors
Errors in the Predictors
Changes of Scale
Collinearity
Problems with the Error
Generalized Least Squares
Weighted Least Squares
Testing for Lack of Fit
Robust Regression
Transformation
Transforming the Response
Transforming the Predictors
Broken Stick Regression
Polynomials
Splines
Additive Models
More Complex Models
Model Selection
Hierarchical Models
Testing-Based Procedures
Criterion-Based Procedures
Summary
Shrinkage Methods
Principal Components
Partial Least Squares
Ridge Regression
Lasso
Insurance Redlining—A Complete Example
Ecological Correlation
Initial Data Analysis
Full Model and Diagnostics
Sensitivity Analysis
Discussion
Missing Data
Types of Missing Data
Deletion
Single Imputation
Multiple Imputation
Categorical Predictors
A Two-Level Factor
Factors and Quantitative Predictors
Interpretation with Interaction Terms
Factors with More than Two Levels
Alternative Codings of Qualitative Predictors
One Factor Models
The Model
An Example
Diagnostics
Pairwise Comparisons
False Discovery Rate
Models with Several Factors
Two Factors with No Replication
Two Factors with Replication
Two Factors with an Interaction
Larger Factorial Experiments
Experiments with Blocks
Randomized Block Design
Latin Squares
Balanced Incomplete Block Design
Appendix: About R
Bibliography
Index