# A Reference List for Statistics

The following books are useful for learning/reference. Some books overlap content and have different levels of difficulty.

## Mathematical Statistics

- Introduction to Mathematical Statistics by Robert Hogg, et al.
- In All Likelihood by Yudi Pawitan
- The Matrix Cookbook by Kaare Petersen and Michael Pedersen
- Matrix Algebra: Theory, Computations, and Applications in Statistics by James Gentle
- A First Look at Rigorous Probability Theory by Jeffrey Rosenthal

## Bayesian Statistics

- Doing Bayesian Data Analysis by John Kruschke
- Bayesian Data Analysis by Andrew Gelman, et al.
- Statistical Rethinking by Richard McElreath

## Modeling

- How to create a model
- Regression Modeling Strategies by Frank Harrell
- Data Analysis Using Regression and Multilevel Models by Andrew Gelman and Jennifer Hill
- The Book of Why by Judea Pearl and Dana Mackenzie
- Causal Inference by Miguel A. Hernan and James M. Robins
- Statistical Issues in Drug Development by Stephen Senn
- Clinical Prediction Models by Ewout Steyerberg
- Uncertainty by William Briggs
- Regression Analysis: A Constructive Critique by Richard Berk

- Implementing specific models
- General
- Applied Linear Statistical Models by Michael Kutner, et al.
- Categorical Data Analysis by Alan Agresti
- Linear Mixed Models: A Practical Guide Using Statistical Software by Brady West, et al.
- Extending the Linear Model with R by Julian Faraway

- Multivariate
- Methods of Multivariate Analysis by Alvin Rencher and William Christensen
- An Introduction to Applied Multivariate Analysis with R by Brian Everitt and Torsten Hothorn
- Multivariate Data Analysis by Joseph Hair, et al.

- Statistical Learning
- An Introduction to Statistical Learning by Trevor Hastie, et al.
- Applied Predictive Modeling by Max Kuhn and Kjell Johnson

- Survival Analysis
- Survival Analysis: A Self-Learning Text by David Kleinbaum and Mitchel Klein
- Modeling Survival Data: Extending the Cox Model by Terry Therneau and Patricia Grambsch

- Time Series
- Forecasting: Principles and Practice by Rob Hyndman and George Athanasopoulos

- Quantile Regression
- Handbook of Quantile Regression by Roger Koenker, et al.

- Missing Data
- Flexible Imputation of Missing Data by Stef van Buuren

- General

## Design of Experiments

- Statistics for Experimenters: Design, Innovation, and Discovery by George Box, et al.
- Design and Analysis of Experiments by Douglas Montgomery
- The Design of Experiments: Statistical Principles for Practical Applications by Roger Mead
- Design and Analysis of Experiments with R by John Lawson

## Programming

- R
- Advanced R by Hadley Wickham
- The Art of R Programming by Norman Matloff
- R for Data Science by Hadley Wickham and Garrett Grolemund
- Software for Data Analysis by John Chambers
- Extending R by John Chambers
- R Packages by Hadley Wickham

- Reproducible Documents
- R Markdown by Yihui Xie, et al.
- Dynamic Documents with R and knitr by Yihui Xie
- bookdown: Authoring Books and Technical Documents with R Markdown by Yihui Xie
- Reproducible Research with R and RStudio by Christopher Gandrud
- LaTeX and Friends by Marc van Dongen
- More Math Into LaTeX by George Gratzer

- Python
- Python for Data Analysis by Wes McKinney
- Python Data Science Handbook by Jake VanderPlas
- Think Python by Allen Downey
- Automate the Boring Stuff with Python by Al Sweigart

- SQL and Databases
- The Language of SQL by Larry Rockoff
- Data Analysis Using SQL and Excel by Gordon Linoff
- Data Modeling Essentials by Graeme Simsion and Graham Witt

- C++
- C++ Primer by Stanley Lippman, et al.
- Effective Modern C++ by Scott Meyers
- Seamless R and C++ Integration with Rcpp by Dirk Eddelbuettel

- C
- C Programming Absolute Beginner’s Guide by Greg Perry and Dean Miller
- C Programming: A Modern Approach by K. N. King
- Modeling with Data: Tools and Techniques for Scientific Computing by Ben Klemens

## Data Visualization

- Visualizing Data by William Cleveland
- ggplot2 by Hadley Wickham
- The Grammar of Graphics by Leland Wilkinson, et al.
- Exploratory Data Analysis by John Tukey
- Data Visualization by Kieran Healy

## Sampling and Surveys

- Sampling by Steven Thompson
- Survey Sampling by Leslie Kish
- Applied Survey Data Analysis by Heeringa, et al.
- The Survey Research Handbook by Pamela Alreck and Robert Settle

## Measurement

- An Introduction to Error Analysis by John Taylor

## Mathematical Background

- Linear Algebra by David Poole
- Calculus by Morris Kline
- Book of Proof by Richard Hammack

## Other

- Ethics
- On Being a Scientist: A Guide to Responsible Conduct in Research by The National Academies

- Communication
- Handbook of Writing for the Mathematical Sciences by Nicholas Higham
- The Craft of Scientific Writing by Michael Alley
- Writing Science: How to Write Papers That Get Cited and Proposals That Get Funded by Joshua Schimel
- The Craft of Scientific Presentations by Michael Alley

- Statistical
- Breakthroughs in Statistics by Samuel Kotz, et al.
- Also see volumes 2 and 3

- Ecological Methodology by Charles Krebs
- Mixed Effects Models and Extensions in Ecology with R by Alain Zuur, et al.
- Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving by Deborah Nolan and Duncan Temple Lang
- Optimal Design of Experiments: A Case Study Approach by Peter Goos and Bradley Jones
- Practical Data Science with R by Nina Zumel and John Mount
- The Geometry of Multivariate Statistics by Thomas Wickens
- Computer Age Statistical Inference by Bradley Efron and Trevor Hastie

- Breakthroughs in Statistics by Samuel Kotz, et al.

*Last updated 2018-08-07*