# Appendix

## Appendix A. Statistics in R Cheat Sheet

The following provides guidance on the types of statistical tests to perform depending on the nature of the variables of interest. Several R functions are also provided, some of which are available in packages.

### For quantitative variables

• If you want to compare if mean values differ from known values, perform a one-sample t-test.
• e.g., t.test(x, mu = 999))
• If you want to compare if mean values differ from other mean values, perform a two-sample t-test.
• e.g., t.test(x, y)
• If you want to find the difference between paired data that are not independent, perform a paired t-test.
• e.g., t.test(x, y, paired = TRUE)
• If you want to find the sample size necessary given your data, perform a power test.
• e.g., power.t.test()
• If you want to compare if the variances from two populations differ, perform an F-test for variance.
• e.g., var.test(x, y)
• If you want to see how correlated two quantitative variables are, calculate the Pearson correlation coefficient.
• e.g., cor(x, y) or cor.test(x, y)
• If you want to predict a quantitative variable using information from a different quantitative variable (the independent variable), perform a simple linear regression.
• e.g., lm()
• If you want to predict a quantitative variable using information from multiple quantitative variables (the independent variables), perform a multiple linear regression.
• e.g.,lm()
• From the leaps package: regsubsets()
• From the GGally package: ggpairs()
• From the car package: vif()
• If you want to predict a quantitative variable at one or more treatment levels, perform an analysis of variance.
• e.g., lm(); pairwise.t.test()
• From the agricolae package: lsd.test()
• If you want to predict a quantitative variable at one or more treatment levels and a quantitative covariate, perform an analysis of covariance.
• e.g., lm(); pairwise.t.test()
• From the agricolae package: lsd.test()
• If you want to predict a quantitative variable using information from a different quantitative variable (the independent variable) with fixed and random effect, perform linear mixed models regression.
• From the lme4 package: lmer()

### For proportions

• If you want to compare if its mean proportion differs from a known proportion, perform a one-sample test for proportion.
• e.g., binom.test()or prop.test()
• If you want to compare its mean proportion to another mean proportion, perform a two-sample test for proportion.
• e.g., prop.test()

### For categorical variables

• If you want to test if there is a relationship across categories, or see if the categories are independent, perform a chi-square test.
• e.g., chisq.test()

### For binary variables

• If you want to Predict a binary variable (e.g., yes/no) using information from one or more quantitative/categorical variables (the independent variables), perform a logistic regression.
• e.g., glm(family = “binomial”)

### For multinomial variables

• If you want to predict an unordered multinomial variable (e.g., three or more responses) using information from one or more quantitative/categorical variables (the independent variables), perform multinomial logistic regression.
• From the nnet package: multinom()

### For ordinal variables

• If you want to predict an ordered multinomial variable (e.g., three or more responses) using information from one or more quantitative/categorical variables (the independent variables), perform an ordinal regression.
• From the MASS package: polr()

### For integers

• If you have non-negative integers (e.g., 0, 1, 2, 3, …) and you want to predict an integer using one or more quantitative/categorical variables (the independent variables), perform count regression (e.g., Poisson, negative binomial).
• e.g., glm(family = “Poisson”) or glm.nb()
• If you have non-negative integers with many zeros, and you want to predict an integer using one or more quantitative/categorical variables (the independent variables), perform zero-inflated count regression (e.g., zero-inflated Poisson or zero-inflated negative binomial).
• e.g., zeroinfl() from the pscl package