
Regression Techniques for Predictive Modeling
Buckle up! This is going to be a long one.

Model Tuning and Overfitting
An overview of the model tuning process, including data splitting, resampling techniques, and recommendations for choosing parameters and models.

Data PreProcessing
How the predictors are encoded, called feature engineering, has a significant impact on model performance (i.e. predictor combinations, ratios, etc). This post covers unsupervised approaches to data preprocessing.

Introduction and Overview of Content

Unsupervised Learning
In unsupervised learning, there is no response variable. Instead, we're looking to find subgroups among variables or observations, discover interesting things about the measurements, or visualize the data informatively. Two common methods are principal components analysis (for data visualization/preprocessing) and clustering (for discovering unknown subgroups).

Support Vector Machines
Support vector machines (SVMs) are often considered one of the best 'out of the box' classifiers. The simple maximal margin classifier can be generalized to the support vector classifier, which can be further generalized to the support vector machine.

TreeBased Methods
Decision trees, which divide the predictor space into regions, are simple and useful for interpretation. Their predictive power can be improved with bagging, random forests, and boosting.

Moving Beyond Linearity
Linear models can have significant limitations in terms of predictive power, because the linear assumption can be a poor assumption. Methods such as polynomial regression, step functions, regression and smoothing splines, local regression, and generalized additive models can help us flexibly model nonlinear relationships.

Linear Model Selection and Regularization
An alternative fitting method to least squares, such as subset selection, shrinkage (ridge regression, lasso), and dimension reduction techniques (principle components analysis, partial least squares) can help prediction accuracy and model interpretability.

Resampling Methods
Resampling methods provide ways tools for model assessment (evaulate a model's performance) and model selection (optimally adjust model flexibility).

Classification
Classification models are used to predict a categorical response.

Linear Regression
Linear Regression is a simple approach for predicting a quantitative response, and is the foundation of many other types of statistical learning. It is especially convenient for establishing a the strength of relationships between specific variables to an output, and is easily interpretable.

An Introduction to Statistical Learning

Economics/Statistics projects from my undergraduate degree

Graphing in R for Effective Communication
In which we review graphing options in ggplot2 that allow you to communicate results effectively, including labels, annotations, scales, zooming, and themes.

Creating Simple Documents with R Markdown
In which we review the basics of R Markdown files, including the YAML header, code chunk options, and formatting options.

Model Building in R
In which we explore the basics of modeling as an exploratory tool through recording and graphing predictions and residuals, variable interactions, and transformations.

Programming in R: Iteration
In which we explore the basics of iteration through the lenses of functional and imperative programming by examining for loops, map functions, and more.

Programming in R: Pipes, Functions, and Vectors
In which we explore the basics of programming in R by examining common programming tools, data structures, and strategies to help effectively analyze data.

Data Wrangling in R: Dates and Times
In which we review basic information about dates and times, including how to create them, how they are represented, and what you can do with them.

Data Wrangling in R: Factors
In which we review some basic information about factors, a type of data used work with categorical variables.

Data Wrangling in R: Strings
In which we dive into string manipulation, with a focus on regular expressions.

HighLevel Data Wrangling in R: Imports, Pivots, and Joins
In which we go over importing data into R, working with 'tidy data,' manipulating single tables, and joining related tables.

Basic Data Exploration in R
In which we examine some common practical examples of data exploration: observing variance and covariance with histograms, boxplots, scatterplots, and heat maps.

Basic Data Transformation in R
In which we review the fundamentals of transforming data in R, using six key functions in the dplyr package: filter, arrange, select, mutate, summarize, and group by.

Basic Data Visualization in R
In which we review the fundamentals of creating graphs in R with ggplot2.