• Regression Techniques for Predictive Modeling

    Buckle up! This is going to be a long one.

    May 18, 2020 - 23 minute read -
    academic
  • Model Tuning and Overfitting

    An overview of the model tuning process, including data splitting, resampling techniques, and recommendations for choosing parameters and models.

    May 13, 2020 - 7 minute read -
    academic
  • Data Pre-Processing

    How the predictors are encoded, called feature engineering, has a significant impact on model performance (i.e. predictor combinations, ratios, etc). This post covers unsupervised approaches to data pre-processing.

    May 12, 2020 - 9 minute read -
    academic
  • Introduction and Overview of Content

    May 11, 2020 - 2 minute read -
    academic
  • Unsupervised Learning

    In unsupervised learning, there is no response variable. Instead, we're looking to find subgroups among variables or observations, discover interesting things about the measurements, or visualize the data informatively. Two common methods are principal components analysis (for data visualization/pre-processing) and clustering (for discovering unknown subgroups).

    May 4, 2020 - 13 minute read -
    academic
  • Support Vector Machines

    Support vector machines (SVMs) are often considered one of the best 'out of the box' classifiers. The simple maximal margin classifier can be generalized to the support vector classifier, which can be further generalized to the support vector machine.

    May 2, 2020 - 13 minute read -
    academic
  • Tree-Based Methods

    Decision trees, which divide the predictor space into regions, are simple and useful for interpretation. Their predictive power can be improved with bagging, random forests, and boosting.

    May 1, 2020 - 13 minute read -
    academic
  • Moving Beyond Linearity

    Linear models can have significant limitations in terms of predictive power, because the linear assumption can be a poor assumption. Methods such as polynomial regression, step functions, regression and smoothing splines, local regression, and generalized additive models can help us flexibly model non-linear relationships.

    April 30, 2020 - 14 minute read -
    academic
  • Linear Model Selection and Regularization

    An alternative fitting method to least squares, such as subset selection, shrinkage (ridge regression, lasso), and dimension reduction techniques (principle components analysis, partial least squares) can help prediction accuracy and model interpretability.

    April 27, 2020 - 20 minute read -
    academic
  • Resampling Methods

    Resampling methods provide ways tools for model assessment (evaulate a model's performance) and model selection (optimally adjust model flexibility).

    April 25, 2020 - 7 minute read -
    academic
  • Classification

    Classification models are used to predict a categorical response.

    April 24, 2020 - 12 minute read -
    academic
  • Linear Regression

    Linear Regression is a simple approach for predicting a quantitative response, and is the foundation of many other types of statistical learning. It is especially convenient for establishing a the strength of relationships between specific variables to an output, and is easily interpretable.

    April 22, 2020 - 10 minute read -
    academic
  • An Introduction to Statistical Learning

    April 21, 2020 - 3 minute read -
    academic
  • Economics/Statistics projects from my undergraduate degree

    March 10, 2020 - 2 minute read -
    academic
  • Graphing in R for Effective Communication

    In which we review graphing options in ggplot2 that allow you to communicate results effectively, including labels, annotations, scales, zooming, and themes.

    March 5, 2020 - 9 minute read -
    academic r
  • Creating Simple Documents with R Markdown

    In which we review the basics of R Markdown files, including the YAML header, code chunk options, and formatting options.

    March 4, 2020 - 4 minute read -
    academic r
  • Model Building in R

    In which we explore the basics of modeling as an exploratory tool through recording and graphing predictions and residuals, variable interactions, and transformations.

    February 28, 2020 - 11 minute read -
    academic r
  • Programming in R: Iteration

    In which we explore the basics of iteration through the lenses of functional and imperative programming by examining for loops, map functions, and more.

    February 25, 2020 - 12 minute read -
    academic r
  • Programming in R: Pipes, Functions, and Vectors

    In which we explore the basics of programming in R by examining common programming tools, data structures, and strategies to help effectively analyze data.

    February 25, 2020 - 8 minute read -
    academic r
  • Data Wrangling in R: Dates and Times

    In which we review basic information about dates and times, including how to create them, how they are represented, and what you can do with them.

    February 22, 2020 - 7 minute read -
    academic r
  • Data Wrangling in R: Factors

    In which we review some basic information about factors, a type of data used work with categorical variables.

    February 21, 2020 - 9 minute read -
    academic r
  • Data Wrangling in R: Strings

    In which we dive into string manipulation, with a focus on regular expressions.

    February 18, 2020 - 8 minute read -
    academic r
  • High-Level Data Wrangling in R: Imports, Pivots, and Joins

    In which we go over importing data into R, working with 'tidy data,' manipulating single tables, and joining related tables.

    February 17, 2020 - 8 minute read -
    academic r
  • Basic Data Exploration in R

    In which we examine some common practical examples of data exploration: observing variance and co-variance with histograms, boxplots, scatterplots, and heat maps.

    February 16, 2020 - 7 minute read -
    academic r
  • Basic Data Transformation in R

    In which we review the fundamentals of transforming data in R, using six key functions in the dplyr package: filter, arrange, select, mutate, summarize, and group by.

    February 15, 2020 - 13 minute read -
    academic r
  • Basic Data Visualization in R

    In which we review the fundamentals of creating graphs in R with ggplot2.

    February 12, 2020 - 12 minute read -
    academic r