gravatar

SheriannMc

Sheriann McLarty

Recently Published

Homework 9
This homework focuses on tuning and evaluating nonlinear regression models using both simulated and real-world datasets. We explore the impact of variable correlation on feature importance (8.1–8.3), investigate bias in tree-based models (8.4–8.6), and finally apply model tuning techniques to the Chemical Manufacturing Process dataset (8.7). Across the exercises, we rely heavily on the caret framework for resampling, model training, and performance evaluation. Techniques like bagging, boosting, and SVMs are compared using RMSE and R² to identify the most effective approach. Along the way, we also evaluate how model interpretation changes when predictors are duplicated or correlated.
Homework 8 -Nonlinear Regression with Friedman Data
In this assignment, I explored nonlinear regression models using two different datasets.
Homework 7 - Exercise 6.2 & 6.3: Chemical Manufacturing Yield Prediction
This assignment explores predictive modeling strategies in two real-world scenarios: drug permeability and chemical manufacturing. Both problems challenge us with high-dimensional, noisy data, and require balancing model performance with interpretability.
Sentiment Analysis with Tidytext – Custom Corpus
This analysis builds on the example provided in Text Mining with R by Julia Silge and David Robinson, Chapter 2 (Sentiment Analysis).
Data 607 Assignment 9 Sheriann McLarty
New York Times APIs
Data 624 Project 1
Forecasting ATM
Forecasting Homework 5
This report applies ETS (Exponential Smoothing State Space) models to a variety of real-world time series using the fpp3 framework. Key techniques include smoothing parameter interpretation, STL decomposition, and model evaluation across trend and seasonality.
Forecasting Homework 3
This report explores various forecasting techniques applied to Australian economic and financial datasets. By implementing different time series models, we analyze trends and seasonality and forecast future values. Our objective is to evaluate the effectiveness of models such as NAIVE, SNAIVE, and RW with drift to provide insights into population growth, industry production, and retail behavior.
HW 6: Forecasting - Exercises 9.1 to 9.8
This assignment explores key time series modeling concepts using the fpp3 framework, focusing on stationarity, transformation, differencing, and ARIMA modeling. Exercises 9.1 through 9.8 from Forecasting: Principles and Practice (3rd ed) are covered using real-world datasets such as Amazon stock prices, Australian air passengers, and US GDP.
Database Analysis with R
Test R Markdown File to confirm successful SQL connection and data retrieval from the skills insights database
Chess Tournament Data Analysis
This project was about turning messy text data into something structured and insightful. I was given a .txt file containing results from a chess tournament, and the goal was to extract meaningful data—like each player’s name, state, total number of points, pre-tournament rating, and the average pre-rating of all their opponents—and generate a clean .csv that could be used for analysis or imported into a SQL database
Data 624 Homework 4- Analyzing Glass & Soybean Data
This analysis shows how we explored, cleaned, transformed, and modeled the Glass and Soybean datasets to make predictions
The normal distribution Lab 4
In this lab, you’ll investigate the probability distribution most central to statistics: the normal distribution. If you are confident that your data are nearly normal, that opens the door to many powerful statistical methods. Here, we’ll use the graphical tools of R to assess the normality of our data and learn how to generate random numbers from a normal distribution.
Probability Lab 3
The Hot Hand
Inference for categorical data Lab 6
In this lab, we will explore and visualize the data using the tidyverse suite of packages, and perform statistical inference using infer. The data can be found in the companion package for OpenIntro resources, openintro.
Working with XML and JSON in R Assignment 7
This document demonstrates how to read and compare data from HTML, XML, and JSON files in R. The dataset contains information about three fiction books.
Flight Delay Analysis and Forecasting Assignment 5
This analysis examines flight delays from Alaska Airlines and AM West. We clean the data, explore trends, compare airline performance, and attempt to forecast future delays. Due to dataset limitations, alternative approaches are also discussed.
Arima, Sarima and ETS Comparison
This document helps compare ARIMA, ETS, and SARIMA models to determine the best approach for forecasting US employment trends in the Leisure & Hospitality sector.
Data 624 Homework 2
Box-Cox Transformations, Seasonal Naïve Forecasts and Residual Analysis
Data 606 Lab 2
Introduction to Data
Data 607 Wk 3 Assignment
Visual Analyzing Regex and transforming data
Data 624 HW 1
Forecasting Homework 1
Data 606 Lab Report 1
Introduction to R
Data 607 Assignment 1
This analysis is based on the FiveThirtyEight article ["We Compiled Demographic Data On Every Candidate Running For The House And Senate In 2022"](https://fivethirtyeight.com/features/2022-candidates-race-data/).