Recently Published

League of Legends Champion Clustering and Dimension Reduction Analysis
This project applies unsupervised learning to a dataset of League of Legends champion base statistics sourced from Kaggle (Cute Dango, League of Legends Champions dataset, available at kaggle.com/datasets/cutedango/league-of-legends-champions) to discover whether champions naturally cluster into distinct statistical archetypes, and which features drive those groupings. The workflow combines three complementary approaches: Hard clustering (K-Means, Hierarchical) to identify stable, discrete champion archetypes, Soft clustering (Fuzzy C-Means) to quantify champion hybridity, how strongly each champion belongs to one archetype versus another, Dimensionality reduction (PCA, MDS, UMAP, t-SNE, SOM) to visualize the structure of the feature space and validate clustering results across multiple independent methods.
Epi553_lab06_ShahiSuruchi
This is the completion of 6th lab for class Epi 553.
Text Classification Using ChatGPT
This project analyzes a complete archive of 90,000+ posts made by Donald Trump on X (Twitter) and Truth Social from 2009–2026. Using R, Python, and Bash, I clean, filter, and structure the data to focus exclusively on original, text-based posts—excluding reposts, quotes, links, images, and videos—to isolate Trump’s direct public communication. The objective is to examine rhetorical patterns and topic trends over time through large-scale text classification. I analyze the frequency of themes such as religion, immigration, education, and economics; measure sentiment toward groups defined by ethnicity, religion, and sexual orientation; track posting behavior; and investigate which linguistic and structural features are associated with viral posts. Overall, the project integrates data wrangling, analysis, and computational text modeling to study the evolution and impact of modern political discourse.
Lab 2 - Finish
Assignment 6
5B Elo rating system: Expected score formula
DATA 607 Assignment 5B Elo expected score calculation using cleaned up chess tournament data set clean from a text file in Project 1
5A Simpson Paradox: Airline Delays
DATA 607 Assignment 5A analysis
Analysis of Prostate Data
Chapter 3 of RMRWR
Document