Recently Published

CJS 310 Quiz 1
Discussion 5
Tarea 1
C.U. 199912
Part1 and Part2 together in NKTCL GSE318371 using Seurat to get top genes before Machine Learning
Part 1 and Part 2 in reading in a large RAW file with Seurat and creating the massively large layered array object of matrices, lists, strings, etc. needed to run analysis within Seurat with. More to come on the unsupervised machine learning algorithms of PCA, tSNE, uMap, and clustering with K-Nearest Neighbors to get top genes to predict Natural Killer T-cell Lymphoma aggressive pathology from our database to build our machine to predict EBV, EBV associated pathologies such as this one and Mononucleosis and Multiple Sclerosis, fibromyalgia, and Lyme disease.
DocumentLoan Approval Optimization: Re-evaluating Automatic Denials
Executive Summary Our existing algorithmic approach to prescreening loan applicants automatically denies 100% of applicants with a prior history of default. This policy is overly restrictive from a quantitative and business perspective, as many of these individuals possess superior credit scores and lower leverage ratios than those who are regularly approved. This report follows the Data Science lifecycle (Modules 1-5) to analyze the issue and deploy an alternative strategy: a predictive, proxy-target model to identify high-potential applicants within this previously excluded population for manual review. M1: Business Understanding The Problem: The strict business rule (decision tree outcome) automatically denies the 22,858 people who have a prior default. Because this criterion universally drove denial in historic data, any model trained naively on the full dataset learns this as an unbreakable rule. The Question: How can we identify credit-worthy applicants among those with a prior default to refer them for secondary, manual review, thereby increasing overall revenue without disproportionately raising default risk? The Solution Path: Because historical data contains zero approvals for the “Prior Default” target group, we will utilize Proxy Target Modeling. We will train an algorithm exclusively on applicants with no prior defaults to understand what a “good” applicant looks like, and then apply this scoring mechanism back to the “Prior Default” group to identify strong candidates for reconsideration.
CCA
Unsupervised Learning Data Prep Beginning Errors Seurat Library on NKTCL GSE318371
This Rpub document goes through beginning errors of using Seurat to handle unsupervised RAW gene expression data with many layers to the Seurat objects created. It runs through figuratively extracting just the data frames of counts and fragments with barcodes of cells in the array to make a large table, but that table doesn't have the attached hidden layers of gene name and other important information that Seurat can handle to run PCA, K-Nearest Neighbors clustering, UMap, and tSNE algorithms in getting top clusters. This is part 1. There is a part 2 before this, this is edited and cleaned up. Removing previous version.
Confidence Intervals
This data dive examines how Game Score relates to statistical “unexpectedness” in NBA performances. By creating derived variables and analyzing their relationships, I evaluate correlation, outliers, and confidence intervals while also reflecting on how documentation and metric construction affect interpretation. The analysis highlights the importance of understanding how performance metrics are calculated before drawing conclusions.