Recently Published
Impactos del Fracking en los Recursos Hídricos
Una Infografía sobre los daños del Fracking en el agua
Visualizing Customer Sentiment Patterns in Product Reviews
This report analyzes customer sentiment patterns in Flipkart product reviews using data visualization techniques in R. The project includes multiple visualizations to examine sentiment distribution, product ratings, review activity, and customer feedback trends across products. Interactive and static visualizations were created using ggplot2 and plotly as part of the Data Visualization Capstone project.
Exoplanet_Data_Analysis
This is an exoplanet data analysis project that uses R to clean the dataset, explore planetary and stellar patterns, and build predictive models for astronomical insights.
PAF 516 Final Dashboard - Christopher Ogino
PAF 516 Final Dashboard - Christopher Ogino
Statistics for Data Science (229711) - Chapter 6: Data Preprocessing
This chapter dives into the "engine room" of Data Science: Preprocessing. Students will learn that the quality of a model is determined long before it is trained, focusing on the critical steps required to turn messy, real-world data into a "model-ready" format.
Core Topics covered:
Why Preprocessing Matters
Handling Missing Data
Outlier Detection and Treatment
Data Transformation
Encoding Categorical Variables
Feature Scaling
Data Integration and Reshaping
Chapter Lab Activity: Full Preprocessing Pipeline with msleep
Statistics for Data Science (229711) - Chapter 5: Data Sampling Techniques
This chapter addresses the foundational question of data science: "How do we ensure our data truly represents the world?" It explores the mechanics of selection, the math of sample size, and the power of computational resampling.
Core Topics covered:
Why Sampling Matters
Probability Sampling Methods
Non-Probability Sampling Methods
Sample Size Determination
Sampling Bias and Common Pitfalls
Bootstrap Resampling
Evaluating Sample Quality
Chapter Lab Activity: Exploring Sampling with nhanes-Style Data