Recently Published
Predictive Modeling of House Prices in King County
My client needed a reliable model to predict house prices in King County, USA, to assist in investment decisions and property valuation. The goal was to identify which property features most influenced pricing and to build a model that could generalize well to unseen data.
One of the main challenges was the presence of multicollinearity among numeric variables and some outliers in the price and square footage distributions. I addressed this by conducting a thorough Exploratory Data Analysis (EDA), applying correlation analysis to reduce redundant predictors, and transforming skewed variables.
I developed two models: Multiple Linear Regression for interpretability and Random Forest for prediction accuracy. The Random Forest model provided superior performance with a significantly lower RMSE.
This project showcased my ability to clean data, engineer features, compare models, and explain results clearly using R and relevant packages.
Classifying Barbell Lifting Techniques Using Random Forests and Sensor Data
This project is part of the Coursera Practical Machine Learning course. It involves building a predictive model to classify how individuals perform barbell lifts using data collected from accelerometers on various parts of the body. The project demonstrates the application of machine learning techniques to real-world sensor data.
Simulation of Exponential Distribution
Investigate by Simulation the Exponential Distribution in R and Compare it with the Central Limit Theorem
Quiz 3 Statistics Inference
Johns Hopkins Data Science Coursera Course
Data Analysis
Tools