Recently Published
Reporting Flexdashboard – Analyse des vols NYC 2013
L'ensemble de données flights (associé au package nycflights13 en R ou utilisé dans des tutoriels Python/Pandas) est l'un des jeux de données les plus célèbres pour apprendre la science des données. Merci à notre best Professor Dr. Solym Manou-Abi.
Statistical model for newborn weight prediction
This project builds a statistical model to predict newborn birth weight using clinical data from 2,500 cases across three hospitals. Key predictors include gestational age, maternal smoking, infant sex, and biometric measures.
The final regression model shows good performance (R² ≈ 0.79), addressing heteroscedasticity with log-transformation and incorporating interactions and non-linear effects. Diagnostic tests support model reliability.
The model aids early identification of at-risk newborns, informs prenatal care (notably smoking cessation), and helps optimize neonatal resource planning. Limitations in extreme value predictions are noted, suggesting future validation and richer longitudinal data integration.
Project 2 PCA
PCA
Project 1 Clustering
Clustering
Document
This project applies a pure dimension reduction approach using t-SNE to the Wine Quality dataset, which consists of physicochemical measurements of red wine samples and corresponding quality ratings assigned by expert tasters. The objective of the analysis is to explore whether wines with similar quality scores exhibit similar physicochemical characteristics when projected into a two-dimensional space
Behavioral Patterns in Cell Phone Reviews (Association Rules)
USL Association Rules Project