gravatar

ANyombayire

Allen Nyombayire Mukundwa

Recently Published

A Behavioral Churn Analysis
An end-to-end unsupervised learning project using K-Means Clustering and PCA to identify high-risk consumer personas. Features a "Lift Analysis" using the Apriori algorithm to uncover behavioral signatures that lead to service friction and churn.
Uncovering Consumer Purchasing Patterns in Fast Fashion: A Market Basket Analysis of H&M Transaction Data
This project analyzes 300,000 H&M transactions using the Apriori algorithm to identify how customers bundle products. By using Support, Confidence, and Lift metrics, the research categorizes items into "Anchor" staples and "Target" coordinated sets. The results, visualized through high-contrast network graphs, reveal that certain fashion pairings exhibit lift values over 5,000, offering a significant opportunity for optimized digital cross-selling and store layout strategies.
Clustering Asian Countries Based on Life Expectancy & Socioeconomic Indicator
This project explores the hidden structures within global health data by focusing on 46 Asian countries. Using the WHO Life Expectancy dataset, I applied various unsupervised learning techniques to group countries based on metrics like life expectancy, schooling, and adult mortality. Key features of this report include: Data Selection: Analysis focused on the year 2014 to ensure data completeness for critical variables like Alcohol consumption and Schooling. Methodology: A comparison of partitioning methods (K-means and PAM) and Hierarchical Clustering (Ward's method) to identify stable country groupings. Advanced Visualization: Use of dimensionality reduction techniques—MDS, t-SNE, and UMAP—to project complex, multi-dimensional data into intuitive 2D maps. Findings: The results reveal three distinct clusters: a "High-Performing" group (e.g., Japan, Singapore), an "Emerging" middle-income group (e.g., China, Thailand), and a "Challenged" group (e.g., Afghanistan, Yemen).
The Geometry of Global Diets: Non-linear Dimension Reduction and Cluster Validatio
This study performs a high-dimensional analysis of global food supply patterns using FAOSTAT data to identify distinct culinary archetypes. By moving beyond traditional geographic or economic groupings, this research utilizes advanced unsupervised learning techniques to categorize 207 nations based on 118 different food variables. Key Methodological Highlights: Cluster Tendency Validation: Verified the non-random nature of the dataset using a Hopkins Statistic of 1.0 and VAT (Visual Assessment of Cluster Tendency) plots. Multi-Dimensional Projection: Compared linear dimensionality reduction (PCA) with advanced non-linear techniques, including t-SNE and UMAP, to reveal complex cultural dietary "neighborhoods". Robust Clustering: Employed PAM (Partitioning Around Medoids) to identify four stable dietary archetypes: Western Industrialized, Cereal-based, Starchy-Root Subsistence, and Mediterranean/Diverse profiles. Findings: The analysis successfully identifies representative "medoid" entities such as the European Union (27) for Western diets and Net Food Importing Developing Countries for subsistence diets—providing a data-driven framework for understanding global nutrition transitions and food security.