Recently Published

Retrieving the mRNA from cDNA of MS patient ID_REF and forming amino acid sequences
Not able to use Bioconductor to get the barcode ID_REF gene name for use in exploring known genes of the top 41 genes found to play a role in predicting with 100% accuracy if a sample has MS or is healthy. But reviewed the process of transcription and translation of a protein from mRNA when transcribed from its triplet codon sets by transfer RNA at the ribosome. Gaps of 1-2 RNAs were unpaired to a codon so gsub wasn't the best choice, but these barcodes are fragments and maybe those are deletions, insertions, or translations in genes that are risk associated genes for MS, or better yet are found in those with MS.
Document
VPN Day 2
Document
Ireland's Median Rainfall in January (1850–2014)
This is a blog post using R codes to analyse precipitation rates measures in January in Ireland for the years 1850 - 2024. Interactive map of weather stations is included for user friendly experience. The R codes used are explained along with pattern discussion and limitations.
Clustering Asian Countries Based on Life Expectancy & Socioeconomic Indicator
This project explores the hidden structures within global health data by focusing on 46 Asian countries. Using the WHO Life Expectancy dataset, I applied various unsupervised learning techniques to group countries based on metrics like life expectancy, schooling, and adult mortality. Key features of this report include: Data Selection: Analysis focused on the year 2014 to ensure data completeness for critical variables like Alcohol consumption and Schooling. Methodology: A comparison of partitioning methods (K-means and PAM) and Hierarchical Clustering (Ward's method) to identify stable country groupings. Advanced Visualization: Use of dimensionality reduction techniques—MDS, t-SNE, and UMAP—to project complex, multi-dimensional data into intuitive 2D maps. Findings: The results reveal three distinct clusters: a "High-Performing" group (e.g., Japan, Singapore), an "Emerging" middle-income group (e.g., China, Thailand), and a "Challenged" group (e.g., Afghanistan, Yemen).
Production models - JABBA (Part 1)
Input data, develop priors, model fitting, diagnostics, sensitivity analysis
The Geometry of Global Diets: Non-linear Dimension Reduction and Cluster Validatio
This study performs a high-dimensional analysis of global food supply patterns using FAOSTAT data to identify distinct culinary archetypes. By moving beyond traditional geographic or economic groupings, this research utilizes advanced unsupervised learning techniques to categorize 207 nations based on 118 different food variables. Key Methodological Highlights: Cluster Tendency Validation: Verified the non-random nature of the dataset using a Hopkins Statistic of 1.0 and VAT (Visual Assessment of Cluster Tendency) plots. Multi-Dimensional Projection: Compared linear dimensionality reduction (PCA) with advanced non-linear techniques, including t-SNE and UMAP, to reveal complex cultural dietary "neighborhoods". Robust Clustering: Employed PAM (Partitioning Around Medoids) to identify four stable dietary archetypes: Western Industrialized, Cereal-based, Starchy-Root Subsistence, and Mediterranean/Diverse profiles. Findings: The analysis successfully identifies representative "medoid" entities such as the European Union (27) for Western diets and Net Food Importing Developing Countries for subsistence diets—providing a data-driven framework for understanding global nutrition transitions and food security.