Recently Published
NextWordpredictor
TextPredictor is a Shiny web app that predicts the next word in a user-entered phrase using unigram, bigram, and trigram frequency tables with a backoff strategy. The system is fast, lightweight, and demonstrates practical applications of R, Shiny, and NLP for text prediction. A 5-slide RPubs presentation explains the model, data, and app functionality.
Assement of wellness programmes
This project evaluates the financial impact of a workplace wellness programme using data from a randomised control trial (RCT). The analysis begins by cleaning the dataset and renaming variables for clarity. It then explores the data through summary statistics and visualisations, highlighting a potential selection bias where healthier employees are more likely to enrol. To address this, the study contrasts a standard OLS regression, which misleadingly suggests cost savings due to the "healthy user" effect, with an instrumental variables (IV) regression. The IV analysis uses the random assignment of the programme as an instrument to isolate the true causal effect, ultimately revealing that the wellness programme has no statistically significant impact on reducing employee medical spending.
The Sample Distribution
I use simulations to show how the spread of the sample distribution changes with sample size and show that this produces the standard error of the mean.
Temario Avanzado: Sistemas de Clasificación y Teoría de la Información en Data Mining
Estadística y Análisis de Grandes Volúmenes de Datos
GEOS372 Final Project: Global Filming Locations
Map of filming locations from 5000+ of the most popular films and tv shows from the movie database, with locations supplied from wikidata & wikipedia!
Módulo Geoestadística
Análisis de clima en finca de aguacate:
-Análisis exploratorio
-Creación de geodata
-Construcción de semivariograma
-Ajuste de semivariograma a modelos teóricos
-Aplicación de la metodología Kriging
Pemetaan Tingkat Pembangunan Sosial Ekonomi Kabupaten/Kota di Jawa Timur Tahun 2021 Menggunakan Metode Multi Dimensional Scaling (MDS)
Kondisi sosial ekonomi di Jawa Timur tahun 2021 mencerminkan perbedaan tingkat kesejahteraan antar kabupaten/kota, yang tercermin dari variasi persentase penduduk miskin, PDB regional, angka harapan hidup, rata-rata lama sekolah, dan pengeluaran per kapita. Beberapa wilayah menunjukkan kapasitas ekonomi dan kualitas hidup yang lebih tinggi, sementara sebagian lainnya masih memerlukan perhatian dalam pengentasan kemiskinan dan peningkatan akses layanan sosial. Metode MDS diterapkan untuk memetakan kedekatan karakteristik sosial ekonomi antar daerah, sehingga hubungan dan kelompok wilayah dengan kondisi serupa dapat diidentifikasi secara visual.
GEOS372 Final Project: Rough Work (1)
Rough Work for the project -- aka an earlier version