Recently Published
Swiftkey word predictor Presentation
This presentation pitches a data product designed for the Coursera Data Science Capstone. It outlines the development of a next-word prediction algorithm using N-gram modeling and a "Stupid Backoff" strategy. The project demonstrates the full data science pipeline: from exploratory analysis of a 4-million-line corpus (Twitter, News, Blogs) to the deployment of a reactive Shiny application. The focus is on balancing computational speed with predictive accuracy for mobile-first environments.
Coursera Data Science Capstone: Milestone Report
This milestone report presents an initial exploration of the text data used in the Coursera Data Science Capstone project. The purpose of this analysis is to understand the structure, size, and characteristics of the data before building a text prediction model.The report summarizes key statistics of the datasets and outlines the planned approach for developing a predictive algorithm and an interactive Shiny application