RPubs

by RStudio

Recently Published

Actividad M1.1: Fundamentos de programación

By SofiaMendezFranco

Desarrollar habilidades básicas para manejo de información y comandos en R.

2 months ago

Document

By Sayed_abbas

This is the first assignment

2 months ago

SwiftKey Capstone: Exploratory Data Analysis of HC Corpora

By chidemannie

This project is part of the Johns Hopkins University Data Science Specialization Capstone. The objective is to explore large-scale English text datasets (blogs, news, and Twitter) and build the foundation for a next-word prediction model similar to those used in mobile smart keyboards. The analysis includes: Basic dataset summaries (file size, line counts, maximum line length) Exploratory analysis of text structure Sampling and cleaning strategies suitable for large corpora Word frequency analysis (unigrams and bigrams) Distribution of words per line Vocabulary coverage analysis (50% and 90% token coverage) Preliminary modeling strategy for n-gram backoff prediction The results highlight differences between text sources (short Twitter messages vs. long blog entries), motivate efficient sampling techniques, and inform the design of a responsive Shiny application for deployment. The next phase of the capstone will implement an optimized n-gram model with a backoff strategy and deploy it via Shiny for real-time next-word prediction.

2 months ago

Next Word Prediction App

By Anujsharmadev

This project demonstrates a next word prediction model using a bigram language model implemented in R.

2 months ago

TEST

By SofiaMendezFranco

2 months ago

Crypto

By Aakash1982

my report

2 months ago

Similarity Ratings Analyses

By fotisfotiadis

Scripts with analyses reported in the manuscript "Stimulus Presentation Duration Affects Category-Learning Accuracy", by Fotis A. Fotiadis, Iris Antonatou Stamatopoulou, & Argiro Vatakis

2 months ago