RPubs

by RStudio

nomarpicasso

Ramon Rodriguez-Santana

Recently Published

Estimating the Size of a Smoking Population Using the Mark–Recapture Method in R

The Mark–Recapture method, originally developed in ecology to estimate wildlife populations, provides a statistically rigorous alternative for estimating the size of partially observed human populations. Here is an example demonstrating how to apply the Mark–Recapture method to estimate the size of a smoking population using simulated data.

5 months ago

Greenfield Analysis (GFA) using a center-of-gravity (CoG) approach to identify 4 optimal health care facility locations in Connecticut

Greenfield Analysis (GFA)is a facility location optimization technique used to identify the most suitable placement of new service centers, warehouses, or healthcare facilities when no prior infrastructure constraints exist. In this analysis, we applied a center-of-gravity (CoG) approach to identify four optimal facility locations in Connecticut.

8 months ago

Comparing the Historical Limits Method (HLM) and Negative Binomial (NB) Regression for Detecting Quarterly Infectious Disease Outbreaks in R

When detecting HCV outbreaks, HLM will alert you if the current quarter’s case count is much higher than the average of previous quarters, without adjusting for changes in population size or time trends. In contrast, NB regression can model expected case counts based on factors such as time, region, population, and seasonality, providing a more accurate expected count along with confidence intervals.

8 months ago

Time-Space Alert Detection (TSAD) in R

This tutorial provides a practical guide to using the R programming language to detect infectious disease outbreaks through time–space analysis. You’ll learn how to use R’s powerful data-handling capabilities and functions to identify unusual increases in specific diagnoses.

8 months ago

OneR (One Rule) classification model in R

The OneR (One Rule) classification model is a simple yet effective rule-based machine learning algorithm. It generates one rule for each feature in the dataset and then selects the rule with the lowest error rate for classification. Despite its simplicity, OneR performs surprisingly well on many classification tasks, often competing with more complex machine learning models.

8 months ago

Forecasting time-series data using 19 {forecast} package algorithms in R

Forecasting plays a vital role in identifying patterns within time-series data and supporting informed decision-making across diverse fields such as business, healthcare, economics, and environmental studies.

8 months ago

Machine Learning Regression Models in R

Regression analysis in machine learning (ML) is a method used to examine the connection between independent variables and a dependent variable. This type of analysis it is known as predictive modeling, in which an algorithm or method is used to predict continuous outcomes. Here are the steps for conducting a ML regression analysis and deploying the final selected model.

almost 2 years ago

Using the {tidycensus} package in R to estimate number of four Connecticut Hispanic subgroups populations

{tidycensus} enables users to interact with specific US Census Bureau data APIs. It retrieves data frames compatible with tidyverse and integrates a straightforward geography feature.

about 2 years ago

Hierarchical cluster analysis of prescription counts by town of residence in R

Hierarchical cluster analysis example in R.

about 2 years ago

K-Means and K-Medoids clustering of prescription counts by Connecticut towns

Steps on how to K-Means and K-Medoids clustering. K-means clustering is an unsupervised machine learning algorithm that identifies groups in unlabeled data. K-medoids is an unsupervised method with unlabelled data to be clustered. It is an improvised version of the K-Means algorithm mainly designed to deal with outlier data sensitivity.

about 2 years ago

Create maps in R by merging shapefile with data source file

Create {ggplo2} and {tmap} maps in R by merging shapefile and data source file.

about 2 years ago

Local Moran and Local Getis-Ord Maps of OD Deaths by Town of Residence using the {rgeoda} package in R

Using the {rgeoda} package to create Local Moran and Local Getis-Ord Maps.

about 2 years ago

Create publication-ready analytical and summary tables using {gtsummary} package in R

Using the {gtsummary} package offers a stylish and adaptable method for producing analytical and summary tables that are ready for publication.

about 2 years ago

Build a HeatMap in R using the {leaflet} and {leaflet.extras} packages

Creating a HeatMap from 2012 Starbucks locations in CT, MA and RI.

about 2 years ago

Detecting anomalies in your data using Benford’s Law and R

Analyzing large amounts of data in search of anomalies can be a frustrating task. You need techniques that allow you to quickly evaluate data in a way that highlights potential anomalies and prevents you from conducting analysis that are meaningless.

about 2 years ago

Impute missing values using {missForest} package in R

In this presentation we will learn how to impute missing values using the {missForest} package. This package uses a random forest trained on the observed values of a data matrix to predict the missing values. It can be used to impute continuous and/or categorical data.

about 2 years ago

Time series forecasting using {modeltime} in R

Here are the instructions on how to perform classical time series analysis and machine learning modeling in one framework.

about 2 years ago

Model deployment using {plumber} package in R

This presentation shows the steps on how to deploy a GLM (Logistic Regression) machine learning model created in R via a Plumber API.

about 2 years ago

Compare 13 models and select the best using the {caret} R package

Compare the estimated accuracy of different machine learning algorithms (models). Select the most accurate model for your predictive analytics project. When working on a machine learning project, you often have several good models to choose from. Each of the models you selected needs to be measure for accuracy. In order to select the best and final model(s), you should use several different methods to estimated the accuracy of your machine learning models. Here are the steps on how to select the best and final model(s) using the Vertical Box-and-Whisker Plot method.

about 2 years ago

Using {Rayshaders} package to visualize 3D map in R

The {rayshader} is an open source package for producing 2D and 3D data visualizations in R.

about 2 years ago

Using the AutoML {forester} package for Tree-based Models

The {forester} package is an AutoML tool in R for tabular data regression and binary classification tasks. It wraps up all machine learning processes into a single train() function.

about 2 years ago

Spatial statistical modeling and prediction using the {spmodel} package in R

The spmodel is an R package used to fit, summarize, and predict for a variety spatial statistical models applied to point-referenced or areal (lattice) data. Parameters are estimated using various methods, including likelihood-based optimization and weighted least squares based on variograms.

about 2 years ago

Using the {sociome} package to identify high deprivation areas in Connecticut.

The ADI scores shown here identify areas where deprivation and affluence exist within communities in Connecticut. Organizations implementing Overdose, HIV and Hep C prevention interventions can use this information to identify high deprivation areas in Connecticut. It is recommended for organizations in Connecticut to focus their Overdose, HIV and Hep C prevention efforts in high level deprivation areas.

about 2 years ago

RPubs

nomarpicasso

Ramon Rodriguez-Santana

Recently Published

Estimating the Size of a Smoking Population Using the Mark–Recapture Method in R

Greenfield Analysis (GFA) using a center-of-gravity (CoG) approach to identify 4 optimal health care facility locations in Connecticut

Comparing the Historical Limits Method (HLM) and Negative Binomial (NB) Regression for Detecting Quarterly Infectious Disease Outbreaks in R

Time-Space Alert Detection (TSAD) in R

OneR (One Rule) classification model in R

Forecasting time-series data using 19 {forecast} package algorithms in R

Machine Learning Regression Models in R

Using the {tidycensus} package in R to estimate number of four Connecticut Hispanic subgroups populations

Hierarchical cluster analysis of prescription counts by town of residence in R

K-Means and K-Medoids clustering of prescription counts by Connecticut towns

Create maps in R by merging shapefile with data source file

Local Moran and Local Getis-Ord Maps of OD Deaths by Town of Residence using the {rgeoda} package in R

Create publication-ready analytical and summary tables using {gtsummary} package in R

Build a HeatMap in R using the {leaflet} and {leaflet.extras} packages

Detecting anomalies in your data using Benford’s Law and R

Impute missing values using {missForest} package in R

Time series forecasting using {modeltime} in R

Model deployment using {plumber} package in R

Compare 13 models and select the best using the {caret} R package

Using {Rayshaders} package to visualize 3D map in R

Using the AutoML {forester} package for Tree-based Models

Spatial statistical modeling and prediction using the {spmodel} package in R

Using the {sociome} package to identify high deprivation areas in Connecticut.

Exploring data using the vtree package

Using {sf} package for spatial counting points in polygons via spatial join

Finding variable importance in a logistic regression model in R

Binary Logistic Regression in R

Exploratory data analysis (EDA) using {summarytools}

Sign In

nomarpicasso

Ramon Rodriguez-Santana

Recently Published