RPubs

by RStudio

Recently Published

Data 622 Assignment 2

By Darwhin88

Reccomending a classifier

about 1 year ago

The most selective colleges (left) & most accepted (up)

By JamesLavin

Colleges toward the left edge of this graph are the most selective in America (1st principal component). I believe accepted applicants are most likely to enroll at schools toward the top (2nd principal component).

about 1 year ago

Caso 17

By rbartolo

Distribución Binomial

about 1 year ago

R basic

By xiaoqqjun

给初学者的建议

about 1 year ago

Module 3 - In Class Activity 8

By rburke2024

**Module 3: Moneyball and The Power of Sports Analytics in Baseball** **// In-class activity # 8: Predicting the Number of Runs**

about 1 year ago

Module 3 - In Class Activity 7

By rburke2024

**Module 3: Moneyball and The Power of Sports Analytics in Baseball** **// In-class activity # 7: Predicting the Number of Wins**

about 1 year ago

Module 3 - In Class Activity 7 and 8

By rburke2024

**Module 3: Moneyball and The Power of Sports Analytics in Baseball** **// In-class activity # 7: Predicting the Number of Wins** **// In-class activity # 8: Predicting the Number of Runs**

about 1 year ago

Data 622 Assignment 2

By Tillmawitz

Introduction In Machine Learning, Experimentation refers to the systematic process of designing, executing, and analyzing different configurations to identify the optimal settings that performs best on a given task. Experimentation is learning by doing. It involves systematically changing parameters, evaluating results with metrics, and comparing different approaches to find the best solution; essentially, it's the practice of testing and refining machine learning models through controlled experiments to improve their performance. The key is to modify only one or a few variables at a time to isolate the impact of each change and understand its effect on model performance. In the assignment you will conduct at least 6 experiments. In real life, data scientists run anywhere from a dozen to hundreds of experiments (depending on the dataset and problem domain). Assignment This assignment consists of conducting at least two (2) experiments for different algorithms: Decision Trees, Random Forest and Adaboost. That is, at least six (6) experiments in total (3 algorithms x 2 experiments each). For each experiment you will define what you are trying to achieve (before each run), conduct the experiment, and at the end you will review how your experiment went. These experiments will allow you to compare algorithms and choose the optimal model. Using the dataset and EDA from the previous assignment, perform the following: Algorithm Selection You will perform experiments using the following algorithms: Decision Trees Random Forest Adaboost Experiment For each of the algorithms (above), perform at least two (2) experiments. In a typical experiment you should: Define the objective of the experiment (hypothesis) Decide what will change, and what will stay the same Select the evaluation metric (what you want to measure) Perform the experiment Document the experiment so you compare results (track progress) Variations There are many things you can vary between experiments, here are some examples: Data sampling (feature selection) Data augmentation e.g., regularization, normalization, scaling Hyperparameter optimization (you decide, random search, grid search, etc.) Decision Tree breadth & depth (this is an example of a hyperparameter) Evaluation metrics e.g., Accuracy, precision, recall, F1-score, AUC-ROC Cross-validation strategy e.g., holdout, k-fold, leave-one-out Number of trees (for ensemble models) Train-test split: Using different data splits to assess model generalization ability

about 1 year ago

Assignment 4

By sreeja05

about 1 year ago

DsLabs

By LA_NUIT

A look into stars

about 1 year ago

Plot

By Y-l-c

about 1 year ago

Plot

By Y-l-c

about 1 year ago

Sign In

Recently Published