Recently Published
{Rilostat} y {tidyverse}
Análisis del índice de Kaitz a nivel internacional usando datos de la Organización Internacional del Trabajo (OIT) a través del paquete {Rilostat} en R. Exploramos la relación entre salario mínimo y salario promedio en diferentes países, aplicando técnicas de manipulación de datos con tidyverse y visualizaciones con ggplot2 para un entendimiento claro y efectivo.
mtcars Linear Regression
Using R to analyze fuel efficiency with vehicle weight.
P4. Gráficos
Del curso SOC294, PUCP 2025-1. Incluye gráficos: i) pie, ii) boxplot, iii) histograma, iv) personalización con ggplot2. Con base adaptada del Latinbarometro 2022.
Introduccion a R
Aspectos Basicos del lenguaje R
Mineria Datos - Predicion Aprobacion Credito con Arboles de Decision
Curso Inteligencia de Negocios
Analysis of Incomplete Data
This work formalizes a framework for handling missing data, progressing from mechanism identification (via Little's MCAR test) to optimized treatment strategies, including Multiple Imputation for uncertainty quantification and EM algorithms for multivariate missingness. The structured approach balances theoretical rigor with practical implementation across data patterns.
⦿ 1. Problem Identification
◈ The Impact of Missing Data
▣ Biased parameter estimates
▣ Reduced statistical power
▣ Compromised generalizability
◈ Limitations of Traditional Methods
▣ Listwise deletion: Inefficient and often invalid
▣ Ad hoc fixes: May introduce new biases
⦿ 2. Theoretical Foundations
◈ Missing Data Patterns
▣ Monotone vs. arbitrary patterns
◈ Missing Data Mechanisms
▣ MCAR (Missing Completely at Random)
▣ MAR (Missing at Random)
▣ MNAR (Missing Not at Random)
◈ Diagnostic Tools
▣ Little's MCAR Test (formal hypothesis testing)
⦿ 3. Basic Solutions
◈ Single Imputation Techniques
▣ Mean/median imputation (with caveats)
▣ Random hot-deck imputation
◈ Preliminary Analysis
▣ Bootstrapping for robust parameter estimation
⦿ 4. Advanced Methods
◈ Multiple Imputation
▣ Creates multiply-imputed datasets
▣ Accounts for imputation uncertainty
◈ Model-Based Approaches
▣ Maximum Likelihood Estimation (MLE)
▣ Expectation-Maximization (EM) Algorithm
◆ Bivariate EM applications
◆ Extension to multivariate data
Predicting Customer Detractors (Part 1): Analyzing Contextual Factors Via Logistic Regression
This case study aims to identify key factors that influence customer's likelihood to recommend the company after interacting with customer service.
Methodology: The project utilizes a comprehensive analytical approach, including:
- Data Simulation & Cleaning: Creating and preparing the dataset for analysis.
- Exploratory Data Analysis: Using data visualization (e.g., heatmaps) and descriptive statistics to uncover patterns across multiple and interactive factors.
- Statistical Modeling: Evaluating different regression models (linear, ordinal, binomial) to predict customer's likelihood to recommend the company.
- Simulation Based Recommendations: Predictions to evaluate the impact of different actions.
- Reusable Functions: The creation of functions to automate procedures.
Tools & Libraries: R with a focus on libraries such as car, VGAM, ordinal, psych, vcd, coefplot, ggplot2, tidyr, dplyr, openxlsx, and readxl.