Recently Published
MSP4092: Biomatemática: aspectos quantitativos da vida: Aula 4: Método gráfico (Capítulo 7)
Batschelet, E (1978) Introdução à matemática para biocientistas. Tradução da 2ª ed. São Paulo: EDUSP e Rio de Janeiro: Interciência.
Brazilian GDP
carry over
Brazilian GDP
contribution to annual growth
A Validation-Based Model Selection Strategy for Breast Cancer Diagnosis Using Logistic Regression
This analysis investigates the use of logistic regression models to predict malignancy in breast cancer based on tumor characteristics derived from digitized medical images. Using the Breast Cancer Wisconsin (Diagnostic) dataset, we:
Conducted exploratory data analysis to visualize key variables.
Performed feature selection using both statistical significance and multicollinearity checks.
Split the dataset into 60% training, 20% validation, and 20% test sets to evaluate generalization.
Fitted multiple logistic regression models, refining them iteratively based on AIC, deviance, accuracy, and ROC/AUC.
Identified a final model with three key predictors: texture_mean, concavity_mean, and radius_mean.
Validated the final model on the test set, achieving strong predictive performance with high sensitivity, specificity, and an AUC of 0.974.
Visualized the effect of predictors on malignancy probability using ggplot2, pROC, and ggpmisc.
This project demonstrates the importance of model validation, feature interpretability, and visualization in clinical predictive modeling, and offers a reproducible pipeline for diagnostic model development using logistic regression.
Delta Tours
Revised Website