Recently Published
Document
HW 4
Linear Regression and It's Cousins
This project analyzes high-dimensional regression techniques using two datasets: the Tecator meat spectroscopy data and a pharmaceutical permeability dataset. For the Tecator data, five regression methods (PCR, PLS, Ridge, Lasso, and Elastic Net) were compared to predict moisture and fat content from 100 spectroscopy measurements, with PLS emerging as the best performer using 18 components. Principal Component Analysis revealed that the spectroscopy data's effective dimension is much lower than the original 100 variables, with 95% of variance captured by just a few components. The permeability analysis used molecular fingerprints to predict drug permeability, comparing seven methods including PLS, PCR, regularization techniques, KNN, and SVM after filtering near-zero variance predictors. The optimized PLS model with cross-validation demonstrated strong predictive performance, though the results suggest it should be used for screening rather than completely replacing laboratory experiments.
Folha de Pagamento - Fase 2
Aula de Programação - Fase 2