gravatar

JanisCorona

Janis Harris

Recently Published

Fibromyalgia RNA-Seq Gene Expression Analysis 12 samples Bootstrap Random Forest Model
In this project, we begin the analysis of the gene expression data on trigger point myofascial pain similar to fibromyalgia in clinical signs and symptoms for chronic pain. The genome data was used in a study of 5 healthy and 7 myofascial pain patients that helped the researchers understand how a drug that starts with 'dex' helps with chronic pain. It included fragments per kilo million and counts that were both normalized from the gene high throughput fastp data collected. This study is a stepping stone to connecting major illnesses associated with Epstein-Barr virus (EBV) such as multiple sclerosis, mononucleosis, Hodgkin's disease, and fibromyalgia. We can eventually understand how changes in the body of many people make their body undergo DNA transcription to make more of some genes and less of others when dealing with 58,000 genes in this study. Details in the document.
Lyme Disease Top Features in Predicting State of illness
Using R packages to manipulate data from NCBI gene studies with tidyr, dplyr, caret, and kernlab there are 6 models used with 10 folds of cross validation and Accuracy to measure algorithms of KNN, rpart, random forest, linear discriminate analysis, support vector machines for radial, and support vector machines for linear model fitting. Then summary results shown. Error in plots displaying properly in knitr and Latex, so they were block commented out. Looks like top genes are involved in upregulation of lipid regulators, DNA repair, and bile production to digest more fats and cholesterols. But downregulated mitotic activity in cell replication. This is from acute infection to chronic infection up to six months. Only 86 samples, and not balanced data for chronic infection. Tuning can be improved and selecting better model parameters to get better accuracy. For four classes best model was rpart but see notes in doc why.