gravatar

darrentsumm

Darren

Recently Published

RNN with GRU
One common issue with Recurrent Neural Networks is exploding or vanishing gradients, where the previous state (h(t-1)) and its weight become extremely high, or more commonly are reduced to practically zero during backpropagation. Since we have a lot of data points (1440), this should be addressed. One way to address this is by implementing a Gated Recurrent Unit, which changes the hidden state equation to include an update gate and a reset gate. The update gate balances how much of the new hidden state to incorporate, compared to the previous hidden state. The reset gate sets how much past information to forget when computing the new hidden state. The rest of this model ran similarly to our True RNN, with 10 epochs, measuring MSE and MAE.
NHANES Vanilla RNN
A Recurrent Neural Network (RNN) is a model that intakes functional data one data point at a time (t), computing a hidden state (h), based on the current input (x), and the previous hidden state, both weighted (W), increased by the learned bias term (b), and applied to the tanh function (σ) calculated by the model. h(t​) = σ(W(h) * ​h(t-1) ​+ W(x) * ​x(t) ​+ b) Scalar, non-functional data is input as a flat vector and applied to the ReLU function for the simplicity of the relationship. We trained the model using a validation split of 0.2, meaning that the RNN cycles through 80% of the training data, updating the model’s weights at each data point, and then uses the remaining 20% of the training data to measure the performance of the model via mean squared error and mean absolute error. The RNN then adjusts the weights by backpropagation and repeats this process, which is called an epoch. In each epoch, the model uses an “adam” (Adaptive Moment Estimation) optimizer, which does particularly well with noisy or high-dimensional data. Our model typically stopped improving after about 8 epochs, so we set the RNN to run 10 epochs, with the idea that this would be a similar process to 10-fold cross-validation. Variable Importance is found through a permutation-based loop. For each variable, its values were randomly shuffled across all subjects, breaking any true association with the outcome. The modified data was passed through the trained model, and the increase in RMSE was recorded. A larger increase indicates greater importance of that variable.
NHANES fSIR
Functional Sliced Inverse Regression is a dimension reduction technique that separates the training data into H slices, grouping subjects by their output, in this case, Systolic Blood Pressure. Then, the inverse regression of the functional data given Systolic Blood Pressure is observed to create and train the model. The model produces several sufficient predictors that simplify the functional data, with decreasing proportions of relationship. We found that the model performed best using two of these sufficient predictors, and H = 5 slices. Like with all models, we used 10-fold cross-validation, where the data was split into ten folds, then the model would train on 9 of these folds, and test on the other. This process is repeated so that each fold tests once. We also smoothed the functional data using 20 basis functions, consistent with our other models. To measure performance, we trained a linear regression model using the sufficient predictors and scalar variables.