Archive: Two layer neural net
Note: Originally posted April 7th, 2021, this is post in the archived Deep Learning for Computer Vision series (cs231n).
- Browse the full cs231n series.
- See the source code in the two layer net notebook.
New comments are found exclusively in info boxes like this one.
I referenced the following course notes to complete this assignment: optimization-2, neural-networks-1, and neural-networks-2.
In a slightly confusing move, they have you increase dimensionality between layers in the toy example in this assignment. This is consistent with their discussion in this neural-networks-1 section, but it goes against the samples here in neural-networks-2 and the visuals in Wikipedia’s CNN article. All of which supports my preconceived notion that increasing dimensionality isn’t productive. I guess the point at this early stage is that it’s a technically valid way to construct layers in a neural network (though prone to overfitting).
At this point, I have implemented the code which calculates scores, loss, and gradients properly. Next up is defining the training logic again and what appears to be some hyperparameter tuning at the end. As before, now that the gradients (with backpropagation) are handled, I should be through the hard part.
Update (2021): Assignment is complete - with a 2-layer fully-connected neural net I get validation accuracy of 52% and test accuracy of 51.7%. These both meet suggested targets (‘good’ accuracies are >48%, with instructor’s best classifier hitting >52% validation accuracy).
I’ve begun the higher-level features notebook (A1Q5). When that’s done I’ll move straight into assignment 2.