Deep Learning
Deep Neural Networks
A neural network is "deep" if it has more than one hidden layer. Deep neural networks (DNNs) are used as feature extractors. Each layer detects something more complex than the previous layer. For instance, one layer can detect edges, then certain curves, then eventually entire images.
Convolutional Neural Networks
With standard image flattening, all the spacial information is lost (which pixels are neighbors, etc.). This can be improved by using a kernel, which scans the image one pixel at a time (called a convolution operation). Using such a method provides information about the neighborhood of each pixel.
Convolution layers can be stacked to extract more complex structures in the image. An example of layers in a convolutional neural network (CNN) is:
- input
- convolution + ReLU
- pooling
- convolution + ReLU
- pooling
- ...
- flatten
- fully connected
- softmax
Overfitting data can result in incorrect classification if the CNN learns particular patterns of pixels in the training data.
Transfer learning
Since training a CNN from scratch can take a long time, one can take the feature extraction weights from a different model, modify the final classification layers, and achieve a trained model much faster.
Downscaling Images
With convolutions producing many images, we can downscale images to reduce the amount of memory needed.
Max pooling operation
Using a matrix that scans across the image and takes the maximum pixel value of each step, the image can be downscaled.
Deep Q Networks
Approximating Q values
For systems where there are a lot of input states, a neural network can be used to approximate Q values for a given state. Providing a state to the neural network will have output values , ,
Training the model
Instead of having one model, we have two copies: the target DNN and policy DNN. This allows using the target network for backpropagation in the policy network. Then, we can copy the policy network to the target network every once in a while.
For a loss function, we can take the squared difference between the target and predicted values.
Limitations
DQNs are only well-suited for discrete action spaces (they cannot do continuous actions). Additionally, they cannot model stochastic policies.
Policy Gradient Methods
To achieve a stochastic policy, we can directly optimize the policy, outputting the probabilities of taking each action given the input state.
For a continuous action space, we can have the neural network output parameters for a certain distribution (ex. mean, variance, etc.).
We can use the following loss function given the policy action probabilities: