Description
The best way to go about learning object detection is to implement the algorithms by yourself, from scratch. This is exactly what we'll do in this tutorial.
Summary
- Object detection is a domain that has benefited immensely from the recent developments in deep learning.
- Generally, stride of any layer in the network is equal to the factor by which the output of the layer is smaller than the input image to the network.
- In YOLO v3 (and it's descendants), the way you interpret this prediction map is that each cell can predict a fixed number of bounding boxes.
- We divide the input image into a grid just to determine which cell of the prediction feature map is responsible for prediction Anchor Boxes It might make sense to predict the width and the height of the bounding box, but in practice, that leads to unstable gradients during training.