Description
Posted by Artsiom Ablavatski and Marat Dukhan, Software Engineers, Google Research On-device inference of neural networks enables a var...
Summary
- On-device inference of neural networks enables a variety of real-time applications, like pose estimation and background blur, in a low-latency and privacy-conscious way.
- In modern on-device inference engines, like XNNPACK, the implementation of 1x1 convolutions as well as other operations in the deep learning models rely on the HWC tensor layout, in which the tensor dimensions correspond to the height, width, and channel (e.g., red, green or blue) of the input image.
- Compared with the dense model the sparse model improved the inference by a factor of two, achieving the identical landmark quality as the distilled model.
- Processing time of the dense model is 2x larger than sparse or distilled models.