How to compress a neural network. An introduction to weight pruning

By Medium - 2020-10-08

Description

Modern state-of-the-art neural network architectures are HUGE. For instance, you have probably heard about GPT-3, OpenAI’s newest revolutionary NLP model, capable of writing poetry and interactive…

Summary

To give you a perspective about how large this number is, consider the following.
However, this is just the tip of the iceberg.
During their experiments with pruning the LeNet for MNIST classification, they found that a significant portion of the weights can be removed without a noticeable increase in the loss.
Here, the student model not only sees the training data for the big one, but new data as well, where it is fitted to approximate the output of the teacher.

Topics

Machine_Learning (0.39)
NLP (0.13)
Backend (0.08)

Similar Articles

FastFormers: 233x Faster Transformers inference on CPU

By Medium - 2020-11-04

Since the birth of BERT followed by that of Transformers have dominated NLP in nearly every language-related tasks whether it is Question-Answering, Sentiment Analysis, Text classification or Text…

Facebook’s Prophet + Deep Learning = NeuralProphet

By Medium - 2020-12-10

While learning about time series forecasting, sooner or later you will encounter the vastly popular Prophet model, developed by Facebook. It gained lots of popularity due to the fact that it provides…

BERT, RoBERTa, DistilBERT, XLNet: Which one to use

By KDnuggets - 2021-01-20

Lately, varying improvements over BERT have been shown — and here I will contrast the main similarities and differences so you can choose which one to use in your research or application.

Interpretability in Machine Learning: An Overview

By The Gradient - 2020-11-21

A broad overview of the sub-field of machine learning interpretability; conceptual frameworks, existing research, and future directions.

3 deep learning mysteries: Ensemble, knowledge- and self-distillation

By Microsoft Research - 2021-01-19

Microsoft and CMU researchers begin to unravel 3 mysteries in deep learning related to ensemble, knowledge distillation & self-distillation. Discover how their work leads to the first theoretical proo ...

Clothes Classification with the DeepFashion Dataset and Fastai

By Medium - 2021-02-02

How to outperform the benchmark in clothes recognition with fastai and DeepFashion Dataset. How to use fastai models in PyTorch. Code, explanation, evaluation on the user data.

Feedback

Let us know how do you think about this newsletter or want to add new topics or keywords

contact@velasticity.com

Bookmarks

Latest Readings in NLP

By Medium - 2021-03-15

17 types of similarity and dissimilarity measures used in data science

By Medium - 2021-03-13

Spark it up a notch. Nitty-gritty details of Apache Spark

By Medium - 2021-01-20

Responsible AI at Facebook. Joaquin Quiñonero-Candela on the TDS

By Medium - 2021-03-14

Non-Linear Augmentations For Deep Learning

By Spreadmind Blog - 2016-10-25

Für Coaches: In 3 Schritten mehr Sichtbarkeit und Reichweite über das Internet

By KDnuggets - 2021-03-14

How to Speed up Pandas by 4x with one line of code

By SearchDataManagement - 2021-03-14

ChaosSearch looks to bring order to data lakes

By KDnuggets - 2021-03-14

Introduction to Data Engineering

By Medium - 2021-03-05

Responsible Machine Learning with Error Analysis

By GitHub - 2021-03-13

facebookresearch/flores

By Selbstmanagement - 2021-03-14

By datasciencecentral - 2021-03-15

FinTech: How AI is Improving This Industry

By KDnuggets - 2021-03-12

Must Know for Data Scientists and Data Analysts: Causal Design Patterns

By Medium - 2021-03-10

(Deep) House: Making AI-Generated House Music

By Medium - 2021-03-12

High Number of Unique Values and Tree-Based Models

By Google AI Blog - 2021-03-12

LEAF: A Learnable Frontend for Audio Classification

By Medium - 2020-12-31

Why do I have a data science blog? 7 benefits of sharing your code

By KDnuggets - 2021-03-12

DBSCAN Clustering Algorithm in Machine Learning

By Stanford School of Engineering - 2021-03-12

Dan Jurafsky: How AI is changing our understanding of language

By reddit - 2021-03-12

r/MachineLearning - [N] Legal NLP Dataset With Over 13,000 Anotations Released

By IoT Agenda - 2021-03-14

Prepare for IoT's role in U.S. CMMC compliance

By Medium - 2021-03-12

This is why your deep learning models don’t work on another microscopy scanner

By Medium - 2021-03-08

The Playbook to Monitor Your Model’s Performance in Production

By Medium - 2020-12-03

Calculating Document Similarities using BERT, word2vec, and other models

By datasciencecentral - 2021-03-15

Best Naming Conventions When Writing Python Code

By reddit - 2021-03-12

r/MachineLearning - [D] Why is tensorflow so hated on and pytorch is the cool kids framework?

By Medium - 2020-12-01

Most Important IT Side Skill, Regex

By Medium - 2021-03-12

Introduction to hierarchical clustering (Part 3 — Spatial clustering

By Medium - 2021-02-15

10 Hyper-parameter Tuning Libraries

By datasciencecentral - 2021-03-15

How the Blend of Artificial Intelligence and Big Data Is Helping Industries During The Pandemic

By Medium - 2020-10-16

How ‘Copy-and-Paste’ is embedded in CNNs for Image Inpainting — Review: Shift-Net: Image Inpainting via Deep Feature Rearrangement

By Medium - 2021-02-28

Intro to Regularization With Ridge And Lasso Regression with Sklearn

By datasciencecentral - 2021-03-14

Artificial Intelligence in the Content Marketing Landscape

By Medium - 2021-03-12

“Multi-Page” Apps Done Right via Heroku & HTML

By datasciencecentral - 2021-03-14

Interesting AI papers published in 2020

By Medium - 2021-03-12

Software Engineering Best Practices for Data Scientists

By SearchUnifiedCommunications - 2021-03-14

Virtual visits to mature in 2021

By Medium - 2020-12-01

Ridgeline Plots: The Perfect Way to Visualize Data Distributions with Python

By datasciencecentral - 2021-03-14

Fraudulent Covid-19 Data and Benford's Law