Description
How can we build machines with human-level intelligence? There’s a limit to how far the field of AI can go with supervised learning alone. Here's why...
Summary
- RESEARCH Self-supervised learning: Instead, they use a predictive architecture in which the model directly produces a prediction for y.
- One starts for a complete segment of text y, then corrupts it, e.g., by masking some words to produce the observation x.
- The corrupted input is fed to a large neural network that is trained to reproduce the original text y.
- An uncorrupted text will be reconstructed as itself (low reconstruction error), while a corrupted text will be reconstructed as an uncorrupted version of itself (large reconstruction error).
- With a properly trained model, as the latent variable varies over a given set, the output prediction varies over the set of plausible predictions compatible with the input x. Latent-variable models can be trained with contrastive methods.
- The volume of the set over which the latent variable can vary limits the volume of outputs that take low energy.