Utility is in the Eye of the User: A Critique of NLP Leaderboards

By arXiv.org - 2020-10-01

Description

Benchmarks such as GLUE have helped drive advances in NLP by incentivizing the creation of more accurate models. While this leaderboard paradigm has been remarkably successful, a historical focus on p ...

Summary

Abstract: While this leaderboard paradigm has been remarkably successful, a historical focus on performance-based evaluation has been at the expense of other qualities that the NLP community values in models, such as compactness, fairness, and energy efficiency.
We frame both the leaderboard and NLP practitioners as consumers and the benefit they get from a model as its utility to them.

Topics

UX (0.18)
NLP (0.15)
Backend (0.05)

Similar Articles

mT5: A massively multilingual pre-trained text-to-text transformer

By arXiv.org - 2020-10-23

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, ...

An Empirical Study of Pre-trained Transformers for Arabic Information Extraction

By arXiv.org - 2020-10-08

Multilingual pre-trained Transformers, such as mBERT (Devlin et al., 2019) and XLM-RoBERTa (Conneau et al., 2020a), have been shown to enable the effective cross-lingual zero-shot transfer. However, t ...

Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation

By arXiv.org - 2020-10-06

Literary tropes, from poetry to stories, are at the crux of human imagination and communication. Figurative language such as a simile go beyond plain expressions to give readers new insights and inspi ...

Tired of Topic Models? Clusters of Pretrained Word Embeddings Make for Fast and Good Topics too!

By arXiv.org - 2020-10-13

Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in t ...

Code and Named Entity Recognition in StackOverflow

By arXiv.org - 2020-10-14

There is an increasing interest in studying natural language and computer code together, as large corpora of programming texts become readily available on the Internet. For example, StackOverflow curr ...

COMETA: A Corpus for Medical Entity Linking in the Social Media