Description
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools - huggingface/datasets
Summary
- README.md 🤗Datasets is a lightweight library providing two main features: With a simple command like squad_dataset = load_datasets("squad"), get any of these datasets ready to use in a dataloader for training/evaluating a ML model (Numpy/Pandas/PyTorch/TensorFlow/JAX), efficient data pre-processing: This gives access to the pair of a benchmark dataset and a benchmark metric for instance for benchmarks like SQuAD or GLUE.
- Dataset but a built-in framework-agnostic dataset class with methods inspired by what we like in tf.data (like a map() method).