Code and Named Entity Recognition in StackOverflow

By arXiv.org - 2020-10-14

Description

There is an increasing interest in studying natural language and computer code together, as large corpora of programming texts become readily available on the Internet. For example, StackOverflow curr ...

Summary

  • Abstract: In this paper, we introduce a new named entity recognition (NER) corpus for the computer programming domain, consisting of 15,372 sentences annotated with 20 fine-grained entity types.
  • Our SoftNER model incorporates a context-independent code token classifier with corpus-level features to improve the BERT-based tagging model.
  • this https URL References & Citations Bibtex formatted citation Bookmark About arXivLabs arXivLabs:

 

Topics

  1. NLP (0.35)
  2. Backend (0.12)
  3. Machine_Learning (0.08)

Similar Articles