The LanguageNet (English) is a collection of sentence level paraphrases from Twitter by linking tweets through shared URLs. This corpus is the largest up to date with 51,524 human annotated sentence pairs: 42200 for training and 9324 for testing. It can grow 30,000 new sentential paraphrases per month with ~70% precision. Now we have 1-year data available: 2,869,657 candidate pairs!

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages