JParaCrawl is a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited. The parallel corpus was constructed by broadly crawling the web and automatically aligning parallel sentences. The corpus amassed over 8.7 million sentence pairs.
Source: JParaCrawl: A Large Scale Web-Based English-Japanese Parallel CorpusPaper | Code | Results | Date | Stars |
---|