We describe pke, an open source python-based keyphrase extraction toolkit.
Turkish Wikipedia Named-Entity Recognition and Text Categorization (TWNERTC) dataset is a collection of automatically categorized and annotated sentences obtained from Wikipedia.
Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-based classifier and SVMs.
A recently introduced text classifier, called SS3, has obtained state-of-the-art performance on the CLEF's eRisk tasks.
A recently introduced classifier, called SS3, has shown to be well suited to deal with early risk detection (ERD) problems on text streams.
The Tsetlin Machine either performs on par with or outperforms all of the evaluated methods on both the 20 Newsgroups and IMDb datasets, as well as on a non-public clinical dataset.
Another limitation of GCN when used on graph-based text representation tasks is that, GCNs do not consider the order information of nodes in graph.