no code implementations • EMNLP (insights) 2021 • Zixin Tang, Prasenjit Mitra, David Reitter
With the essays part from The International Corpus Network of Asian Learners of English (ICNALE) and the TOEFL11 corpus, we fine-tuned neural language models based on BERT to predict English learners’ native languages.
1 code implementation • LREC 2022 • Nan Zhang, Shomir Wilson, Prasenjit Mitra
Therefore, we propose the first title-text dataset on web documents that incorporates a wide variety of domains to facilitate downstream training.
1 code implementation • 11 Apr 2024 • Ali Al-Lawati, Elsayed Eshra, Prasenjit Mitra
Trajectory generation is an important task in movement studies; it circumvents the privacy, ethical, and technical challenges of collecting real trajectories from the target population.
1 code implementation • 23 Mar 2024 • Nan Zhang, Connor Heaton, Sean Timothy Okonsky, Prasenjit Mitra, Hilal Ezgi Toraman
To mitigate this gap, we present the Printed English and Chemical Equations (PEaCE) dataset, containing both synthetic and real-world records, and evaluate the efficacy of transformer-based OCR models when trained on this resource.
Optical Character Recognition Optical Character Recognition (OCR)
1 code implementation • 6 Mar 2024 • Suhan Cui, Prasenjit Mitra
To reduce human intervention and improve the framework design, we propose an automated approach named AutoDP, which can search for the optimal configuration of task grouping and architectures simultaneously.
no code implementations • 15 Jan 2024 • Saptarshi Sengupta, Shreya Ghosh, Prasenjit Mitra, Tarikul Islam Tamiti
Sentiment Analysis (SA) refers to the task of associating a view polarity (usually, positive, negative, or neutral; or even fine-grained such as slightly angry, sad, etc.)
no code implementations • 15 Jan 2024 • Saptarshi Sengupta, Connor Heaton, Prasenjit Mitra, Soumalya Sarkar
Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved.
no code implementations • 30 Dec 2023 • Ali Al-Lawati, Elsayed Eshra, Prasenjit Mitra
Trajectory generation is an important concern in pedestrian, vehicle, and wildlife movement studies.
1 code implementation • 3 Nov 2023 • Nan Zhang, Yusen Zhang, Wu Guo, Prasenjit Mitra, Rui Zhang
In this paper, we investigate and improve faithfulness in summarization on a broad range of medical summarization tasks.
no code implementations • 25 Oct 2023 • Saptarshi Sengupta, Connor Heaton, Shreya Ghosh, Preslav Nakov, Prasenjit Mitra
Domain adaptation, the process of training a model in one domain and applying it to another, has been extensively explored in machine learning.
no code implementations • 21 Jul 2023 • Matthew Hines, Gregory Glatzer, Shreya Ghosh, Prasenjit Mitra
The interaction between elephants and their environment has profound implications for both ecology and conservation strategies.
no code implementations • 24 Jun 2023 • Shreya Ghosh, Saptarshi Sengupta, Prasenjit Mitra
In this paper, we lay out a vision for analysing semantic trajectory traces and generating synthetic semantic trajectory data (SSTs) using generative language model.
no code implementations • 5 Jun 2023 • Jakob Hederich, Shreya Ghosh, Zeyu He, Prasenjit Mitra
We introduce NightPulse, an interactive tool for Night-time light (NTL) data visualization and analytics, which enables researchers and stakeholders to explore and analyze NTL data with a user-friendly platform.
no code implementations • 5 Nov 2022 • Amogh Subbakrishna Adishesha, Lily Jakielaszek, Fariha Azhar, Peixuan Zhang, Vasant Honavar, Fenglong Ma, Chandra Belani, Prasenjit Mitra, Sharon Xiaolei Huang
Specifically, we pose the problem of predicting topic tags or keywords that describe the future information needs of users based on their profiles, traces of their online interactions within the community (past posts, replies) and the profiles and traces of online interactions of other users with similar profiles and similar traces of past interaction with the target users.
1 code implementation • 1 Mar 2022 • Scott Pezanowski, Prasenjit Mitra, Alan M. MacEachren
We present GeoMovement, a system that is based on combining machine learning and rule-based extraction of movement-related information with state-of-the-art visualization techniques.
no code implementations • LREC 2020 • Scott Pezanowski, Prasenjit Mitra
Analyzing the geographic movement of humans, animals, and other phenomena is a growing field of research.
no code implementations • 24 Jan 2022 • Chen Wu, Sencun Zhu, Prasenjit Mitra
Federated Learning (FL) is designed to protect the data privacy of each client during the training process by transmitting only models instead of the original data.
no code implementations • 12 Jan 2022 • Scott Pezanowski, Alan M. MacEachren, Prasenjit Mitra
Understanding movement described in text documents is important since text descriptions of movement contain a wealth of geographic and contextual information about the movement of people, wildlife, goods, and much more.
no code implementations • 5 Nov 2021 • Gregory Glatzer, Prasenjit Mitra, Johnson Kinyua
We explore the use of clustering to identify locations of interest to African Elephants in regions of Sub-Saharan Africa.
1 code implementation • 11 Sep 2021 • Connor Heaton, Prasenjit Mitra
Major League Baseball (MLB) has a storied history of using statistics to better understand and discuss the game of baseball, with an entire discipline of statistics dedicated to the craft, known as sabermetrics.
1 code implementation • NAACL (sdp) 2021 • Athar Sefid, Jian Wu, Prasenjit Mitra, Lee Giles
Presentation slides describing the content of scientific and technical papers are an efficient and effective way to present that work.
no code implementations • 28 Oct 2020 • Chen Wu, Xian Yang, Sencun Zhu, Prasenjit Mitra
To minimize the pruning influence on test accuracy, we can fine-tune after pruning, and the attack success rate drops to 6. 4%, with only a 1. 7% loss of test accuracy.
no code implementations • 27 Aug 2020 • Connor T. Heaton, Prasenjit Mitra
Seeing the related endeavors, we set out to repurpose the relevancy annotations for TREC-COVID tasks to identify journal articles in CORD-19 which are relevant to the key questions posed by CORD-19.
1 code implementation • 25 Aug 2020 • Athar Sefid, Clyde Lee Giles, Prasenjit Mitra
We introduce an extractive method that will summarize long scientific papers.
no code implementations • 28 Jun 2020 • Xianfeng Tang, Huaxiu Yao, Yiwei Sun, Yiqi Wang, Jiliang Tang, Charu Aggarwal, Prasenjit Mitra, Suhang Wang
Pseudo labels increase the chance of connecting to labeled neighbors for low-degree nodes, thus reducing the biases of GCNs from the data perspective.
no code implementations • 10 Jun 2020 • Xianfeng Tang, Yozen Liu, Neil Shah, Xiaolin Shi, Prasenjit Mitra, Suhang Wang
In this paper, we study a novel problem of explainable user engagement prediction for social network Apps.
no code implementations • 22 Nov 2019 • Xianfeng Tang, Huaxiu Yao, Yiwei Sun, Charu Aggarwal, Prasenjit Mitra, Suhang Wang
Thus, jointly modeling local and global temporal dynamics is very promising for MTS forecasting with missing values.
1 code implementation • 8 Oct 2019 • Rajeev Bhatt Ambati, Saptarashmi Bandyopadhyay, Prasenjit Mitra
In this paper, we propose a method based on extracting the highlights of a document; a key concept that is conveyed in a few sentences.
1 code implementation • 20 Aug 2019 • Xianfeng Tang, Yandong Li, Yiwei Sun, Huaxiu Yao, Prasenjit Mitra, Suhang Wang
To optimize PA-GNN for a poisoned graph, we design a meta-optimization algorithm that trains PA-GNN to penalize perturbations using clean graphs and their adversarial counterparts, and transfers such ability to improve the robustness of PA-GNN on the poisoned graph.
Ranked #25 on Node Classification on Pubmed
1 code implementation • 20 Jun 2019 • Athar Sefid, Jian Wu, Allen C. Ge, Jing Zhao, Lu Liu, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles
We introduce a system designed to match scholarly document entities with noisy metadata against a reference dataset.
no code implementations • 9 Jan 2019 • Agnese Chiatti, Dolzodmaa Davaasuren, Nilam Ram, Prasenjit Mitra, Byron Reeves, Thomas Robinson
A significant proportion of individuals' daily activities is experienced through digital devices.
no code implementations • 5 Oct 2016 • Koustav Rudra, Siddhartha Banerjee, Niloy Ganguly, Pawan Goyal, Muhammad Imran, Prasenjit Mitra
The use of microblogging platforms such as Twitter during crises has become widespread.
no code implementations • 4 Oct 2016 • Dat Tien Nguyen, Shafiq Joty, Muhammad Imran, Hassan Sajjad, Prasenjit Mitra
During natural or man-made disasters, humanitarian response organizations look for useful information to support their decision-making processes.
no code implementations • 22 Sep 2016 • Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama
The most informative and well-formed sub-graph obtained by integer linear programming (ILP) is selected to generate a one-sentence summary for each topic segment.
no code implementations • 22 Sep 2016 • Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama
The sentences in the most important document are aligned to sentences in other documents to generate clusters of similar sentences.
no code implementations • 22 Sep 2016 • Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama
Automatic summarization techniques on meeting conversations developed so far have been primarily extractive, resulting in poor summaries.
no code implementations • 12 Aug 2016 • Dat Tien Nguyen, Kamela Ali Al Mannai, Shafiq Joty, Hassan Sajjad, Muhammad Imran, Prasenjit Mitra
The current state-of-the-art classification methods require a significant amount of labeled data specific to a particular event for training plus a lot of feature engineering to achieve best results.
1 code implementation • LREC 2016 • Muhammad Imran, Prasenjit Mitra, Carlos Castillo
Microblogging platforms such as Twitter provide active communication channels during mass convergence and emergency events such as earthquakes, typhoons.
no code implementations • 17 Feb 2016 • Muhammad Imran, Prasenjit Mitra, Jaideep Srivastava
Scarcity of labeled data causes poor performance in machine training.
no code implementations • AAAI 2015 • Wenyi Huang, Zhaohui Wu, Chen Liang, Prasenjit Mitra, C. Lee Giles
It is not always easy for knowledgeable researchers to give an accurate citation context for a cited paper or to find the right paper to cite given context.