1 code implementation • EMNLP (CINLP) 2021 • Fiona Anting Tan, Devamanyu Hazarika, See-Kiong Ng, Soujanya Poria, Roger Zimmermann
Scarcity of annotated causal texts leads to poor robustness when training state-of-the-art language models for causal sentence classification.
no code implementations • 2 Apr 2024 • Yunshan Ma, Yingzhi He, Wenjun Zhong, Xiang Wang, Roger Zimmermann, Tat-Seng Chua
However, the cross-item relations have been under-explored in the current multimodal pre-train models.
no code implementations • 6 Feb 2024 • Kun Wang, Hao Wu, Guibin Zhang, Junfeng Fang, Yuxuan Liang, Yuankai Wu, Roger Zimmermann, Yang Wang
In this paper, we address the issue of modeling and estimating changes in the state of the spatio-temporal dynamical systems based on a sequence of observations like video frames.
no code implementations • 18 Jan 2024 • Yutong Xia, Runpeng Yu, Yuxuan Liang, Xavier Bresson, Xinchao Wang, Roger Zimmermann
Graph Neural Networks (GNNs) have become the preferred tool to process graph data, with their efficacy being boosted through graph data augmentation techniques.
no code implementations • 26 Oct 2023 • Junfeng Hu, Xu Liu, Zhencheng Fan, Yuxuan Liang, Roger Zimmermann
Based on this proposal, we introduce Unified Spatio-Temporal Diffusion Models (USTD) to address the tasks uniformly within the uncertainty-aware diffusion framework.
no code implementations • 22 Oct 2023 • Yibo Yan, Haomin Wen, Siru Zhong, Wei Chen, Haodong Chen, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
To answer the questions, we leverage the power of Large Language Models (LLMs) and introduce the first-ever LLM-enhanced framework that integrates the knowledge of textual modality into urban imagery profiling, named LLM-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining (UrbanCLIP).
1 code implementation • 15 Oct 2023 • Xu Liu, Junfeng Hu, Yuan Li, Shizhe Diao, Yuxuan Liang, Bryan Hooi, Roger Zimmermann
To address these issues, we propose UniTime for effective cross-domain time series learning.
no code implementations • 9 Oct 2023 • Li Li, You Qin, Wei Ji, Yuxiao Zhou, Roger Zimmermann
Panoptic Scene Graph Generation (PSG) involves the detection of objects and the prediction of their corresponding relationships (predicates).
no code implementations • 29 Sep 2023 • Wei Ji, Li Li, Hao Fei, Xiangyan Liu, Xun Yang, Juncheng Li, Roger Zimmermann
Referring Image Understanding (RIS) has been extensively studied over the past decade, leading to the development of advanced algorithms.
1 code implementation • 13 Sep 2023 • Zhenguang Liu, Xinyang Yu, Ruili Wang, Shuai Ye, Zhe Ma, Jianfeng Dong, Sifeng He, Feng Qian, Xiaobo Zhang, Roger Zimmermann, Lei Yang
We theoretically analyzed the mutual information between the label and the disentangled features, arriving at a loss that maximizes the extraction of task-relevant information from the original feature.
no code implementations • 3 Sep 2023 • Haomin Wen, Youfang Lin, Lixia Wu, Xiaowei Mao, Tianyue Cai, Yunfeng Hou, Shengnan Guo, Yuxuan Liang, Guangyin Jin, Yiji Zhao, Roger Zimmermann, Jieping Ye, Huaiyu Wan
An emerging research area within these services is service Route\&Time Prediction (RTP), which aims to estimate the future service route as well as the arrival time of a given worker.
1 code implementation • 26 Aug 2023 • Qiu Zhou, Jinming Cao, Hanchao Leng, Yifang Yin, Yu Kun, Roger Zimmermann
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
no code implementations • 19 Aug 2023 • Yichen Zhang, Yifang Yin, Ying Zhang, Zhenguang Liu, Zheng Wang, Roger Zimmermann
Early detection of dysplasia of the cervix is critical for cervical cancer treatment.
1 code implementation • 28 Jul 2023 • Li Li, Wei Ji, Yiming Wu, Mengze Li, You Qin, Lina Wei, Roger Zimmermann
To promise consistency and accuracy during the transfer process, we propose to measure the invariance of representations in each predicate class, and learn unbiased prototypes of predicates with different intensities.
Ranked #3 on Panoptic Scene Graph Generation on PSG Dataset
no code implementations • 18 Jul 2023 • Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Roger Zimmermann
While recent video-based methods utilizing video tubelets have shown promising results, we argue that the effective modeling of spatial and temporal context plays a more significant role than the choice between clip tubelets and video tubelets.
no code implementations • 7 Jul 2023 • Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, Andrei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann
Street-view imagery provides us with novel experiences to explore different places remotely.
Ranked #3 on Image-Based Localization on cvact
no code implementations • 19 Jun 2023 • Lixia Wu, Haomin Wen, Haoyuan Hu, Xiaowei Mao, Yutong Xia, Ergang Shan, Jianbin Zhen, Junhong Lou, Yuxuan Liang, Liuqing Yang, Roger Zimmermann, Youfang Lin, Huaiyu Wan
In this paper, we introduce \texttt{LaDe}, the first publicly available last-mile delivery dataset with millions of packages from the industry.
1 code implementation • NeurIPS 2023 • Xu Liu, Yutong Xia, Yuxuan Liang, Junfeng Hu, Yiwei Wang, Lei Bai, Chao Huang, Zhenguang Liu, Bryan Hooi, Roger Zimmermann
To mitigate these limitations, we introduce the LargeST benchmark dataset.
1 code implementation • 30 May 2023 • Junfeng Hu, Yuxuan Liang, Zhencheng Fan, Hongyang Chen, Yu Zheng, Roger Zimmermann
We study the task of spatio-temporal extrapolation that generates data at target locations from surrounding contexts in a graph.
1 code implementation • 17 Mar 2023 • Chunyi Li, May Lim, Abdelhak Bentaleb, Roger Zimmermann
In today's Internet, HTTP Adaptive Streaming (HAS) is the mainstream standard for video streaming, which switches the bitrate of the video content based on an Adaptive BitRate (ABR) algorithm.
1 code implementation • 31 Jan 2023 • Haomin Wen, Youfang Lin, Yutong Xia, Huaiyu Wan, Qingsong Wen, Roger Zimmermann, Yuxuan Liang
Spatio-temporal graph neural networks (STGNN) have emerged as the dominant model for spatio-temporal graph (STG) forecasting.
no code implementations • 30 Jan 2023 • Xu Liu, Yuxuan Liang, Chao Huang, Hengchang Hu, Yushi Cao, Bryan Hooi, Roger Zimmermann
Spatio-temporal graph neural networks (STGNN) have become the most popular solution to traffic forecasting.
no code implementations • ICCV 2023 • Yifang Yin, Wenmiao Hu, Zhenguang Liu, Guanfeng Wang, Shili Xiang, Roger Zimmermann
Source-free domain adaptive semantic segmentation has gained increasing attention recently.
1 code implementation • 29 Nov 2022 • Yuxuan Liang, Yutong Xia, Songyu Ke, Yiwei Wang, Qingsong Wen, Junbo Zhang, Yu Zheng, Roger Zimmermann
Air pollution is a crucial issue affecting human health and livelihoods, as well as one of the barriers to economic and social growth.
1 code implementation • NAACL 2022 • Devamanyu Hazarika, Yingting Li, Bo Cheng, Shuai Zhao, Roger Zimmermann, Soujanya Poria
In this work, we hope to address that by (i) Proposing simple diagnostic checks for modality robustness in a trained multimodal model.
1 code implementation • ACL 2022 • Abhinav Ramesh Kashyap, Devamanyu Hazarika, Min-Yen Kan, Roger Zimmermann, Soujanya Poria
Automatic transfer of text between domains has become popular in recent times.
no code implementations • 2 Apr 2022 • Yichen Zhang, Yifang Yin, Ying Zhang, Roger Zimmermann
Contrastive self-supervised learning has attracted significant research attention recently.
1 code implementation • 9 Dec 2021 • Yuxuan Liang, Pan Zhou, Roger Zimmermann, Shuicheng Yan
While transformers have shown great potential on video recognition with their strong capability of capturing long-range dependencies, they often suffer high computational costs induced by the self-attention to the huge number of 3D tokens.
no code implementations • 16 Sep 2021 • Junfeng Hu, Yuxuan Liang, Zhencheng Fan, Yifang Yin, Ying Zhang, Roger Zimmermann
Sensors are the key to sensing the environment and imparting benefits to smart cities in many aspects, such as providing real-time air quality information throughout an urban area.
1 code implementation • 26 Aug 2021 • Xu Liu, Yuxuan Liang, Chao Huang, Yu Zheng, Bryan Hooi, Roger Zimmermann
In view of this, one may ask: can we leverage the additional signals from contrastive learning to alleviate data scarcity, so as to benefit STG forecasting?
1 code implementation • 5 Jul 2021 • Meng-Jiun Chiou, Henghui Ding, Hanshu Yan, Changhu Wang, Roger Zimmermann, Jiashi Feng
Given input images, scene graph generation (SGG) aims to produce comprehensive, graphical representations describing visual relationships among salient objects.
Ranked #2 on Unbiased Scene Graph Generation on Visual Genome
1 code implementation • 25 May 2021 • Meng-Jiun Chiou, Chun-Yu Liao, Li-Wei Wang, Roger Zimmermann, Jiashi Feng
Detecting human-object interactions (HOI) is an important step toward a comprehensive visual understanding of machines.
Ranked #3 on Human-Object Interaction Anticipation on VidHOI
no code implementations • 31 Dec 2020 • Rajaswa Patil, Yaman Kumar Singla, Rajiv Ratn Shah, Mika Hama, Roger Zimmermann
While there has been significant progress towards modelling coherence in written discourse, the work in modelling spoken discourse coherence has been quite limited.
no code implementations • 21 Dec 2020 • Yaman Kumar, Swati Aggarwal, Debanjan Mahata, Rajiv Ratn Shah, Ponnurangam Kumaraguru, Roger Zimmermann
In this paper, we present a fast, scalable, and accurate approach towards automated Short Answer Scoring (SAS).
1 code implementation • 30 Oct 2020 • Aradhya Neeraj Mathur, Devansh Batra, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann
We also release several datasets to test computer vision video generation models of their speech understanding.
no code implementations • NAACL 2021 • Abhinav Ramesh Kashyap, Devamanyu Hazarika, Min-Yen Kan, Roger Zimmermann
Domain divergence plays a significant role in estimating the performance of a model in new domains.
no code implementations • 19 Oct 2020 • Shagun Uppal, Sarthak Bhagat, Devamanyu Hazarika, Navonil Majumdar, Soujanya Poria, Roger Zimmermann, Amir Zadeh
Deep Learning and its applications have cascaded impactful research and development with a diverse range of modalities present in the real-world data.
1 code implementation • 10 Sep 2020 • Meng-Jiun Chiou, Roger Zimmermann, Jiashi Feng
Visual relationship detection aims to reason over relationships among salient objects in images, which has drawn increasing attention over the past few years.
no code implementations • 25 Aug 2020 • Dilruk Perera, Roger Zimmermann
The abundance of information in web applications make recommendation essential for users as well as applications.
no code implementations • 25 Aug 2020 • Dilruk Perera, Roger Zimmermann
Existing models (1) fail to capture complex non-linear relationships in user interactions, and (2) are designed for offline settings hence, not updated online with incoming interactions to capture the dynamics in the recommender environment.
no code implementations • 25 Aug 2020 • Dilruk Perera, Roger Zimmermann
The solution first learns historical user models in the target network by aggregating user preferences from multiple source networks.
no code implementations • 25 Aug 2020 • Dilruk Perera, Roger Zimmermann
The resultant user preferences are used in a Siamese network based neural recommender architecture.
1 code implementation • 6 Aug 2020 • Meng-Jiun Chiou, Zhenguang Liu, Yifang Yin, An-An Liu, Roger Zimmermann
In this paper, we propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.
no code implementations • 12 Jun 2020 • Dhruva Sahrawat, Yaman Kumar, Shashwat Aggarwal, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
To close the gap between speech understanding and multimedia video applications, in this paper, we show the initial experiments by modelling the perception on visual speech and showing its use case on video compression.
2 code implementations • 7 May 2020 • Devamanyu Hazarika, Roger Zimmermann, Soujanya Poria
In this paper, we aim to learn effective modality representations to aid the process of fusion.
no code implementations • 4 Nov 2019 • Jagriti Sikka, Kushal Satya, Yaman Kumar, Shagun Uppal, Rajiv Ratn Shah, Roger Zimmermann
Predicting the runtime complexity of a programming code is an arduous task.
no code implementations • 19 Oct 2019 • Dhruva Sahrawat, Debanjan Mahata, Mayank Kulkarni, Haimin Zhang, Rakesh Gosangi, Amanda Stent, Agniv Sharma, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann
In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings.
1 code implementation • 11 Oct 2019 • Devamanyu Hazarika, Soujanya Poria, Roger Zimmermann, Rada Mihalcea
We propose an approach, TL-ERC, where we pre-train a hierarchical dialogue model on multi-turn conversations (source) and then transfer its parameters to a conversational emotion classifier (target).
Ranked #19 on Emotion Recognition in Conversation on DailyDialog
1 code implementation • ACL 2019 • Santiago Castro, Devamanyu Hazarika, Ver{\'o}nica P{\'e}rez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria
As a first step towards enabling the development of multimodal approaches for sarcasm detection, we propose a new sarcasm dataset, Multimodal Sarcasm Detection Dataset (MUStARD), compiled from popular TV shows.
no code implementations • 28 Jun 2019 • Yaman Kumar, Rohit Jain, Khwaja Mohd. Salik, Rajiv Ratn Shah, Yifang Yin, Roger Zimmermann
The model takes silent videos as input and produces speech as the output.
1 code implementation • 5 Jun 2019 • Santiago Castro, Devamanyu Hazarika, Verónica Pérez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria
As a first step towards enabling the development of multimodal approaches for sarcasm detection, we propose a new sarcasm dataset, Multimodal Sarcasm Detection Dataset (MUStARD), compiled from popular TV shows.
1 code implementation • 29 Jan 2019 • Yaman Kumar, Dhruva Sahrawat, Shubham Maheshwari, Debanjan Mahata, Amanda Stent, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
To solve this problem, we present a novel approach to zero-shot learning by generating new classes using Generative Adversarial Networks (GANs), and show how the addition of unseen class samples increases the accuracy of a VSR system by a significant margin of 27% and allows it to handle speaker-independent out-of-vocabulary phrases.
no code implementations • 2 Dec 2018 • Nupur Baghel, Yaman Kumar, Paavini Nanda, Rajiv Ratn Shah, Debanjan Mahata, Roger Zimmermann
There has been upsurge in the number of people participating in challenges made popular through social media channels.
1 code implementation • EMNLP 2018 • Devamanyu Hazarika, Soujanya Poria, Rada Mihalcea, Erik Cambria, Roger Zimmermann
Emotion recognition in conversations is crucial for building empathetic machines.
Ranked #50 on Emotion Recognition in Conversation on IEMOCAP
Emotion Recognition in Conversation General Classification +2
no code implementations • 23 Sep 2018 • Raghav Kapoor, Yaman Kumar, Kshitij Rajput, Rajiv Ratn Shah, Ponnurangam Kumaraguru, Roger Zimmermann
In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users.
no code implementations • 16 Jul 2018 • Mayank Meghawat, Satyendra Yadav, Debanjan Mahata, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann
In this work, we propose a multimodal dataset consisiting of content, context, and social information for popularity prediction.
no code implementations • 16 Jul 2018 • Debanjan Mahata, John Kuriakose, Rajiv Ratn Shah, Roger Zimmermann, John R. Talburt
Keyword extraction is a fundamental task in natural language processing that facilitates mapping of documents to a concise set of representative single and multi-word phrases.
1 code implementation • NAACL 2018 • Devamanyu Hazarika, Soujanya Poria, Prateek Vij, Gangeshwar Krishnamurthy, Erik Cambria, Roger Zimmermann
Aspect-based Sentiment Analysis is a fine-grained task of sentiment classification for multiple aspects in a sentence.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4
1 code implementation • NAACL 2018 • Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann
Emotion recognition in conversations is crucial for the development of empathetic machines.
Ranked #52 on Emotion Recognition in Conversation on IEMOCAP
no code implementations • NAACL 2018 • Debanjan Mahata, John Kuriakose, Rajiv Ratn Shah, Roger Zimmermann
Keyphrase extraction is a fundamental task in natural language processing that facilitates mapping of documents to a set of representative phrases.
1 code implementation • COLING 2018 • Devamanyu Hazarika, Soujanya Poria, Sruthi Gorantla, Erik Cambria, Roger Zimmermann, Rada Mihalcea
The literature in automated sarcasm detection has mainly focused on lexical, syntactic and semantic-level analysis of text.
Ranked #1 on Sarcasm Detection on SARC (all-bal)