no code implementations • 7 Aug 2024 • Xijun Wang, Dongshan Ye, Chenyuan Feng, Howard H. Yang, Xiang Chen, Tony Q. S. Quek
Image semantic communication (ISC) has garnered significant attention for its potential to achieve high efficiency in visual content transmission.
no code implementations • 16 Jun 2024 • Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha
AUTOHALLUSION enables us to create new benchmarks at the minimum cost and thus overcomes the fragility of hand-crafted benchmarks.
no code implementations • 4 Apr 2024 • Tianrui Guan, Ruiqi Xian, Xijun Wang, Xiyang Wu, Mohamed Elnoor, Daeun Song, Dinesh Manocha
We present AGL-NET, a novel learning-based method for global localization using LiDAR point clouds and satellite maps.
no code implementations • 15 Mar 2024 • Xijun Wang, Santiago López-Tapia, Alice Lucas, Xinyi Wu, Rafael Molina, Aggelos K. Katsaggelos
To reduce these artifacts and enhance the perceptual quality of the results, in this paper, we propose a general method that can be effectively used in most GAN-based super-resolution (SR) models by introducing essential spatial information into the training process.
no code implementations • 12 Feb 2024 • Xijun Wang, Santiago López-Tapia, Aggelos K. Katsaggelos
Atmospheric turbulence, a common phenomenon in daily life, is primarily caused by the uneven heating of the Earth's surface.
1 code implementation • 18 Dec 2023 • Decheng Liu, Xijun Wang, Chunlei Peng, Nannan Wang, Ruiming Hu, Xinbo Gao
Adversarial attacks involve adding perturbations to the source image to cause misclassification by the target model, which demonstrates the potential of attacking face recognition models.
no code implementations • 13 Dec 2023 • Xijun Wang, Junbang Liang, Chun-Kai Wang, Kenan Deng, Yu Lou, Ming Lin, Shan Yang
In this work, we propose an efficient Video-Language Alignment (ViLA) network.
Ranked #1 on Video Question Answering on STAR Benchmark
no code implementations • 26 Oct 2023 • Xiang Chen, Zhiheng Guo, Xijun Wang, Howard H. Yang, Chenyuan Feng, Junshen Su, Sihui Zheng, Tony Q. S. Quek
Future wireless communication networks are in a position to move beyond data-centric, device-oriented connectivity and offer intelligent, immersive experiences based on task-oriented connections, especially in the context of the thriving development of pre-trained foundation models (PFM) and the evolving vision of 6G native artificial intelligence (AI).
6 code implementations • CVPR 2024 • Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou
Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs.
Ranked #1 on Visual Question Answering (VQA) on HallusionBench
no code implementations • 17 Aug 2023 • Xijun Wang, Anqi Liang, Junbang Liang, Ming Lin, Yu Lou, Shan Yang
Based on this notion, we propose a compatibility learning framework, a category-aware Flexible Bidirectional Transformer (FBT), for visual "scene-based set compatibility reasoning" with the cross-domain visual similarity input and auto-regressive complementary item generation.
no code implementations • 14 Aug 2023 • Xijun Wang, Xiaojie Chu, Chunrui Han, Xiangyu Zhang
This paper presents a module, Spatial Cross-scale Convolution (SCSC), which is verified to be effective in improving both CNNs and Transformers.
no code implementations • 25 May 2023 • Xijun Wang, Dongyang Liu, Meina Kan, Chunrui Han, Zhongqin Wu, Shiguang Shan
Distillation then begins in an online manner, and the teacher is only allowed to express solutions within the aforementioned subspace.
no code implementations • 21 May 2023 • Xijun Wang, Ruiqi Xian, Tianrui Guan, Fuxiao Liu, Dinesh Manocha
In practice, we observe a 3. 17-10. 2% accuracy improvement on the aerial video datasets (Okutama, NECDrone), which consist of scenes with single-agent and multi-agent actions.
Ranked #1 on Action Recognition on Okutama-Action
no code implementations • 8 May 2023 • Xijun Wang, Santiago López-Tapia, Aggelos K. Katsaggelos
We use the learned information to further condition the diffusion model.
no code implementations • 7 May 2023 • Xijun Wang, Aggelos K. Katsaggelos
To better learn these action category queries, we exploit not only the features of the current input video but also the correlation between different videos through a novel video-specific action category query learner worked with a query similarity loss.
Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization
1 code implementation • 14 Apr 2023 • Ruiqi Xian, Xijun Wang, Divya Kothandaraman, Dinesh Manocha
Our algorithm utilizes the motion bias within aerial videos, which enables the selection of motion-salient frames.
Ranked #1 on Action Recognition on UAV-Human
1 code implementation • ICCV 2023 • Tianrui Guan, Aswath Muthuselvam, Montana Hoover, Xijun Wang, Jing Liang, Adarsh Jagan Sathyamoorthy, Damon Conover, Dinesh Manocha
We present CrossLoc3D, a novel 3D place recognition method that solves a large-scale point matching problem in a cross-source setting.
Ranked #1 on 3D Place Recognition on CS-Campus3D
no code implementations • 30 Mar 2023 • Yuxuan Zhang, Chao Xu, Howard H. Yang, Xijun Wang, Tony Q. S. Quek
This paper proposes a client selection (CS) method to tackle the communication bottleneck of federated learning (FL) while concurrently coping with FL's data heterogeneity issue.
1 code implementation • 5 Mar 2023 • Ruiqi Xian, Xijun Wang, Dinesh Manocha
We present a novel approach for action recognition in UAV videos.
Ranked #2 on Action Recognition on UAV-Human
no code implementations • 2 Mar 2023 • Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso M. de Melo, Stephen M. Nogar, Aniket Bera, Dinesh Manocha
We propose a novel approach for aerial video action recognition.
Ranked #1 on Action Recognition on RoCoG-v2
1 code implementation • 21 Mar 2022 • Divya Kothandaraman, Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin, Dinesh Manocha
Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent (which is typically small) from the background.
Ranked #1 on Action Recognition on UAV Human
no code implementations • 16 Sep 2021 • Rohan Chandra, Xijun Wang, Mridul Mahajan, Rahul Kala, Rishitha Palugulla, Chandrababu Naidu, Alok Jain, Dinesh Manocha
We present a new traffic dataset, METEOR, which captures traffic patterns and multi-agent driving behaviors in unstructured scenarios.
no code implementations • 13 Apr 2021 • Chao Xu, Yiping Xie, Xijun Wang, Howard H. Yang, Dusit Niyato, Tony Q. S. Quek
cost), by integrating R-learning, a tabular reinforcement learning (RL) algorithm tailored for maximizing the long-term average reward, and traditional DRL algorithms, initially developed to optimize the discounted long-term cumulative reward rather than the average one.
no code implementations • CVPR 2021 • Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun
More gracefully, our DRConv transfers the increasing channel-wise filters to spatial dimension with learnable instructor, which not only improve representation ability of convolution, but also maintains computational cost and the translation-invariance as standard convolution dose.
Ranked #19 on Semantic Segmentation on MCubeS
no code implementations • CVPR 2019 • Xijun Wang, Meina Kan, Shiguang Shan, Xilin Chen
Benefitted from its great success on many tasks, deep learning is increasingly used on low-computational-cost devices, e. g. smartphone, embedded devices, etc.