no code implementations • 18 May 2025 • Yuwei Zhang, Wenhao Yu, Shangbin Feng, Yifan Zhu, Letian Peng, Jayanth Srinivasa, Gaowen Liu, Jingbo Shang
WikiDYK contains 12, 290 facts and 77, 180 questions, which is also seamlessly extensible with future updates from Wikipedia editors.
no code implementations • 28 Mar 2025 • Changchang Sun, Gaowen Liu, Charles Fleming, Yan Yan
Conditional diffusion models have gained increasing attention since their impressive results for cross-modal synthesis, where the strong alignment between conditioning input and generated output can be achieved by training a time-conditioned U-Net augmented with cross-attention mechanism.
no code implementations • 17 Mar 2025 • Tong Zhou, Shijin Duan, Gaowen Liu, Charles Fleming, Ramana Rao Kompella, Shaolei Ren, Xiaolin Xu
Pre-trained models are valuable intellectual property, capturing both domain-specific and domain-invariant features within their weight spaces.
1 code implementation • 14 Mar 2025 • YiWei Chen, Yuguang Yao, Yihua Zhang, Bingquan Shen, Gaowen Liu, Sijia Liu
While current alignment strategies primarily rely on supervised safety fine-tuning with curated datasets, we identify a fundamental limitation we call the "safety mirage" where supervised fine-tuning inadvertently reinforces spurious correlations between superficial textual patterns and safety responses, rather than fostering deep, intrinsic mitigation of harm.
no code implementations • 12 Mar 2025 • Yuwei Zhang, Jayanth Srinivasa, Gaowen Liu, Jingbo Shang
Interestingly, we observe that the internal attention weights from the generated CoT tokens can effectively ground implicit facts, even when these facts are not explicitly recalled.
1 code implementation • 19 Feb 2025 • Shijin Duan, Yejia Liu, Gaowen Liu, Ramana Rao Kompella, Shaolei Ren, Xiaolin Xu
Vector Symbolic Architecture (VSA) is emerging in machine learning due to its efficiency, but they are hindered by issues of hyperdimensionality and accuracy.
no code implementations • 18 Feb 2025 • Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao, Shreedhar Jangam, Jayanth Srinivasa, Gaowen Liu, Dawn Song, Xin Eric Wang
The rapid development of large reasoning models, such as OpenAI-o3 and DeepSeek-R1, has led to significant improvements in complex reasoning over non-reasoning large language models~(LLMs).
no code implementations • 12 Feb 2025 • Quan Xiao, Hui Yuan, A F M Saif, Gaowen Liu, Ramana Kompella, Mengdi Wang, Tianyi Chen
Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains.
1 code implementation • 8 Feb 2025 • Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar, Sat Chidananda, Jayanth Srinivasa, Gaowen Liu, Ali Payani, Chitta Baral
We then develop an LLM-based automated evaluation framework to identify reasoning errors and evaluate the performance of LLMs.
1 code implementation • 21 Dec 2024 • Changchang Sun, Ren Wang, Yihua Zhang, Jinghan Jia, Jiancheng Liu, Gaowen Liu, Sijia Liu, Yan Yan
Machine unlearning (MU), which seeks to erase the influence of specific unwanted data from already-trained models, is becoming increasingly vital in model editing, particularly to comply with evolving data regulations like the ``right to be forgotten''.
no code implementations • 9 Dec 2024 • Yanbo Xu, Jayanth Srinivasa, Gaowen Liu, Shubham Tulsiani
Score distillation of 2D diffusion models has proven to be a powerful mechanism to guide 3D optimization, for example enabling text-based 3D generation or single-view reconstruction.
1 code implementation • 5 Dec 2024 • Jiangweizhi Peng, Zhiwei Tang, Gaowen Liu, Charles Fleming, Mingyi Hong
Our method introduces a novel optimization framework that leverages both the continuous prompt embedding and the injected noise trajectory in the sampling process to generate safe images.
no code implementations • 29 Nov 2024 • Wenfang Sun, Yingjun Du, Gaowen Liu, Cees G. M. Snoek
We tackle the problem of quantifying the number of objects by a generative text-to-image model.
no code implementations • 27 Nov 2024 • Haomin Zhuang, Yihua Zhang, Kehan Guo, Jinghan Jia, Gaowen Liu, Sijia Liu, Xiangliang Zhang
As MoE LLMs are celebrated for their exceptional performance and highly efficient inference processes, we ask: How can unlearning be performed effectively and efficiently on MoE LLMs?
no code implementations • 26 Oct 2024 • Yingjun Du, Gaowen Liu, Yuzhang Shang, Yuguang Yao, Ramana Kompella, Cees G. M. Snoek
This paper introduces prompt diffusion, which uses a diffusion model to gradually refine the prompts to obtain a customized prompt for each sample.
1 code implementation • 6 Oct 2024 • Han Yang, Kun Su, Yutong Zhang, Jiaben Chen, Kaizhi Qian, Gaowen Liu, Chuang Gan
We introduce a music-motion parallel generation scheme that unifies all music and motion generation tasks into a single transformer decoder architecture with a single training task of music-motion joint generation.
no code implementations • 18 Jul 2024 • Sheng-Yao Kuan, Jen-Hao Cheng, Hsiang-Wei Huang, Wenhao Chai, Cheng-Yen Yang, Hugo Latapie, Gaowen Liu, Bing-Fei Wu, Jenq-Neng Hwang
In the domain of autonomous driving, the integration of multi-modal perception techniques based on data from diverse sensors has demonstrated substantial progress.
no code implementations • 15 Jul 2024 • Ziheng Chen, Yue Song, Xiao-Jun Wu, Gaowen Liu, Nicu Sebe
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
1 code implementation • 8 Jul 2024 • Xintong Li, Jinya Jiang, Ria Dharmani, Jayanth Srinivasa, Gaowen Liu, Jingbo Shang
We study open-world multi-label text classification under extremely weak supervision (XWS), where the user only provides a brief description for classification objectives without any labels or ground-truth label space.
1 code implementation • 3 Jul 2024 • Weitai Kang, Gaowen Liu, Mubarak Shah, Yan Yan
Specifically, we propose the Multi-layer Multi-task Encoder-Decoder as the target grounding stage, where we learn a regression query and multiple segmentation queries to ground the target by regression and segmentation of the box in each decoding layer, respectively.
1 code implementation • 1 Jul 2024 • Nick John Eliopoulos, Purvish Jajal, James C. Davis, Gaowen Liu, George K. Thiravathukal, Yung-Hsiang Lu
For similar latency (within 5. 2% or 7ms) across devices we achieve 78. 6%-84. 5% ImageNet1K accuracy, while the state-of-the-art, Token Merging, achieves 45. 8%-85. 4%.
1 code implementation • 12 Jun 2024 • Jiabao Ji, Yujian Liu, Yang Zhang, Gaowen Liu, Ramana Rao Kompella, Sijia Liu, Shiyu Chang
To achieve both goals, a mainstream class of LLM unlearning methods introduces an optimization framework with a combination of two objectives - maximizing the prediction loss on the forget documents while minimizing that on the retain documents, which suffers from two challenges, degenerated output and catastrophic forgetting.
1 code implementation • 30 May 2024 • Yuchi Liu, Jaskirat Singh, Gaowen Liu, Ali Payani, Liang Zheng
Specifically, we include a hierarchy of LLMs, first constructing a prompt with precise instructions and accurate wording in a hierarchical manner, and then using this prompt to generate the final answer to the user query.
no code implementations • CVPR 2024 • Yuzhang Shang, Dan Xu, Gaowen Liu, Ramana Rao Kompella, Yan Yan
Moreover, we introduce a knowledge distillation mechanism to correct the direction of information flow in backward propagation.
1 code implementation • 18 Apr 2024 • Jiabao Ji, Bairu Hou, Zhen Zhang, Guanhua Zhang, Wenqi Fan, Qing Li, Yang Zhang, Gaowen Liu, Sijia Liu, Shiyu Chang
Although large language models (LLMs) have achieved significant success, their vulnerability to adversarial perturbations, including recent jailbreak attacks, has raised considerable concerns.
1 code implementation • CVPR 2024 • Matteo Farina, Massimiliano Mancini, Elia Cunegatti, Gaowen Liu, Giovanni Iacca, Elisa Ricci
In this challenging setting, the transferable representations already encoded in the pretrained model are a key aspect to preserve.
1 code implementation • 7 Apr 2024 • Hou-I Liu, Christine Wu, Jen-Hao Cheng, Wenhao Chai, Shian-Yun Wang, Gaowen Liu, Hugo Latapie, Jhih-Ciang Wu, Jenq-Neng Hwang, Hong-Han Shuai, Wen-Huang Cheng
Monocular 3D object detection (Mono3D) holds noteworthy promise for autonomous driving applications owing to the cost-effectiveness and rich visual context of monocular camera sensors.
1 code implementation • 2 Apr 2024 • Sihao Hu, Tiansheng Huang, Gaowen Liu, Ramana Rao Kompella, Fatih Ilhan, Selim Furkan Tekin, Yichang Xu, Zachary Yahn, Ling Liu
The development of game agents holds a critical role in advancing towards Artificial General Intelligence.
no code implementations • 31 Mar 2024 • Wenfang Sun, Yingjun Du, Gaowen Liu, Ramana Kompella, Cees G. M. Snoek
Additionally, we propose an assembly that merges the segmentation maps from the various subclass descriptors to ensure a more comprehensive representation of the different aspects in the test images.
no code implementations • 18 Mar 2024 • Junge Zhang, Qihang Zhang, Li Zhang, Ramana Rao Kompella, Gaowen Liu, Bolei Zhou
Generating unbounded 3D scenes is crucial for large-scale scene understanding and simulation.
1 code implementation • 14 Mar 2024 • Fangqiang Ding, Yunzhou Zhu, Xiangyu Wen, Gaowen Liu, Chris Xiaoxuan Lu
Designing egocentric 3D hand pose estimation systems that can perform reliably in complex, real-world scenarios is crucial for downstream applications.
1 code implementation • 7 Mar 2024 • Kaiwen Cai, Zhekai Duan, Gaowen Liu, Charles Fleming, Chris Xiaoxuan Lu
Recent advancements in Vision-Language (VL) models have sparked interest in their deployment on edge devices, yet challenges in handling diverse visual modalities, manual annotation, and computational constraints remain.
1 code implementation • 19 Feb 2024 • Yihua Zhang, Chongyu Fan, Yimeng Zhang, Yuguang Yao, Jinghan Jia, Jiancheng Liu, Gaoyuan Zhang, Gaowen Liu, Ramana Rao Kompella, Xiaoming Liu, Sijia Liu
The technological advancements in diffusion models (DMs) have demonstrated unprecedented capabilities in text-to-image generation and are widely used in diverse applications.
1 code implementation • 15 Feb 2024 • Letian Peng, Yuwei Zhang, Zilong Wang, Jayanth Srinivasa, Gaowen Liu, Zihan Wang, Jingbo Shang
This work aims to build a text embedder that can capture characteristics of texts specified by user instructions.
no code implementations • CVPR 2024 • Yuzhang Shang, Gaowen Liu, Ramana Rao Kompella, Yan Yan
We aim to calibrate the quantized activations by maximizing the mutual information between the pre- and post-quantized activations.
1 code implementation • 4 Nov 2023 • Zhuoshi Pan, Yuguang Yao, Gaowen Liu, Bingquan Shen, H. Vicky Zhao, Ramana Rao Kompella, Sijia Liu
This is because the art necessitates modifications to the diffusion training and sampling procedures.
no code implementations • 2 Nov 2023 • Jen-Hao Cheng, Sheng-Yao Kuan, Hugo Latapie, Gaowen Liu, Jenq-Neng Hwang
CenterRadarNet achieves the state-of-the-art result on the K-Radar 3D object detection benchmark.
1 code implementation • ICCV 2023 • Yuzhang Shang, Bingxin Xu, Gaowen Liu, Ramana Kompella, Yan Yan
Inspired by the causal understanding, we propose the Causality-guided Data-free Network Quantization method, Causal-DFQ, to eliminate the reliance on data via approaching an equilibrium of causality-driven intervened distributions.
1 code implementation • 6 Sep 2023 • Sanjana Vijay Ganesh, Yanzhao Wu, Gaowen Liu, Ramana Kompella, Ling Liu
Object tracking is an important functionality of edge video analytic systems and services.
no code implementations • 15 Aug 2023 • Peihao Chen, Xinyu Sun, Hongyan Zhi, Runhao Zeng, Thomas H. Li, Gaowen Liu, Mingkui Tan, Chuang Gan
We study the task of zero-shot vision-and-language navigation (ZS-VLN), a practical yet challenging problem in which an agent learns to navigate following a path described by language instructions without requiring any path-instruction annotation data.
2 code implementations • CVPR 2024 • Ziheng Chen, Yue Song, Gaowen Liu, Ramana Rao Kompella, XiaoJun Wu, Nicu Sebe
Besides, our framework offers a novel intrinsic explanation for the most popular LogEig classifier in existing SPD networks.
1 code implementation • NeurIPS 2023 • Jinghan Jia, Jiancheng Liu, Parikshit Ram, Yuguang Yao, Gaowen Liu, Yang Liu, Pranay Sharma, Sijia Liu
We show in both theory and practice that model sparsity can boost the multi-criteria unlearning performance of an approximate unlearner, closing the approximation gap, while continuing to be efficient.
no code implementations • 27 Jan 2023 • Bin Duan, Keshav Bhandari, Gaowen Liu, Yan Yan
Moreover, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF) estimation, which is trained in a contrastive manner via a hybrid loss that combines siamese contrastive and optical flow losses.
1 code implementation • 15 Jan 2023 • Fatih Ilhan, Ka-Ho Chow, Sihao Hu, Tiansheng Huang, Selim Tekin, Wenqi Wei, Yanzhao Wu, Myungjin Lee, Ramana Kompella, Hugo Latapie, Gaowen Liu, Ling Liu
Instead of having every sample go through all DNN layers during prediction, EENet learns an early exit scheduler, which can intelligently terminate the inference earlier for certain predictions, which the model has high confidence of early exit.
no code implementations • 7 Aug 2022 • Keshav Bhandari, Bin Duan, Gaowen Liu, Hugo Latapie, Ziliang Zong, Yan Yan
Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature.
no code implementations • 7 Jul 2021 • Gaowen Liu, Hao Tang, Hugo Latapie, Jason Corso, Yan Yan
Particularly, we propose a novel Bi-directional Spatial Temporal Attention Fusion Generative Adversarial Network (STA-GAN) to learn both spatial and temporal information to generate egocentric video sequences from the exocentric view.
no code implementations • 11 Feb 2021 • Hugo Latapie, Ozkan Kilic, Gaowen Liu, Yan Yan, Ramana Kompella, Pei Wang, Kristinn R. Thorisson, Adam Lawrence, Yuhong Sun, Jayanth Srinivasa
This paper introduces a new metamodel-based knowledge representation that significantly improves autonomous learning and adaptation.
no code implementations • 8 Feb 2020 • Gaowen Liu, Hao Tang, Hugo Latapie, Yan Yan
In this paper, we investigate exocentric (third-person) view to egocentric (first-person) view image generation.
1 code implementation • 2 Aug 2019 • Hao Tang, Dan Xu, Gaowen Liu, Wei Wang, Nicu Sebe, Yan Yan
In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C$^2$GAN) for the task of keypoint-guided image generation.