no code implementations • 10 Dec 2024 • Zongyu Lin, Wei Liu, Chen Chen, Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, BoWen Zhang, Cha Chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang
The field of video generation has made remarkable advancements, yet there remains a pressing need for a clear, systematic recipe that can guide the development of robust and scalable models.
1 code implementation • 31 Oct 2024 • Qiming Wu, Xiaohan Chen, Yifan Jiang, Zhangyang Wang
Besides, we also extend LIP to compressive sensing image reconstruction, where a pre-trained GAN generator is used as the prior (in contrast to untrained DIP or deep decoder), and confirm its validity in this setting too.
no code implementations • 14 Oct 2024 • Dejia Xu, Yifan Jiang, Chen Huang, Liangchen Song, Thorsten Gernoth, Liangliang Cao, Zhangyang Wang, Hao Tang
Recent studies have attempted to incorporate camera control into the generation process, but their results are often limited to simple trajectories or lack the ability to generate consistent videos from multiple distinct camera paths for the same scene.
1 code implementation • 26 Sep 2024 • Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee
Our experiments reveal that all LLMs are vulnerable to RED QUEEN ATTACK, reaching 87. 62% attack success rate on GPT-4o and 75. 4% on Llama3-70B.
1 code implementation • 6 Sep 2024 • Koen Kraaijveld, Yifan Jiang, Kaixin Ma, Filip Ilievski
Then, we develop COLUMBUS, a synthetic benchmark that applies the task pipeline to create QA sets with text and icon rebus puzzles based on publicly available collections of compounds and common phrases.
no code implementations • 30 Aug 2024 • Winston Chen, Yifan Jiang, William Stafford Noble, Yang Young Lu
We further address the challenges of using off-the-shelf interaction importance measures by proposing a calibration procedure that refines these measures to maintain the desired FDR.
no code implementations • 17 Jul 2024 • Yifan Jiang, Qingqing Wu, Wen Chen, Hongxun Hui
Specifically, our approach designs the UAV trajectory by solving two subproblems: the candidate trajectory optimization problem and the energy-aware backup trajectory optimization problem.
no code implementations • 9 Jun 2024 • Ruiqi Liu, Shuang Zheng, Qingqing Wu, Yifan Jiang, Nan Zhang, Yuanwei Liu, Marco Di Renzo, and George C. Alexandropoulos
Reconfigurable Intelligent Surfaces (RISs) are a novel form of ultra-low power devices that are capable to increase the communication data rates as well as the cell coverage in a cost- and energy-efficient way.
no code implementations • 22 Apr 2024 • Yifan Jiang, Filip Ilievski, Kaixin Ma
In this paper, we split the original benchmark to also support fine-tuning setting and present SemEval Task 9: BRAIN-TEASER(S), the first task at this competition designed to test the system's reasoning and lateral thinking ability.
1 code implementation • 21 Apr 2024 • Yifan Jiang, Jiarui Zhang, Kexuan Sun, Zhivar Sourati, Kian Ahrabian, Kaixin Ma, Filip Ilievski, Jay Pujara
Further analysis of perception questions reveals that MLLMs struggle to comprehend the visual features (near-random performance) and even count the panels in the puzzle ( <45%), hindering their ability for abstract reasoning.
no code implementations • 7 Mar 2024 • Zezheng Feng, Yifan Jiang, Hongjun Wang, Zipei Fan, Yuxin Ma, Shuang-Hua Yang, Huamin Qu, Xuan Song
Recent achievements in deep learning (DL) have shown its potential for predicting traffic flows.
no code implementations • 22 Jan 2024 • Kian Ahrabian, Zhivar Sourati, Kexuan Sun, Jiarui Zhang, Yifan Jiang, Fred Morstatter, Jay Pujara
While large language models (LLMs) are still being adopted to new domains and utilized in novel applications, we are experiencing an influx of the new generation of foundation models, namely multi-modal large language models (MLLMs).
no code implementations • 8 Jan 2024 • Yifan Jiang, Qingqing Wu, Wen Chen, Kaitao Meng
Specifically, a weighted sum of predicted posterior Cram\'er-Rao bound (PCRB) for object relative position and velocity estimation is minimized by optimizing the UAV trajectory, where an efficient solution is obtained based on the successive convex approximation method.
no code implementations • 4 Jan 2024 • Elia Peruzzo, Vidit Goel, Dejia Xu, Xingqian Xu, Yifan Jiang, Zhangyang Wang, Humphrey Shi, Nicu Sebe
Recently, several works tackled the video editing task fostered by the success of large-scale text-to-image generative models.
1 code implementation • CVPR 2024 • Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi
We propose PAIR Diffusion a generic framework that enables a diffusion model to control the structure and appearance properties of each object in the image.
no code implementations • 13 Dec 2023 • Liangchen Song, Liangliang Cao, Jiatao Gu, Yifan Jiang, Junsong Yuan, Hao Tang
In this work, we propose that by incorporating correspondence regularization into diffusion models, the process of 3D editing can be significantly accelerated.
1 code implementation • 1 Dec 2023 • Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang
Novel view synthesis from limited observations remains an important and persistent task.
2 code implementations • 8 Oct 2023 • Yifan Jiang, Filip Ilievski, Kaixin Ma, Zhivar Sourati
The success of language models has inspired the NLP community to attend to tasks that require implicit and complex reasoning, relying on human-like commonsense mechanisms.
no code implementations • 5 Oct 2023 • Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Hanwen Jiang, Dejia Xu, Zehao Zhu, Dilin Wang, Zhangyang Wang
To address this challenge, we introduce PF-GRT, a new Pose-Free framework for Generalizable Rendering Transformer, eliminating the need for pre-computed camera poses and instead leveraging feature-matching learned directly from data.
no code implementations • 4 Oct 2023 • Yifan Jiang, Hao Tang, Jen-Hao Rick Chang, Liangchen Song, Zhangyang Wang, Liangliang Cao
Although the fidelity and generalizability are greatly improved, training such a powerful diffusion model requires a vast volume of training data and model parameters, resulting in a notoriously long time and high computational costs.
no code implementations • 2 Oct 2023 • Zhivar Sourati, Filip Ilievski, Pia Sommerauer, Yifan Jiang
However, while cognitive theories of analogy often focus on narratives and study the distinction between surface, relational, and system similarities, existing work in natural language processing has a narrower focus as far as relational analogies between word pairs.
1 code implementation • 28 Aug 2023 • Prateek Chhikara, Dhiraj Chaurasia, Yifan Jiang, Omkar Masur, Filip Ilievski
Food computing has emerged as a prominent multidisciplinary field of research in recent years.
1 code implementation • NeurIPS 2023 • Xingjian Bai, Guangyi He, Yifan Jiang, Jan Obloj
To evaluate the distributional robustness of neural networks, we propose a first-order AA algorithm and its multi-step version.
1 code implementation • 25 May 2023 • Zhiwen Fan, Panwang Pan, Peihao Wang, Yifan Jiang, Dejia Xu, Hanwen Jiang, Zhangyang Wang
To mitigate this issue, we propose a general paradigm for object pose estimation, called Promptable Object Pose Estimation (POPE).
1 code implementation • 16 May 2023 • Yifan Jiang, Shane Steinert-Threlkeld
Feature attribution aims to explain the reasoning behind a black-box model's prediction by identifying the impact of each feature on the prediction.
1 code implementation • NeurIPS 2023 • Zhendong Wang, Yifan Jiang, Yadong Lu, Yelong Shen, Pengcheng He, Weizhu Chen, Zhangyang Wang, Mingyuan Zhou
We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models.
1 code implementation • 26 Apr 2023 • Yifan Jiang, Filip Ilievski, Kaixin Ma
Stories about everyday situations are an essential part of human communication, motivating the need to develop AI agents that can reliably understand these stories.
1 code implementation • NeurIPS 2023 • Zhendong Wang, Yifan Jiang, Huangjie Zheng, Peihao Wang, Pengcheng He, Zhangyang Wang, Weizhu Chen, Mingyuan Zhou
Patch Diffusion meanwhile improves the performance of diffusion models trained on relatively small datasets, $e. g.$, as few as 5, 000 images to train from scratch.
1 code implementation • 30 Mar 2023 • Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi
We propose PAIR Diffusion, a generic framework that can enable a diffusion model to control the structure and appearance properties of each object in the image.
no code implementations • 26 Feb 2023 • Yifan Jiang, Han Chen, Hanseok Ko
In this paper, we introduce a novel data augmentation method for skeleton-based action recognition tasks, which can effectively generate high-quality and diverse sequential actions.
no code implementations • CVPR 2023 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang
In this work, we study the challenging task of lifting a single image to a 3D object and, for the first time, demonstrate the ability to generate a plausible 3D object with 360deg views that corresponds well with the given reference image.
1 code implementation • 29 Nov 2022 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang
In this work, we study the challenging task of lifting a single image to a 3D object and, for the first time, demonstrate the ability to generate a plausible 3D object with 360{\deg} views that correspond well with the given reference image.
no code implementations • CVPR 2023 • Yifan Jiang, Peter Hedman, Ben Mildenhall, Dejia Xu, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue
Neural Radiance Fields (NeRFs) are a powerful representation for modeling a 3D scene as a continuous function.
no code implementations • 17 Oct 2022 • Dejia Xu, Peihao Wang, Yifan Jiang, Zhiwen Fan, Zhangyang Wang
We answer this question by proposing an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
no code implementations • 10 Oct 2022 • Han Chen, Yifan Jiang, Hanseok Ko
Graph convolutional networks (GCNs), which can model the human body skeletons as spatial and temporal graphs, have shown remarkable potential in skeleton-based action recognition.
1 code implementation • 19 Sep 2022 • Zhiwen Fan, Peihao Wang, Yifan Jiang, Xinyu Gong, Dejia Xu, Zhangyang Wang
Our framework, called NeRF with Self-supervised Object Segmentation NeRF-SOS, couples object segmentation and neural radiance field to segment objects in any view within a scene.
1 code implementation • 11 Sep 2022 • Pangbo Ban, Yifan Jiang, Tianran Liu, Shane Steinert-Threlkeld
To what extent do pre-trained language models grasp semantic knowledge regarding the phenomenon of distributivity?
1 code implementation • 27 Apr 2022 • Qiucheng Wu, Yifan Jiang, Junru Wu, Kai Wang, Gong Zhang, Humphrey Shi, Zhangyang Wang, Shiyu Chang
To study the motion features in the latent space of StyleGAN, in this paper, we hypothesize and demonstrate that a series of meaningful, natural, and versatile small, local movements (referred to as "micromotion", such as expression, head movement, and aging effect) can be represented in low-rank spaces extracted from the latent space of a conventionally pre-trained StyleGAN-v2 model for face generation, with the guidance of proper "anchors" in the form of either short text or video clips.
1 code implementation • 5 Apr 2022 • Zhiwen Fan, Yifan Jiang, Peihao Wang, Xinyu Gong, Dejia Xu, Zhangyang Wang
Representing visual signals by implicit representation (e. g., a coordinate based deep network) has prevailed among many vision tasks.
1 code implementation • 2 Apr 2022 • Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang
Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense covers largely prohibits its wider applications.
no code implementations • 11 Mar 2022 • Yifan Jiang, Zezheng Feng, Hongjun Wang, Zipei Fan, Xuan Song
TrafPS consists of three layers, from data process to results computation and visualization.
no code implementations • 17 Jan 2022 • Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang
To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.
no code implementations • 2 Jan 2022 • Yifan Jiang, Bartlomiej Wronski, Ben Mildenhall, Jonathan T. Barron, Zhangyang Wang, Tianfan Xue
These spatially-varying kernels are produced by an efficient predictor network running on a downsampled input, making them much more efficient to compute than per-pixel kernels produced by a full-resolution image, and also enlarging the network's receptive field compared with static kernels.
no code implementations • 9 Dec 2021 • Yifan Jiang, Xinyu Gong, Junru Wu, Humphrey Shi, Zhicheng Yan, Zhangyang Wang
Efficient video architecture is the key to deploying video recognition systems on devices with limited computing resources.
no code implementations • 19 Nov 2021 • Han Chen, Yifan Jiang, Hanseok Ko
Due to the fast processing-speed and robustness it can achieve, skeleton-based action recognition has recently received the attention of the computer vision community.
no code implementations • 13 Oct 2021 • Han Chen, Yifan Jiang, Hanseok Ko, Murray Loew
Automatic segmentation of infected regions in computed tomography (CT) images is necessary for the initial diagnosis of COVID-19.
no code implementations • 29 Sep 2021 • Qiming Wu, Xiaohan Chen, Yifan Jiang, Pan Zhou, Zhangyang Wang
Drawing inspirations from the recently prosperous research on lottery ticket hypothesis (LTH), we conjecture and study a novel “lottery image prior” (LIP), stated as: given an (untrained or trained) DNN-based image prior, it will have a sparse subnetwork that can be training in isolation, to match the original DNN’s performance when being applied as a prior to various image inverse problems.
1 code implementation • ICCV 2021 • Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang
Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images.
1 code implementation • 1 Aug 2021 • Zeyuan Chen, Yifan Jiang, Dong Liu, Zhangyang Wang
We present \underline{C}oordinated \underline{E}nhancement for \underline{R}eal-world \underline{L}ow-light Noisy Images (CERL), that seamlessly integrates light enhancement and noise suppression parts into a unified and physics-grounded optimization framework.
no code implementations • NeurIPS 2021 • Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva
The self-attention-based model, transformer, is recently becoming the leading backbone in the field of computer vision.
Ranked #29 on Efficient ViTs on ImageNet-1K (with DeiT-S)
1 code implementation • 22 Apr 2021 • Yonggan Fu, Zhongzhi Yu, Yongan Zhang, Yifan Jiang, Chaojian Li, Yongyuan Liang, Mingchao Jiang, Zhangyang Wang, Yingyan Celine Lin
The promise of Deep Neural Network (DNN) powered Internet of Thing (IoT) devices has motivated a tremendous demand for automated solutions to enable fast development and deployment of efficient (1) DNNs equipped with instantaneous accuracy-efficiency trade-off capability to accommodate the time-varying resources at IoT devices and (2) dataflows to optimize DNNs' execution efficiency on different devices.
no code implementations • ICLR 2021 • Tianjian Meng, Xiaohan Chen, Yifan Jiang, Zhangyang Wang
Unrolling is believed to incorporate the model-based prior with the learning capacity of deep learning.
10 code implementations • NeurIPS 2021 • Yifan Jiang, Shiyu Chang, Zhangyang Wang
Our vanilla GAN architecture, dubbed TransGAN, consists of a memory-friendly transformer-based generator that progressively increases feature resolution, and correspondingly a multi-scale discriminator to capture simultaneously semantic contexts and low-level textures.
Ranked #14 on Image Generation on STL-10
no code implementations • 1 Feb 2021 • Yifan Jiang, Han Chen, David K. Han, Hanseok Ko
To compensate for the sparseness of labeled data, the proposed method utilizes a large amount of synthetic COVID-19 CT images and adjusts the networks from the source domain (synthetic data) to the target domain (real data) with a cross-domain training mechanism.
no code implementations • 23 Nov 2020 • Han Chen, Yifan Jiang, Murray Loew, Hanseok Ko
In this paper, we propose an unsupervised domain adaptation based segmentation network to improve the segmentation performance of the infection areas in COVID-19 CT images.
no code implementations • 16 Aug 2020 • Xinyu Gong, Wuyang Chen, Yifan Jiang, Ye Yuan, Xian-Ming Liu, Qian Zhang, Yuan Li, Zhangyang Wang
Such simplification limits the fusion of information at different scales and fails to maintain high-resolution representations.
no code implementations • 29 Jul 2020 • Yifan Jiang, Han Chen, Murray Loew, Hanseok Ko
However, training a deep-learning model requires large volumes of data, and medical staff faces a high risk when collecting COVID-19 CT data due to the high infectivity of the disease.
2 code implementations • ICCV 2019 • Xinyu Gong, Shiyu Chang, Yifan Jiang, Zhangyang Wang
Neural architecture search (NAS) has witnessed prevailing success in image classification and (very recently) segmentation tasks.
Ranked #22 on Image Generation on STL-10
8 code implementations • 17 Jun 2019 • Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, Zhangyang Wang
Deep learning-based methods have achieved remarkable success in image restoration and enhancement, but are they still competitive when there is a lack of paired training data?
no code implementations • ICLR 2019 • Ningyuan Zheng, Yifan Jiang, Dingjiang Huang
In this paper we try to address the discrete nature of software environment with an intermediate, differentiable simulation.
2 code implementations • 5 Apr 2018 • Xi Ouyang, Yu Cheng, Yifan Jiang, Chun-Liang Li, Pan Zhou
The results show that our framework can smoothly synthesize pedestrians on background images of variations and different levels of details.
Ranked #2 on Scene Text Recognition on MSDA