1 code implementation • 18 Dec 2024 • Zhuo Cao, Bingqing Zhang, Heming Du, Xin Yu, Xue Li, Sen Wang
For short-moment retrieval, FlashVTG increases mAP to 125% of previous SOTA performance.
Ranked #1 on Highlight Detection on TvSum
1 code implementation • 22 Nov 2024 • Xin Yu, Ze Yuan, Yuan-Chen Guo, Ying-Tian Liu, Jianhui Liu, Yangguang Li, Yan-Pei Cao, Ding Liang, Xiaojuan Qi
Instead, we focus on the fundamental problem of learning in the UV texture space itself.
no code implementations • 15 Nov 2024 • Peter St. John, Dejun Lin, Polina Binder, Malcolm Greaves, Vega Shah, John St. John, Adrian Lange, Patrick Hsu, Rajesh Illango, Arvind Ramanathan, Anima Anandkumar, David H Brookes, Akosua Busia, Abhishaike Mahajan, Stephen Malina, Neha Prasad, Sam Sinai, Lindsay Edwards, Thomas Gaudelet, Cristian Regep, Martin Steinegger, Burkhard Rost, Alexander Brace, Kyle Hippe, Luca Naef, Keisuke Kamata, George Armstrong, Kevin Boyd, Zhonglin Cao, Han-Yi Chou, Simon Chu, Allan dos Santos Costa, Sajad Darabi, Eric Dawson, Kieran Didi, Cong Fu, Mario Geiger, Michelle Gill, Darren Hsu, Gagan Kaushik, Maria Korshunova, Steven Kothen-Hill, Youhan Lee, Meng Liu, Micha Livne, Zachary McClure, Jonathan Mitchell, Alireza Moradzadeh, Ohad Mosafi, Youssef Nashed, Yuxing Peng, Sara Rabhi, Farhad Ramezanghorbani, Danny Reidenbach, Camir Ricketts, Brian Roland, Kushal Shah, Tyler Shimko, Hassan Sirelkhatim, Savitha Srinivasan, Abraham C Stern, Dorota Toczydlowska, Srimukh Prasad Veccham, Niccolò Alberto Elia Venanzi, Anton Vorontsov, Jared Wilber, Isabel Wilkinson, Wei Jing Wong, Eva Xue, Cory Ye, Xin Yu, Yang Zhang, Guoqing Zhou, Becca Zandstein, Christian Dallago, Bruno Trentini, Emine Kucukbenli, Saee Paliwal, Timur Rvachov, Eddie Calleja, Johnny Israeli, Harry Clifford, Risto Haukioja, Nicholas Haemel, Kyle Tretina, Neha Tadimeti, Anthony B Costa
We introduce the BioNeMo Framework to facilitate the training of computational biology and chemistry AI models across hundreds of GPUs.
1 code implementation • 9 Nov 2024 • Fatemeh Shiri, Xiao-Yu Guo, Mona Golestan Far, Xin Yu, Gholamreza Haffari, Yuan-Fang Li
We believe our benchmark dataset and in-depth analyses can spark further research on LMMs spatial reasoning.
no code implementations • 25 Oct 2024 • Xin Shen, Heming Du, Hongwei Sheng, Shuyun Wang, Hui Chen, Huiqiang Chen, Zhuojie Wu, Xiaobiao Du, Jiaying Ying, Ruihan Lu, Qingzheng Xu, Xin Yu
Experiment results indicate that MM-WLAuslan is a challenging ISLR dataset, and we hope this dataset will contribute to the development of Auslan and the advancement of sign languages worldwide.
1 code implementation • 25 Oct 2024 • Xin Shen, Lei Shen, Shaozu Yuan, Heming Du, Haiyang Sun, Xin Yu
In this work, we introduce a Diverse Sign Language Translation (DivSLT) task, aiming to generate diverse yet accurate translations for sign language videos.
1 code implementation • 11 Oct 2024 • Xin Yu, Zelin He, Ying Sun, Lingzhou Xue, Runze Li
FedProx is a simple yet effective federated learning method that enables model personalization via regularization.
1 code implementation • 2 Oct 2024 • Xiaobiao Du, Yida Wang, Xin Yu
3) Built on top of our multi-view regulated training, we further propose a cross-ray densification strategy, densifying more Gaussian kernels in the ray-intersect regions from a selection of views.
Ranked #1 on Novel View Synthesis on Mip-NeRF 360
no code implementations • 30 Sep 2024 • Bingqing Zhang, Zhuo Cao, Heming Du, Xin Yu, Xue Li, Jiajun Liu, Sen Wang
Text-Video Retrieval (TVR) methods typically match query-candidate pairs by aligning text and video features in coarse-grained, fine-grained, or combined (coarse-to-fine) manners.
no code implementations • 22 Sep 2024 • Huafeng Qin, Hongyu Zhu, Xin Jin, Xin Yu, Mounim A. El-Yacoubi, Shuqiang Yang
First, we define a supernet and propose a global and local alternate Neural Architecture Search method to search the optimal architecture alternately with a differentiable neural architecture search.
no code implementations • 20 Sep 2024 • Feng Qiu, Wei zhang, Chen Liu, Rudong An, Lincheng Li, Yu Ding, Changjie Fan, Zhipeng Hu, Xin Yu
In the facial animation transfer component, we propose a novel Expression-driven Multi-avatar Animator, which first maps expressive semantics to the facial control parameters of 3D avatars and then imposes perceptual constraints between the input and output images to maintain expression consistency.
no code implementations • 14 Sep 2024 • Suzhen Wang, Yifeng Ma, Yu Ding, Zhipeng Hu, Changjie Fan, Tangjie Lv, Zhidong Deng, Xin Yu
To address this challenge, we propose a one-shot style-controllable talking face generation method that can obtain speaking styles from reference speaking videos and drive the one-shot portrait to speak with the reference speaking styles and another piece of audio.
1 code implementation • 13 Sep 2024 • Zhi Chen, Tianqi Wei, Zecheng Zhao, Jia Syuen Lim, Yadan Luo, Hu Zhang, Xin Yu, Scott Chapman, Zi Huang
In modern agriculture, precise monitoring of plants and fruits is crucial for tasks such as high-throughput phenotyping and automated harvesting.
1 code implementation • 6 Sep 2024 • Tianqi Wei, Zhi Chen, Xin Yu, Scott Chapman, Paul Melloy, Zi Huang
(2) Image source: Unlike typical datasets that contain images from laboratory settings, PlantSeg primarily comprises in-the-wild plant disease images.
no code implementations • 27 Aug 2024 • Tianqi Wei, Zhi Chen, Xin Yu
Plant disease recognition is a critical task that ensures crop health and mitigates the damage caused by diseases.
no code implementations • 6 Aug 2024 • Tianqi Wei, Zhi Chen, Zi Huang, Xin Yu
Motivated by this observation, we propose an in-the-wild multimodal plant disease recognition dataset that contains the largest number of disease classes but also text-based descriptions for each disease.
no code implementations • 24 Jul 2024 • Chen Liu, Wei zhang, Feng Qiu, Lincheng Li, Xin Yu
To advance this, the 7th Affective Behavior Analysis in-the-wild (ABAW) competition establishes two tracks: i. e., the Multi-task Learning (MTL) Challenge and the Compound Expression (CE) challenge based on Aff-Wild2 and C-EXPR-DB datasets.
1 code implementation • 24 Jul 2024 • Xiaobiao Du, Haiyang Sun, Ming Lu, Tianqing Zhu, Xin Yu
With this dataset, we make the generative model more robust to cars.
no code implementations • 11 Jul 2024 • Pu Feng, Junkang Liang, Size Wang, Xin Yu, Xin Ji, Yiting Chen, Kui Zhang, Rongye Shi, Wenjun Wu
This approach enables agents to form a global consensus from local observations, using it as an additional piece of information to guide collaborative actions during execution.
no code implementations • 21 Jun 2024 • Jia Syuen Lim, Zhuoxiao Chen, Mahsa Baktashmotlagh, Zhi Chen, Xin Yu, Zi Huang, Yadan Luo
We demonstrate the effectiveness of DiPEx through extensive class-agnostic OD and OOD-OD experiments on MS-COCO and LVIS, surpassing other prompting methods by up to 20. 1% in AR and achieving a 21. 3% AP improvement over SAM.
no code implementations • 18 Jun 2024 • Xin Yu, Qi Yang, Han Liu, Ho Hin Lee, Yucheng Tang, Lucas W. Remedios, Michael E. Kim, Rendong Zhang, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure.
no code implementations • 18 Jun 2024 • Shuting Wang, Xin Yu, Mang Wang, WeiPeng Chen, Yutao Zhu, Zhicheng Dou
These ranked documents sufficiently cover various query aspects and are aware of the generator's preferences, hence incentivizing it to produce rich and comprehensive responses for users.
no code implementations • 7 Jun 2024 • Xiaobiao Du, Haiyang Sun, Shuyun Wang, Zhuojie Wu, Hongwei Sheng, Jiaying Ying, Ming Lu, Tianqing Zhu, Kun Zhan, Xin Yu
(1) \textbf{High-Volume}: 2, 500 cars are meticulously scanned by 3D scanners, obtaining car images and point clouds with real-world dimensions; (2) \textbf{High-Quality}: Each car is captured in an average of 200 dense, high-resolution 360-degree RGB-D views, enabling high-fidelity 3D reconstruction; (3) \textbf{High-Diversity}: The dataset contains various cars from over 100 brands, collected under three distinct lighting conditions, including reflective, standard, and dark.
no code implementations • 22 May 2024 • Senmao Tian, Haoyu Gao, Gangyi Hong, Shuyun Wang, JingJie Wang, Xin Yu, Shunli Zhang
Minor variations in silhouette sequences can be diminished in the network's intermediate layers due to the accumulation of quantization errors.
no code implementations • 1 May 2024 • Yuran Zhu, Guanhua Wang, Yuning Gu, Walter Zhao, Jiahao Lu, Junqing Zhu, Christina J. MacAskill, Andrew Dupuis, Mark A. Griswold, Dan Ma, Chris A. Flask, Xin Yu
We present the first dynamic and multi-parametric approach for quantitatively tracking contrast agent transport in the mouse brain using 3D MRF.
1 code implementation • 21 Apr 2024 • Huiqiang Chen, Tianqing Zhu, Xin Yu, Wanlei Zhou
Current research centres on efficient unlearning to erase the influence of data from the model and neglects the subsequent impacts on the remaining data.
no code implementations • 16 Mar 2024 • Wei zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu
Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do.
no code implementations • 11 Jan 2024 • Peng Dai, Feitong Tan, Xin Yu, Yifan Peng, yinda zhang, Xiaojuan Qi
Virtual environments (VEs) are pivotal for virtual, augmented, and mixed reality systems.
no code implementations • 5 Jan 2024 • Ho Hin Lee, Adam M. Saunders, Michael E. Kim, Samuel W. Remedios, Lucas W. Remedios, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Chloe Cho, Louise A. Mawn, Tonia S. Rex, Kevin L. Schey, Blake E. Dewey, Jeffrey M. Spraggins, Jerry L. Prince, Yuankai Huo, Bennett A. Landman
These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference.
no code implementations • CVPR 2024 • Zhipeng Hu, Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Changjie Fan, Xiaowei Zhou, Xin Yu
This limitation leads to the Janus problem where multi-faced 3D models are generated under the guidance of such diffusion models.
no code implementations • CVPR 2024 • Chen Liu, Peike Patrick Li, Qingtao Yu, Hongwei Sheng, Dadong Wang, Lincheng Li, Xin Yu
Considering that pixel-level annotations are difficult to achieve in some complex scenes we also provide the bounding boxes to indicate the sounding regions.
1 code implementation • CVPR 2024 • Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu
In the editing stage we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts.
no code implementations • 15 Dec 2023 • Tianhao Peng, Wenjun Wu, Haitao Yuan, Zhifeng Bao, Zhao Pengrui, Xin Yu, Xuetao Lin, Yu Liang, Yanjun Pu
To address this limitation, this paper presents GraphRARE, a general framework built upon node relative entropy and deep reinforcement learning, to strengthen the expressive capability of GNNs.
Ranked #3 on Node Classification on Cornell
no code implementations • 14 Dec 2023 • Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang
Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.
no code implementations • 12 Dec 2023 • Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu
OpenSight utilizes 2D-3D geometric priors for the initial discernment and localization of generic objects, followed by a more specific semantic interpretation of the detected objects.
1 code implementation • 4 Dec 2023 • Yiyun Zhang, Zijian Wang, Yadan Luo, Xin Yu, Zi Huang
Existing Building Damage Detection (BDD) methods always require labour-intensive pixel-level annotations of buildings and their conditions, hence largely limiting their applications.
1 code implementation • 1 Dec 2023 • Yunjie Wu, Yapeng Meng, Zhipeng Hu, Lincheng Li, Haoqian Wu, Kun Zhou, Weiwei Xu, Xin Yu
In the editing stage, we first employ a pre-trained diffusion model to update facial geometry or texture based on the texts.
1 code implementation • 8 Nov 2023 • Shikai Fang, Xin Yu, Zheng Wang, Shibo Li, Mike Kirby, Shandian Zhe
To generalize Tucker decomposition to such scenarios, we propose Functional Bayesian Tucker Decomposition (FunBaT).
no code implementations • 30 Oct 2023 • Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi
In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks.
1 code implementation • 16 Oct 2023 • Zhuoxiao Chen, Yadan Luo, Zixin Wang, Zijian Wang, Xin Yu, Zi Huang
This paper investigates a more practical and challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aimed at acquiring informative point clouds with new concepts.
no code implementations • 15 Oct 2023 • Hongyu Fu, Xin Yu, Lincheng Li, Li Zhang
Existing volumetric neural rendering techniques, such as Neural Radiance Fields (NeRF), face limitations in synthesizing high-quality novel views when the camera poses of input images are imperfect.
no code implementations • 9 Oct 2023 • Hu Zhang, Xin Shen, Heming Du, Huiqiang Chen, Chen Liu, Hongwei Sheng, Qingzheng Xu, MD Wahiduzzaman Khan, Qingtao Yu, Tianqing Zhu, Scott Chapman, Zi Huang, Xin Yu
In the wheat nutrient deficiencies classification challenge, we present the DividE and EnseMble (DEEM) method for progressive test data predictions.
1 code implementation • 30 Sep 2023 • Ho Hin Lee, Quan Liu, Qi Yang, Xin Yu, Shunxing Bao, Yuankai Huo, Bennett A. Landman
We hypothesize that deformable convolution can be an exploratory alternative to combine all advantages from the previous operators, providing long-range dependency, adaptive spatial aggregation and computational efficiency as a foundation backbone.
1 code implementation • 29 Sep 2023 • Shibo Li, Xin Yu, Wei Xing, Mike Kirby, Akil Narayan, Shandian Zhe
To overcome this problem, we propose Multi-Resolution Active learning of FNO (MRA-FNO), which can dynamically select the input functions and resolutions to lower the data cost as much as possible while optimizing the learning efficiency.
2 code implementations • 19 Sep 2023 • Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan, Fan Yang, Fei Deng, Feng Wang, Feng Liu, Guangwei Ai, Guosheng Dong, Haizhou Zhao, Hang Xu, Haoze Sun, Hongda Zhang, Hui Liu, Jiaming Ji, Jian Xie, Juntao Dai, Kun Fang, Lei Su, Liang Song, Lifeng Liu, Liyun Ru, Luyao Ma, Mang Wang, Mickel Liu, MingAn Lin, Nuolan Nie, Peidong Guo, Ruiyang Sun, Tao Zhang, Tianpeng Li, Tianyu Li, Wei Cheng, WeiPeng Chen, Xiangrong Zeng, Xiaochuan Wang, Xiaoxi Chen, Xin Men, Xin Yu, Xuehai Pan, Yanjun Shen, Yiding Wang, Yiyu Li, Youxin Jiang, Yuchen Gao, Yupeng Zhang, Zenan Zhou, Zhiying Wu
Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering.
1 code implementation • 17 Sep 2023 • Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, Leon Y. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
We further evaluate our method's capability to harmonize longitudinal positional variation on 1033 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset, which contains longitudinal single abdominal slices, and confirmed that our method can harmonize the slice positional variance in terms of visceral fat area.
1 code implementation • 8 Sep 2023 • Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Shunxing Bao, Yuankai Huo, Bennett A. Landman
Subsequently, the model is finetuned with 45 T1w 3D volumes from Open Access Series Imaging Studies (OASIS) where both 133 whole brain classes and TICV/PFV labels are available.
no code implementations • 2 Sep 2023 • Qingtao Yu, Heming Du, Chen Liu, Xin Yu
CIP-WPIS leverages pretrained knowledge embedded in the 2D foundation model SAM and 3D geometric prior to achieve accurate point-wise instance labels from the bounding box annotations.
1 code implementation • 25 Aug 2023 • Zhipeng Hu, Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Changjie Fan, Xiaowei Zhou, Xin Yu
This limitation leads to the Janus problem, where multi-faced 3D models are generated under the guidance of such diffusion models.
1 code implementation • ICCV 2023 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi
In this work, we focus on synthesizing high-quality textures on 3D meshes.
no code implementations • 20 Aug 2023 • Chen Liu, Peike Li, Hu Zhang, Lincheng Li, Zi Huang, Dadong Wang, Xin Yu
In a nutshell, our BAVS is designed to eliminate the interference of background noise or off-screen sounds in segmentation by establishing the audio-visual correspondences in an explicit manner.
no code implementations • 31 Jul 2023 • Chen Liu, Peike Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu
However, we observed that prior arts are prone to segment a certain salient object in a video regardless of the audio information.
no code implementations • 30 Jul 2023 • Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Jie Luo, Wenjun Wu
In addition, the proposed framework is model-agnostic and can be applied to most of the current MARL algorithms.
no code implementations • 24 Jun 2023 • Shuai Zhou, Tianqing Zhu, Dayong Ye, Xin Yu, Wanlei Zhou
Hence, in this paper, we propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
1 code implementation • 19 Jun 2023 • Simin Li, Yifan Zhong, Jiarong Liu, Jianing Guo, Siyuan Qi, Ruixiao Xu, Xin Yu, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yujing Hu, Bo An, Xianglong Liu, Yaodong Yang
Finally, we generalize a unified template for MaxEnt algorithmic design named \emph{Maximum Entropy Heterogeneous-Agent Mirror Learning} (MEHAML), which provides any induced method with the same guarantees as HASAC.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 2 Jun 2023 • Yinchi Zhou, Ho Hin Lee, Yucheng Tang, Xin Yu, Qi Yang, Shunxing Bao, Jeffrey M. Spraggins, Yuankai Huo, Bennett A. Landman
Briefly, DEEDs affine and non-rigid registration are performed to transfer patient abdominal volumes to a fixed high-resolution atlas template.
1 code implementation • 30 May 2023 • Xingqun Qi, Chen Liu, Lincheng Li, Jie Hou, Haoran Xin, Xin Yu
In this work, we propose EmotionGesture, a novel framework for synthesizing vivid and diverse emotional co-speech 3D gestures from audio.
no code implementations • 10 May 2023 • Hongwei Sheng, Xin Yu, Feiyu Wang, MD Wahiduzzaman Khan, Hexuan Weng, Sahar Shariflou, S. Mojtaba Golzan
Both of the evaluations support its effectiveness in facilitating the observation of SVPs.
1 code implementation • CVPR 2023 • Peng Dai, yinda zhang, Xin Yu, Xiaoyang Lyu, Xiaojuan Qi
Rendering novel view images is highly desirable for many applications.
no code implementations • 1 Apr 2023 • Yifeng Ma, Suzhen Wang, Yu Ding, Bowen Ma, Tangjie Lv, Changjie Fan, Zhipeng Hu, Zhidong Deng, Xin Yu
Leveraging the proposed dataset, we introduce a CLIP-based style encoder that projects natural language-based descriptions to the representations of expressions.
2D Semantic Segmentation task 3 (25 classes) Talking Head Generation
1 code implementation • CVPR 2023 • Haoqian Wu, Zhipeng Hu, Lincheng Li, Yongqiang Zhang, Changjie Fan, Xin Yu
Inverse rendering methods aim to estimate geometry, materials and illumination from multi-view RGB images.
Ranked #2 on Surface Normals Estimation on Stanford-ORB
no code implementations • ICCV 2023 • Ming Wang, Xianda Guo, Beibei Lin, Tian Yang, Zheng Zhu, Lincheng Li, Shunli Zhang, Xin Yu
This is the first framework on gait recognition that is designed to focus on the extraction of dynamic features.
no code implementations • 23 Mar 2023 • Huajie Chen, Tianqing Zhu, Yuan Zhao, Bo Liu, Xin Yu, Wanlei Zhou
By avoiding high-frequency artifacts and manipulating the frequency distribution of the embedded feature map, LIDS achieves improved robustness against attacks that distort the high-frequency components of container images.
2 code implementations • 10 Mar 2023 • Ho Hin Lee, Quan Liu, Shunxing Bao, Qi Yang, Xin Yu, Leon Y. Cai, Thomas Li, Yuankai Huo, Xenofon Koutsoukos, Bennett A. Landman
We hypothesize that convolution with LK sizes is limited to maintain an optimal convergence for locality learning.
1 code implementation • CVPR 2023 • Xingqun Qi, Chen Liu, Muyi Sun, Lincheng Li, Changjie Fan, Xin Yu
Considering the asymmetric gestures and motions of two hands, we introduce a Spatial-Residual Memory (SRM) module to model spatial interaction between the body and each hand by residual learning.
no code implementations • 22 Feb 2023 • Shannan Guan, Xin Yu, Wei Huang, Gengfa Fang, Haiyan Lu
Our DMMG consists of a viewpoint variation min-max game and an edge perturbation min-max game.
1 code implementation • 7 Feb 2023 • Simin Li, Jun Guo, Jingqiao Xiu, Yuwei Zheng, Pu Feng, Xin Yu, Aishan Liu, Yaodong Yang, Bo An, Wenjun Wu, Xianglong Liu
To achieve maximum deviation in victim policies under complex agent-wise interactions, our unilateral attack aims to characterize and maximize the impact of the adversary on the victims.
1 code implementation • 23 Jan 2023 • Yadan Luo, Zhuoxiao Chen, Zijian Wang, Xin Yu, Zi Huang, Mahsa Baktashmotlagh
To alleviate the high annotation cost in LiDAR-based 3D object detection, active learning is a promising solution that learns to select only a small portion of unlabeled data to annotate, without compromising model performance.
no code implementations • 19 Jan 2023 • Junyang Cai, Khai-Nguyen Nguyen, Nishant Shrestha, Aidan Good, Ruisen Tu, Xin Yu, Shandian Zhe, Thiago Serra
One surprising trait of neural networks is the extent to which their connections can be pruned with little to no effect on accuracy.
1 code implementation • 3 Jan 2023 • Yifeng Ma, Suzhen Wang, Zhipeng Hu, Changjie Fan, Tangjie Lv, Yu Ding, Zhidong Deng, Xin Yu
In a nutshell, we aim to attain a speaking style from an arbitrary reference speaking video and then drive the one-shot portrait to speak with the reference speaking style and another piece of audio.
no code implementations • CVPR 2023 • Heming Du, Lincheng Li, Zi Huang, Xin Yu
In HiNL, we propose a History-aware State Estimation (HaSE) module to alleviate the impacts of dominant historical states on the current state estimation.
no code implementations • 6 Dec 2022 • Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu
Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.
2 code implementations • 6 Dec 2022 • Wenbo Li, Xin Yu, Kun Zhou, Yibing Song, Zhe Lin, Jiaya Jia
To achieve high-quality results with low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled probabilistic modeling, combining the optimization efficiency of GANs with the prediction tractability of probabilistic models.
1 code implementation • 30 Nov 2022 • Qi Yang, Xin Yu, Ho Hin Lee, Leon Y. Cai, Kaiwen Xu, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Sokratis Makrogiannis, Luigi Ferrucci, Bennett A. Landman
The proposed pipeline is effective and robust in extracting muscle groups on 2D single slice CT thigh images. The container is available for public use at https://github. com/MASILab/DA_CT_muscle_seg
no code implementations • 15 Nov 2022 • Heming Du, Chen Liu, Ming Wang, Lincheng Li, Shunli Zhang, Xin Yu
We measure the uncertainty and predict the match status of the recognition results, and thus determine whether the probe is an OOG query. To the best of our knowledge, our method is the first attempt to tackle OOG queries in gait recognition.
no code implementations • 25 Oct 2022 • Zhipeng Hu, Wei zhang, Lincheng Li, Yu Ding, Wei Chen, Zhigang Deng, Xin Yu
We find that AUs and facial expressions are highly associated, and existing facial expression datasets often contain a large number of identities.
no code implementations • 23 Oct 2022 • Shibo Li, Jeff M. Phillips, Xin Yu, Robert M. Kirby, Shandian Zhe
However, this method only queries at one pair of fidelity and input at a time, and hence has a risk to bring in strongly correlated examples to reduce the learning efficiency.
1 code implementation • 16 Oct 2022 • Ho Hin Lee, Yucheng Tang, Han Liu, Yubo Fan, Leon Y. Cai, Qi Yang, Xin Yu, Shunxing Bao, Yuankai Huo, Bennett A. Landman
We evaluate our proposed approach on multi-organ segmentation with both non-contrast CT (NCCT) datasets and the MICCAI 2015 BTCV Challenge contrast-enhance CT (CECT) datasets.
1 code implementation • 14 Oct 2022 • Ruifei He, Shuyang Sun, Xin Yu, Chuhui Xue, Wenqing Zhang, Philip Torr, Song Bai, Xiaojuan Qi
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images.
no code implementations • 13 Oct 2022 • Yuxin Mao, Zhexiong Wan, Yuchao Dai, Xin Yu
Single image blind deblurring is highly ill-posed as neither the latent sharp image nor the blur kernel is known.
1 code implementation • 29 Sep 2022 • Ping Liu, Xin Yu, Joey Tianyi Zhou
In this work, we first introduce a meta knowledge representation method that extracts meta knowledge from distributed clients.
1 code implementation • 28 Sep 2022 • Xin Yu, Qi Yang, Yucheng Tang, Riqiang Gao, Shunxing Bao, LeonY. Cai, Ho Hin Lee, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
External experiments on 20 subjects from the Baltimore Longitudinal Study of Aging (BLSA) dataset that contains longitudinal single abdominal slices validate that our method can harmonize the slice positional variance in terms of muscle and visceral fat area.
1 code implementation • 28 Sep 2022 • Xin Yu, Qi Yang, Yinchi Zhou, Leon Y. Cai, Riqiang Gao, Ho Hin Lee, Thomas Li, Shunxing Bao, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Zizhao Zhang, Yuankai Huo, Bennett A. Landman, Yucheng Tang
Transformer-based models, capable of learning better global dependencies, have recently demonstrated exceptional representation learning capabilities in computer vision and medical image analysis.
no code implementations • 28 Sep 2022 • Xin Yu, Yucheng Tang, Qi Yang, Ho Hin Lee, Riqiang Gao, Shunxing Bao, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman
Metabolic health is increasingly implicated as a risk factor across conditions from cardiology to neurology, and efficiency assessment of body composition is critical to quantitatively characterizing these relationships.
no code implementations • 7 Aug 2022 • Yujiao Shi, Xin Yu, Shan Wang, Hongdong Li
The critical challenge of this task is to learn a powerful global feature descriptor for the sequential ground-view images while considering its domain alignment with reference satellite images.
1 code implementation • 5 Aug 2022 • Feng Zhu, Zongxin Yang, Xin Yu, Yi Yang, Yunchao Wei
In this work, we propose a new online VIS paradigm named Instance As Identity (IAI), which models temporal information for both detection and tracking in an efficient way.
2 code implementations • 2 Aug 2022 • Beibei Lin, Shunli Zhang, Ming Wang, Lincheng Li, Xin Yu
GFR extractor aims to extract contextual information, e. g., the relationship among various body parts, and the mask-based LFR extractor is presented to exploit the detailed posture changes of local regions.
1 code implementation • 20 Jul 2022 • Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, Xiaojuan Qi
With the rapid development of mobile devices, modern widely-used mobile phones typically allow users to capture 4K resolution (i. e., ultra-high-definition) images.
Ranked #1 on Image Restoration on UHDM
1 code implementation • 19 Jul 2022 • Haitian Zeng, Xin Yu, Jiaxu Miao, Yi Yang
We propose MHR-Net, a novel method for recovering Non-Rigid Shapes from Motion (NRSfM).
no code implementations • 16 Jun 2022 • Zhimin Li, Shusen Liu, Xin Yu, Kailkhura Bhavya, Jie Cao, Diffenderfer James Daniel, Peer-Timo Bremer, Valerio Pascucci
We decomposed and evaluated a set of critical geometric concepts from the common adopted classification loss, and used them to design a visualization system to compare and highlight the impact of pruning on model performance and feature representation.
no code implementations • 7 Jun 2022 • Aidan Good, Jiaqi Lin, Hannah Sieg, Mikey Ferguson, Xin Yu, Shandian Zhe, Jerzy Wieczorek, Thiago Serra
In this work, we study such relative distortions in recall by hypothesizing an intensification effect that is inherent to the model.
1 code implementation • ICLR 2021 • Hehe Fan, Xin Yu, Yuhang Ding, Yi Yang, Mohan Kankanhalli
Then, a spatial convolution is employed to capture the local structure of points in the 3D space, and a temporal convolution is used to model the dynamics of the spatial regions along the time dimension.
Ranked #3 on 3D Action Recognition on NTU RGB+D
no code implementations • 12 May 2022 • Ho Hin Lee, Yucheng Tang, Riqiang Gao, Qi Yang, Xin Yu, Shunxing Bao, James G. Terry, J. Jeffrey Carr, Yuankai Huo, Bennett A. Landman
In this paper, we propose a novel unsupervised approach that leverages pairwise contrast-enhanced CT (CECT) context to compute non-contrast segmentation without ground-truth label.
1 code implementation • CVPR 2022 • Peng Dai, Xin Yu, Lan Ma, Baoheng Zhang, Jia Li, Wenbo Li, Jiajun Shen, Xiaojuan Qi
Moire patterns, appearing as color distortions, severely degrade image and video qualities when filming a screen with digital cameras.
1 code implementation • 26 Mar 2022 • Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li
We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
1 code implementation • 9 Mar 2022 • Xin Yu, Thiago Serra, Srikumar Ramalingam, Shandian Zhe
We propose a tractable heuristic for solving the combinatorial extension of OBS, in which we select weights for simultaneous removal, as well as a systematic update of the remaining weights.
1 code implementation • 8 Mar 2022 • Ming Wang, Beibei Lin, Xianda Guo, Lincheng Li, Zheng Zhu, Jiande Sun, Shunli Zhang, Xin Yu
ECM consists of the Spatial-Temporal feature extractor (ST), the Frame-Level feature extractor (FL) and SPB, and has two obvious advantages: First, each branch focuses on a specific representation, which can be used to improve the robustness of the network.
no code implementations • 8 Mar 2022 • Chuanfu Shen, Beibei Lin, Shunli Zhang, George Q. Huang, Shiqi Yu, Xin Yu
Also, we design an Inception-like ReverseMask Block, which has three branches composed of a global branch, a feature dropping branch, and a feature scaling branch.
Ranked #4 on Gait Recognition on OUMVLP
no code implementations • 4 Mar 2022 • Xin Yu, Yucheng Tang, Yinchi Zhou, Riqiang Gao, Qi Yang, Ho Hin Lee, Thomas Li, Shunxing Bao, Yuankai Huo, Zhoubing Xu, Thomas A. Lasko, Richard G. Abramson, Bennett A. Landman
Efficiently quantifying renal structures can provide distinct spatial context and facilitate biomarker discovery for kidney morphology.
no code implementations • 23 Dec 2021 • Guangming Yao, Hongzhi Wu, Yi Yuan, Lincheng Li, Kun Zhou, Xin Yu
In this paper, we present a novel double diffusion based neural radiance field, dubbed DD-NeRF, to reconstruct human body geometry and render the human body appearance in novel views from a sparse set of images.
no code implementations • 6 Dec 2021 • Suzhen Wang, Lincheng Li, Yu Ding, Xin Yu
Hence, we propose a novel one-shot talking face generation framework by exploring consistent correlations between audio and visual motions from a specific speaker and then transferring audio-driven motion fields to a reference image.
1 code implementation • 16 Oct 2021 • Xin Yu, Jeroen van Baar, Siheng Chen
We use a coarse graph, derived from a dense graph, to estimate the human's 3D pose, and the dense graph to estimate the 3D shape.
Ranked #269 on 3D Human Pose Estimation on Human3.6M
1 code implementation • ICCV 2021 • Jing Zhang, Deng-Ping Fan, Yuchao Dai, Xin Yu, Yiran Zhong, Nick Barnes, Ling Shao
In this paper, we introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
no code implementations • 7 Sep 2021 • Minghui Zhang, Xin Yu, Hanxiao Zhang, Hao Zheng, Weihao Yu, Hong Pan, Xiangran Cai, Yun Gu
Compared to other state-of-the-art transfer learning methods, our method accurately segmented more bronchi in the noisy CT scans.
no code implementations • ICCV 2021 • Haitian Zeng, Yuchao Dai, Xin Yu, Xiaohan Wang, Yi Yang
As NRSfM is a highly under-constrained problem, we propose two new pairwise regularization to further regularize the reconstruction.
no code implementations • 2 Aug 2021 • Yang Zhang, Xin Yu, Xiaobo Lu, Ping Liu
Specifically, we design a novel cross-modal transformer module for facial priors estimation, in which an input face and its landmark features are formulated as queries and keys, respectively.
1 code implementation • 20 Jul 2021 • Suzhen Wang, Lincheng Li, Yu Ding, Changjie Fan, Xin Yu
As this keypoint based representation models the motions of facial regions, head, and backgrounds integrally, our method can better constrain the spatial and temporal consistency of the generated videos.
1 code implementation • CVPR 2021 • Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang
First, we propose a complementary cascaded network architecture, namely CCN, to remove rain streaks and raindrops in a unified framework.
Ranked #5 on Video deraining on Video Waterdrop Removal Dataset
no code implementations • 3 Jun 2021 • Ho Hin Lee, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Leon Y. Cai, Lucas W. Remedios, Bennett A. Landman, Yuankai Huo
Medical image segmentation, or computing voxelwise semantic masks, is a fundamental yet challenging task to compute a voxel-level semantic mask.
1 code implementation • 31 May 2021 • Yuan Gan, Yawei Luo, Xin Yu, Bang Zhang, Yi Yang
In this paper, we investigate the task of hallucinating an authentic high-resolution (HR) human face from multiple low-resolution (LR) video snapshots.
no code implementations • ICLR 2021 • Heming Du, Xin Yu, Liang Zheng
In this paper, we introduce a Visual Transformer Network (VTNet) for learning informative visual representation in navigation.
1 code implementation • 16 Apr 2021 • Lincheng Li, Suzhen Wang, Zhimeng Zhang, Yu Ding, Yixing Zheng, Xin Yu, Changjie Fan
To be specific, our framework consists of a speaker-independent stage and a speaker-specific stage.
no code implementations • CVPR 2021 • Zongxin Yang, Xin Yu, Yi Yang
In the first step, the framework learns to segment objects from real and synthetic data in a weakly-supervised fashion, and the segmentation masks will act as a prior for pose estimation.
1 code implementation • CVPR 2021 • Yujiao Shi, Hongdong Li, Xin Yu
We then warp and aggregate source view pixels to synthesize a novel view based on the estimated source-view visibility and target-view depth.
no code implementations • ICCV 2021 • Peike Li, Xin Yu, Yi Yang
By iteratively updating the latent representations and our decoder, our DAP-FSR will be adapted to the target domain, thus achieving authentic and high-quality upsampled HR faces.
no code implementations • CVPR 2021 • Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li
Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions.
1 code implementation • 2 Mar 2021 • Yujiao Shi, Dylan Campbell, Xin Yu, Hongdong Li
Specifically, we observe that when a 3D point in the real world is visible in both views, there is a deterministic mapping between the projected points in the two-view images given the height information of this 3D point.
1 code implementation • NeurIPS 2021 • Thiago Serra, Xin Yu, Abhinav Kumar, Srikumar Ramalingam
We can compress a rectifier network while exactly preserving its underlying functionality with respect to a given input domain if some of its neurons are stable.
1 code implementation • 3 Feb 2021 • Yuhang Ding, Xin Yu, Yi Yang
Thus, it is more desirable to employ only a few labeled data in pursuing high segmentation performance.
no code implementations • 22 Jan 2021 • Gerard Kennedy, Zheyu Zhuang, Xin Yu, Robert Mahony
Object pose estimation from a single RGB image is a challenging problem due to variable lighting conditions and viewpoint changes.
1 code implementation • ICCV 2021 • Yuhang Ding, Xin Yu, Yi Yang
In this work, we propose a Region-aware Fusion Network (RFNet) that is able to exploit different combinations of multi-modal data adaptively and effectively for tumor segmentation.
Ranked #75 on Semantic Segmentation on NYU Depth v2
no code implementations • 10 Dec 2020 • Jing Zhang, Yuchao Dai, Xin Yu, Mehrtash Harandi, Nick Barnes, Richard Hartley
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy.
no code implementations • ICCV 2021 • Beibei Lin, Shunli Zhang, Xin Yu
Towards this goal, we take advantage of both global visual information and local region details and develop a Global and Local Feature Extractor (GLFE).
2 code implementations • NeurIPS 2020 • Dongxu Li, Chenchen Xu, Xin Yu, Kaihao Zhang, Ben Swift, Hanna Suominen, Hongdong Li
Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.
no code implementations • 4 Oct 2020 • Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam
We propose a novel technique to register sparse 3D scans in the absence of texture.
1 code implementation • ECCV 2020 • Heming Du, Xin Yu, Liang Zheng
Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).
1 code implementation • 1 Jul 2020 • Yizhak Ben-Shabat, Xin Yu, Fatemeh Sadat Saleh, Dylan Campbell, Cristian Rodriguez-Opazo, Hongdong Li, Stephen Gould
The availability of a large labeled dataset is a key requirement for applying deep learning methods to solve various computer vision tasks.
1 code implementation • CVPR 2020 • Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li
Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (e. g., satellite) images.
1 code implementation • CVPR 2020 • Jing Zhang, Xin Yu, Aixuan Li, Peipei Song, Bowen Liu, Yuchao Dai
In this paper, we propose a weakly-supervised salient object detection model to learn saliency from such annotations.
no code implementations • CVPR 2020 • Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li
To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features.
no code implementations • CVPR 2020 • Yang Zhang, Ivor Tsang, Yawei Luo, Changhui Hu, Xiaobo Lu, Xin Yu
This paper proposes a Copy and Paste Generative Adversarial Network (CPGAN) to recover authentic high-resolution (HR) face images while compensating for low and non-uniform illumination.
no code implementations • 10 Feb 2020 • Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li
In this paper, we aim to reduce such errors by incorporating the distances between pixels and keypoints into our objective.
no code implementations • 9 Feb 2020 • Yang Zhang, Ivor W. Tsang, Jun Li, Ping Liu, Xiaobo Lu, Xin Yu
The coarse-level FHnet generates a frontal coarse HR face and then the fine-level FHnet makes use of the facial component appearance prior, i. e., fine-grained facial components, to attain a frontal HR face image with authentic details.
1 code implementation • NeurIPS 2019 • Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li
The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama.
Ranked #5 on Image-Based Localization on VIGOR Cross Area
3 code implementations • 24 Oct 2019 • Dongxu Li, Cristian Rodriguez Opazo, Xin Yu, Hongdong Li
Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.
Ranked #4 on Sign Language Recognition on WLASL100
1 code implementation • 11 Jul 2019 • Yujiao Shi, Xin Yu, Liu Liu, Tong Zhang, Hongdong Li
This paper proposes a novel Cross-View Feature Transport (CVFT) technique to explicitly establish cross-view domain transfer that facilitates feature alignment between ground and aerial images.
no code implementations • 13 Jun 2019 • Siddhant Ranade, Xin Yu, Shantnu Kakkar, Pedro Miraldo, Srikumar Ramalingam
In contrast to correspondence based methods, we take a different viewpoint and formulate the sparse 3D registration problem based on the constraints from the intersection of line segments from adjacent scans.
2 code implementations • CVPR 2019 • Yurun Tian, Xin Yu, Bin Fan, Fuchao Wu, Huub Heijnen, Vassileios Balntas
Despite the fact that Second Order Similarity (SOS) has been used with significant success in tasks such as graph matching and clustering, it has not been exploited for learning local descriptors.
no code implementations • 7 Apr 2019 • Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz
We develop an Identity-preserving Face Recovery from Portraits (IFRP) method that utilizes a Style Removal network (SRN) and a Discriminative Network (DN).
no code implementations • 7 Apr 2019 • Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz
%Our method can recover high-quality photorealistic faces from unaligned portraits while preserving the identity of the face images as well as it can reconstruct a photorealistic face image with a desired set of attributes.
1 code implementation • 12 Mar 2019 • Liyuan Pan, Richard Hartley, Cedric Scheerlinck, Miaomiao Liu, Xin Yu, Yuchao Dai
Based on the abundant event data alongside a low frame rate, easily blurred images, we propose a simple yet effective approach to reconstruct high-quality and high frame rate sharp videos.
1 code implementation • CVPR 2019 • Liyuan Pan, Cedric Scheerlinck, Xin Yu, Richard Hartley, Miaomiao Liu, Yuchao Dai
In this paper, we propose a simple and effective approach, the \textbf{Event-based Double Integral (EDI)} model, to reconstruct a high frame-rate, sharp video from a single blurry frame and its event data.