no code implementations • EMNLP (SpLU) 2020 • Chao Xu, Emmanuelle-Anna Dietz Saldanha, Dagmar Gromann, Beihai Zhou
We propose an automated spatial semantic analysis (ASSA) framework building on grammar and cognitive linguistic theories to identify spatial entities and relations, bringing together methods of spatial information extraction and cognitive frameworks on spatial language.
no code implementations • LT4HALA (LREC) 2022 • Bin Li, Yiguo Yuan, Jingya Lu, Minxuan Feng, Chao Xu, Weiguang Qu, Dongbo Wang
This paper presents the results of the First Ancient Chinese Word Segmentation and POS Tagging Bakeoff (EvaHan), which was held at the Second Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) 2022, in the context of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022).
no code implementations • 20 Nov 2024 • Ning Ding, Yehui Tang, Haochen Qin, Zhenli Zhou, Chao Xu, Lin Li, Kai Han, Heng Liao, Yunhe Wang
This is made possible by utilizing an alternative method for feature transformation to replace the linear projection of fully-connected layers.
no code implementations • 5 Nov 2024 • Chao Xu, Xijia Tang, Guoqing Liu, Yuhua Qian, Chenping Hou
It enables the model to effectively track the evolving data distributions.
no code implementations • 2 Nov 2024 • Zekun Hong, Shinya Sugiura, Chao Xu, Lajos Hanzo
A precoded orthogonal time frequency space (OTFS) modulation scheme relying on faster-than-Nyquist (FTN) transmission over doubly selective fading channels is {proposed}, which enhances the spectral efficiency and improves the Doppler resilience.
no code implementations • 31 Oct 2024 • Xiang Deng, Youxin Pang, Xiaochen Zhao, Chao Xu, Lizhen Wang, Hongjiang Xiao, Shi Yan, Hongwen Zhang, Yebin Liu
This paper introduces Stereo-Talker, a novel one-shot audio-driven human video synthesis system that generates 3D talking videos with precise lip synchronization, expressive body gestures, temporally consistent photo-realistic quality, and continuous viewpoint control.
no code implementations • 9 Oct 2024 • Jiajia Huang, Haoran Zhu, Chao Xu, Tianming Zhan, Qianqian Xie, Jimin Huang
To overcome these challenges, this study introduces AuditWen, an open-source audit LLM by fine-tuning Qwen with constructing instruction data from audit domain.
no code implementations • 7 Oct 2024 • Chonghao Zhong, Chao Xu
Therefore, we propose TeX-NeRF, a 3D reconstruction method using only infrared images, which introduces the object material emissivity as a priori, preprocesses the infrared images using Pseudo-TeX vision, and maps the temperatures (T), emissivities (e), and textures (X) of the scene into the saturation (S), hue (H), and value (V) channels of the HSV color space, respectively.
2 code implementations • 27 Sep 2024 • Bin Wang, Chao Xu, Xiaomeng Zhao, Linke Ouyang, Fan Wu, Zhiyuan Zhao, Rui Xu, Kaiwen Liu, Yuan Qu, FuKai Shang, Bo Zhang, Liqun Wei, Zhihao Sui, Wei Li, Botian Shi, Yu Qiao, Dahua Lin, Conghui He
Document content analysis has been a crucial research area in computer vision.
no code implementations • 19 Aug 2024 • Chao Xu, Ang Li, Linghao Chen, Yulin Liu, Ruoxi Shi, Hao Su, Minghua Liu
The diffusion model is trained to jointly predict surrogate representations for camera poses and multi-view images of the object under known poses, integrating all information from the input sparse views.
no code implementations • 19 Aug 2024 • Minghua Liu, Chong Zeng, Xinyue Wei, Ruoxi Shi, Linghao Chen, Chao Xu, Mengqi Zhang, Zhaoning Wang, Xiaoshuai Zhang, Isabella Liu, Hongzhi Wu, Hao Su
The input normal maps can be predicted by 2D diffusion models, significantly aiding in the guidance and refinement of the geometry's learning.
no code implementations • 18 Aug 2024 • Chao Xu, Mingze Sun, Zhi-Qi Cheng, Fei Wang, Yang Liu, Baigui Sun, Ruqi Huang, Alexander Hauptmann
For the former, we propose to pre-train on data regarding a fixed identity with neutral emotion, and defer the incorporation of customizable conditions (identity and emotion) to fine-tuning stage, which is boosted by our novel X-Adapter for parameter-efficient fine-tuning.
no code implementations • 12 Aug 2024 • Phuc V. Trinh, Shinya Sugiura, Chao Xu, Lajos Hanzo
This model facilitates adaptive ORIS beam-width control through linear, quadratic, and focusing phase shifts, which are capable of effectively mitigating the detrimental effects of beam broadening and pointing errors (PE).
1 code implementation • 11 Jul 2024 • Yushuo Chen, Zerong Zheng, Zhe Li, Chao Xu, Yebin Liu
We present a novel pipeline for learning high-quality triangular human avatars from multi-view videos.
1 code implementation • 30 Jun 2024 • Yuchuan Tian, Jianhong Han, Hanting Chen, Yuanyuan Xi, Guoyang Zhang, Jie Hu, Chao Xu, Yunhe Wang
We have conducted experiments on Instruct-IPT to demonstrate the effectiveness of our method on manifold tasks, and we have effectively extended our method to diffusion denoisers as well.
Ranked #1 on Single Image Desnowing on CSD
2 code implementations • 17 Jun 2024 • Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao
Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data.
1 code implementation • 12 Jun 2024 • Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu, Tong Lu, Yali Wang, LiMin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai
In this paper, we introduce OmniCorpus, a 10 billion-scale image-text interleaved dataset.
no code implementations • 4 Jun 2024 • Conghui He, Wei Li, Zhenjiang Jin, Chao Xu, Bin Wang, Dahua Lin
The dispersion of data sources and diversity of data formats often lead to inefficiencies in data retrieval and processing, significantly impeding the progress of AI research and applications.
1 code implementation • 28 May 2024 • Botao He, Ze Wang, Yuan Zhou, Jingxi Chen, Chahat Deep Singh, Haojia Li, Yuman Gao, Shaojie Shen, Kaiwei Wang, Yanjun Cao, Chao Xu, Yiannis Aloimonos, Fei Gao, Cornelia Fermuller
These event cameras' output is dependent on both motion and texture.
1 code implementation • 4 May 2024 • Yuchuan Tian, Zhijun Tu, Hanting Chen, Jie Hu, Chao Xu, Yunhe Wang
Diffusion Transformers (DiTs) introduce the transformer architecture to diffusion tasks for latent-space image generation.
1 code implementation • 25 Apr 2024 • Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang
Compared to both open-source and proprietary models, InternVL 1. 5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks.
Ranked #11 on Visual Question Answering on MM-Vet v2
1 code implementation • 23 Apr 2024 • Bin Wang, Zhuangcheng Gu, Guang Liang, Chao Xu, Bo Zhang, Botian Shi, Conghui He
To better utilize the UniMER dataset, the paper proposes a Universal Mathematical Expression Recognition Network (UniMERNet), tailored to the characteristics of formula recognition.
no code implementations • 12 Apr 2024 • Siming Shan, Pengkai Wang, Song Chen, Jiaxu Liu, Chao Xu, Shengze Cai
The use of machine learning in fluid dynamics is becoming more common to expedite the computation when solving forward and inverse problems of partial differential equations.
1 code implementation • 1 Apr 2024 • Jiazheng Xing, Chao Xu, Yijie Qian, Yang Liu, Guang Dai, Baigui Sun, Yong liu, Jingdong Wang
However, the clothing identity uncontrollability and training inefficiency of existing diffusion-based methods, which struggle to maintain the identity even with full parameter training, are significant limitations that hinder the widespread applications.
no code implementations • 28 Mar 2024 • Mingze Sun, Chao Xu, Xinyu Jiang, Yang Liu, Baigui Sun, Ruqi Huang
Furthermore, we introduce the HoCo holistic communication dataset, which is a valuable resource for future research.
3 code implementations • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin
The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).
Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)
no code implementations • CVPR 2024 • Jianping Jiang, Xinyu Zhou, Bingxuan Wang, Xiaoming Deng, Chao Xu, Boxin Shi
Experiments on real-world data demonstrate that EvRGBHand can effectively solve the challenging issues when using either type of camera alone via retaining the merits of both, and shows the potential of generalization to outdoor scenes and another type of event camera.
1 code implementation • CVPR 2024 • Chao Xu, Yang Liu, Jiazheng Xing, Weida Wang, Mingze Sun, Jun Dan, Tianxin Huang, Siyuan Li, Zhi-Qi Cheng, Ying Tai, Baigui Sun
In this paper, we abstract the process of people hearing speech, extracting meaningful cues, and creating various dynamically audio-consistent talking faces, termed Listening and Imagining, into the task of high-fidelity diverse talking faces generation from a single audio.
no code implementations • 28 Feb 2024 • Chu Zhou, Minggui Teng, Xinyu Zhou, Chao Xu, Boxin Sh
However, since the on-chip micro-polarizers block part of the light so that the sensor often requires a longer exposure time, the captured polarized images are prone to motion blur caused by camera shakes, leading to noticeable degradation in the computed DoP and AoP.
1 code implementation • 8 Feb 2024 • Dongyang Liu, Renrui Zhang, Longtian Qiu, Siyuan Huang, Weifeng Lin, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng Jin, Kaipeng Zhang, Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li, Yu Qiao, Peng Gao
We propose SPHINX-X, an extensive Multimodality Large Language Model (MLLM) series developed upon SPHINX.
Ranked #13 on Video Question Answering on MVBench
no code implementations • CVPR 2024 • Xiang Deng, Zerong Zheng, Yuxiang Zhang, Jingxiang Sun, Chao Xu, Xiaodong Yang, Lizhen Wang, Yebin Liu
This paper focuses on advancing the applicability of human avatar learning methods by proposing RAM-Avatar which learns a Real-time photo-realistic Avatar that supports full-body control from Monocular videos.
no code implementations • CVPR 2024 • Yunkai Tang, Chengxuan Zhu, Renjie Wan, Chao Xu, Boxin Shi
Among the numerous efforts towards digitally recovering the physical world Neural Radiance Fields (NeRFs) have proved effective in most cases.
no code implementations • CVPR 2024 • Yifei Xia, Chu Zhou, Chengxuan Zhu, Minggui Teng, Chao Xu, Boxin Shi
The removal of atmospheric turbulence is crucial for long-distance imaging.
no code implementations • CVPR 2024 • Xinyu Zhou, Peiqi Duan, Boyu Li, Chu Zhou, Chao Xu, Boxin Shi
In this paper we leverage the event camera to facilitate the separation of direct and global components enabling video-rate separation of high quality.
1 code implementation • CVPR 2024 • Yuchuan Tian, Hanting Chen, Chao Xu, Yunhe Wang
Alternatively we leverage the flexibility of graphs and propose the Image Processing GNN (IPG) model to break the rigidity that dominates previous SR methods.
Ranked #9 on Image Super-Resolution on Urban100 - 4x upscaling
no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao
We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.
1 code implementation • NeurIPS 2023 • Yuchuan Tian, Hanting Chen, Tianyu Guo, Chao Xu, Yunhe Wang
To this end, we propose a Rank-based PruninG (RPG) method to maintain the ranks of sparse weights in an adversarial manner.
no code implementations • CVPR 2024 • Minghua Liu, Ruoxi Shi, Linghao Chen, Zhuoyang Zhang, Chao Xu, Xinyue Wei, Hansheng Chen, Chong Zeng, Jiayuan Gu, Hao Su
Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts.
no code implementations • 12 Nov 2023 • Chao Xu, Yu Yang, Rongzhao Wang, Guan Wang, Bojia Lin
Multi-Stage Classifier (MSC) - several classifiers working sequentially in an arranged order and classification decision is partially made at each step - is widely used in industrial applications for various resource limitation reasons.
no code implementations • 24 Oct 2023 • Qing Miao, Xiaohe Wu, Chao Xu, Yanli Ji, WangMeng Zuo, Yiwen Guo, Zhaopeng Meng
By incorporating auxiliary information from CLIP and utilizing prompt fine-tuning, we effectively eliminate noisy samples from the clean set and mitigate confirmation bias during training.
1 code implementation • 23 Oct 2023 • Ruoxi Shi, Hansheng Chen, Zhuoyang Zhang, Minghua Liu, Chao Xu, Xinyue Wei, Linghao Chen, Chong Zeng, Hao Su
We report Zero123++, an image-conditioned diffusion model for generating 3D-consistent multi-view images from a single input view.
2 code implementations • 26 Sep 2023 • Pan Zhang, Xiaoyi Dong, Bin Wang, Yuhang Cao, Chao Xu, Linke Ouyang, Zhiyuan Zhao, Haodong Duan, Songyang Zhang, Shuangrui Ding, Wenwei Zhang, Hang Yan, Xinyue Zhang, Wei Li, Jingwen Li, Kai Chen, Conghui He, Xingcheng Zhang, Yu Qiao, Dahua Lin, Jiaqi Wang
We propose InternLM-XComposer, a vision-language large model that enables advanced image-text comprehension and composition.
Ranked #9 on Visual Question Answering (VQA) on InfiMM-Eval
1 code implementation • 28 Aug 2023 • Yang Liu, Cheng Yu, Lei Shang, Yongyi He, Ziheng Wu, Xingjun Wang, Chao Xu, Haoyu Xie, Weida Wang, Yuze Zhao, Lin Zhu, Chen Cheng, Weitao Chen, Yuan YAO, Wenmeng Zhou, Jiaqi Xu, Qiang Wang, Yingda Chen, Xuansong Xie, Baigui Sun
In this paper, we present FaceChain, a personalized portrait generation framework that combines a series of customized image-generation model and a rich set of face-related perceptual understanding models (\eg, face detection, deep face embedding extraction, and facial attribute recognition), to tackle aforementioned challenges and to generate truthful personalized portraits, with only a handful of portrait images as input.
1 code implementation • 21 Aug 2023 • Conghui He, Zhenjiang Jin, Chao Xu, Jiantao Qiu, Bin Wang, Wei Li, Hang Yan, Jiaqi Wang, Dahua Lin
The rise in popularity of ChatGPT and GPT-4 has significantly accelerated the development of large models, leading to the creation of numerous impressive large language models(LLMs) and multimodal large language models (MLLMs).
no code implementations • 3 Aug 2023 • Jiazheng Xing, Chao Xu, Mengmeng Wang, Guang Dai, Baigui Sun, Yong liu, Jingdong Wang, Jian Zhao
To tackle these issues, we introduce MA-FSAR, a framework that employs the Parameter-Efficient Fine-Tuning (PEFT) technique to enhance the CLIP visual encoder in terms of action-related temporal and semantic representations.
1 code implementation • NeurIPS 2023 • Minghua Liu, Chao Xu, Haian Jin, Linghao Chen, Mukund Varma T, Zexiang Xu, Hao Su
Single image 3D reconstruction is an important but challenging task that requires extensive knowledge of our natural world.
1 code implementation • 1 Jun 2023 • Ning Ding, Yehui Tang, Zhongqian Fu, Chao Xu, Kai Han, Yunhe Wang
We present a new learning paradigm in which the knowledge extracted from large pre-trained models are utilized to help models like CNN and ViT learn enhanced representations and achieve better performance.
3 code implementations • 29 May 2023 • Yuchuan Tian, Hanting Chen, Xutao Wang, Zheyuan Bai, Qinghua Zhang, Ruifeng Li, Chao Xu, Yunhe Wang
Recent releases of Large Language Models (LLMs), e. g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts.
no code implementations • 4 May 2023 • Chao Xu, Shaoting Zhu, Junwei Zhu, Tianxin Huang, Jiangning Zhang, Ying Tai, Yong liu
More specifically, given a textured face as the source and the rendered face projected from the desired 3DMM coefficients as the target, our proposed Texture-Geometry-aware Diffusion Model decomposes the complex transfer problem into multi-conditional denoising process, where a Texture Attention-based module accurately models the correspondences between appearance and geometry cues contained in source and target conditions, and incorporate extra implicit information for high-fidelity talking face generation.
no code implementations • CVPR 2023 • Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong liu
Specifically, we supplement the emotion style in text prompts and use an Aligned Multi-modal Emotion encoder to embed the text, image, and audio emotion modality into a unified space, which inherits rich semantic prior from CLIP.
no code implementations • 17 Apr 2023 • Jiaxu Liu, Song Chen, Shengze Cai, Chao Xu
In this paper, we investigate a distributed aggregative optimization problem in a network, where each agent has its own local cost function which depends not only on the local state variable but also on an aggregated function of state variables from all agents.
1 code implementation • 12 Apr 2023 • Jiefeng Li, Siyuan Bian, Chao Xu, Zhicun Chen, Lixin Yang, Cewu Lu
To address these issues, this paper presents a novel hybrid inverse kinematics solution, HybrIK, that integrates the merits of 3D keypoint estimation and body mesh recovery in a unified framework.
Ranked #1 on 3D Human Reconstruction on AGORA
1 code implementation • CVPR 2023 • Xuhai Chen, Jiangning Zhang, Chao Xu, Yabiao Wang, Chengjie Wang, Yong liu
Most of the existing blind image Super-Resolution (SR) methods assume that the blur kernels are space-invariant.
no code implementations • 30 Mar 2023 • Yuxuan Zhang, Chao Xu, Howard H. Yang, Xijun Wang, Tony Q. S. Quek
This paper proposes a client selection (CS) method to tackle the communication bottleneck of federated learning (FL) while concurrently coping with FL's data heterogeneity issue.
no code implementations • 8 Mar 2023 • Jiaxu Liu, Song Chen, Shengze Cai, Chao Xu
The vanilla fractional order gradient descent may oscillatively converge to a region around the global minimum instead of converging to the exact minimum point, or even diverge, in the case where the objective function is strongly convex.
no code implementations • 9 Feb 2023 • Shuying Gan, Marie Siew, Chao Xu, Tony Q. S. Quek
Mobile edge computing (MEC) is a promising paradigm to meet the quality of service (QoS) requirements of latency-sensitive IoT applications.
1 code implementation • CVPR 2023 • Ning Ding, Yehui Tang, Kai Han, Chao Xu, Yunhe Wang
Recently, the sizes of deep neural networks and training datasets both increase drastically to pursue better performance in a practical sense.
no code implementations • CVPR 2023 • Yakun Chang, Chu Zhou, Yuchen Hong, Liwen Hu, Chao Xu, Tiejun Huang, Boxin Shi
Capturing high frame rate and high dynamic range (HFR&HDR) color videos in high-speed scenes with conventional frame-based cameras is very challenging.
11 code implementations • 23 Nov 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang
The convolutional operation can only capture local information in a window region, which prevents performance from being further improved.
1 code implementation • CVPR 2023 • Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang, He Wang
Based on GAPartNet, we investigate three cross-category tasks: part segmentation, part pose estimation, and part-based object manipulation.
8 code implementations • 7 Nov 2022 • Hao-Shu Fang, Jiefeng Li, Hongyang Tang, Chao Xu, Haoyi Zhu, Yuliang Xiu, Yong-Lu Li, Cewu Lu
Accurate whole-body multi-person pose estimation and tracking is an important yet challenging topic in computer vision.
no code implementations • 31 Oct 2022 • Jiaming Liang, Chao Xu, Shengze Cai
By introducing a novel deep neural network based on recurrent Graph Optimal Transport, called GotFlow3D, we present an end-to-end solution to learn the 3D fluid flow motion from double-frame particle sets.
no code implementations • 28 Sep 2022 • Marie Siew, Shikhar Sharma, Zekai Li, Kun Guo, Chao Xu, Tania Lorido-Botran, Tony Q. S. Quek, Carlee Joe-Wong
In edge computing, users' service profiles are migrated due to user mobility.
1 code implementation • 19 Sep 2022 • Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
In this work, we present D&D (Learning Human Dynamics from Dynamic Camera), which leverages the laws of physics to reconstruct 3D human motion from the in-the-wild videos with a moving camera.
1 code implementation • 16 Sep 2022 • Peng Yu, Chao Xu, Albert Bifet, Jesse Read
Decision trees are well-known due to their ease of interpretability.
no code implementations • 27 Aug 2022 • Song Chen, Shengze Cai, Tehuan Chen, Chao Xu, Jian Chu
In this paper, we propose a novel nonlinear observer based on neural networks, called neural observer, for observation tasks of linear time-invariant (LTI) systems and uncertain nonlinear systems.
no code implementations • 23 May 2022 • Yadian Zhao, Zhenglin Yang, Chao Xu
Therefore, the aim of this study is to develop a dataset named NPU-BOLT for bolt object detection in natural scene images and open it to researchers for public use and further development.
1 code implementation • CVPR 2022 • Ning Ding, Yixing Xu, Yehui Tang, Chao Xu, Yunhe Wang, DaCheng Tao
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
no code implementations • 11 Apr 2022 • Haojie Liu, Daoxun Xia, Wei Jiang, Chao Xu
In order to mitigate the impact of large modality discrepancy existing in heterogeneous images, previous methods attempt to apply generative adversarial network (GAN) to generate the modality-consisitent data.
no code implementations • CVPR 2022 • Chao Xu, Jiangning Zhang, Miao Hua, Qian He, Zili Yi, Yong liu
This paper presents a novel Region-Aware Face Swapping (RAFSwap) network to achieve identity-consistent harmonious high-resolution face generation in a local-global manner: \textbf{1)} Local Facial Region-Aware (FRA) branch augments local identity-relevant features by introducing the Transformer to effectively model misaligned cross-scale semantic interaction.
no code implementations • 28 Feb 2022 • Chao Xu, Yixin Chen, He Wang, Song-Chun Zhu, Yixin Zhu, Siyuan Huang
We propose a novel learning framework for PartAfford, which discovers part-level representations by leveraging only the affordance set supervision and geometric primitive regularization, without dense supervision.
no code implementations • 12 Jan 2022 • Jiangning Zhang, Chao Xu, Jian Li, Yue Han, Yabiao Wang, Ying Tai, Yong liu
In the practical application of restoring low-resolution gray-scale images, we generally need to run three separate processes of image colorization, super-resolution, and dows-sampling operation for the target device.
no code implementations • NeurIPS 2021 • Chu Zhou, Minggui Teng, Yufei Han, Chao Xu, Boxin Shi
Haze, a common kind of bad weather caused by atmospheric scattering, decreases the visibility of scenes and degenerates the performance of computer vision algorithms.
8 code implementations • CVPR 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Yanxi Li, Chao Xu, Yunhe Wang
To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase.
no code implementations • 29 Sep 2021 • Lin Xinyang, Hanting Chen, Yixing Xu, Chao Xu, Xiaolin Gui, Yiping Deng, Yunhe Wang
We study the problem of learning from positive and unlabeled (PU) data in the federated setting, where each client only labels a little part of their dataset due to the limitation of resources and time.
no code implementations • 29 Sep 2021 • Qifang Zhao, Yu Jiang, Yuqing Liu, Meng Du, Qinghui Sun, Chao Xu, Huan Xu, Zhongyao Wang
Recommender (RS) and Advertising/Marketing Systems (AS) play the key roles in E-commerce companies like Amazaon and Alibaba.
10 code implementations • CVPR 2022 • Jianyuan Guo, Yehui Tang, Kai Han, Xinghao Chen, Han Wu, Chao Xu, Chang Xu, Yunhe Wang
Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial information.
4 code implementations • NeurIPS 2021 • Yehui Tang, Kai Han, Chang Xu, An Xiao, Yiping Deng, Chao Xu, Yunhe Wang
Transformer models have achieved great progress on computer vision tasks recently.
1 code implementation • 21 Jun 2021 • Xinyang Lin, Hanting Chen, Yixing Xu, Chao Xu, Xiaolin Gui, Yiping Deng, Yunhe Wang
We study the problem of learning from positive and unlabeled (PU) data in the federated setting, where each client only labels a little part of their dataset due to the limitation of resources and time.
1 code implementation • CVPR 2021 • Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang
Experiments on various datasets demonstrate that the student networks learned by the proposed method can achieve comparable performance with those using the original dataset.
no code implementations • CVPR 2022 • Yehui Tang, Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chao Xu, DaCheng Tao
We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers.
Ranked #8 on Efficient ViTs on ImageNet-1K (with DeiT-T)
1 code implementation • NeurIPS 2021 • Jiangning Zhang, Chao Xu, Jian Li, Wenzhou Chen, Yabiao Wang, Ying Tai, Shuo Chen, Chengjie Wang, Feiyue Huang, Yong liu
Inspired by biological evolution, we explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derive that both of them have consistent mathematical representation.
no code implementations • 29 May 2021 • Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, Chunjing Xu, Tong Zhang
The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.
no code implementations • 13 Apr 2021 • Chao Xu, Yiping Xie, Xijun Wang, Howard H. Yang, Dusit Niyato, Tony Q. S. Quek
cost), by integrating R-learning, a tabular reinforcement learning (RL) algorithm tailored for maximizing the long-term average reward, and traditional DRL algorithms, initially developed to optimize the discounted long-term cumulative reward rather than the average one.
1 code implementation • 11 Mar 2021 • Qianhao Wang, Yuman Gao, Jialin Ji, Chao Xu, Fei Gao
The visibility of targets determines performance and even success rate of various applications, such as active slam, exploration, and target tracking.
Trajectory Planning Robotics
no code implementations • 11 Mar 2021 • Neng Pan, Ruibin Zhang, Tiankai Yang, Chao Xu, Fei Gao
In recent years, several progressive works promote the development of aerial tracking.
Trajectory Planning Robotics
7 code implementations • CVPR 2021 • Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, DaCheng Tao, Chang Xu
Then, the manifold relationship between instances and the pruned sub-networks will be aligned in the training procedure.
1 code implementation • 10 Mar 2021 • Botao He, Haojia Li, Siyuan Wu, Dong Wang, Zhiwei Zhang, Qianli Dong, Chao Xu, Fei Gao
The bottleneck of solving this problem is the accurate perception of rapid dynamic objects.
Motion Compensation Robust Object Detection +1 Robotics
1 code implementation • 9 Mar 2021 • Hongkai Ye, Tianyu Liu, Chao Xu, Fei Gao
For real-time multirotor kinodynamic motion planning, the efficiency of sampling-based methods is usually hindered by difficult-to-sample homotopy classes like narrow passages.
Motion Planning Robotics
no code implementations • ICCV 2021 • Jin Han, Yixin Yang, Chu Zhou, Chao Xu, Boxin Shi
To reconstruct high-resolution intensity images from event data, we propose EvIntSR-Net that converts event data to multiple latent intensity frames to achieve super-resolution on intensity images in this paper.
6 code implementations • CVPR 2021 • Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao
To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.
Ranked #1 on Single Image Deraining on Rain100L (using extra training data)
no code implementations • NeurIPS 2020 • Chu Zhou, Hang Zhao, Jin Han, Chang Xu, Chao Xu, Tiejun Huang, Boxin Shi
A conventional camera often suffers from over- or under-exposure when recording a real-world scene with a very high dynamic range (HDR).
3 code implementations • CVPR 2021 • Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu
We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model, leading to a pixel-aligned 3D body mesh and a more accurate 3D pose than the pure 3D keypoint estimation methods.
Ranked #3 on 3D Human Pose Estimation on EMDB
1 code implementation • 8 Nov 2020 • Lizi Wang, Hongkai Ye, Qianhao Wang, Yuman Gao, Chao Xu, Fei Gao
In autonomous navigation of mobile robots, sensors suffer from massive occlusion in cluttered environments, leaving significant amount of space unknown during planning.
1 code implementation • 25 Oct 2020 • Jiangning Zhang, Xianfang Zeng, Chao Xu, Jun Chen, Yong liu, Yunliang Jiang
Audio-guided face reenactment aims to generate a photorealistic face that has matched facial expression with the input audio.
4 code implementations • NeurIPS 2020 • Yehui Tang, Yunhe Wang, Yixing Xu, DaCheng Tao, Chunjing Xu, Chao Xu, Chang Xu
To increase the reliability of the results, we prefer to have a more rigorous research design by including a scientific control group as an essential part to minimize the effect of all factors except the association between the filter and expected network output.
1 code implementation • NeurIPS 2020 • Zhaohui Yang, Yunhe Wang, Kai Han, Chunjing Xu, Chao Xu, DaCheng Tao, Chang Xu
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.
no code implementations • 3 Sep 2020 • Yindi Yang, Shun Zhang, Feifei Gao, Chao Xu, Jianpeng Ma, Octavia A. Dobre
In massive multiple-input multiple-output (MIMO) systems, the large number of antennas would bring a great challenge for the acquisition of the accurate channel state information, especially in the frequency division duplex mode.
2 code implementations • 20 Aug 2020 • Xin Zhou, Zhepei Wang, Chao Xu, Fei Gao
Gradient-based planners are widely used for quadrotor local planning, in which a Euclidean Signed Distance Field (ESDF) is crucial for evaluating gradient magnitude and direction.
Robotics
1 code implementation • ECCV 2020 • Jiangning Zhang, Chao Xu, Liang Liu, Mengmeng Wang, Xia Wu, Yong liu, Yunliang Jiang
The proposed DTVNet consists of two submodules: \emph{Optical Flow Encoder} (OFE) and \emph{Dynamic Video Generator} (DVG).
1 code implementation • 7 Jul 2020 • Jialin Ji, Xin Zhou, Chao Xu, Fei Gao
In this paper, we propose an efficient, receding horizon, local adaptive low-level planner as the middle layer between our original planner and controller.
Robotics
6 code implementations • CVPR 2021 • Zhaohui Yang, Yunhe Wang, Xinghao Chen, Jianyuan Guo, Wei zhang, Chao Xu, Chunjing Xu, DaCheng Tao, Chang Xu
To achieve an extremely fast NAS while preserving the high accuracy, we propose to identify the vital blocks and make them the priority in the architecture search.
no code implementations • 18 May 2020 • Jiangning Zhang, Liang Liu, Chao Xu, Yong liu
Recent works in the person re-identification task mainly focus on the model accuracy while ignore factors related to the efficiency, e. g. model size and latency, which are critical for practical application.
no code implementations • CVPR 2020 • Yehui Tang, Yunhe Wang, Yixing Xu, Hanting Chen, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu
A graph convolutional neural network is introduced to predict the performance of architectures based on the learned representations and their relation modeled by the graph.
no code implementations • 7 Mar 2020 • Hanting Chen, Yunhe Wang, Han Shu, Changyuan Wen, Chunjing Xu, Boxin Shi, Chao Xu, Chang Xu
To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators.
2 code implementations • 23 Feb 2020 • Yehui Tang, Yunhe Wang, Yixing Xu, Boxin Shi, Chao Xu, Chunjing Xu, Chang Xu
On one hand, massive trainable parameters significantly enhance the performance of these deep networks.
no code implementations • 17 Feb 2020 • Zhaohui Yang, Yunhe Wang, Chang Xu, Peng Du, Chao Xu, Chunjing Xu, Qi Tian
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
1 code implementation • CVPR 2020 • Tianyu Guo, Chang Xu, Jiajun Huang, Yunhe Wang, Boxin Shi, Chao Xu, DaCheng Tao
In contrast, it is more reasonable to treat the generated data as unlabeled, which could be positive or negative according to their quality.
7 code implementations • CVPR 2020 • Hanting Chen, Yunhe Wang, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu
The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.
no code implementations • arXiv 2019 • Zhaohui Yang, Miaojing Shi, Chao Xu, Vittorio Ferrari, Yannis Avrithis
Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set.
Ranked #29 on Weakly Supervised Object Detection on PASCAL VOC 2012 test (using extra training data)
no code implementations • NeurIPS 2019 • Tianyu Guo, Chang Xu, Boxin Shi, Chao Xu, DaCheng Tao
A worst-case formulation can be developed over this distribution set, and then be interpreted as a generation task in an adversarial manner.
1 code implementation • CVPR 2020 • Zhaohui Yang, Yunhe Wang, Xinghao Chen, Boxin Shi, Chao Xu, Chunjing Xu, Qi Tian, Chang Xu
Architectures in the population that share parameters within one SuperNet in the latest generation will be tuned over the training dataset with a few epochs.
no code implementations • 13 Jul 2019 • Yehui Tang, Shan You, Chang Xu, Boxin Shi, Chao Xu
Specifically, we exploit the unlabeled data to mimic the classification characteristics of giant networks, so that the original capacity can be preserved nicely.
no code implementations • 1 Jun 2019 • Robert Busa-Fekete, Krzysztof Dembczynski, Alexander Golovnev, Kalina Jasinska, Mikhail Kuznetsov, Maxim Sviridenko, Chao Xu
First, we show that finding a tree with optimal training cost is NP-complete, nevertheless there are some tractable special cases with either perfect approximation or exact solution that can be obtained in linear time in terms of the number of labels $m$.
no code implementations • 3 May 2019 • Xiong Deng, Chao Chen, Deyang Chen, Xiangbin Cai, Xiaozhe Yin, Chao Xu, Fei Sun, Caiwen Li, Yan Li, Han Xu, Mao Ye, Guo Tian, Zhen Fan, Zhipeng Hou, Minghui Qin, Yu Chen, Zhenlin Luo, Xubing Lu, Guofu Zhou, Lang Chen, Ning Wang, Ye Zhu, Xingsen Gao, Jun-Ming Liu
The limitation of commercially available single-crystal substrates and the lack of continuous strain tunability preclude the ability to take full advantage of strain engineering for further exploring novel properties and exhaustively studying fundamental physics in complex oxides.
Materials Science
no code implementations • 8 Apr 2019 • Yong Luo, Tongliang Liu, DaCheng Tao, Chao Xu
Therefore, we propose to weightedly combine the MC outputs of different views, and present the multi-view matrix completion (MVMC) framework for transductive multi-label image classification.
no code implementations • 8 Apr 2019 • Yong Luo, Tongliang Liu, DaCheng Tao, Chao Xu
In particular, DTDML learns a sparse combination of the base metrics to construct the target metric by forcing the target metric to be close to an integration of the source metrics.
no code implementations • 8 Apr 2019 • Yong Luo, DaCheng Tao, Chang Xu, Chao Xu, Hong Liu, Yonggang Wen
In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e. g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e. g. color, texture and shape).
no code implementations • 8 Apr 2019 • Yong Luo, Yonggang Wen, DaCheng Tao, Jie Gui, Chao Xu
The features used in many image analysis-based applications are frequently of very high dimension.
no code implementations • 4 Apr 2019 • Meng Liu, Chang Xu, Yong Luo, Chao Xu, Yonggang Wen, DaCheng Tao
Feature selection is beneficial for improving the performance of general machine learning tasks by extracting an informative subset from the high-dimensional features.
no code implementations • 4 Apr 2019 • Chang Xu, DaCheng Tao, Chao Xu
In this paper, we propose the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data.
3 code implementations • ICCV 2019 • Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, Qi Tian
Learning portable neural networks is very essential for computer vision for the purpose that pre-trained heavy deep models can be well applied on edge devices such as mobile phones and micro sensors.
no code implementations • 17 Dec 2018 • Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, DaCheng Tao
Experiments on benchmark datasets and well-trained networks suggest that the proposed algorithm is superior to state-of-the-art teacher-student learning methods in terms of computational and storage complexity.
no code implementations • NeurIPS 2018 • Yunhe Wang, Chang Xu, Chunjing Xu, Chao Xu, DaCheng Tao
A series of secondary filters can be derived from a primary filter.
no code implementations • 30 Jul 2018 • Tianyu Guo, Chang Xu, Shiyi He, Boxin Shi, Chao Xu, DaCheng Tao
In this way, a portable student network with significantly fewer parameters can achieve a considerable accuracy which is comparable to that of teacher network.
no code implementations • CVPR 2019 • Miaojing Shi, Zhaohui Yang, Chao Xu, Qijun Chen
Modern crowd counting methods employ deep neural networks to estimate crowd counts via crowd density regressions.
no code implementations • CVPR 2018 • Shuqin Xie, Zitian Chen, Chao Xu, Cewu Lu
We propose a training algorithm for this framework to address the different training demands of agent and environment.
1 code implementation • 23 Oct 2017 • Kai Han, Yunhe Wang, Chao Zhang, Chao Li, Chao Xu
High-dimensional data in many areas such as computer vision and machine learning tasks brings in computational and analytical difficulty.
1 code implementation • ICML 2017 • Yunhe Wang, Chang Xu, Chao Xu, DaCheng Tao
The filter is then re-configured to establish the mapping from original input to the new compact feature map, and the resulting network can preserve intrinsic information of the original network with significantly fewer parameters, which not only decreases the online memory for launching CNN but also accelerates the computation speed.
no code implementations • 25 Jul 2017 • Yunhe Wang, Chang Xu, Jiayan Qiu, Chao Xu, DaCheng Tao
In contrast to directly recognizing subtle weights or filters as redundant in a given CNN, this paper presents an evolutionary method to automatically eliminate redundant convolution filters.
no code implementations • CVPR 2017 • Weilong Peng, Zhiyong Feng, Chao Xu, Yong Su
As any pre-learnt subspace is not complete to handle the variety and details of faces and expressions, it covers a limited span of morphing.
no code implementations • 25 Jan 2017 • Shan You, Chang Xu, Yunhe Wang, Chao Xu, DaCheng Tao
This paper presents privileged multi-label learning (PrML) to explore and exploit the relationship between labels in multi-label learning problems.
no code implementations • NeurIPS 2016 • Yunhe Wang, Chang Xu, Shan You, DaCheng Tao, Chao Xu
Deep convolutional neural networks (CNNs) are successfully used in a number of applications.
no code implementations • 21 Sep 2016 • Chao Xu, Yanjing Wang, Thomas Studer
When we say "I know why he was late", we know not only the fact that he was late, but also an explanation of this fact.
no code implementations • 28 Apr 2016 • Chang Xu, DaCheng Tao, Chao Xu
An underlying assumption in conventional multi-view learning algorithms is that all views can be simultaneously accessed.
no code implementations • 19 Apr 2016 • Shan You, Chang Xu, Yunhe Wang, Chao Xu, DaCheng Tao
The core of SLL is to explore and exploit the relationships between new labels and past labels and then inherit the relationship into hypotheses of labels to boost the performance of new classifiers.
no code implementations • 19 Apr 2016 • Yunhe Wang, Chang Xu, Shan You, DaCheng Tao, Chao Xu
Here we study the extreme visual recovery problem, in which over 90\% of pixel values in a given image are missing.
no code implementations • 21 Jan 2016 • Weilong Peng, Zhiyong Feng, Chao Xu
In this paper, B-spline Shape from Motion and Shading (BsSfMS) is proposed to reconstruct continuous B-spline surface for multi-view face images, according to an assumption that shading and motion information in the images contain 1st- and 0th-order derivative of B-spline face respectively.
3 code implementations • 9 Feb 2015 • Yong Luo, DaCheng Tao, Yonggang Wen, Kotagiri Ramamohanarao, Chao Xu
As a consequence, the high order correlation information contained in the different views is explored and thus a more reliable common subspace shared by all features can be obtained.
no code implementations • 27 Nov 2014 • Tao Han, Chao Xu, Ryan Loxton, Lei Xie
This paper considers a new bi-objective optimization formulation for robust RGB-D visual odometry.
no code implementations • 26 Oct 2014 • Chang Xu, Tongliang Liu, DaCheng Tao, Chao Xu
We analyze the local Rademacher complexity of empirical risk minimization (ERM)-based multi-label learning algorithms, and in doing so propose a new algorithm for multi-label learning.
no code implementations • 20 Apr 2013 • Chang Xu, DaCheng Tao, Chao Xu
Notably, co-training style algorithms train alternately to maximize the mutual agreement on two distinct views of the data; multiple kernel learning algorithms exploit kernels that naturally correspond to different views and combine kernels either linearly or non-linearly to improve learning performance; and subspace learning algorithms aim to obtain a latent subspace shared by multiple views by assuming that the input views are generated from this latent subspace.