3 code implementations • ACL 2022 • Yanzeng Li, Jiangxia Cao, Xin Cong, Zhenyu Zhang, Bowen Yu, Hongsong Zhu, Tingwen Liu
Chinese pre-trained language models usually exploit contextual character information to learn representations, while ignoring the linguistics knowledge, e. g., word and sentence information.
5 code implementations • 14 Apr 2025 • Yuqian Fu, Xingyu Qiu, Bin Ren, Yanwei Fu, Radu Timofte, Nicu Sebe, Ming-Hsuan Yang, Luc van Gool, Kaijin Zhang, Qingpeng Nong, Xiugang Dong, Hong Gao, Xiangsheng Zhou, Jiancheng Pan, Yanxing Liu, Xiao He, Jiahao Li, Yuze Sun, Xiaomeng Huang, Zhenyu Zhang, Ran Ma, YuHan Liu, Zijian Zhuang, Shuai Yi, Yixiong Zou, Lingyi Hong, Mingxi Chen, Runze Li, Xingdong Sheng, Wenqiang Zhang, Weisen Chen, Yongxin Yan, Xinguo Chen, Yuanjie Shao, Zhengrong Zuo, Nong Sang, Hao Wu, Haoran Sun, Shuming Hu, Yan Zhang, Zhiguang Shi, Yu Zhang, Chao Chen, Tao Wang, Da Feng, Linhai Zhuo, Ziming Lin, Yali Huang, Jie Me, Yiming Yang, Mi Guo, Mingyuan Jiu, Mingliang Xu, Maomao Xiong, Qunshu Zhang, Xinyu Cao, Yuqing Yang, Dianmo Sheng, Xuanpu Zhao, Zhiyu Li, Xuyang Ding, Wenqian Li
Cross-Domain Few-Shot Object Detection (CD-FSOD) poses significant challenges to existing object detection and few-shot detection models when applied across domains.
Cross-Domain Few-Shot
Cross-Domain Few-Shot Object Detection
+3
1 code implementation • 7 Apr 2025 • Runjin Chen, Zhenyu Zhang, Junyuan Hong, Souvik Kundu, Zhangyang Wang
To address this issue, we investigate the internal reasoning structures of LLMs and categorize them into three primary thought types: execution, reflection, and transition thoughts.
no code implementations • 7 Apr 2025 • Zhenyu Zhang, Qianli Wang, Gang Liu, Feifei Gao, Pingzhi Fan
By designing co-prime numbers of subcarriers and time slots in different subframes, the difference in the responses of the subframes for a target can be used to estimate the distance and velocity of an out-of-range target.
no code implementations • 28 Mar 2025 • Haijie Yang, Zhenyu Zhang, Hao Tang, Jianjun Qian, Jian Yang
However, they often face challenges with temporal consistency, particularly in the talking head domain, where continuous changes in facial expressions intensify the level of difficulty.
1 code implementation • 23 Mar 2025 • Zefeng Zhang, Hengzhu Tang, Jiawei Sheng, Zhenyu Zhang, Yiming Ren, Zhenyang Li, Dawei Yin, Duohe Ma, Tingwen Liu
Multimodal Large Language Models excel in various tasks, yet often struggle with modality bias, where the model tends to rely heavily on a single modality and overlook critical information in other modalities, which leads to incorrect focus and generating irrelevant responses.
no code implementations • 12 Mar 2025 • Lei Liu, Yuchao Lu, Ling An, Huajie Liang, ChiChun Zhou, Zhenyu Zhang
As human activities intensify, environmental systems such as aquatic ecosystems and water treatment systems face increasingly complex pressures, impacting ecological balance, public health, and sustainable development, making intelligent anomaly monitoring essential.
no code implementations • 10 Mar 2025 • Zhangdi Liu, Ling An, Mengke Song, Zhuohang Yu, Shan Wang, Kezhen Qi, Zhenyu Zhang, ChiChun Zhou
The design of inorganic catalysts and the prediction of their catalytic efficiency are fundamental challenges in chemistry and materials science.
no code implementations • 4 Mar 2025 • Kui Huang, Mengke Song, Shuo Ba, Ling An, Huajie Liang, Huanxi Deng, Yang Liu, Zhenyu Zhang, ChiChun Zhou
On a real-world dataset of 4, 169 waste images, only 50 labeled samples were needed to accurately label thousands, improving classification accuracy by 29. 85% compared to supervised models.
1 code implementation • 24 Feb 2025 • Tianjin Huang, Haotian Hu, Zhenyu Zhang, Gaojie Jin, Xiang Li, Li Shen, Tianlong Chen, Lu Liu, Qingsong Wen, Zhangyang Wang, Shiwei Liu
This paper comprehensively evaluates several recently proposed optimizers for 4-bit training, revealing that low-bit precision amplifies sensitivity to learning rates and often causes unstable gradient norms, leading to divergence at higher learning rates.
no code implementations • 19 Feb 2025 • Yilong Chen, Junyuan Shang, Zhenyu Zhang, Yanxi Xie, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang
Large language models (LLMs) face inherent performance bottlenecks under parameter constraints, particularly in processing critical tokens that demand complex reasoning.
no code implementations • 19 Feb 2025 • Naibin Gu, Zhenyu Zhang, Xiyu Liu, Peng Fu, Zheng Lin, Shuohuan Wang, Yu Sun, Hua Wu, Weiping Wang, Haifeng Wang
Due to the demand for efficient fine-tuning of large language models, Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods.
1 code implementation • 11 Feb 2025 • Xialie Zhuang, Zhikai Jia, Jianjin Li, Zhenyu Zhang, Li Shen, Zheng Cao, Shiwei Liu
To address this, we propose Mask-Enhanced Autoregressive Prediction (MEAP), a simple yet effective training paradigm that seamlessly integrates Masked Language Modeling (MLM) into Next-Token Prediction (NTP) to enhance the latter's in-context retrieval capabilities.
no code implementations • 6 Jan 2025 • Rui Xie, Yinhong Liu, Penghao Zhou, Chen Zhao, Jun Zhou, Kai Zhang, Zhenyu Zhang, Jian Yang, Zhenheng Yang, Ying Tai
Image diffusion models have been adapted for real-world video super-resolution to tackle over-smoothing issues in GAN-based methods.
no code implementations • 2 Jan 2025 • Ao Gao, Luosong Guo, Tao Chen, Zhao Wang, Ying Tai, Jian Yang, Zhenyu Zhang
In this way, the proposed method tackles the limitation on initialization and optimization, leading to an efficient and accurate 3DGS modeling.
1 code implementation • 20 Dec 2024 • Xiantao Hu, Ying Tai, Xu Zhao, Chen Zhao, Zhenyu Zhang, Jun Li, Bineng Zhong, Jian Yang
These temporal information tokens are used to guide the localization of the target in the next time state, establish long-range contextual relationships between video frames, and capture the temporal trajectory of the target.
Ranked #4 on
Rgb-T Tracking
on LasHeR
1 code implementation • 16 Dec 2024 • Xiaokun Sun, Zeyu Cai, Ying Tai, Jian Yang, Zhenyu Zhang
We propose StrandHead, a novel text to 3D head avatar generation method capable of generating disentangled 3D hair with strand representation.
no code implementations • 12 Dec 2024 • Weiqi Li, Shijie Zhao, Chong Mou, Xuhan Sheng, Zhenyu Zhang, Qian Wang, Junlin Li, Li Zhang, Jian Zhang
As virtual reality gains popularity, the demand for controllable creation of immersive and dynamic omnidirectional videos (ODVs) is increasing.
no code implementations • 11 Dec 2024 • Tianxin Huang, Zhenyu Zhang, Ying Tai, Gim Hee Lee
According to experiments on both single images and video sequences, we demonstrate the effectiveness of our approach in modeling facial textures under challenging illumination affected by occlusions.
1 code implementation • 6 Dec 2024 • Hanqing Zhu, Zhenyu Zhang, Wenyan Cong, Xi Liu, Sem Park, Vikas Chandra, Bo Long, David Z. Pan, Zhangyang Wang, Jinwon Lee
This memory burden necessitates using more or higher-end GPUs or reducing batch sizes, limiting training scalability and throughput.
no code implementations • 2 Dec 2024 • Hao Yang, Zhenyu Zhang, Yanyan Zhao, Bing Qin
And in the real world, the quality of data usually varies for different samples, such noise is called data uncertainty.
no code implementations • 23 Nov 2024 • Haijie Yang, Zhenyu Zhang, Hao Tang, Jianjun Qian, Jian Yang
In this paper, we propose ConsistentAvatar, a novel framework for fully consistent and high-fidelity talking avatar generation.
no code implementations • 13 Nov 2024 • Farouq Sammour, Jia Xu, Xi Wang, Mo Hu, Zhenyu Zhang
Construction remains one of the most hazardous sectors.
no code implementations • 12 Nov 2024 • Zhuohang Yu, Ling An, Yansong Li, Yu Wu, Zeyu Dong, Zhangdi Liu, Le Gao, Zhenyu Zhang, ChiChun Zhou
The absence of explicit Feature Relation Patterns (FRPs) presents a significant challenge for deep learning techniques in scientific applications that are not image, text, and graph-based.
no code implementations • 17 Oct 2024 • Chuanyu Tang, Yilong Chen, Zhenyu Zhang, Junyuan Shang, Wenyuan Zhang, Yong Huang, Tingwen Liu
Low-Rank Adaptation (LoRA) drives research to align its performance with full fine-tuning.
1 code implementation • 2 Oct 2024 • Renkai Wu, Xianjin Wang, Pengchen Liang, Zhenyu Zhang, Qing Chang, Hao Tang
In addition, we organize and propose a dehaze dataset for robotic vision in urological surgery (USRobot-Dehaze dataset).
no code implementations • 2 Oct 2024 • Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Yu Sun, Hua Wu, Sen Su
To ensure that each specialized expert in the MoE model works as expected, we select a small amount of seed data that each expert excels to pre-optimize the router.
1 code implementation • 23 Aug 2024 • Zhenyu Zhang, Guangyao Chen, Yixiong Zou, Yuhua Li, Ruixuan Li
Few-shot open-set recognition (FSOR) is a challenging task that requires a model to recognize known classes and identify unknown classes with limited labeled data.
1 code implementation • 23 Aug 2024 • Zhenyu Zhang, Guangyao Chen, Yixiong Zou, Zhimeng Huang, Yuhua Li, Ruixuan Li
Humans exhibit a remarkable ability to learn quickly from a limited number of labeled samples, a capability that starkly contrasts with that of current machine learning systems.
1 code implementation • 17 Aug 2024 • Xiaokun Sun, Zhenyu Zhang, Ying Tai, Qian Wang, Hao Tang, Zili Yi, Jian Yang
In this paper, we propose Barbie, a novel framework for generating 3D avatars that can be dressed in diverse and high-quality Barbie-like garments and accessories.
2 code implementations • 7 Aug 2024 • Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, dianhai yu, Hua Wu
Large Language Models (LLMs) have ignited an innovative surge of AI applications, marking a new era of exciting possibilities equipped with extended context windows.
1 code implementation • 29 Jul 2024 • Taoyu Su, Xinghua Zhang, Jiawei Sheng, Zhenyu Zhang, Tingwen Liu
Other studies refine each uni-modal information with graph structures, but may introduce unnecessary relations in specific modalities.
no code implementations • 27 Jul 2024 • Tengyao Tu, Wei Zeng, Kun Zhao, Zhenyu Zhang
The result proves that adding a classifier to the model based on the random forest algorithm is very effective, and our model generally outperforms ordinary deep learning methods.
1 code implementation • 15 Jul 2024 • Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
Modern Large Language Models (LLMs) are composed of matrices with billions of elements, making their storage and processing quite demanding in terms of computational resources and memory usage.
2 code implementations • 11 Jul 2024 • Zhenyu Zhang, Ajay Jaiswal, Lu Yin, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
To address these limitations, we introduce Q-Galore, a novel approach that substantially reduces memory usage by combining quantization and low-rank projection, surpassing the benefits of GaLore.
no code implementations • 28 Jun 2024 • Guangba Yu, Gou Tan, Haojia Huang, Zhenyu Zhang, Pengfei Chen, Roberto Natella, Zibin Zheng
Moreover, this survey contributes to the field by providing a framework for fault diagnosis, evaluating the state-of-the-art in FI, and identifying areas for improvement in FI techniques to enhance the resilience of AI systems.
no code implementations • 20 Jun 2024 • Shijie Han, Zhenyu Zhang, Andrei Arsene Simion
Language models like BERT excel at sentence classification tasks due to extensive pre-training on general data, but their robustness to parameter corruption is unexplored.
no code implementations • 16 Jun 2024 • Zhenyu Zhang, Bingguang Hao, Jinpeng Li, Zekai Zhang, Dongyan Zhao
Most large language models (LLMs) are sensitive to prompts, and another synonymous expression or a typo may lead to unexpected results for the model.
no code implementations • 3 Jun 2024 • Yilong Chen, Linhao Zhang, Junyuan Shang, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun
Large language models (LLMs) with billions of parameters demonstrate impressive performance.
no code implementations • 24 May 2024 • Hanchen Tai, Qingdong He, Jiangning Zhang, Yijie Qian, Zhenyu Zhang, Xiaobin Hu, Xiangtai Li, Yabiao Wang, Yong liu
This framework is designed to perform understanding tasks for any 3D scene without requiring prior knowledge of the scene.
no code implementations • 29 Apr 2024 • Tianyidan Xie, Rui Ma, Qian Wang, Xiaoqian Ye, Feixuan Liu, Ying Tai, Zhenyu Zhang, Lanjun Wang, Zili Yi
In this framework, each agent is specialized in a distinct aspect, such as foreground understanding, diversity enhancement, object integrity protection, and textual prompt consistency.
no code implementations • 29 Apr 2024 • Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Weiran Xu, Yu Sun, Hua Wu
Large language models (LLMs) with one or more fine-tuning phases have become a necessary step to unlock various capabilities, enabling LLMs to follow natural language instructions or align with human preferences.
no code implementations • 11 Apr 2024 • Zeyu Zhang, Yuanshen Zhao, Jingxian Duan, Yaou Liu, Hairong Zheng, Dong Liang, Zhenyu Zhang, Zhi-Cheng Li
The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph.
1 code implementation • 2 Apr 2024 • Rui Xie, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Jian Yang, Ying Tai
Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs.
1 code implementation • 26 Mar 2024 • Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, DaCheng Tao
Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.
1 code implementation • 25 Mar 2024 • Bin Chen, Zhenyu Zhang, Weiqi Li, Chen Zhao, Jiwen Yu, Shijie Zhao, Jie Chen, Jian Zhang
To enable such memory-intensive end-to-end fine-tuning, we propose a novel two-level invertible design to transform both (1) multi-step sampling process and (2) noise estimation U-Net in each step into invertible networks.
no code implementations • CVPR 2024 • Zhiqiang Yan, Yuankai Lin, Kun Wang, Yupeng Zheng, YuFei Wang, Zhenyu Zhang, Jun Li, Jian Yang
Depth completion is a vital task for autonomous driving, as it involves reconstructing the precise 3D geometry of a scene from sparse and noisy depth measurements.
3 code implementations • 6 Mar 2024 • Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian
Our approach reduces memory usage by up to 65. 5% in optimizer states while maintaining both efficiency and performance for pre-training on LLaMA 1B and 7B architectures with C4 dataset with up to 19. 7B tokens, and on fine-tuning RoBERTa on GLUE tasks.
1 code implementation • 5 Mar 2024 • Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang
To address this problem, this paper introduces Multi-scale Positional Encoding (Ms-PoE) which is a simple yet effective plug-and-play approach to enhance the capacity of LLMs to handle the relevant information located in the middle of the context, without fine-tuning or introducing any additional overhead.
no code implementations • 19 Feb 2024 • Xuelin Qian, Yu Wang, Simian Luo, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue, Bo Zhao, Tiejun Huang, Yunsheng Wu, Yanwei Fu
In this paper, we extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
1 code implementation • 14 Feb 2024 • Harry Dong, Xinyu Yang, Zhenyu Zhang, Zhangyang Wang, Yuejie Chi, Beidi Chen
Many computational factors limit broader deployment of large language models.
no code implementations • 25 Jan 2024 • Maciej Besta, Florim Memedi, Zhenyu Zhang, Robert Gerstenberger, Guangyuan Piao, Nils Blach, Piotr Nyczyk, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Lukas Gianinazzi, Ales Kubicek, Hubert Niewiadomski, Aidan O'Mahony, Onur Mutlu, Torsten Hoefler
Among these, prompt engineering coupled with structures has emerged as a promising paradigm, with designs such as Chain-of-Thought, Tree of Thoughts, or Graph of Thoughts, in which the overall LLM reasoning is guided by a structure such as a graph.
1 code implementation • 10 Jan 2024 • Tianlong Chen, Zhenyu Zhang, Hanrui Wang, Jiaqi Gu, Zirui Li, David Z. Pan, Frederic T. Chong, Song Han, Zhangyang Wang
To address these two pain points, we propose QuantumSEA, an in-time sparse exploration for noise-adaptive quantum circuits, aiming to achieve two key objectives: (1) implicit circuits capacity during training - by dynamically exploring the circuit's sparse connectivity and sticking a fixed small number of quantum gates throughout the training which satisfies the coherence time and enjoy light noises, enabling feasible executions on real quantum devices; (2) noise robustness - by jointly optimizing the topology and parameters of quantum circuits under real device noise models.
2 code implementations • 22 Dec 2023 • Zhen Tan, Tianlong Chen, Zhenyu Zhang, Huan Liu
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains.
1 code implementation • 30 Nov 2023 • Shiyao Cui, Zhenyu Zhang, Yilong Chen, Wenyuan Zhang, Tianyun Liu, Siqi Wang, Tingwen Liu
The widespread of generative artificial intelligence has heightened concerns about the potential harms posed by AI-generated texts, primarily stemming from factoid, unfair, and toxic content.
no code implementations • 15 Nov 2023 • Zhaocong liu, Fa Zhang, Lin Cheng, Huanxi Deng, Xiaoyan Yang, Zhenyu Zhang, ChiChun Zhou
Addressing this, an unsupervised classification method with three key ideas is introduced: 1) dual-step feature dimensionality reduction using a pre-trained model and manifold learning, 2) a voting mechanism from multiple clustering algorithms, and 3) post-hoc instead of prior manual annotation.
1 code implementation • 2 Nov 2023 • Zhenyu Zhang, Benlu Wang, Weijie Liang, Yizhi Li, Xuechen Guo, Guanhong Wang, Shiyan Li, Gaoang Wang
With the development of multimodality and large language models, the deep learning-based technique for medical image captioning holds the potential to offer valuable diagnostic recommendations.
1 code implementation • 8 Oct 2023 • Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Gen Li, Ajay Jaiswal, Mykola Pechenizkiy, Yi Liang, Michael Bendersky, Zhangyang Wang, Shiwei Liu
Large Language Models (LLMs), renowned for their remarkable performance across diverse domains, present a challenge when it comes to practical deployment due to their colossal model size.
1 code implementation • 2 Oct 2023 • Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen
Sparsely activated Mixture-of-Experts (SMoE) has shown promise to scale up the learning capacity of neural networks, however, they have issues like (a) High Memory Usage, due to duplication of the network layers into multiple copies as experts; and (b) Redundancy in Experts, as common learning-based routing policies suffer from representational collapse.
1 code implementation • 1 Oct 2023 • Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Du
We propose Joint MLP/Attention (JoMA) dynamics, a novel mathematical framework to understand the training procedure of multilayer Transformer architectures.
1 code implementation • 5 Sep 2023 • Yuxiang Guo, Xiaopeng Gao, Zhenyu Zhang, W. K. Chan, Bo Jiang
These findings emphasize the effectiveness of transformer-based pre-trained models in JIT defect prediction tasks, especially in scenarios with limited training data.
no code implementations • 1 Sep 2023 • Zhiqiang Yan, Xiang Li, Le Hui, Zhenyu Zhang, Jun Li, Jian Yang
To tackle these challenges, we explore a repetitive design in our image guided network to gradually and sufficiently recover depth values.
no code implementations • 19 Aug 2023 • Kun Wang, Zhiqiang Yan, Huang Tian, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang
Neural Radiance Fields (NeRF) have shown promise in generating realistic novel views from sparse scene images.
no code implementations • 29 Jun 2023 • Zhenyu Zhang, Wenhao Chai, Zhongyu Jiang, Tian Ye, Mingli Song, Jenq-Neng Hwang, Gaoang Wang
Estimating 3D human poses only from a 2D human pose sequence is thoroughly explored in recent years.
2 code implementations • 24 Jun 2023 • Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen
Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens.
no code implementations • 8 Jun 2023 • Kun Wang, Zhiqiang Yan, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang
Our key contributions are: (1) We parameterize the geometry and appearance of the object using a multi-scale global feature extractor, which avoids frequent point-wise feature retrieval and camera dependency.
1 code implementation • 30 May 2023 • Tianjin Huang, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, Shiwei Liu
We hereby carry out a first-of-its-kind study unveiling that modern large-kernel ConvNets, a compelling competitor to Vision Transformers, are remarkably more effective teachers for small-kernel ConvNets, due to more similar architectures.
no code implementations • 9 May 2023 • Ming Cheng, Haoyu Ma, Qiufang Ma, Xiaopeng Sun, Weiqi Li, Zhenyu Zhang, Xuhan Sheng, Shijie Zhao, Junlin Li, Li Zhang
Multi-stage strategies are frequently employed in image restoration tasks.
no code implementations • 26 Apr 2023 • Xiaopeng Sun, Weiqi Li, Zhenyu Zhang, Qiufang Ma, Xuhan Sheng, Ming Cheng, Haoyu Ma, Shijie Zhao, Jian Zhang, Junlin Li, Li Zhang
Model A aims to enhance the feature extraction ability of 360{\deg} image positional information, while Model B further focuses on the high-frequency information of 360{\deg} images.
no code implementations • 26 Mar 2023 • Simian Luo, Xuelin Qian, Yanwei Fu, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue
Auto-Regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
no code implementations • CVPR 2023 • Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc van Gool
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.
1 code implementation • 3 Mar 2023 • Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang
In pursuit of a more general evaluation and unveiling the true potential of sparse algorithms, we introduce "Sparsity May Cry" Benchmark (SMC-Bench), a collection of carefully-curated 4 diverse tasks with 10 datasets, that accounts for capturing a wide range of domain-specific and sophisticated knowledge.
1 code implementation • 2 Mar 2023 • Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang
Despite their remarkable achievement, gigantic transformers encounter significant drawbacks, including exorbitant computational and memory footprints during training, as well as severe collapse evidenced by a high degree of parameter redundancy.
1 code implementation • 24 Feb 2023 • Ruisi Cai, Zhenyu Zhang, Zhangyang Wang
Given a robust model trained to be resilient to one or multiple types of distribution shifts (e. g., natural image corruptions), how is that "robustness" encoded in the model weights, and how easily can it be disentangled and/or "zero-shot" transferred to some other models?
no code implementations • CVPR 2023 • Zhenyu Zhang, Renwang Chen, Weijian Cao, Ying Tai, Chengjie Wang
To address this problem, this paper presents a novel Neural Proto-face Field (NPF) for unsupervised robust 3D face modeling.
no code implementations • CVPR 2023 • Tianxin Huang, Zhonggan Ding, Jiangning Zhang, Ying Tai, Zhenyu Zhang, Mingang Chen, Chengjie Wang, Yong liu
Specifically, we use the contrastive constraint to help CALoss learn a representation space with shape similarity, while we introduce the adversarial strategy to help CALoss mine differences between reconstructed results and ground truths.
no code implementations • ICCV 2023 • Simian Luo, Xuelin Qian, Yanwei Fu, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue
Auto-Regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.
no code implementations • 29 Nov 2022 • Bowen Yu, Zhenyu Zhang, Jingyang Li, Haiyang Yu, Tingwen Liu, Jian Sun, Yongbin Li, Bin Wang
Open Information Extraction (OpenIE) facilitates the open-domain discovery of textual facts.
no code implementations • 20 Nov 2022 • Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang
Unsupervised depth completion aims to recover dense depth from the sparse one without using the ground-truth annotation.
no code implementations • 9 Nov 2022 • Kaixiong Zhou, Zhenyu Zhang, Shengyuan Chen, Tianlong Chen, Xiao Huang, Zhangyang Wang, Xia Hu
Quantum neural networks (QNNs), an interdisciplinary field of quantum computing and machine learning, have attracted tremendous research interests due to the specific quantum advantages.
1 code implementation • NIPS 2022 • Mukund Varma T, Xuxi Chen, Zhenyu Zhang, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang
Improving the performance of deep networks in data-limited regimes has warranted much attention.
2 code implementations • CVPR 2023 • Zhida Feng, Zhenyu Zhang, Xintong Yu, Yewei Fang, Lanxin Li, Xuyi Chen, Yuxiang Lu, Jiaxiang Liu, Weichong Yin, Shikun Feng, Yu Sun, Li Chen, Hao Tian, Hua Wu, Haifeng Wang
Recent progress in diffusion models has revolutionized the popular technology of text-to-image generation.
Ranked #12 on
Text-to-Image Generation
on MS COCO
2 code implementations • 12 Oct 2022 • Qiming Peng, Yinxu Pan, Wenjin Wang, Bin Luo, Zhenyu Zhang, Zhengjie Huang, Teng Hu, Weichong Yin, Yongfeng Chen, Yin Zhang, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang
Recent years have witnessed the rise and success of pre-training techniques in visually-rich document understanding.
Ranked #2 on
Semantic entity labeling
on FUNSD
no code implementations • 20 Sep 2022 • Yang Wu, Pai Peng, Zhenyu Zhang, Yanyan Zhao, Bing Qin
At the low-level, we propose the progressive tri-modal attention, which can model the tri-modal feature interactions by adopting a two-pass strategy and can further leverage such interactions to significantly reduce the computation and memory complexity through reducing the input token length.
no code implementations • 14 Jul 2022 • Zhenyu Zhang, Bowen Yu, Haiyang Yu, Tingwen Liu, Cheng Fu, Jingyang Li, Chengguang Tang, Jian Sun, Yongbin Li
In this paper, we propose a Layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents (VRDs), so as to generate accurate responses in dialogue systems.
1 code implementation • 15 Jun 2022 • Tianlong Chen, huan zhang, Zhenyu Zhang, Shiyu Chang, Sijia Liu, Pin-Yu Chen, Zhangyang Wang
Certifiable robustness is a highly desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios, but often demands tedious computations to establish.
1 code implementation • 9 Jun 2022 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Yang Zhang, Shiyu Chang, Zhangyang Wang
For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89. 26%/73. 79%, 89. 26%/79. 03%, and 91. 41%/83. 22% sparsity, respectively.
1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, Zhangyang Wang
Trojan attacks threaten deep neural networks (DNNs) by poisoning them to behave normally on most samples, yet to produce manipulated results for inputs attached with a particular trigger.
no code implementations • NAACL 2022 • Zhenyu Zhang, Yuming Zhao, Meng Chen, Xiaodong He
Motivated by this, we propose a novel label anchored contrastive learning approach (denoted as LaCon) for language understanding.
1 code implementation • 18 Mar 2022 • Zhiqiang Yan, Xiang Li, Kun Wang, Zhenyu Zhang, Jun Li, Jian Yang
To deal with the PDC task, we train a deep network that takes both depth and image as inputs for the dense panoramic depth recovery.
1 code implementation • CVPR 2022 • Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang
However, a "head-to-toe assessment" regarding the extent of redundancy in ViTs, and how much we could gain by thoroughly mitigating such, has been absent for this field.
1 code implementation • ICLR 2022 • Tianlong Chen, Zhenyu Zhang, Pengjun Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang
We introduce two alternatives for sparse adversarial training: (i) static sparsity, by leveraging recent results from the lottery ticket hypothesis to identify critical sparse subnetworks arising from the early training; (ii) dynamic sparsity, by allowing the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training.
no code implementations • 26 Jan 2022 • Jian Li, Bin Zhang, Yabiao Wang, Ying Tai, Zhenyu Zhang, Chengjie Wang, Jilin Li, Xiaoming Huang, Yili Xia
Along with current multi-scale based detectors, Feature Aggregation and Enhancement (FAE) modules have shown superior performance gains for cutting-edge object detection.
Ranked #1 on
Face Detection
on WIDER Face (Medium)
no code implementations • CVPR 2022 • Zhenyu Zhang, Yanhao Ge, Ying Tai, Weijian Cao, Renwang Chen, Kunlin Liu, Hao Tang, Xiaoming Huang, Chengjie Wang, Zhifeng Xie, Dongjin Huang
This paper presents a novel Physically-guided Disentangled Implicit Rendering (PhyDIR) framework for high-fidelity 3D face modeling.
no code implementations • CVPR 2022 • Zhenyu Zhang, Yanhao Ge, Ying Tai, Xiaoming Huang, Chengjie Wang, Hao Tang, Dongjin Huang, Zhifeng Xie
In-the-wild 3D face modelling is a challenging problem as the predicted facial geometry and texture suffer from a lack of reliable clues or priors, when the input images are degraded.
1 code implementation • NeurIPS 2021 • Xuxi Chen, Tianlong Chen, Zhenyu Zhang, Zhangyang Wang
The lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork (i. e., winning ticket) instead of a full model for both training and inference, that can lower both costs without sacrificing the performance.
1 code implementation • 18 Oct 2021 • Zhenyu Zhang, Yewei Gu, Xiaowei Yi, Xianfeng Zhao
As increasing development of text-to-speech (TTS) and voice conversion (VC) technologies, the detection of synthetic speech has been suffered dramatically.
1 code implementation • EMNLP 2021 • Xinghua Zhang, Bowen Yu, Tingwen Liu, Zhenyu Zhang, Jiawei Sheng, Mengge Xue, Hongbo Xu
Distantly supervised named entity recognition (DS-NER) efficiently reduces labor costs but meanwhile intrinsically suffers from the label noise due to the strong assumption of distant supervision.
2 code implementations • 6 Oct 2021 • Yewei Gu, Zhenyu Zhang, Xiaowei Yi, Xianfeng Zhao
To realize any-to-any (A2A) voice conversion (VC), most methods are to perform symmetric self-supervised reconstruction tasks (Xi to Xi), which usually results in inefficient performances due to inadequate feature decoupling, especially for unseen speakers.
no code implementations • 22 Sep 2021 • Zhenyu Zhang, Tao Guo, Meng Chen
DialogueBERT was pre-trained with 70 million dialogues in real scenario, and then fine-tuned in three different downstream dialogue understanding tasks.
2 code implementations • ICCV 2021 • Kun Wang, Zhenyu Zhang, Zhiqiang Yan, Xiang Li, Baobei Xu, Jun Li, Jian Yang
Monocular depth estimation aims at predicting depth from a single image or video.
no code implementations • 29 Jul 2021 • Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang
However, blurry guidance in the image and unclear structure in the depth still impede the performance of the image guided frameworks.
Ranked #2 on
Depth Completion
on KITTI Depth Completion
1 code implementation • CVPR 2021 • Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang
Non-parametric face modeling aims to reconstruct 3D face only from images without shape assumptions.
1 code implementation • 6 Jun 2021 • Zhenyu Zhang, Xuxi Chen, Tianlong Chen, Zhangyang Wang
We observe that a high-quality winning ticket can be found with training and pruning the dense network on the very compact PrAC set, which can substantially save training iterations for the ticket finding process.
1 code implementation • ICLR 2021 • Xuxi Chen, Zhenyu Zhang, Yongduo Sui, Tianlong Chen
In this work, we for the first time study the existence of such trainable matching subnetworks in deep GANs.
no code implementations • 22 May 2021 • Zhenyu Zhang, Yuanyuan Dong, Keping Long, Xiyuan Wang, Xiaoming Dai
Decentralized baseband processing (DBP) architecture, which partitions the base station antennas into multiple antenna clusters, has been recently proposed to alleviate the excessively high interconnect bandwidth, chip input/output data rates, and detection complexity for massive multi-user multiple-input multiple-output (MU-MIMO) systems.
1 code implementation • 21 Apr 2021 • Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng Li, Thomas Tanay, Fenglong Song, Wentao Chao, Qiang Guo, Yan Liu, Jiang Li, Xiaochao Qu, Dewang Hou, Jiayu Yang, Lyn Jiang, Di You, Zhenyu Zhang, Chong Mou, Iaroslav Koshelev, Pavel Ostyakov, Andrey Somov, Jia Hao, Xueyi Zou, Shijie Zhao, Xiaopeng Sun, Yiting Liao, Yuanzhi Zhang, Qing Wang, Gen Zhan, Mengxi Guo, Junlin Li, Ming Lu, Zhan Ma, Pablo Navarrete Michelini, Hai Wang, Yiyun Chen, Jingyu Guo, Liliang Zhang, Wenming Yang, Sijung Kim, Syehoon Oh, Yucong Wang, Minjie Cai, Wei Hao, Kangdi Shi, Liangyan Li, Jun Chen, Wei Gao, Wang Liu, XiaoYu Zhang, Linjie Zhou, Sixin Lin, Ru Wang
This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results.
1 code implementation • 16 Apr 2021 • Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang
However, the BN layer is costly to calculate and is typically implemented with non-binary parameters, leaving a hurdle for the efficient implementation of BNN training.
Ranked #179 on
Image Classification
on CIFAR-100
no code implementations • 11 Mar 2021 • Liying Zhang, Leiqiang Li, Chenxiao Zhao, Shunfang Li, Jinfeng Jia, Zhenyu Zhang, Yu Jia, Ping Cui
The atomistic growth mechanisms and nontrivial topology of stanene as presented here are also discussed in connection with recent experimental findings.
Materials Science
no code implementations • ICLR 2021 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang
In view of those, we introduce two pruning options, e. g., top-down and bottom-up, for finding lifelong tickets.
no code implementations • ICLR 2021 • Tianlong Chen, Zhenyu Zhang, Sijia Liu, Shiyu Chang, Zhangyang Wang
A recent study (Rice et al., 2020) revealed overfitting to be a dominant phenomenon in adversarially robust training of deep networks, and that appropriate early-stopping of adversarial training (AT) could match the performance gains of most recent algorithmic improvements.
no code implementations • COLING 2020 • Bowen Yu, Xue Mengge, Zhenyu Zhang, Tingwen Liu, Wang Yubin, Bin Wang
Dependency trees have been shown to be effective in capturing long-range relations between target entities.
no code implementations • COLING 2020 • Zhenyu Zhang, Bowen Yu, Xiaobo Shu, Tingwen Liu, Hengzhu Tang, Wang Yubin, Li Guo
Document-level relation extraction (RE) poses new challenges over its sentence-level counterpart since it requires an adequate comprehension of the whole document and the multi-hop reasoning ability across multiple sentences to reach the final result.
1 code implementation • EMNLP 2020 • Mengge Xue, Bowen Yu, Zhenyu Zhang, Tingwen Liu, Yue Zhang, Bin Wang
More recently, Named Entity Recognition hasachieved great advances aided by pre-trainingapproaches such as BERT.