no code implementations • ECCV 2020 • Zhetong Liang, Shi Guo, Hong Gu, Huaqi Zhang, Lei Zhang
On one hand, most of the models are trained on video sequences with synthetic noise.
no code implementations • ECCV 2020 • Lida Li, Kun Wang, Shuai Li, Xiangchu Feng, Lei Zhang
The 2D convolutional (Conv2d) layer is the fundamental element to a deep convolutional neural network (CNN).
no code implementations • ECCV 2020 • Hongwei Yong, Jianqiang Huang, Deyu Meng, Xian-Sheng Hua, Lei Zhang
To make a deeper understanding of BN, in this work we prove that BN actually introduces a certain level of noise into the sample mean and variance during the training process, while the noise level depends only on the batch size.
no code implementations • 14 Sep 2023 • Lei Zhang, Zhengkun Tian, Xiang Chen, Jiaming Sun, Hongyu Xiang, Ke Ding, Guanglu Wan
To address this issue, we draw inspiration from the multifaceted capabilities of LLMs and Whisper, and focus on integrating multiple ASR text processing tasks related to speech recognition into the ASR model.
no code implementations • 5 Sep 2023 • Bojia Zi, Xianbiao Qi, Lingzhi Wang, Jianan Wang, Kam-Fai Wong, Lei Zhang
In this paper, we present Delta-LoRA, which is a novel parameter-efficient approach to fine-tune large language models (LLMs).
1 code implementation • 28 Aug 2023 • Tao Yang, Peiran Ren, Xuansong Xie, Lei Zhang
However, the existing methods along this line either fail to keep faithful pixel-wise image structures or resort to extra skipped connections to reproduce details, which requires additional training in image space and limits their extension to other related tasks in latent space such as image stylization.
1 code implementation • ICCV 2023 • Jie Yang, Ailing Zeng, Feng Li, Shilong Liu, Ruimao Zhang, Lei Zhang
Click-Pose explores how user feedback can cooperate with a neural keypoint detector to correct the predicted keypoints in an interactive way for a faster and more effective annotation process.
1 code implementation • 16 Aug 2023 • Haiyuan Zhao, Lei Zhang, Jun Xu, Guohao Cai, Zhenhua Dong, Ji-Rong Wen
In the video recommendation, watch time is commonly adopted as an indicator of user interest.
no code implementations • 16 Aug 2023 • Xinghua Xue, Cheng Liu, Bo Liu, Haitong Huang, Ying Wang, Tao Luo, Lei Zhang, Huawei Li, Xiaowei Li
When it is applied on fault-tolerant neural networks enhanced with fault-aware retraining and constrained activation functions, the resulting model accuracy generally shows significant improvement in presence of various faults.
1 code implementation • ICCV 2023 • Yichen Yuan, Yifan Wang, Lijun Wang, Xiaoqi Zhao, Huchuan Lu, Yu Wang, Weibo Su, Lei Zhang
Recent leading zero-shot video object segmentation (ZVOS) works devote to integrating appearance and motion information by elaborately designing feature fusion modules and identically applying them in multiple feature stages.
1 code implementation • ICCV 2023 • jianqi ma, Zhetong Liang, Wangmeng Xiang, Xi Yang, Lei Zhang
Scene Text Image Super-resolution (STISR) aims to recover high-resolution (HR) scene text images with visually pleasant and readable text content from the given low-resolution (LR) input.
1 code implementation • ICCV 2023 • Wentong Li, Yuqian Yuan, Song Wang, Jianke Zhu, Jianshu Li, Jian Liu, Lei Zhang
Weakly-supervised image segmentation has recently attracted increasing research attentions, aiming to avoid the expensive pixel-wise labeling.
no code implementations • ICCV 2023 • Hongyang Li, Hao Zhang, Zhaoyang Zeng, Shilong Liu, Feng Li, Tianhe Ren, Lei Zhang
Existing feature lifting approaches, such as Lift-Splat-based and 2D attention-based, either use estimated depth to get pseudo LiDAR features and then splat them to a 3D space, which is a one-pass operation without feature refinement, or ignore depth and lift features by 2D attention mechanisms, which achieve finer semantics while suffering from a depth ambiguity problem.
1 code implementation • ICCV 2023 • Binglu Wang, Lei Zhang, Zhaozhong Wang, Yongqiang Zhao, Tianfei Zhou
This paper presents CORE, a conceptually simple, effective and communication-efficient model for multi-agent cooperative perception.
no code implementations • 11 Jul 2023 • Bin Du, He Zhang, Xiangle Cheng, Lei Zhang
The network structure reflects the edge-cloud computing topology and is trained to minimize the expectation of the cost function for unconstrained continuous optimization problems.
1 code implementation • 10 Jul 2023 • Feng Li, Hao Zhang, Peize Sun, Xueyan Zou, Shilong Liu, Jianwei Yang, Chunyuan Li, Lei Zhang, Jianfeng Gao
In this paper, we introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
1 code implementation • 8 Jul 2023 • Liqi Xue, Tianyi Xu, Yongbao Song, Yan Liu, Lei Zhang, XianTong Zhen, Jun Xu
But the majority of media images on the internet remain in 8-bit standard dynamic range (SDR) format.
1 code implementation • 6 Jul 2023 • Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang
Video moment retrieval pursues an efficient and generalized solution to identify the specific temporal segments within an untrimmed video that correspond to a given language description.
no code implementations • 4 Jul 2023 • Yiyang Liao, Lei Zhang, Ziye Jia, Chao Dong, Yifan Zhang, Qihui Wu, Huiling Hu, Bin Wang
However, due to the limited frequency of ADS-B technique, UAVs equipped with ADS-B devices result in the loss of packets to both UAVs and civil aviation.
1 code implementation • 3 Jul 2023 • Jing Lin, Ailing Zeng, Shunlin Lu, Yuanhao Cai, Ruimao Zhang, Haoqian Wang, Lei Zhang
In this paper, we present Motion-X, a large-scale 3D expressive whole-body motion dataset.
no code implementations • 25 Jun 2023 • Lei Zhang, Dong Li, Olha Jurečková, Mark Stamp
We find that the steganographic capacity of the learning models tested is surprisingly high, and that in each case, there is a clear threshold after which model performance rapidly degrades.
no code implementations • 21 Jun 2023 • Yukun Huang, Jianan Wang, Yukai Shi, Xianbiao Qi, Zheng-Jun Zha, Lei Zhang
Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled text-to-3D content creation by optimizing a randomly initialized Neural Radiance Fields (NeRF) with score distillation.
no code implementations • 15 Jun 2023 • Xianbiao Qi, Jianan Wang, Lei Zhang
This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively.
1 code implementation • 12 Jun 2023 • Tianhe Ren, Shilong Liu, Feng Li, Hao Zhang, Ailing Zeng, Jie Yang, Xingyu Liao, Ding Jia, Hongyang Li, He Cao, Jianan Wang, Zhaoyang Zeng, Xianbiao Qi, Yuhui Yuan, Jianwei Yang, Lei Zhang
To address this issue, we develop a unified, highly modular, and lightweight codebase called detrex, which supports a majority of the mainstream DETR-based instance recognition algorithms, covering various fundamental tasks, including object detection, segmentation, and pose estimation.
1 code implementation • 6 Jun 2023 • Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang
We are releasing the RAM at \url{https://recognize-anything. github. io/} to foster the advancements of large models in computer vision.
1 code implementation • 6 Jun 2023 • Peggy Tang, Junbin Gao, Lei Zhang, Zhiyong Wang
Recently, compressive text summarisation offers a balance between the conciseness issue of extractive summarisation and the factual hallucination issue of abstractive summarisation.
1 code implementation • CVPR 2023 • Yuxiang Wei, Zhilong Ji, Xiaohe Wu, Jinfeng Bai, Lei Zhang, WangMeng Zuo
Despite the progress in semantic image synthesis, it remains a challenging problem to generate photo-realistic parts from input semantic map.
no code implementations • 21 May 2023 • Yukun Huang, Jianan Wang, Ailing Zeng, He Cao, Xianbiao Qi, Yukai Shi, Zheng-Jun Zha, Lei Zhang
We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior.
1 code implementation • 20 May 2023 • Jie Yang, Bingliang Li, Fengyu Yang, Ailing Zeng, Lei Zhang, Ruimao Zhang
Extensive experiments demonstrate that DiffHOI significantly outperforms the state-of-the-art in regular detection (i. e., 41. 50 mAP) and zero-shot detection.
Ranked #2 on
Zero-Shot Human-Object Interaction Detection
on HICO-DET
(using extra training data)
Human-Object Interaction Detection
Zero-Shot Human-Object Interaction Detection
1 code implementation • journal 2023 • Zhitao Zeng, Pengwen Dai, Xuan Zhang, Lei Zhang, Xiaochun Cao
Human-object relationship detection reveals the fine-grained relationship between humans and objects, helping the comprehensive understanding of videos.
no code implementations • 28 Apr 2023 • Lei Zhang, Yuge Zhang, Kan Ren, Dongsheng Li, Yuqing Yang
In contrast, though human engineers have the incredible ability to understand tasks and reason about solutions, their experience and knowledge are often sparse and difficult to utilize by quantitative approaches.
no code implementations • 26 Apr 2023 • Kai Armstrong, Lei Zhang, Yan Wen, Alexander P. Willmott, Paul Lee, Xujioing Ye
In recent years the NHS has been having increased difficulty seeing all low-risk patients, this includes but not limited to suspected osteoarthritis (OA) patients.
2 code implementations • 25 Apr 2023 • Tianhe Ren, Jianwei Yang, Shilong Liu, Ailing Zeng, Feng Li, Hao Zhang, Hongyang Li, Zhaoyang Zeng, Lei Zhang
This work presents Focal-Stable-DINO, a strong and reproducible object detection model which achieves 64. 6 AP on COCO val2017 and 64. 8 AP on COCO test-dev using only 700M parameters without any test time augmentation.
Ranked #4 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2023 • Haoyu Wang, Guansong Pang, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang
Few-shot open-set recognition (FSOR) is a challenging task of great practical value.
1 code implementation • 19 Apr 2023 • Xianbiao Qi, Jianan Wang, Yihao Chen, Yukai Shi, Lei Zhang
In contrast to previous practical tricks that address training instability by learning rate warmup, layer normalization, attention formulation, and weight initialization, we show that Lipschitz continuity is a more essential property to ensure training stability.
1 code implementation • CVPR 2023 • Yihao Chen, Xianbiao Qi, Jianan Wang, Lei Zhang
In this way, we can reduce the GPU memory consumption of contrastive loss computation from $\bigO(B^2)$ to $\bigO(\frac{B^2}{N})$, where $B$ and $N$ are the batch size and the number of GPUs used for training.
no code implementations • 16 Apr 2023 • Fuxiang Huang, Lei Zhang
Interactive Image Retrieval (IIR) aims to retrieve images that are generally similar to the reference image but under the requested text modification.
1 code implementation • ICCV 2023 • Shilong Liu, Tianhe Ren, Jiayu Chen, Zhaoyang Zeng, Hao Zhang, Feng Li, Hongyang Li, Jun Huang, Hang Su, Jun Zhu, Lei Zhang
We point out that the unstable matching in DETR is caused by a multi-optimization path problem, which is highlighted by the one-to-one matching design in DETR.
1 code implementation • ICCV 2023 • Xuan Ju, Ailing Zeng, Chenchen Zhao, Jianan Wang, Lei Zhang, Qiang Xu
While such a plug-and-play approach is appealing, the inevitable and uncertain conflicts between the original images produced from the frozen SD branch and the given condition incur significant challenges for the learnable branch, which essentially conducts image feature editing for condition enforcement.
1 code implementation • CVPR 2023 • Mingjun Xu, Lingyun Qin, WeiJie Chen, ShiLiang Pu, Lei Zhang
In this work, we present an idea to remove non-causal factors from common features by multi-view adversarial training on source domains, because we observe that such insignificant non-causal factors may still be significant in other latent spaces (views) due to the multi-mode structure of data.
no code implementations • 31 Mar 2023 • Yuang Ai, Xiaoqiang Zhou, Huaibo Huang, Lei Zhang, Ran He
Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image Super-Resolution (SR) by accessing both the source and target data.
1 code implementation • CVPR 2023 • Jing Lin, Ailing Zeng, Haoqian Wang, Lei Zhang, Yu Li
It is challenging to perform this task with a single network due to resolution issues, i. e., the face and hands are usually located in extremely small regions.
Ranked #1 on
3D Human Reconstruction
on EHF
1 code implementation • 26 Mar 2023 • Minghan Li, Lei Zhang
As a result, the amount of pixel-wise annotations in existing video instance segmentation (VIS) datasets is small, limiting the generalization capability of trained VIS models.
1 code implementation • 26 Mar 2023 • Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Lei Zhang, Baochang Ma, Xiangang Li
However current research rarely studies the impact of different amounts of instruction data on model performance, especially in the real-world use cases.
1 code implementation • CVPR 2023 • Zhiyuan Ma, Xiangyu Zhu, GuoJun Qi, Zhen Lei, Lei Zhang
In this paper, we propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution so that each personalized avatar can be constructed from only one portrait as the reference.
1 code implementation • CVPR 2023 • Minghan Li, Shuai Li, Wangmeng Xiang, Lei Zhang
The proposed MDQE is the first VIS method with per-clip input that achieves state-of-the-art results on challenging videos and competitive performance on simple videos.
1 code implementation • CVPR 2023 • Du Chen, Jie Liang, Xindong Zhang, Ming Liu, Hui Zeng, Lei Zhang
A human guided GT image dataset with both positive and negative samples is then constructed, and a loss function is proposed to train the Real-ISR models.
1 code implementation • CVPR 2023 • Shuai Li, Minghan Li, Ruihuang Li, Chenhang He, Lei Zhang
The positive and negative weights of these soft anchors are dynamically adjusted during training so that they can contribute more to ``representation learning'' in the early training stage, and contribute more to ``duplicated prediction removal'' in the later stage.
1 code implementation • CVPR 2023 • Pengfei Wang, Zhaoxiang Zhang, Zhen Lei, Lei Zhang
In this paper, we present two conditions to ensure that the model could converge to a flat minimum with a small loss, and present an algorithm, named Sharpness-Aware Gradient Matching (SAGM), to meet the two conditions for improving model generalization capability.
1 code implementation • 18 Mar 2023 • Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu, Lei Zhang
They ignore two key problems when the encoder exchanges information with the decoder: one is the lack of interference control mechanism between them, the other is without considering the disparity of the contributions from different encoder levels.
1 code implementation • CVPR 2023 • Chenhang He, Ruihuang Li, Yabin Zhang, Shuai Li, Lei Zhang
Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the sequence and fuses them to detect the objects in the current frame.
1 code implementation • CVPR 2023 • Ruihuang Li, Chenhang He, Yabin Zhang, Shuai Li, Liyi Chen, Lei Zhang
Weakly supervised instance segmentation using only bounding box annotations has recently attracted much research attention.
Box-supervised Instance Segmentation
Semantic Segmentation
+1
1 code implementation • CVPR 2023 • Ruihuang Li, Chenhang He, Shuai Li, Yabin Zhang, Lei Zhang
The representative instance segmentation methods mostly segment different object instances with a mask of the fixed resolution, e. g., 28*28 grid.
2 code implementations • ICCV 2023 • Hao Zhang, Feng Li, Xueyan Zou, Shilong Liu, Chunyuan Li, Jianfeng Gao, Jianwei Yang, Lei Zhang
We present OpenSeeD, a simple Open-vocabulary Segmentation and Detection framework that jointly learns from different segmentation and detection datasets.
Ranked #2 on
Instance Segmentation
on ADE20K val
(using extra training data)
1 code implementation • CVPR 2023 • Hao Zhang, Feng Li, Huaizhe xu, Shijia Huang, Shilong Liu, Lionel M. Ni, Lei Zhang
We present a mask-piloted Transformer which improves masked-attention in Mask2Former for image segmentation.
1 code implementation • 13 Mar 2023 • Feng Li, Ailing Zeng, Shilong Liu, Hao Zhang, Hongyang Li, Lei Zhang, Lionel M. Ni
Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance.
no code implementations • 13 Mar 2023 • Tao Yang, Peiran Ren, Xuansong Xie, Lei Zhang
In supervised image restoration tasks, one key issue is how to obtain the aligned high-quality (HQ) and low-quality (LQ) training image pairs.
2 code implementations • 10 Mar 2023 • Xinyu Huang, Youcai Zhang, Jinyu Ma, Weiwei Tian, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Lei Zhang
This paper presents Tag2Text, a vision language pre-training (VLP) framework, which introduces image tagging into vision-language models to guide the learning of visual-linguistic features.
2 code implementations • 9 Mar 2023 • Shilong Liu, Zhaoyang Zeng, Tianhe Ren, Feng Li, Hao Zhang, Jie Yang, Chunyuan Li, Jianwei Yang, Hang Su, Jun Zhu, Lei Zhang
To effectively fuse language and vision modalities, we conceptually divide a closed-set detector into three phases and propose a tight fusion solution, which includes a feature enhancer, a language-guided query selection, and a cross-modality decoder for cross-modality fusion.
Ranked #1 on
Zero-Shot Object Detection
on ODinW
1 code implementation • CVPR 2023 • Xuan Ju, Ailing Zeng, Jianan Wang, Qiang Xu, Lei Zhang
Humans have long been recorded in a variety of forms since antiquity.
1 code implementation • ICCV 2023 • Yuxiang Wei, Yabo Zhang, Zhilong Ji, Jinfeng Bai, Lei Zhang, WangMeng Zuo
In addition to the unprecedented ability in imaginary creation, large text-to-image models are expected to take customized concepts in image generation.
no code implementations • 27 Feb 2023 • Shi Guo, Hongwei Yong, Xindong Zhang, jianqi ma, Lei Zhang
In this paper, we propose the spatial-frequency attention network (SFANet) to enhance the network's ability in exploiting long-range dependency.
no code implementations • 25 Feb 2023 • Hao Zhang, Hongyang Li, Ailing Zeng, Feng Li, Shilong Liu, Xingyu Liao, Lei Zhang
To address the second issue, we introduce an auxiliary learning task called Depth-aware Negative Suppression loss.
no code implementations • 21 Feb 2023 • Kenechi G. Omeke, Michael Mollel, Syed T. Shah, Lei Zhang, Qammer H. Abbasi, Muhammad Ali Imran
In this paper, we propose a sustainable scheme to improve the throughput and lifetime of underwater networks, enabling them to potentially operate indefinitely.
no code implementations • 20 Feb 2023 • Hao Lv, Bing Li, Lei Zhang, Cheng Liu, Ying Wang
The RRAM-based neuromorphic computing system has amassed explosive interests for its superior data processing capability and energy efficiency than traditional architectures, and thus being widely used in many data-centric applications.
no code implementations • 15 Feb 2023 • Lei Zhang, Mingliang Wang, Xin Zhou, Xingyu Wu, Yiming Cao, Yonghui Xu, Lizhen Cui, Zhiqi Shen
To address the issue, we propose a novel Dual Graph Multitask framework for imbalanced Delivery Time Estimation (DGM-DTE).
1 code implementation • 10 Feb 2023 • Lei Zhang, Xiaodong Yan, Jianshan He, Ruopeng Li, Wei Chu
Our experimental results show that our model effectively relieves the problem of over-smoothing in deep GCNs and outperforms the state-of-the-art (SOTA) methods on various benchmark datasets.
3 code implementations • 3 Feb 2023 • Jie Yang, Ailing Zeng, Shilong Liu, Feng Li, Ruimao Zhang, Lei Zhang
This paper presents a novel end-to-end framework with Explicit box Detection for multi-person Pose estimation, called ED-Pose, where it unifies the contextual learning between human-level (global) and keypoint-level (local) information.
Ranked #1 on
2D Human Pose Estimation
on Human-Art
no code implementations • 30 Jan 2023 • Yabin Zhang, Bin Deng, Ruihuang Li, Kui Jia, Lei Zhang
By updating the model against the adversarial statistics perturbation during training, we allow the model to explore the worst-case domain and hence improve its generalization performance.
no code implementations • 28 Jan 2023 • Lei Zhang, Kaixin Bai, Zhaopeng Chen, Yunlei Shi, Jianwei Zhang
In physical robotic experiments, our grasping framework grasped single known objects and novel complex-shaped household objects with a success rate of 90. 91%.
no code implementations • 28 Jan 2023 • Xin Wei, Lei Zhang, Jianwei Zhang, Junyou Wang, Wenjie Liu, Jiaqi Li, Xian Jiang
In addition, we build a high-quality acne detection dataset named ACNE-DET to verify the effectiveness of DSDH.
no code implementations • 18 Jan 2023 • Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei Zhang
Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (RL).
1 code implementation • ICCV 2023 • Liyi Chen, Chenyang Lei, Ruihuang Li, Shuai Li, Zhaoxiang Zhang, Lei Zhang
Without introducing any external supervision and human priors, the proposed FPR effectively suppresses wrong activations from the background objects.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
1 code implementation • CVPR 2023 • Fei Zhou, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang
Prototypical Network is a popular few-shot solver that aims at establishing a feature metric generalizable to novel few-shot classification (FSC) tasks using deep neural networks.
1 code implementation • ICCV 2023 • Song Guo, Lei Zhang, Xiawu Zheng, Yan Wang, Yuchao Li, Fei Chao, Chenglin Wu, Shengchuan Zhang, Rongrong Ji
In this paper, we try to solve this problem by introducing a principled and unified framework based on Information Bottleneck (IB) theory, which further guides us to an automatic pruning approach.
no code implementations • CVPR 2023 • Feng Li, Ailing Zeng, Shilong Liu, Hao Zhang, Hongyang Li, Lei Zhang, Lionel M. Ni
Recent DEtection TRansformer-based (DETR) models have obtained remarkable performance.
no code implementations • ICCV 2023 • Jiashuo Fan, Yaoyuan Liang, Leyao Liu, ShaoLun Huang, Lei Zhang
We evaluate our approach on two datasets and show that our proposed RCA-NOC approach outperforms state-of-the-art methods by a large margin, demonstrating its effectiveness in improving vision-language representation for novel object captioning.
1 code implementation • CVPR 2023 • Shuaizheng Liu, Xindong Zhang, Lingchen Sun, Zhetong Liang, Hui Zeng, Lei Zhang
In this work, we develop, for the first time to our best knowledge, an HDR image dataset by using mobile phone cameras, namely Mobile-HDR dataset.
no code implementations • ICCV 2023 • Lei Zhang, Zhibo Wang, Xiaowei Dong, Yunhe Feng, Xiaoyi Pang, Zhifei Zhang, Kui Ren
Network pruning aims to compress models while minimizing loss in accuracy.
1 code implementation • CVPR 2023 • Hongwei Yong, Ying Sun, Lei Zhang
Though the full-matrix preconditioned gradient methods theoretically have a lower regret bound, they are impractical for use to train DNNs because of the high complexity.
no code implementations • 28 Dec 2022 • He Cao, Jianan Wang, Tianhe Ren, Xianbiao Qi, Yihao Chen, Yuan YAO, Lei Zhang
We further provide a hypothesis on the implication of disentangling the generative backbone as an encoder-decoder structure and show proof-of-concept experiments verifying the effectiveness of a stronger encoder for generative tasks with ASymmetriC ENcoder Decoder (ASCEND).
2 code implementations • CVPR 2023 • Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu
Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones.
no code implementations • International Journal of Computer Vision 2022 • Zhenwei He, Lei Zhang, Xinbo Gao, David Zhang
Our proposed MAF has two distinct contributions: (1) The Hierarchical Domain Feature Alignment (HDFA) module is introduced to minimize the image-level domain disparity, where Scale Reduction Module (SRM) reduces the feature map size without information loss and increases the training efficiency.
1 code implementation • 10 Dec 2022 • Ruohao Wang, Xiaohui Liu, Zhilu Zhang, Xiaohe Wu, Chun-Mei Feng, Lei Zhang, WangMeng Zuo
On the other hand, alignment algorithms in existing VSR methods perform poorly for real-world videos, leading to unsatisfactory results.
2 code implementations • 3 Dec 2022 • Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Risheng Yu, Xiansheng Hua, Lei Zhang
In contrast to fully supervised methods using pixel-wise mask labels, box-supervised instance segmentation takes advantage of simple box annotations, which has recently attracted increasing research attention.
1 code implementation • 1 Dec 2022 • Ruibin Yuan, Hanzhi Yin, Yi Wang, Yifan He, Yushi Ye, Lei Zhang, Zhizheng Wu
We apply this technique to the automatic speaker verification (ASV) task as a proof of concept.
1 code implementation • 28 Nov 2022 • Shilong Liu, Yaoyuan Liang, Feng Li, Shijia Huang, Hao Zhang, Hang Su, Jun Zhu, Lei Zhang
As phrase extraction can be regarded as a $1$D text segmentation problem, we formulate PEG as a dual detection problem and propose a novel DQ-DETR model, which introduces dual queries to probe different features from image and text for object prediction and phrase mask prediction.
Ranked #6 on
Referring Expression Comprehension
on RefCOCO
no code implementations • 17 Nov 2022 • Yiyue Hu, Lei Zhang, Nan Mu, Lei Liu
To this end, we propose a parameter-efficient transformer to explore intrinsic inductive bias via position information for medical image segmentation.
no code implementations • 15 Nov 2022 • Shijia Huang, Feng Li, Hao Zhang, Shilong Liu, Lei Zhang, LiWei Wang
Our mutual supervision contains two directions.
1 code implementation • 13 Nov 2022 • Yabin Zhang, Jiehong Lin, Ruihuang Li, Kui Jia, Lei Zhang
Generally, we corrupt the point cloud with affine transformation and masking as input and learn an encoder-decoder model to reconstruct the original point cloud from its corrupted version.
1 code implementation • 4 Nov 2022 • Yan Wen, Lei Zhang, Xiangli Meng, Xujiong Ye
Besides the complex nature of colonoscopy frames with intrinsic frame formation artefacts such as light reflections and the diversity of polyp types/shapes, the publicly available polyp segmentation training datasets are limited, small and imbalanced.
no code implementations • 31 Oct 2022 • Lei Zhang, Shilin Zhou, Chen Gong, Zhenghua Li, Zhefeng Wang, Baoxing Huai, Min Zhang
Chinese word segmentation (CWS) models have achieved very high performance when the training data is sufficient and in-domain.
no code implementations • 19 Oct 2022 • Xinliang Liu, Bo Xu, Lei Zhang
Neural operators have emerged as a powerful tool for learning the mapping between infinite-dimensional parameter and solution spaces of partial differential equations (PDEs).
1 code implementation • 19 Oct 2022 • Lei Zhang, Xiaoke Wang, Michael Rawson, Radu Balan, Edward H. Herskovits, Elias Melhem, Linda Chang, Ze Wang, Thomas Ernst
Evaluation used simulated T1 and T2-weighted axial, coronal, and sagittal images unseen during training, as well as T1-weighted images with motion artifacts from real scans.
no code implementations • 16 Oct 2022 • Peggy Tang, Kun Hu, Lei Zhang, Jiebo Luo, Zhiyong Wang
Multimodal summarisation with multimodal output is drawing increasing attention due to the rapid growth of multimedia data.
1 code implementation • 15 Oct 2022 • Xiaoming Li, Shiguang Zhang, Shangchen Zhou, Lei Zhang, WangMeng Zuo
Generally, it is a challenging and intractable task to improve the photo-realistic performance of blind restoration and adaptively handle the generic and specific restoration scenarios with a single unified model.
1 code implementation • 9 Oct 2022 • Rang Meng, Xianfeng Li, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Mingli Song, Di Xie, ShiLiang Pu
Under this guidance, a novel Attention Diversification framework is proposed, in which Intra-Model and Inter-Model Attention Diversification Regularization are collaborated to reassign appropriate attention to diverse task-related features.
1 code implementation • 3 Oct 2022 • Xiaoming Li, Chaofeng Chen, Xianhui Lin, WangMeng Zuo, Lei Zhang
Notably, LQ face images, which may have the same degradation process as natural images, can be robustly restored with photo-realistic textures by exploiting their strong structural priors.
no code implementations • 13 Sep 2022 • Zhen Yu, Toan Nguyen, Yaniv Gal, Lie Ju, Shekhar S. Chandra, Lei Zhang, Paul Bonnington, Victoria Mar, Zhiyong Wang, ZongYuan Ge
Accordingly, the learned prototypes preserve the semantic class relations in the embedding space and we can predict the label of an image by assigning its feature to the nearest hyperbolic class prototype.
no code implementations • 6 Sep 2022 • Lei Zhang, Heung-Yeung Shum
This paper revisits the principle of uniform convergence in statistical learning, discusses how it acts as the foundation behind machine learning, and attempts to gain a better understanding of the essential problem that current deep learning algorithms are solving.
no code implementations • 1 Sep 2022 • Yifan Zhang, Ziye Jia, Chao Dong, Yuntian Liu, Lei Zhang, Qihui Wu
It is noted that the recurrent neural network (RNN) is available for the UAV trajectory prediction, in which the long short-term memory (LSTM) is specialized in dealing with the time-series data.
2 code implementations • ICCV 2023 • Wangmeng Xiang, Chao Li, Yuxuan Zhou, Biao Wang, Lei Zhang
More specifically, we employ a pre-trained large-scale language model as the knowledge engine to automatically generate text descriptions for body parts movements of actions, and propose a multi-modal training scheme by utilizing the text encoder to generate feature vectors for different body parts and supervise the skeleton encoder for action representation learning.
Ranked #4 on
Skeleton Based Action Recognition
on N-UCLA
1 code implementation • 27 Jul 2022 • Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Xian-Sheng Hua, Lei Zhang
For 3D video-based tasks such as action recognition, however, directly applying spatiotemporal transformers on video data will bring heavy computation and memory burdens due to the largely increased number of patches and the quadratic complexity of self-attention computation.
Ranked #8 on
Action Recognition
on Something-Something V1
1 code implementation • 24 Jul 2022 • Lei Zhang, Guanyu Gao, Huaizheng Zhang
Then, the learnt knowledge from edge clients will be aggregated by centralized parameter server, where the knowledge will be selectively and attentively distilled from spatial- and temporal-dimension with carefully designed mechanisms.
no code implementations • 21 Jul 2022 • Jianwei Zhang, Dong Li, Lituan Wang, Lei Zhang
To address the problem, an improved augmentation search strategy, named Augmented Density Matching, was proposed by randomly sampling policies from a prior distribution for training.
1 code implementation • 21 Jul 2022 • Ming Liu, Yuxiang Wei, Xiaohe Wu, WangMeng Zuo, Lei Zhang
Generative adversarial networks (GANs) have drawn enormous attention due to the simple yet effective training mechanism and superior image generation quality.
2 code implementations • 19 Jul 2022 • Wentong Li, Wenyu Liu, Jianke Zhu, Miaomiao Cui, Xiansheng Hua, Lei Zhang
A simple mask supervised SOLOv2 model is adapted to predict the instance-aware mask map as the level set for each instance.
1 code implementation • 17 Jul 2022 • Lei Zhang, Yuxuan Sun, Wei Wei
Instead of directly exploiting the pseudo labels produced by the teacher detector, we take the first attempt at reducing their deviation from ground truth using dual polishing learning, where two differently structured polishing networks are elaborately developed and trained using synthesized paired pseudo labels and the corresponding ground truth for categories and bounding boxes on the given annotated objects, respectively.
1 code implementation • 15 Jul 2022 • Haoran Yin, Meng Ge, Yanjie Fu, Gaoyan Zhang, Longbiao Wang, Lei Zhang, Lin Qiu, Jianwu Dang
These algorithms are usually achieved by mapping the multi-channel audio input to the single output (i. e. overall spatial pseudo-spectrum (SPS) of all sources), that is called MISO.
1 code implementation • 14 Jul 2022 • Zhiqiang Lang, Chongxing Song, Lei Zhang, Wei Wei
Binary neural network (BNN) provides a promising solution to deploy parameter-intensive deep single image super-resolution (SISR) models onto real devices with limited storage and computational resources.
no code implementations • 12 Jul 2022 • Ziyang Zong, Jun He, Lei Zhang, Hai Huan
However, for source free UDA, the source domain data can not be accessed during adaptation, which poses great challenge of measuring the domain gap.
1 code implementation • 8 Jul 2022 • Jianwei Zhang, Lei Zhang, Junyou Wang, Xin Wei, Jiaqi Li, Xian Jiang, Dan Du
Acne detection is crucial for interpretative diagnosis and precise treatment of skin disease.
1 code implementation • 7 Jul 2022 • Yabin Zhang, Jiehong Lin, Chenhang He, Yongwei Chen, Kui Jia, Lei Zhang
In this work, we make the first attempt, to the best of our knowledge, to consider the local geometry information explicitly into the masked auto-encoding, and propose a novel Masked Surfel Prediction (MaskSurf) method.
no code implementations • 5 Jul 2022 • Wenxu Shi, Lei Zhang, WeiJie Chen, ShiLiang Pu
Universal domain adaptive object detection (UniDAOD)is more challenging than domain adaptive object detection (DAOD) since the label space of the source domain may not be the same as that of the target and the scale of objects in the universal scenarios can vary dramatically (i. e, category shift and scale shift).
no code implementations • 5 Jul 2022 • Lei Zhang, Mukesh Ghimire, Wenlong Zhang, Zhe Xu, Yi Ren
This paper investigates two potential solutions to this problem: a hybrid method that leverages both supervised Nash equilibria and the HJI PDE, and a value-hardening method where a sequence of HJIs are solved with a gradually hardening reward.
2 code implementations • 4 Jul 2022 • Wenyu Liu, Wentong Li, Jianke Zhu, Miaomiao Cui, Xuansong Xie, Lei Zhang
With DIAL-Filters, we design both unsupervised and supervised frameworks for nighttime driving-scene segmentation, which can be trained in an end-to-end manner.
1 code implementation • 20 Jun 2022 • Tao Chen, Yazhou Yao, Lei Zhang, Qiong Wang, Guo-Sen Xie, Fumin Shen
Specifically, we propose a saliency guided class-agnostic distance module to pull the intra-category features closer by aligning features to their class prototypes.
1 code implementation • 15 Jun 2022 • Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua
Unlike convolutional inductive biases, which are forced to focus exclusively on hard-coded local regions, our proposed SPs are learned by the model itself and take a variety of spatial relations into account.
Ranked #144 on
Image Classification
on ImageNet
2 code implementations • 13 Jun 2022 • Meilin Chen, WeiJie Chen, Shicai Yang, Jie Song, Xinchao Wang, Lei Zhang, Yunfeng Yan, Donglian Qi, Yueting Zhuang, Di Xie, ShiLiang Pu
In addition, we conduct anchor adaptation in parallel with localization adaptation, since anchor can be regarded as a learnable parameter.
8 code implementations • CVPR 2023 • Feng Li, Hao Zhang, Huaizhe xu, Shilong Liu, Lei Zhang, Lionel M. Ni, Heung-Yeung Shum
In this paper we present Mask DINO, a unified object detection and segmentation framework.
Ranked #1 on
Panoptic Segmentation
on COCO test-dev
(using extra training data)
4 code implementations • 26 May 2022 • Ailing Zeng, Muxi Chen, Lei Zhang, Qiang Xu
Recently, there has been a surge of Transformer-based solutions for the long-term time series forecasting (LTSF) task.
Ranked #1 on
Time Series Forecasting
on ETTh1 (48)
no code implementations • 23 May 2022 • Lei Zhang, Yu Pan, Yi Liu, Qibin Zheng, Zhisong Pan
In order to improve the defense ability of defender, a game model based on reward randomization reinforcement learning is proposed.
no code implementations • 16 May 2022 • Lei Zhang, Yu Pan, Yi Liu, Qibin Zheng, Zhisong Pan
Following that, we proposed a user's permissions reasoning method based on reinforcement learning.
1 code implementation • 25 Apr 2022 • Zhishe Wang, Yanlin Chen, Wenyu Shao, Hui Li, Lei Zhang
The existing deep learning fusion methods mainly concentrate on the convolutional neural networks, and few attempts are made with transformer.
1 code implementation • Findings (NAACL) 2022 • Peggy Tang, Kun Hu, Rui Yan, Lei Zhang, Junbin Gao, Zhiyong Wang
Optimal sentence extraction is conceptualised as obtaining an optimal summary that minimises the transportation cost to a given document regarding their semantic distributions.
1 code implementation • CVPR 2022 • Binghui Chen, Pengyu Li, Xiang Chen, Biao Wang, Lei Zhang, Xian-Sheng Hua
Semi-supervised object detection (SSOD) aims to facilitate the training and deployment of object detectors with the help of a large amount of unlabeled data.
no code implementations • 13 Apr 2022 • Wenao Ma, Shuang Zheng, Lei Zhang, Huimao Zhang, Qi Dou
Despite the remarkable success on medical image analysis with deep learning, it is still under exploration regarding how to rapidly transfer AI models from one dataset to another for clinical applications.
no code implementations • 12 Apr 2022 • Lei Zhang, Kang Liao, Chunyu Lin, Yao Zhao
Concretely, we propose a Depth-Guided Outpainting Network to model different feature representations of two modalities and learn the structure-aware cross-modal fusion.
1 code implementation • 4 Apr 2022 • Ming Liu, Jianan Pan, Zifei Yan, WangMeng Zuo, Lei Zhang
Meanwhile, diverse testing sets are also provided with different types of reflection and scenes.
2 code implementations • CVPR 2022 • Dengpan Fu, Dongdong Chen, Hao Yang, Jianmin Bao, Lu Yuan, Lei Zhang, Houqiang Li, Fang Wen, Dong Chen
Since theses ID labels automatically derived from tracklets inevitably contain noises, we develop a large-scale Pre-training framework utilizing Noisy Labels (PNL), which consists of three learning modules: supervised Re-ID learning, prototype-based contrastive learning, and label-guided contrastive learning.
Ranked #7 on
Person Re-Identification
on CUHK03
1 code implementation • 27 Mar 2022 • Jie Liang, Hui Zeng, Lei Zhang
Specifically, a tiny regression network is employed to predict the degradation parameters of the input image, while several convolutional experts with the same topology are jointly optimized to specify the network parameters via a non-linear mixture of experts.
1 code implementation • CVPR 2022 • Chenhang He, Ruihuang Li, Shuai Li, Lei Zhang
VoxSeT is built upon a voxel-based set attention (VSA) module, which reduces the self-attention in each voxel by two cross-attentions and models features in a hidden space induced by a group of latent codes.
no code implementations • 18 Mar 2022 • Lida Li, Shuai Li, Kun Wang, Xiangchu Feng, Lei Zhang
2D convolution (Conv2d), which is responsible for extracting features from the input image, is one of the key modules of a convolutional neural network (CNN).
1 code implementation • CVPR 2022 • Shuai Li, Chenhang He, Ruihuang Li, Lei Zhang
Existing LA methods mostly focus on the design of pos weighting function, while the neg weight is directly derived from the pos weight.
1 code implementation • CVPR 2022 • Ruihuang Li, Shuai Li, Chenhang He, Yabin Zhang, Xu Jia, Lei Zhang
One popular solution to this challenging task is self-training, which selects high-scoring predictions on target samples as pseudo labels for training.
Ranked #9 on
Image-to-Image Translation
on SYNTHIA-to-Cityscapes
1 code implementation • 18 Mar 2022 • Tao Yang, Peiran Ren, Xuansong Xie, Xiansheng Hua, Lei Zhang
Most of the existing deep learning based VFI methods adopt off-the-shelf optical flow algorithms to estimate the bidirectional flows and interpolate the missing frames accordingly.
1 code implementation • CVPR 2022 • Shi Guo, Xi Yang, jianqi ma, Gaofeng Ren, Lei Zhang
Denoising and demosaicking are two essential steps to reconstruct a clean full-color image from the raw data.
2 code implementations • CVPR 2022 • Jie Liang, Hui Zeng, Lei Zhang
In this paper, we demonstrate that it is possible to train a GAN-based SISR model which can stably generate perceptually realistic details while inhibiting visual artifacts.
2 code implementations • CVPR 2022 • jianqi ma, Zhetong Liang, Lei Zhang
The semantics of the text are firstly extracted by a text recognition module as text prior information.
1 code implementation • CVPR 2022 • Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, Lei Zhang
In this work, we, for the first time to our best knowledge, propose to perform Exact Feature Distribution Matching (EFDM) by exactly matching the empirical Cumulative Distribution Functions (eCDFs) of image features, which could be implemented by applying the Exact Histogram Matching (EHM) in the image feature space.
1 code implementation • 13 Mar 2022 • Xindong Zhang, Hui Zeng, Shi Guo, Lei Zhang
A highly efficient long-range attention block (ELAB) is then built by simply cascading two shift-conv with a GMSA module, which is further accelerated by using a shared attention mechanism.
no code implementations • 12 Mar 2022 • Minghan Li, Lei Zhang
Based on the fact that adjacent frames in a short clip are highly coherent in content, we propose to extend the one-stage FiFo framework to a clip-in clip-out (CiCo) one, which performs VIS clip by clip.
1 code implementation • 10 Mar 2022 • Hongyi Zheng, Hongwei Yong, Lei Zhang
Nonetheless, the existing deep unfolding methods cannot explicitly solve the data term of the unfolding objective function, limiting their capability in blur kernel estimation.
13 code implementations • 7 Mar 2022 • Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum
Compared to other models on the leaderboard, DINO significantly reduces its model size and pre-training data size while achieving better results.
Ranked #1 on
Object Detection
on COCO 2017 val
(box AP metric)
no code implementations • 3 Mar 2022 • Feng Li, Hao Zhang, Yi-Fan Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Pengchuan Zhang, Lei Zhang
This survey is inspired by the remarkable progress in both computer vision and natural language processing, and recent trends shifting from single modality processing to multiple modality comprehension.
13 code implementations • CVPR 2022 • Feng Li, Hao Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, Lei Zhang
Our method is universal and can be easily plugged into any DETR-like methods by adding dozens of lines of code to achieve a remarkable improvement.
no code implementations • 19 Feb 2022 • Shanshan Wang, Lei Zhang, Pichao Wang
In our work, considering the different importance of pair-wise samples for both feature learning and domain alignment, we deduce our BP-Triplet loss for effective UDA from the perspective of Bayesian learning.
no code implementations • 17 Feb 2022 • Xinghua Xue, Haitong Huang, Cheng Liu, Ying Wang, Tao Luo, Lei Zhang
Winograd convolution is originally proposed to reduce the computing overhead by converting multiplication in neural network (NN) with addition via linear transformation.
6 code implementations • ICLR 2022 • Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, Lei Zhang
We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR.
1 code implementation • 28 Jan 2022 • Jie Zhang, Lei Zhang, Gang Li, Chao Wu
Adversarial examples are inputs for machine learning models that have been designed by attackers to cause the model to make mistakes.
no code implementations • 4 Jan 2022 • Qiankun Liu, Dongdong Chen, Qi Chu, Lu Yuan, Bin Liu, Lei Zhang, Nenghai Yu
In addition, such practice of re-identification still can not track those highly occluded objects when they are missed by the detector.
Ranked #7 on
Multi-Object Tracking
on MOT16
(using extra training data)
1 code implementation • CVPR 2022 • Jie Zhang, Bo Li, Jianghe Xu, Shuang Wu, Shouhong Ding, Lei Zhang, Chao Wu
The proposed method can efficiently imitate the target model through a small number of queries and achieve high attack success rate.
1 code implementation • CVPR 2022 • Xiawu Zheng, Xiang Fei, Lei Zhang, Chenglin Wu, Fei Chao, Jianzhuang Liu, Wei Zeng, Yonghong Tian, Rongrong Ji
Building upon RMI, we further propose a new search algorithm termed RMI-NAS, facilitating with a theorem to guarantee the global optimal of the searched architecture.
no code implementations • CVPR 2022 • Lingen Li, Lizhi Wang, Weitao Song, Lei Zhang, Zhiwei Xiong, Hua Huang
In this paper, we propose the quantization-aware deep optics for diffractive snapshot hyperspectral imaging.
no code implementations • 29 Dec 2021 • Lidong Fang, Pei Ge, Lei Zhang, Weinan E, Huan Lei
A long standing problem in the modeling of non-Newtonian hydrodynamics of polymeric flows is the availability of reliable and interpretable hydrodynamic models that faithfully encode the underlying micro-scale polymer dynamics.
no code implementations • 21 Dec 2021 • Di Yao, Chang Gong, Lei Zhang, Sheng Chen, Jingping Bi
Existing methods first train a model to predict the conversion probability of the advertisement journeys with historical data and calculate the attribution of each touchpoint using counterfactual predictions.
1 code implementation • 15 Dec 2021 • Wenyu Liu, Gaofeng Ren, Runsheng Yu, Shi Guo, Jianke Zhu, Lei Zhang
Though deep learning-based object detection methods have achieved promising results on the conventional datasets, it is still challenging to locate objects from the low-quality images captured in adverse weather conditions.