Zero-Shot Image Super-Resolution with Depth Guided Internal Degradation Learning

no code implementations ECCV 2020 Xi Cheng, Zhen-Yong Fu, Jian Yang

In the past few years, we have witnessed the great progress of image super-resolution (SR) thanks to the power of deep learning.

Image Super-Resolution

Double-Shot 3D Shape Measurement with a Dual-Branch Network

no code implementations19 Jul 2024 Mingyang Lei, Jingfan Fan, Long Shao, Hong Song, Deqiang Xiao, Danni Ai, Tianyu Fu, Ying Gu, Jian Yang

The structured light (SL)-based 3D measurement techniques with deep learning have been widely studied, among which speckle projection profilometry (SPP) and fringe projection profilometry (FPP) are two popular methods.

Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model

no code implementations3 Jul 2024 Xia Hou, QiFeng Li, Jian Yang, Tongliang Li, Linzheng Chai, Xianjie Wu, Hangyuan Ji, Zhoujun Li, Jixuan Nie, Jingbo Dun, Wenfeng Song

In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generating knowledge-intensive multi-turn dialogues for instruction tuning.

Language Modelling Large Language Model

OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

no code implementations2 Jul 2024 Kepan Nan, Rui Xie, Penghao Zhou, Tiehan Fan, Zhenheng Yang, Zhijie Chen, Xiang Li, Jian Yang, Ying Tai

Additionally, we propose a novel Multi-modal Video Diffusion Transformer (MVDiT) capable of mining both structure information from visual tokens and semantic information from text tokens.

Text-to-Video Generation Video Generation

Complementary Fusion of Deep Network and Tree Model for ETA Prediction

no code implementations1 Jul 2024 Yurui Huang, Jie Zhang, HengDa Bao, Yang Yang, Jian Yang

Estimated time of arrival (ETA) is a very important factor in the transportation system.

RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

no code implementations26 Jun 2024 Xiaozhong Ji, Chuming Lin, Zhonggan Ding, Ying Tai, Jian Yang, Junwei Zhu, Xiaobin Hu, Jiangning Zhang, Donghao Luo, Chengjie Wang

In the second component, we design a lightweight facial identity alignment (FIA) module which includes a lip-shape control structure and a face texture reference structure.

Audio-Visual Synchronization Face Generation

LongIns: A Challenging Long-context Instruction-based Exam for LLMs

no code implementations25 Jun 2024 Shawn Gavin, Tuney Zheng, Jiaheng Liu, Quehry Que, Noah Wang, Jian Yang, Chenchen Zhang, Wenhao Huang, Wenhu Chen, Ge Zhang

To address these issues, we propose the LongIns benchmark dataset, a challenging long-context instruction-based exam for LLMs, which is built based on the existing instruction datasets.

16k 4k

UniCoder: Scaling Code Large Language Model via Universal Code

no code implementations24 Jun 2024 Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li

When applying LLMs for code generation, recent works mainly focus on directing the models to articulate intermediate natural-language reasoning steps, as in chain-of-thought (CoT) prompting, and then output code with the natural language or other structured intermediate steps.

Code Translation Language Modelling +2

MLPHand: Real Time Multi-View 3D Hand Mesh Reconstruction via MLP Modeling

no code implementations23 Jun 2024 Jian Yang, Jiakun Li, Guoming Li, Zhen Shen, Huai-Yu Wu, Zhaoxin Fan, Heng Huang

Multi-view hand mesh reconstruction is a critical task for applications in virtual reality and human-computer interaction, but it remains a formidable challenge.

Can Large Language Models Always Solve Easy Problems if They Can Solve Harder Ones?

no code implementations18 Jun 2024 Zhe Yang, Yichang Zhang, Tianyu Liu, Jian Yang, Junyang Lin, Chang Zhou, Zhifang Sui

Furthermore, we introduce the concept of consistency score to quantitatively measure this inconsistency and analyze the potential for improvement in consistency by relative consistency score.

In-Context Learning

Towards Real-world Scenario: Imbalanced New Intent Discovery

no code implementations5 Jun 2024 Shun Zhang, Chaoran Yan, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li

New Intent Discovery (NID) aims at detecting known and previously undefined categories of user intent by utilizing limited labeled and massive unlabeled data.

Intent Discovery Representation Learning

DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images

2 code implementations5 Jun 2024 Yimian Dai, Minrui Zou, YuXuan Li, Xiang Li, Kang Ni, Jian Yang

Motivated by traditional SAR image denoising, we propose DenoDet, a network aided by explicit frequency domain transform to calibrate convolutional biases and pay more attention to high-frequencies, forming a natural multi-scale subspace representation to detect targets from the perspective of multi-subspace denoising.

2D Object Detection Image Denoising

Graph Neural Networks for Brain Graph Learning: A Survey

no code implementations1 Jun 2024 Xuexiong Luo, Jia Wu, Jian Yang, Shan Xue, Amin Beheshti, Quan Z. Sheng, David Mcalpine, Paul Sowman, Alexis Giral, Philip S. Yu

Exploring the complex structure of the human brain is crucial for understanding its functionality and diagnosing brain disorders.

Graph Learning

MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space

no code implementations25 May 2024 Jiangwei Weng, Zhiqiang Yan, Ying Tai, Jianjun Qian, Jian Yang, Jun Li

In this paper, we introduce MambaLLIE, an implicit Retinex-aware low light enhancer featuring a global-then-local state space design.

Long-range modeling Low-Light Image Enhancement

GS-ROR: 3D Gaussian Splatting for Reflective Object Relighting via SDF Priors

no code implementations22 May 2024 Zuo-Liang Zhu, Beibei Wang, Jian Yang

At the core of our method is the mutual supervision of the depth and normal between deferred Gaussians and SDF, which avoids the expensive volume rendering of SDF.

Inverse Rendering Novel View Synthesis

ECLIPSE: Semantic Entropy-LCS for Cross-Lingual Industrial Log Parsing

no code implementations22 May 2024 Wei zhang, Xianfu Cheng, Yi Zhang, Jian Yang, Hongcheng Guo, Zhoujun Li, Xiaolin Yin, Xiangyuan Guan, Xu Shi, Liangfan Zheng, Bo Zhang

These challenges are two-fold: 1) massive log templates: The performance and efficiency of most existing parsers will be significantly reduced when logs of growing quantities and different lengths; 2) Complex and changeable semantics: Traditional template-matching algorithms cannot accurately match the log templates of complicated industrial logs because they cannot utilize cross-language logs with similar semantics.

Language Modelling Large Language Model +2

Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance

no code implementations CVPR 2024 Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, Jian Yang

Firstly, we introduce a non-aligned reference frame matching module, leveraging an adaptive sliding window to match high-quality reference frames from clear videos.

Automated Metaheuristic Algorithm Design with Autoregressive Learning

no code implementations6 May 2024 Qi Zhao, Tengfei Liu, Bai Yan, Qiqi Duan, Jian Yang, Yuhui Shi

To bridge the gap, this paper proposes an autoregressive learning-based designer for automated design of metaheuristic algorithms.

Woven Fabric Capture with a Reflection-Transmission Photo Pair

1 code implementation4 May 2024 Yingjie Tang, Zixuan Li, Miloš Hašan, Jian Yang, Beibei Wang

We propose to recover the woven fabric parameters from two captured images: reflection and transmission.

Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion

no code implementations2 May 2024 Shanshan Zhang, Mingqian Ji, Yang Li, Jian Yang

From the perspective of reducing intra-class variance, we propose to complete features for occluded regions so as to align the features of pedestrians across different occlusion patterns.

Pedestrian Detection

Heterogeneous Subgraph Transformer for Fake News Detection

no code implementations19 Apr 2024 Yuchen Zhang, Xiaoxiao Ma, Jia Wu, Jian Yang, Hao Fan

To bridge the gap, this work proposes a heterogeneous subgraph transformer (HeteroSGT) to exploit subgraphs in our constructed heterogeneous graph.

Fake News Detection Language Modelling +1

Elevating Spectral GNNs through Enhanced Band-pass Filter Approximation

no code implementations15 Apr 2024 Guoming Li, Jian Yang, Shangsong Liang, Dongsheng Luo

Spectral Graph Neural Networks (GNNs) have attracted great attention due to their capacity to capture patterns in the frequency domains with essential graph filters.

Graph Learning

RoNID: New Intent Discovery with Generated-Reliable Labels and Cluster-friendly Representations

no code implementations13 Apr 2024 Shun Zhang, Chaoran Yan, Jian Yang, Changyu Ren, Jiaqi Bai, Tongliang Li, Zhoujun Li

To address the aforementioned challenges, we propose a Robust New Intent Discovery (RoNID) framework optimized by an EM-style method, which focuses on constructing reliable pseudo-labels and obtaining cluster-friendly discriminative representations.

Contrastive Learning Intent Discovery +2

Spectral GNN via Two-dimensional (2-D) Graph Convolution

no code implementations6 Apr 2024 Guoming Li, Jian Yang, Shangsong Liang, Dongsheng Luo

Spectral Graph Neural Networks (GNNs) have achieved tremendous success in graph learning.

Graph Learning

AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation

1 code implementation2 Apr 2024 Rui Xie, Ying Tai, Chen Zhao, Kai Zhang, Zhenyu Zhang, Jun Zhou, Xiaoqian Ye, Qian Wang, Jian Yang

Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs.

Blind Super-Resolution Super-Resolution

Diff-Reg v1: Diffusion Matching Model for Registration Problem

1 code implementation29 Mar 2024 Qianliang Wu, Haobo Jiang, Lei Luo, Jun Li, Yaqing Ding, Jin Xie, Jian Yang

Establishing reliable correspondences is essential for registration tasks such as 3D and 2D3D registration.


Deepfake Generation and Detection: A Benchmark and Survey

1 code implementation26 Mar 2024 Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, DaCheng Tao

Deepfake is a technology dedicated to creating highly realistic facial images and videos under specific conditions, which has significant application potential in fields such as entertainment, movie production, digital human creation, to name a few.

Attribute Face Reenactment +2

New Intent Discovery with Attracting and Dispersing Prototype

no code implementations25 Mar 2024 Shun Zhang, Jian Yang, Jiaqi Bai, Chaoran Yan, Tongliang Li, Zhao Yan, Zhoujun Li

New Intent Discovery (NID) aims to recognize known and infer new intent categories with the help of limited labeled and large-scale unlabeled data.

Intent Discovery Language Modelling +1

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion

no code implementations CVPR 2024 Zhiqiang Yan, Yuankai Lin, Kun Wang, Yupeng Zheng, YuFei Wang, Zhenyu Zhang, Jun Li, Jian Yang

Depth completion is a vital task for autonomous driving, as it involves reconstructing the precise 3D geometry of a scene from sparse and noisy depth measurements.

Autonomous Driving Depth Completion

LSKNet: A Foundation Lightweight Backbone for Remote Sensing

1 code implementation18 Mar 2024 YuXuan Li, Xiang Li, Yimian Dai, Qibin Hou, Li Liu, Yongxiang Liu, Ming-Ming Cheng, Jian Yang

While a considerable amount of research has been dedicated to remote sensing classification, object detection and semantic segmentation, most of these studies have overlooked the valuable prior knowledge embedded within remote sensing scenarios.

object-detection Object Detection +1

SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection

1 code implementation11 Mar 2024 YuXuan Li, Xiang Li, Weijie Li, Qibin Hou, Li Liu, Ming-Ming Cheng, Jian Yang

To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.

Ranked #2 on 2D Object Detection on SARDet-100K (using extra training data)

2k Object +2

Harmonious Group Choreography with Trajectory-Controllable Diffusion

no code implementations10 Mar 2024 Yuqin Dai, Wanlu Zhu, Ronghui Li, Zeping Ren, Xiangzheng Zhou, Xiu Li, Jun Li, Jian Yang

Specifically, to tackle dancer collisions, we introduce a Dance-Beat Navigator capable of generating trajectories for multiple dancers based on the music, complemented by a Distance-Consistency loss to maintain appropriate spacing among trajectories within a reasonable threshold.

PromptKD: Unsupervised Prompt Distillation for Vision-Language Models

1 code implementation CVPR 2024 Zheng Li, Xiang Li, Xinyi Fu, Xin Zhang, Weiqiang Wang, Shuo Chen, Jian Yang

To our best knowledge, we are the first to (1) perform unsupervised domain-specific prompt-driven knowledge distillation for CLIP, and (2) establish a practical pre-storing mechanism of text features as shared class vectors between teacher and student.

Knowledge Distillation Prompt Engineering +1

Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging

1 code implementation28 Feb 2024 Wei zhang, Hongcheng Guo, Anjie Le, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Shi Xu, Runqiang Zang, Liangfan Zheng, Bo Zhang

Log parsing, which entails transforming raw log messages into structured templates, constitutes a critical phase in the automation of log analytics.

Log Parsing

Scene Prior Filtering for Depth Map Super-Resolution

no code implementations21 Feb 2024 Zhengxue Wang, Zhiqiang Yan, Ming-Hsuan Yang, Jinshan Pan, Jian Yang, Ying Tai, Guangwei Gao

Specifically, we design an All-in-one Prior Propagation that computes the similarity between multi-modal scene priors, i. e., RGB, normal, semantic, and depth, to reduce the texture interference.

Depth Map Super-Resolution

A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence

no code implementations20 Feb 2024 Penghai Zhao, Xin Zhang, Ming-Ming Cheng, Jian Yang, Xiang Li

To improve efficiency, this paper aims to provide a thorough review of reviews in the PAMI field from diverse perspectives.

Language Modelling Large Language Model

C-ICL: Contrastive In-context Learning for Information Extraction

no code implementations17 Feb 2024 Ying Mo, Jiahao Liu, Jian Yang, Qifan Wang, Shun Zhang, Jingang Wang, Zhoujun Li

There has been increasing interest in exploring the capabilities of advanced large language models (LLMs) in the field of information extraction (IE), specifically focusing on tasks related to named entity recognition (NER) and relation extraction (RE).

In-Context Learning Miscellaneous +4

Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models

1 code implementation8 Feb 2024 Senmao Li, Joost Van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang

However, these models struggle to effectively suppress the generation of undesired content, which is explicitly requested to be omitted from the generated image in the prompt.

VIPTR: A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition

1 code implementation18 Jan 2024 Xianfu Cheng, Weixiao Zhou, Xiang Li, Xiaoming Chen, Jian Yang, Tongliang Li, Zhoujun Li

In this work, we propose the VIsion Permutable extractor for fast and efficient scene Text Recognition (VIPTR), which achieves an impressive balance between high performance and rapid inference speeds in the domain of STR.

Decoder Scene Text Recognition

MLAD: A Unified Model for Multi-system Log Anomaly Detection

no code implementations15 Jan 2024 Runqiang Zang, Hongcheng Guo, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Xu Shi, Liangfan Zheng, Bo Zhang

In spite of the rapid advancements in unsupervised log anomaly detection techniques, the current mainstream models still necessitate specific training for individual system datasets, resulting in costly procedures and limited scalability due to dataset size, thereby leading to performance bottlenecks.

Anomaly Detection Relational Reasoning +1

xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

no code implementations13 Jan 2024 Linzheng Chai, Jian Yang, Tao Sun, Hongcheng Guo, Jiaheng Liu, Bing Wang, Xiannian Liang, Jiaqi Bai, Tongliang Li, Qiyao Peng, Zhoujun Li

To bridge the gap among different languages, we propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages.

Few-Shot Learning Language Modelling +1

RotationDrag: Point-based Image Editing with Rotated Diffusion Features

1 code implementation12 Jan 2024 Minxing Luo, Wentao Cheng, Jian Yang

Our method tracks handle points more precisely by utilizing the feature map of the rotated images, thus ensuring precise optimization and high image fidelity.

Dynamic Weighted Adversarial Learning for Semi-Supervised Classification under Intersectional Class Mismatch

2 code implementations ACM Transactions on Multimedia Computing, Communications, and Applications 2024 Mingyu Li, Tao Zhou, Zhuo Huang, Jian Yang, Jie Yang, Chen Gong

Nowadays, class-mismatch problem has drawn intensive attention in Semi-Supervised Learning (SSL), where the classes of labeled data are assumed to be only a subset of the classes of unlabeled data.

Domain Adaptation

Exploring Multi-Modal Control in Music-Driven Dance Generation

no code implementations1 Jan 2024 Ronghui Li, Yuqin Dai, Yachao Zhang, Jun Li, Jian Yang, Jie Guo, Xiu Li

Existing music-driven 3D dance generation methods mainly concentrate on high-quality dance generation, but lack sufficient control during the generation process.

LTA-PCS: Learnable Task-Agnostic Point Cloud Sampling

no code implementations CVPR 2024 Jiaheng Liu, Jianhao Li, Kaisiyuan Wang, Hongcheng Guo, Jian Yang, Junran Peng, Ke Xu, Xianglong Liu, Jinyang Guo

Existing task-agnostic point cloud sampling strategy (e. g. FPS) does not consider semantic information of point clouds causing degraded performance on downstream tasks.

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

1 code implementation18 Dec 2023 Bing Wang, Changyu Ren, Jian Yang, Xinnian Liang, Jiaqi Bai, Linzheng Chai, Zhao Yan, Qian-Wen Zhang, Di Yin, Xing Sun, Zhoujun Li

Our framework comprises a core decomposer agent for Text-to-SQL generation with few-shot chain-of-thought reasoning, accompanied by two auxiliary agents that utilize external tools or models to acquire smaller sub-databases and refine erroneous SQL queries.

SQL Parsing Text-To-SQL

SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation

1 code implementation17 Dec 2023 Xiaoqi An, Lin Zhao, Chen Gong, Nannan Wang, Di Wang, Jian Yang

In this paper, we address the following question: "Only sparse human keypoint locations are detected for human pose estimation, is it really necessary to describe the whole image in a dense, high-resolution manner?"

Pose Estimation

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

1 code implementation15 Dec 2023 Senmao Li, Taihang Hu, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang

This finding inspired us to omit the encoder at certain adjacent time-steps and reuse cyclically the encoder features in the previous time-steps for the decoder.

Decoder Knowledge Distillation

Divide and Conquer: Hybrid Pre-training for Person Search

1 code implementation13 Dec 2023 Yanling Tian, Di Chen, Yunan Liu, Jian Yang, Shanshan Zhang

To the best of our knowledge, this is the first work that investigates how to support full-task pre-training using sub-task data.

Human Detection Person Search

M2C: Towards Automatic Multimodal Manga Complement

1 code implementation26 Oct 2023 Hongcheng Guo, Boyang Wang, Jiaqi Bai, Jiaheng Liu, Jian Yang, Zhoujun Li

In other words, the Multimodal Manga Complement (M2C) task has not been investigated, which aims to handle the aforementioned issues by providing a shared semantic space for vision and language understanding.

Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems

no code implementations16 Oct 2023 Yunli Wang, Zhiqiang Wang, Jian Yang, Shiyang Wen, Dongying Kong, Han Li, Kun Gai

Concretely, we employ multi-task learning to adaptively combine the optimization of relaxed and full targets, which refers to metrics Recall@m@k and OPA respectively.

Learning-To-Rank Multi-Task Learning +1

Continual Learning via Manifold Expansion Replay

no code implementations12 Oct 2023 Zihao Xu, Xuan Tang, Yufei Shi, Jianfeng Zhang, Jian Yang, Mingsong Chen, Xian Wei

To address this problem, we propose a novel replay strategy called Manifold Expansion Replay (MaER).

Continual Learning Management

Interpretable Traffic Event Analysis with Bayesian Networks

no code implementations10 Oct 2023 Tong Yuan, Jian Yang, Zeyi Wen

With a concrete case study, our framework can derive a Bayesian Network from a dataset based on the causal relationships between weather and traffic events across the United States.

TP2O: Creative Text Pair-to-Object Generation using Balance Swap-Sampling

no code implementations3 Oct 2023 Jun Li, Zedong Zhang, Jian Yang

Generating creative combinatorial objects from two seemingly unrelated object texts is a challenging task in text-to-image synthesis, often hindered by a focus on emulating existing data distributions.

Object Text-to-Image Generation

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

1 code implementation1 Oct 2023 Zekun Moore Wang, Zhongyuan Peng, Haoran Que, Jiaheng Liu, Wangchunshu Zhou, Yuhan Wu, Hongcheng Guo, Ruitong Gan, Zehao Ni, Jian Yang, Man Zhang, Zhaoxiang Zhang, Wanli Ouyang, Ke Xu, Stephen W. Huang, Jie Fu, Junran Peng

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters.


Unleashing Potential of Evidence in Knowledge-Intensive Dialogue Generation

no code implementations15 Sep 2023 Xianjie Wu, Jian Yang, Tongliang Li, Di Liang, Shiwei Zhang, Yiyang Du, Zhoujun Li

To fully Unleash the potential of evidence, we propose a framework to effectively incorporate Evidence in knowledge-Intensive Dialogue Generation (u-EIDG).

Dialogue Generation

TSSAT: Two-Stage Statistics-Aware Transformation for Artistic Style Transfer

1 code implementation12 Sep 2023 Haibo Chen, Lei Zhao, Jun Li, Jian Yang

To address this issue, we imitate the drawing process of humans and propose a Two-Stage Statistics-Aware Transformation (TSSAT) module, which first builds the global style foundation by aligning the global statistics of content and style features and then further enriches local style details by swapping the local statistics (instead of local features) in a patch-wise manner, significantly improving the stylization effects.

Style Transfer

SGNet: Salient Geometric Network for Point Cloud Registration

no code implementations12 Sep 2023 Qianliang Wu, Yaqing Ding, Lei Luo, Shuo Gu, Chuanwei Zhou, Jin Xie, Jian Yang

These high-order features are then propagated to dense points and utilized by a Sinkhorn matching module to identify key correspondences for successful registration.

Point Cloud Registration

Punctate White Matter Lesion Segmentation in Preterm Infants Powered by Counterfactually Generative Learning

no code implementations7 Sep 2023 Zehua Ren, Yongheng Sun, Miaomiao Wang, Yuying Feng, Xianjun Li, Chao Jin, Jian Yang, Chunfeng Lian, Fan Wang

In this paper, we propose to leverage the idea of counterfactual reasoning coupled with the auxiliary task of brain tissue segmentation to learn fine-grained positional and morphological representations of PWMLs for accurate localization and segmentation.

counterfactual Counterfactual Reasoning +2

Trust your Good Friends: Source-free Domain Adaptation by Reciprocal Neighborhood Clustering

no code implementations1 Sep 2023 Shiqi Yang, Yaxing Wang, Joost Van de Weijer, Luis Herranz, Shangling Jui, Jian Yang

We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity.

Clustering Source-Free Domain Adaptation

RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion

no code implementations1 Sep 2023 Zhiqiang Yan, Xiang Li, Le Hui, Zhenyu Zhang, Jun Li, Jian Yang

To tackle these challenges, we explore a repetitive design in our image guided network to gradually and sufficiently recover depth values.

Depth Completion Depth Estimation +1

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

no code implementations17 Aug 2023 Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, Zhoujun Li

A multi-view contrastive learning framework is introduced to encompass semantic contrasts between source, codeswitched, and target sentences, as well as contrasts among token-to-token relations.

Contrastive Learning named-entity-recognition +2

Dual-Stream Diffusion Net for Text-to-Video Generation

no code implementations16 Aug 2023 Binhui Liu, Xin Liu, Anbo Dai, Zhiyong Zeng, Dan Wang, Zhen Cui, Jian Yang

In particular, the designed two diffusion streams, video content and motion branches, could not only run separately in their private spaces for producing personalized video variations as well as content, but also be well-aligned between the content and motion domains through leveraging our designed cross-transformer interaction module, which would benefit the smoothness of generated videos.

Text-to-Video Generation Video Generation

Discriminative Graph-level Anomaly Detection via Dual-students-teacher Model

1 code implementation3 Aug 2023 Fu Lin, Xuexiong Luo, Jia Wu, Jian Yang, Shan Xue, Zitong Wang, Haonan Gong

Then, two competing student models trained by normal and abnormal graphs respectively fit graph representations of the teacher model in terms of node-level and graph-level representation perspectives.

Anomaly Detection

Creative Birds: Self-Supervised Single-View 3D Style Transfer

2 code implementations ICCV 2023 Renke Wang, Guimin Que, Shuo Chen, Xiang Li, Jun Li, Jian Yang

Our focus lies primarily on birds, a popular subject in 3D reconstruction, for which no existing single-view 3D transfer methods have been developed. The method we propose seeks to generate a 3D mesh shape and texture of a bird from two single-view images.

3D Reconstruction Style Transfer

FinPT: Financial Risk Prediction with Profile Tuning on Pretrained Foundation Models

1 code implementation22 Jul 2023 Yuwei Yin, Yazheng Yang, Jian Yang, Qi Liu

To tackle these issues, we propose FinPT and FinBench: the former is a novel approach for financial risk prediction that conduct Profile Tuning on large pretrained foundation models, and the latter is a set of high-quality datasets on financial risks such as default, fraud, and churn.

KnowPrefix-Tuning: A Two-Stage Prefix-Tuning Framework for Knowledge-Grounded Dialogue Generation

1 code implementation27 Jun 2023 Jiaqi Bai, Zhao Yan, Jian Yang, Xinnian Liang, Hongcheng Guo, Zhoujun Li

We propose Knowledgeable Prefix Tuning (KnowPrefix-Tuning), a two-stage tuning framework, bypassing the retrieval process in a knowledge-grounded conversation system by injecting prior knowledge into the lightweight knowledge prefix.

Dialogue Generation Response Generation +1

Learnable Differencing Center for Nighttime Depth Perception

no code implementations26 Jun 2023 Zhiqiang Yan, Yupeng Zheng, Chongyi Li, Jun Li, Jian Yang

Depth completion is the task of recovering dense depth maps from sparse ones, usually with the help of color images.

Depth Completion Depth Estimation

Hyperbolic Graph Diffusion Model

1 code implementation13 Jun 2023 Lingfeng Wen, Xuan Tang, Mingjie Ouyang, Xiangxiang Shen, Jian Yang, Daxin Zhu, Mingsong Chen, Xian Wei

In order to simultaneously utilize the data generation capabilities of diffusion models and the ability of hyperbolic embeddings to extract latent hierarchical distributions, we propose a novel graph generation method called, Hyperbolic Graph Diffusion Model (HGDM), which consists of an auto-encoder to encode nodes into successive hyperbolic embeddings, and a DM that operates in the hyperbolic latent space.

Graph Generation

Variable Radiance Field for Real-Life Category-Specifc Reconstruction from Single Image

no code implementations8 Jun 2023 Kun Wang, Zhiqiang Yan, Zhenyu Zhang, Xiang Li, Jun Li, Jian Yang

Our key contributions are: (1) We parameterize the geometry and appearance of the object using a multi-scale global feature extractor, which avoids frequent point-wise feature retrieval and camera dependency.

Contrastive Learning Object +1

Fine-Grained Visual Prompting

1 code implementation NeurIPS 2023 Lingfeng Yang, Yueze Wang, Xiang Li, Xinlong Wang, Jian Yang

Previous works have suggested that incorporating visual prompts, such as colorful boxes or circles, can improve the ability of models to recognize objects of interest.

Visual Prompting

GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking

no code implementations29 May 2023 Jiaqi Bai, Hongcheng Guo, Jiaheng Liu, Jian Yang, Xinnian Liang, Zhao Yan, Zhoujun Li

However, the retrieved passages are not ideal for guiding answer generation because of the discrepancy between retrieval and generation, i. e., the candidate passages are all treated equally during the retrieval procedure without considering their potential to generate a proper answer.

Answer Generation Dialogue Generation +6

Self-Supervised 3D Scene Flow Estimation Guided by Superpoints

1 code implementation CVPR 2023 Yaqi Shen, Le Hui, Jin Xie, Jian Yang

In our superpoint generation module, we utilize the bidirectional flow information at the previous iteration to obtain the matching points of points and superpoint centers for soft point-to-superpoint association construction, in which the superpoints are generated for pairwise point clouds.

Scene Flow Estimation

Refined Response Distillation for Class-Incremental Player Detection

no code implementations1 May 2023 Liang Bai, Hangjie Yuan, Tao Feng, Hong Song, Jian Yang

Furthermore, we present the NBA-IOD and Volleyball-IOD datasets as the benchmark and investigate the IOD tasks of the players systematically.

Knowledge Distillation object-detection +1

Group Equivariant BEV for 3D Object Detection

no code implementations26 Apr 2023 Hongwei Liu, Jian Yang, Jianfeng Zhang, Dongheng Shao, Jielong Guo, Shaobo Li, Xuan Tang, Xian Wei

Experimental results demonstrate that GeqBevNet can extract more rotational equivariant features in the 3D object detection of the actual road scene and improve the performance of object orientation prediction.

3D Object Detection Object +2

Enhancing Large Language Model with Self-Controlled Memory Framework

1 code implementation26 Apr 2023 Bing Wang, Xinnian Liang, Jian Yang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, Zhoujun Li

Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information.

Book summarization Document Summarization +5

Partition-based Stability of Coalitional Games

no code implementations20 Apr 2023 Jian Yang

For the resulting strong, medium, and weak stability concepts, the first is core-compatible in that the traditional core exactly contains those allocations that are associated through this strong stability concept with the all-consolidated partition consisting of only the grand coalition.


Autoencoders with Intrinsic Dimension Constraints for Learning Low Dimensional Image Representations

no code implementations16 Apr 2023 Jianzhang Zheng, Hao Shen, Jian Yang, Xuan Tang, Mingsong Chen, Hui Yu, Jielong Guo, Xian Wei

Motivated by the important role of ID, in this paper, we propose a novel deep representation learning approach with autoencoder, which incorporates regularization of the global and local ID constraints into the reconstruction of data representations.

Image Classification Representation Learning

Curricular Object Manipulation in LiDAR-based Object Detection

1 code implementation CVPR 2023 Ziyue Zhu, Qiang Meng, Xiao Wang, Ke Wang, Liujiang Yan, Jian Yang

For the loss design, we propose the COMLoss to dynamically predict object-level difficulties and emphasize objects of different difficulties based on training stages.

3D Object Detection Object +1

Robust Outlier Rejection for 3D Registration with Variational Bayes

1 code implementation CVPR 2023 Haobo Jiang, Zheng Dang, Zhen Wei, Jin Xie, Jian Yang, Mathieu Salzmann

Embedded with the inlier/outlier label, the posterior feature distribution is label-dependent and discriminative.

Bayesian Inference

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

1 code implementation28 Mar 2023 Senmao Li, Joost Van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.

Text-based Image Editing

3D-Aware Multi-Class Image-to-Image Translation with NeRFs

1 code implementation CVPR 2023 Senmao Li, Joost Van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

In the second step, based on the well-trained multi-class 3D-aware GAN architecture, that preserves view-consistency, we construct a 3D-aware I2I translation system.

Image-to-Image Translation Translation

A Survey of Historical Learning: Learning Models with Learning History

1 code implementation23 Mar 2023 Xiang Li, Ge Wu, Lingfeng Yang, Wenhai Wang, RenJie Song, Jian Yang

The various types of elements, deposited in the training history, are a large amount of wealth for improving learning deep models.

Ensemble Learning

Large Selective Kernel Network for Remote Sensing Object Detection

1 code implementation ICCV 2023 YuXuan Li, Qibin Hou, Zhaohui Zheng, Ming-Ming Cheng, Jian Yang, Xiang Li

To the best of our knowledge, this is the first time that large and selective kernel mechanisms have been explored in the field of remote sensing object detection.

Object object-detection +3

AutoOptLib: Tailoring Metaheuristic Optimizers via Automated Algorithm Design

1 code implementation12 Mar 2023 Qi Zhao, Bai Yan, Taiwei Hu, Xianglong Chen, Qiqi Duan, Jian Yang, Yuhui Shi

In response, this paper proposes AutoOptLib, the first platform for accessible automated design of metaheuristic optimizers.

Metaheuristic Optimization

Non-aligned supervision for Real Image Dehazing

no code implementations8 Mar 2023 Junkai Fan, Fei Guo, Jianjun Qian, Xiang Li, Jun Li, Jian Yang

In particular, we explore a non-alignment scenario that a clear reference image, unaligned with the input hazy image, is utilized to supervise the dehazing network.

Image Dehazing

Heterogeneous Social Event Detection via Hyperbolic Graph Representations

1 code implementation20 Feb 2023 Zitai Qiu, Jia Wu, Jian Yang, Xing Su, Charu C. Aggarwal

This model addresses the heterogeneity of social media, and, with this graph, the information in social media can be used to capture structural information based on the properties of hyperbolic space.

Contrastive Learning Event Detection

Structure Flow-Guided Network for Real Depth Super-Resolution

no code implementations31 Jan 2023 Jiayi Yuan, Haobo Jiang, Xiang Li, Jianjun Qian, Jun Li, Jian Yang

Specifically, our framework consists of a cross-modality flow-guided upsampling network (CFUNet) and a flow-enhanced pyramid edge attention network (PEANet).

Depth Estimation Depth Prediction +1

Recurrent Structure Attention Guidance for Depth Super-Resolution

no code implementations31 Jan 2023 Jiayi Yuan, Haobo Jiang, Xiang Li, Jianjun Qian, Jun Li, Jian Yang

Second, instead of the coarse concatenation guidance, we propose a recurrent structure attention block, which iteratively utilizes the latest depth estimation and the image features to jointly select clear patterns and boundaries, aiming at providing refined guidance for accurate depth recovery.

Depth Estimation Super-Resolution

State of the Art and Potentialities of Graph-level Learning

no code implementations14 Jan 2023 Zhenyu Yang, Ge Zhang, Jia Wu, Jian Yang, Quan Z. Sheng, Shan Xue, Chuan Zhou, Charu Aggarwal, Hao Peng, Wenbin Hu, Edwin Hancock, Pietro Liò

Traditional approaches to learning a set of graphs heavily rely on hand-crafted features, such as substructures.

Graph Learning

Multilingual Entity and Relation Extraction from Unified to Language-specific Training

no code implementations11 Jan 2023 Zixiang Wang, Jian Yang, Tongliang Li, Jiaheng Liu, Ying Mo, Jiaqi Bai, Longtao He, Zhoujun Li

In this paper, we propose a two-stage multilingual training method and a joint model called Multilingual Entity and Relation Extraction framework (mERE) to mitigate language interference across languages.

Relation Relation Extraction +1

Clothed Human Performance Capture With a Double-Layer Neural Radiance Fields

no code implementations CVPR 2023 Kangkan Wang, Guofeng Zhang, Suxu Cong, Jian Yang

Previous methods capture the performance of full humans with a personalized template or recover the garments from a single frame with static human poses.

Efficient LiDAR Point Cloud Oversegmentation Network

no code implementations ICCV 2023 Le Hui, Linghua Tang, Yuchao Dai, Jin Xie, Jian Yang

Then, to generate homogeneous superpoints from the sparse LiDAR point cloud, we propose a LiDAR point grouping algorithm that simultaneously considers the similarity of point embeddings and the Euclidean distance of points in 3D space.

LIDAR Semantic Segmentation Semantic Segmentation

Revisiting the P3P Problem

1 code implementation CVPR 2023 Yaqing Ding, Jian Yang, Viktor Larsson, Carl Olsson, Kalle Åström

One of the classical multi-view geometry problems is the so called P3P problem, where the absolute pose of a calibrated camera is determined from three 2D-to-3D correspondences.

Few-shot Continual Infomax Learning

no code implementations ICCV 2023 Ziqi Gu, Chunyan Xu, Jian Yang, Zhen Cui

Further, considering that the learned knowledge in the human brain is a generalization of actual information and exists in a certain relational structure, we perform continual structure infomax learning to relieve the catastrophic forgetting problem in the continual learning process.

Continual Learning Few-Shot Learning

Center-Based Decoupled Point-cloud Registration for 6D Object Pose Estimation

no code implementations ICCV 2023 Haobo Jiang, Zheng Dang, Shuo Gu, Jin Xie, Mathieu Salzmann, Jian Yang

Our method decouples the translation from the entire transformation by predicting the object center and estimating the rotation in a center-aware manner.

6D Pose Estimation using RGB Object +2

Efficient Image Super-Resolution with Feature Interaction Weighted Hybrid Network

no code implementations29 Dec 2022 Wenjie Li, Juncheng Li, Guangwei Gao, Weihong Deng, Jian Yang, Guo-Jun Qi, Chia-Wen Lin

Recently, great progress has been made in single-image super-resolution (SISR) based on deep learning technology.

Image Super-Resolution

Robust Consensus Clustering and its Applications for Advertising Forecasting

no code implementations27 Dec 2022 Deguang Kong, Miao Lu, Konstantin Shmakov, Jian Yang

Consensus clustering aggregates partitions in order to find a better fit by reconciling clustering results from different sources/executions.


Demystifying Advertising Campaign Bid Recommendation: A Constraint target CPA Goal Optimization

no code implementations26 Dec 2022 Deguang Kong, Konstantin Shmakov, Jian Yang

In cost-per-click (CPC) or cost-per-impression (CPM) advertising campaigns, advertisers always run the risk of spending the budget without getting enough conversions.

Do not Waste Money on Advertising Spend: Bid Recommendation via Concavity Changes

no code implementations26 Dec 2022 Deguang Kong, Konstantin Shmakov, Jian Yang

In computational advertising, a challenging problem is how to recommend the bid for advertisers to achieve the best return on investment (ROI) given budget constraint.

Mining User-aware Multi-relations for Fake News Detection in Large Scale Online Social Networks

1 code implementation21 Dec 2022 Xing Su, Jian Yang, Jia Wu, Yuchen Zhang

In this paper, we construct a dual-layer graph (i. e., the news layer and the user layer) to extract multiple relations of news and users in social networks to derive rich information for detecting fake news.

Fake News Detection

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

1 code implementation20 Dec 2022 Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li

Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model.

Decoder Denoising +2

One-Stage Cascade Refinement Networks for Infrared Small Target Detection

1 code implementation16 Dec 2022 Yimian Dai, Xiang Li, Fei Zhou, Yulei Qian, Yaohong Chen, Jian Yang

Finally, we present a new research benchmark for infrared small target detection, consisting of the SIRST-V2 dataset of real-world, high-resolution single-frame targets, the normalized contrast evaluation metric, and the DeepInfrared toolkit for detection.

Feature Aggregation and Propagation Network for Camouflaged Object Detection

1 code implementation2 Dec 2022 Tao Zhou, Yi Zhou, Chen Gong, Jian Yang, Yu Zhang

In this paper, we propose a novel Feature Aggregation and Propagation Network (FAP-Net) for camouflaged object detection.

Object object-detection +1

Curriculum Temperature for Knowledge Distillation

1 code implementation29 Nov 2022 Zheng Li, Xiang Li, Lingfeng Yang, Borui Zhao, RenJie Song, Lei Luo, Jun Li, Jian Yang

In this paper, we propose a simple curriculum-based technique, termed Curriculum Temperature for Knowledge Distillation (CTKD), which controls the task difficulty level during the student's learning career through a dynamic and learnable temperature.

Image Classification Knowledge Distillation

DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth Completion

no code implementations20 Nov 2022 Zhiqiang Yan, Kun Wang, Xiang Li, Zhenyu Zhang, Jun Li, Jian Yang

Unsupervised depth completion aims to recover dense depth from the sparse one without using the ground-truth annotation.

Depth Completion Depth Estimation +2

LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation

no code implementations19 Oct 2022 Hongcheng Guo, Jiaheng Liu, Haoyang Huang, Jian Yang, Zhoujun Li, Dongdong Zhang, Zheng Cui, Furu Wei

To this end, we first propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.

Multimodal Machine Translation Translation

DAGAD: Data Augmentation for Graph Anomaly Detection

1 code implementation18 Oct 2022 Fanzhen Liu, Xiaoxiao Ma, Jia Wu, Jian Yang, Shan Xue, Amin Beheshti, Chuan Zhou, Hao Peng, Quan Z. Sheng, Charu C. Aggarwal

To bridge the gaps, this paper devises a novel Data Augmentation-based Graph Anomaly Detection (DAGAD) framework for attributed graphs, equipped with three specially designed modules: 1) an information fusion module employing graph neural network encoders to learn representations, 2) a graph data augmentation module that fertilizes the training set with generated samples, and 3) an imbalance-tailored learning module to discriminate the distributions of the minority (anomalous) and majority (normal) classes.

Data Augmentation Graph Anomaly Detection +1

SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval

4 code implementations28 Sep 2022 Yang shen, Xuhao Sun, Xiu-Shen Wei, Qing-Yuan Jiang, Jian Yang

In this paper, we propose Suppression-Enhancing Mask based attention and Interactive Channel transformatiON (SEMICON) to learn binary hash codes for dealing with large-scale fine-grained image retrieval tasks.

Image Retrieval Retrieval

Spatio-Temporal Relation Learning for Video Anomaly Detection

no code implementations27 Sep 2022 Hui Lv, Zhen Cui, Biao Wang, Jian Yang

Anomaly identification is highly dependent on the relationship between the object and the scene, as different/same object actions in same/different scenes may lead to various degrees of normality and anomaly.

Anomaly Detection Knowledge Graph Embedding +5

Grouped Adaptive Loss Weighting for Person Search

no code implementations23 Sep 2022 Yanling Tian, Di Chen, Yunan Liu, Shanshan Zhang, Jian Yang

A straightforward solution is to manually assign different weights to different tasks, compensating for the diverse convergence rates.

Model Optimization Multi-Task Learning +2

Point Cloud Registration-Driven Robust Feature Matching for 3D Siamese Object Tracking

no code implementations14 Sep 2022 Haobo Jiang, Kaihao Lan, Le Hui, Guangyu Li, Jin Xie, Jian Yang

The core of Siamese feature matching is how to assign high feature similarity on the corresponding points between the template and search area for precise object localization.

Object Localization Object Tracking +1

LogLG: Weakly Supervised Log Anomaly Detection via Log-Event Graph Construction

no code implementations23 Aug 2022 Hongcheng Guo, Yuhui Guo, Renjie Chen, Jian Yang, Jiaheng Liu, Zhoujun Li, Tieqiao Zheng, Weichao Hou, Liangfan Zheng, Bo Zhang

Experiments on five benchmarks validate the effectiveness of LogLG for detecting anomalies on unlabeled log data and demonstrate that LogLG, as the state-of-the-art weakly supervised method, achieves significant performance improvements compared to existing methods.

Anomaly Detection graph construction +1

GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation

1 code implementation29 Jul 2022 Jian Yang, Yuwei Yin, Liqun Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Furu Wei, Zhoujun Li

Transformer structure, stacked by a sequence of encoder and decoder network layers, achieves significant development in neural machine translation.

Decoder Machine Translation +1

3D Siamese Transformer Network for Single Object Tracking on Point Clouds

1 code implementation25 Jul 2022 Le Hui, Lingpeng Wang, Linghua Tang, Kaihao Lan, Jin Xie, Jian Yang

Siamese network based trackers formulate 3D single object tracking as cross-correlation learning between point features of a template and a search area.

3D Single Object Tracking Object Tracking

RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation

1 code implementation25 Jul 2022 Mu He, Le Hui, Yikai Bian, Jian Ren, Jin Xie, Jian Yang

In this paper, we propose a resolution adaptive self-supervised monocular depth estimation method (RA-Depth) by learning the scale invariance of the scene depth.

Data Augmentation Decoder +1

HLT-MT: High-resource Language-specific Training for Multilingual Neural Machine Translation

1 code implementation11 Jul 2022 Jian Yang, Yuwei Yin, Shuming Ma, Dongdong Zhang, Zhoujun Li, Furu Wei

Nonetheless, multilingual training is plagued by language interference degeneration in shared parameters because of the negative interference among different translation directions, especially on high-resource languages.

Decoder Machine Translation +1

GCN-based Multi-task Representation Learning for Anomaly Detection in Attributed Networks

no code implementations8 Jul 2022 Venus Haghighi, Behnaz Soltani, Adnan Mahmood, Quan Z. Sheng, Jian Yang

Anomaly detection in attributed networks has received a considerable attention in recent years due to its applications in a wide range of domains such as finance, network security, and medicine.

Anomaly Detection Community Detection +2

Cross-receptive Focused Inference Network for Lightweight Image Super-Resolution

1 code implementation6 Jul 2022 Wenjie Li, Juncheng Li, Guangwei Gao, Jiantao Zhou, Jian Yang, Guo-Jun Qi

Recently, Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks due to the ability of global feature extraction.

Image Super-Resolution

Towards Harnessing Feature Embedding for Robust Learning with Noisy Labels

no code implementations27 Jun 2022 Chuang Zhang, Li Shen, Jian Yang, Chen Gong

To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels.

Learning with noisy labels Memorization

Graph-level Neural Networks: Current Progress and Future Directions

no code implementations31 May 2022 Ge Zhang, Jia Wu, Jian Yang, Shan Xue, Wenbin Hu, Chuan Zhou, Hao Peng, Quan Z. Sheng, Charu Aggarwal

To frame this survey, we propose a systematic taxonomy covering GLNNs upon deep neural networks, graph neural networks, and graph pooling.

Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality

1 code implementation20 May 2022 Xiang Li, Wenhai Wang, Lingfeng Yang, Jian Yang

Masked AutoEncoder (MAE) has recently led the trends of visual self-supervision area by an elegant asymmetric encoder-decoder design, which significantly optimizes both the pre-training efficiency and fine-tuning accuracy.

Object Detection

Bi-level Alignment for Cross-Domain Crowd Counting

1 code implementation CVPR 2022 Shenjian Gong, Shanshan Zhang, Jian Yang, Dengxin Dai, Bernt Schiele

The main challenge for this task is to achieve high-quality manual annotations on a large amount of training data.

AutoML Crowd Counting +2

Hyperspectral Image Classification With Contrastive Graph Convolutional Network

no code implementations11 May 2022 Wentao Yu, Sheng Wan, Guangyu Li, Jian Yang, Chen Gong

To enhance the feature representation ability, in this paper, a GCN model with contrastive learning is proposed to explore the supervision signals contained in both spectral information and spatial relations, which is termed Contrastive Graph Convolutional Network (ConGCN), for HSI classification.

Classification Contrastive Learning +2

Semantics-Guided Moving Object Segmentation with 3D LiDAR

no code implementations6 May 2022 Shuo Gu, Suling Yao, Jian Yang, Hui Kong

Instead of segmenting the moving objects directly, the network conducts single-scan-based semantic segmentation and multiple-scan-based moving object segmentation in turn.

Object Segmentation +1

Knowledge-aware Document Summarization: A Survey of Knowledge, Embedding Methods and Architectures

no code implementations24 Apr 2022 Yutong Qu, Wei Emma Zhang, Jian Yang, Lingfei Wu, Jia Wu

Knowledge-aware methods have boosted a range of natural language processing applications over the last decades.

Document Summarization Informativeness