Search Results for author: xiangyang xue

Found 136 papers, 54 papers with code

Star-Transformer

2 code implementations • NAACL 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, Yunfan Shao, xiangyang xue, Zheng Zhang

Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data.

Ranked #13 on Sentiment Analysis on SST-5 Fine-grained classification

Named Entity Recognition (NER) Natural Language Inference +2

12,992

Paper
Code

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

4 code implementations • EMNLP 2020 • Linyang Li, Ruotian Ma, Qipeng Guo, xiangyang xue, Xipeng Qiu

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods.

Adversarial Attack

2,753

Paper
Code

Pose-Normalized Image Generation for Person Re-identification

2 code implementations • ECCV 2018 • Xuelin Qian, Yanwei Fu, Tao Xiang, Wenxuan Wang, Jie Qiu, Yang Wu, Yu-Gang Jiang, xiangyang xue

Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations.

Ranked #2 on Person Re-Identification on Market-1501->DukeMTMC-reID

Generative Adversarial Network Image Generation +2

1,267

Paper
Code

DSOD: Learning Deeply Supervised Object Detectors from Scratch

4 code implementations • ICCV 2017 • Zhiqiang Shen, Zhuang Liu, Jianguo Li, Yu-Gang Jiang, Yurong Chen, xiangyang xue

State-of-the-art object objectors rely heavily on the off-the-shelf networks pre-trained on large-scale classification datasets like ImageNet, which incurs learning bias due to the difference on both the loss functions and the category distributions between classification and detection tasks.

General Classification Object +2

702

Paper
Code

Object Detection from Scratch with Deep Supervision

1 code implementation • 25 Sep 2018 • Zhiqiang Shen, Zhuang Liu, Jianguo Li, Yu-Gang Jiang, Yurong Chen, xiangyang xue

Thus, a better solution to handle these critical problems is to train object detectors from scratch, which motivates our proposed method.

General Classification Object +2

702

Paper
Code

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

6 code implementations • ICCV 2017 • Xingyi Zhou, Qi-Xing Huang, Xiao Sun, xiangyang xue, Yichen Wei

We propose a weakly-supervised transfer learning method that uses mixed 2D and 3D labels in a unified deep neutral network that presents two-stage cascaded structure.

Ranked #1 on 3D Human Pose Estimation on Geometric Pose Affordance

2D Pose Estimation 3D Multi-Person Pose Estimation (absolute) +4

609

Paper
Code

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

4 code implementations • 3 Mar 2017 • Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, xiangyang xue

This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images.

Computational Efficiency Region Proposal +2

436

Paper
Code

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

1 code implementation • 30 Dec 2021 • Haiyang Yu, Jingye Chen, Bin Li, jianqi ma, Mengnan Guan, Xixi Xu, Xiaocong Wang, Shaobo Qu, xiangyang xue

The experimental results indicate that the performance of baselines on CTR datasets is not as good as that on English datasets due to the characteristics of Chinese texts that are quite different from the Latin alphabet.

Attribute Benchmarking +1

384

Paper
Code

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

2 code implementations • 24 Mar 2022 • Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, xiangyang xue

Multiple datasets and open challenges for object detection have been introduced in recent years.

Ranked #1 on Object Detection on BigDetection val

Object object-detection +1

380

Paper
Code

Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition

1 code implementation • 22 Jun 2021 • Jingye Chen, Bin Li, xiangyang xue

Inspired by the fact that humans can generalize to know how to write characters unseen before if they have learned stroke orders of some characters, we propose a stroke-based method by decomposing each character into a sequence of strokes, which are the most basic units of Chinese characters.

309

Paper
Code

Scene Text Telescope: Text-Focused Scene Image Super-Resolution

1 code implementation • CVPR 2021 • Jingye Chen, Bin Li, xiangyang xue

Image super-resolution, which is often regarded as a preprocessing procedure of scene text recognition, aims to recover the realistic features from a low-resolution text image.

Ranked #3 on Optical Character Recognition (OCR) on Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Image Super-Resolution Optical Character Recognition (OCR) +2

309

Paper
Code

Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

1 code implementation • 13 Dec 2021 • Jingye Chen, Haiyang Yu, jianqi ma, Bin Li, xiangyang xue

However, the recognition of low-resolution scene text images remains a challenge.

Image Super-Resolution Scene Text Recognition

309

Paper
Code

Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning

1 code implementation • ICCV 2023 • Haiyang Yu, Xiaocong Wang, Bin Li, xiangyang xue

However, despite Chinese characters possessing different characteristics from Latin characters, such as complex inner structures and large categories, few methods have been proposed for Chinese Text Recognition (CTR).

Scene Text Recognition

309

Paper
Code

Orientation-Independent Chinese Text Recognition in Scene Images

1 code implementation • 3 Sep 2023 • Haiyang Yu, Xiaocong Wang, Bin Li, xiangyang xue

We conduct experiments on a scene dataset for benchmarking Chinese text recognition, and the results demonstrate that the proposed method can indeed improve performance through disentangling content and orientation information.

Benchmarking Image Reconstruction +1

309

Paper
Code

DeepSFM: Structure From Motion Via Deep Bundle Adjustment

1 code implementation • ECCV 2020 • Xingkui Wei, yinda zhang, Zhuwen Li, Yanwei Fu, xiangyang xue

The explicit constraints on both depth (structure) and pose (motion), when combined with the learning components, bring the merit from both traditional BA and emerging deep learning technology.

Pose Estimation

308

Paper
Code

Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models

2 code implementations • ICLR 2020 • Xisen Jin, Zhongyu Wei, Junyi Du, xiangyang xue, Xiang Ren

Human and metrics evaluation on both LSTM models and BERT Transformer models on multiple datasets show that our algorithms outperform prior hierarchical explanation algorithms.

Semantic Composition

203

Paper
Code

MEAL: Multi-Model Ensemble via Adversarial Learning

1 code implementation • 6 Dec 2018 • Zhiqiang Shen, Zhankui He, xiangyang xue

In this paper, we present a method for compressing large, complex trained ensembles into a single network, where knowledge from a variety of trained deep neural networks (DNNs) is distilled and transferred to a single DNN.

172

Paper
Code

Neural Pose Transfer by Spatially Adaptive Instance Normalization

1 code implementation • CVPR 2020 • Jiashun Wang, Chao Wen, Yanwei Fu, Haitao Lin, Tianyun Zou, xiangyang xue, yinda zhang

Pose transfer has been studied for decades, in which the pose of a source mesh is applied to a target mesh.

Pose Transfer Style Transfer

140

Paper
Code

Model-based Deep Hand Pose Estimation

1 code implementation • 22 Jun 2016 • Xingyi Zhou, Qingfu Wan, Wei zhang, xiangyang xue, Yichen Wei

For the first time, we show that embedding such a non-linear generative process in deep learning is feasible for hand pose estimation.

Hand Pose Estimation valid

111

Paper
Code

Multi-level Semantic Feature Augmentation for One-shot Learning

1 code implementation • 15 Apr 2018 • Zitian Chen, Yanwei Fu, yinda zhang, Yu-Gang Jiang, xiangyang xue, Leonid Sigal

In semantic space, we search for related concepts, which are then projected back into the image feature spaces by the decoder portion of the TriNet.

Novel Concepts One-Shot Learning

Paper
Code

Progressive Coordinate Transforms for Monocular 3D Object Detection

1 code implementation • NeurIPS 2021 • Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, xiangyang xue

Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment.

Monocular 3D Object Detection Object +2

Paper
Code

SUPS: A Simulated Underground Parking Scenario Dataset for Autonomous Driving

1 code implementation • 25 Feb 2023 • Jiawei Hou, Qi Chen, Yurong Cheng, Guang Chen, xiangyang xue, Taiping Zeng, Jian Pu

However, there is a lack of underground parking scenario datasets with multiple sensors and well-labeled images that support both SLAM tasks and perception tasks, such as semantic segmentation and parking slot detection.

3D Reconstruction Autonomous Driving +4

Paper
Code

LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling

1 code implementation • 18 Aug 2022 • Boyan Jiang, Xinlin Ren, Mingsong Dou, xiangyang xue, Yanwei Fu, yinda zhang

Recent progress in 4D implicit representation focuses on globally controlling the shape and motion with low dimensional latent vectors, which is prone to missing surface details and accumulating tracking error.

3D Shape Modeling 4D reconstruction +1

Paper
Code

Grad-PU: Arbitrary-Scale Point Cloud Upsampling via Gradient Descent with Learned Distance Functions

1 code implementation • CVPR 2023 • Yun He, Danhang Tang, yinda zhang, xiangyang xue, Yanwei Fu

Most existing point cloud upsampling methods have roughly three steps: feature extraction, feature expansion and 3D coordinate prediction.

point cloud upsampling

Paper
Code

Is normalization indispensable for training deep neural network?

1 code implementation • NeurIPS 2020 • Jie Shao, Kai Hu, Changhu Wang, xiangyang xue, Bhiksha Raj

In this paper, we study what would happen when normalization layers are removed from the network, and show how to train deep neural networks without normalization layers and without performance degradation.

General Classification Image Classification +5

Paper
Code

Learning to score the figure skating sports videos

1 code implementation • 8 Feb 2018 • Chengming Xu, Yanwei Fu, Bing Zhang, Zitian Chen, Yu-Gang Jiang, xiangyang xue

This paper targets at learning to score the figure skating sports videos.

Paper
Code

Temporal Context Aggregation for Video Retrieval with Contrastive Learning

1 code implementation • 4 Aug 2020 • Jie Shao, Xin Wen, Bingchen Zhao, xiangyang xue

The current research focus on Content-Based Video Retrieval requires higher-level video representation describing the long-range semantic dependencies of relevant incidents, events, etc.

Ranked #6 on Video Retrieval on FIVR-200K

Contrastive Learning Representation Learning +2

Paper
Code

Dynamic Graph Message Passing Networks for Visual Recognition

2 code implementations • 20 Sep 2022 • Li Zhang, Mohan Chen, Anurag Arnab, xiangyang xue, Philip H. S. Torr

A fully-connected graph, such as the self-attention operation in Transformers, is beneficial for such modelling, however, its computational overhead is prohibitive.

Image Classification object-detection +3

Paper
Code

CODA: Counting Objects via Scale-aware Adversarial Density Adaption

1 code implementation • 25 Mar 2019 • Li Wang, Yongbo Li, xiangyang xue

Extensive experiments demonstrate that our network produces much better results on unseen datasets compared with existing counting adaption models.

Crowd Counting

Paper
Code

Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from Transformers by Self-supervised Learning of Sketch Gestalt

1 code implementation • CVPR 2020 • Hangyu Lin, Yanwei Fu, Yu-Gang Jiang, xiangyang xue

Unfortunately, the representation learned by SketchRNN is primarily for the generation tasks, rather than the other tasks of recognition and retrieval of sketches.

Retrieval Self-Supervised Learning +1

Paper
Code

Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection

1 code implementation • CVPR 2021 • Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, xiangyang xue, Jianfeng Feng, Li Zhang

The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.

Ranked #13 on Monocular 3D Object Detection on KITTI Cars Moderate

Monocular 3D Object Detection object-detection

Paper
Code

Learning Dynamic Alignment via Meta-filter for Few-shot Learning

1 code implementation • CVPR 2021 • Chengming Xu, Chen Liu, Li Zhang, Chengjie Wang, Jilin Li, Feiyue Huang, xiangyang xue, Yanwei Fu

Our insight is that these methods would lead to poor adaptation with redundant matching, and leveraging channel-wise adjustment is the key to well adapting the learned knowledge to new classes.

Few-Shot Learning Position

Paper
Code

SGM3D: Stereo Guided Monocular 3D Object Detection

1 code implementation • 3 Dec 2021 • Zheyuan Zhou, Liang Du, Xiaoqing Ye, Zhikang Zou, Xiao Tan, Li Zhang, xiangyang xue, Jianfeng Feng

Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image.

Autonomous Driving Depth Estimation +4

Paper
Code

Exploring Efficient Few-shot Adaptation for Vision Transformers

1 code implementation • 6 Jan 2023 • Chengming Xu, Siqian Yang, Yabiao Wang, Zhanxiong Wang, Yanwei Fu, xiangyang xue

Essentially, despite ViTs have been shown to enjoy comparable or even better performance on other vision tasks, it is still very nontrivial to efficiently finetune the ViTs in real-world FSL scenarios.

Few-Shot Learning

Paper
Code

Joint Parsing and Generation for Abstractive Summarization

2 code implementations • 23 Nov 2019 • Kaiqiang Song, Logan Lebanoff, Qipeng Guo, Xipeng Qiu, xiangyang xue, Chen Li, Dong Yu, Fei Liu

If generating a word can introduce an erroneous relation to the summary, the behavior must be discouraged.

Ranked #27 on Text Summarization on GigaWord

Abstractive Text Summarization Sentence

Paper
Code

The Image Local Autoregressive Transformer

1 code implementation • NeurIPS 2021 • Chenjie Cao, Yuxin Hong, Xiang Li, Chengrong Wang, Chengming Xu, xiangyang xue, Yanwei Fu

To address these limitations, we propose a novel model -- image Local Autoregressive Transformer (iLAT), to better facilitate the locally guided image synthesis.

Image Generation

Paper
Code

GAT-COBO: Cost-Sensitive Graph Neural Network for Telecom Fraud Detection

1 code implementation • 29 Mar 2023 • Xinxin Hu, Haotian Chen, Junjie Zhang, Hongchang Chen, Shuxin Liu, Xing Li, Yahui Wang, xiangyang xue

Extensive experiments on two real-world telecom fraud detection datasets demonstrate that our proposed method is effective for the graph imbalance problem, outperforming the state-of-the-art GNNs and GNN-based fraud detectors.

Anomaly Detection Fraud Detection +2

Paper
Code

Co-Attention Aligned Mutual Cross-Attention for Cloth-Changing Person Re-Identification

1 code implementation • Asian Conference on Computer Vision (ACCV) 2023 • Qizao Wang, Xuelin Qian, Yanwei Fu, xiangyang xue

In this paper, we first design a novel Shape Semantics Embedding (SSE) module to encode body shape semantic information, which is one of the essential clues to distinguish pedestrians when their clothes change.

Cloth-Changing Person Re-Identification Person Retrieval +1

Paper
Code

Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification

1 code implementation • 7 Apr 2015 • Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, xiangyang xue

In this paper, we propose a hybrid deep learning framework for video classification, which is able to model static spatial information, short-term motion, as well as long-term temporal clues in the videos.

Classification General Classification +1

Paper
Code

Cost Sensitive GNN-based Imbalanced Learning for Mobile Social Network Fraud Detection

1 code implementation • 28 Mar 2023 • Xinxin Hu, Haotian Chen, Hongchang Chen, Shuxin Liu, Xing Li, Shibo Zhang, Yahui Wang, xiangyang xue

But the imbalance problem in the aforementioned data, which could severely hinder the effectiveness of fraud detectors based on graph neural networks(GNN), has hardly been addressed in previous work.

Fraud Detection

Paper
Code

Improving Empathetic Dialogue Generation by Dynamically Infusing Commonsense Knowledge

1 code implementation • 24 May 2023 • Hua Cai, Xuli Shen, Qing Xu, Weilin Shen, Xiaomei Wang, Weifeng Ge, Xiaoqing Zheng, xiangyang xue

To this end, we propose a novel approach for empathetic response generation, which incorporates an adaptive module for commonsense knowledge selection to ensure consistency between the generated empathetic responses and the speaker's situation.

Dialogue Generation Empathetic Response Generation +1

Paper
Code

Repositioning the Subject within Image

1 code implementation • 30 Jan 2024 • Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, YiFan Li, xiangyang xue, Yanwei Fu

Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task.

Image Generation Image Manipulation

Paper
Code

A Multi-task Neural Approach for Emotion Attribution, Classification and Summarization

1 code implementation • 21 Dec 2018 • Guoyun Tu, Yanwei Fu, Boyang Li, Jiarui Gao, Yu-Gang Jiang, xiangyang xue

However, the sparsity of emotional expressions in the videos poses an obstacle to visual emotion analysis.

Classification Emotion Recognition +1

Paper
Code

Spatial Mixture Models with Learnable Deep Priors for Perceptual Grouping

1 code implementation • 7 Feb 2019 • Jinyang Yuan, Bin Li, xiangyang xue

Different from existing methods, the proposed method disentangles the attributes of an object into ``shape'' and ``appearance'' which are modeled separately by the mixture weights and the mixture components.

Object

Paper
Code

Vocabulary-informed Zero-shot and Open-set Learning

1 code implementation • 3 Jan 2023 • Yanwei Fu, Xiaomei Wang, Hanze Dong, Yu-Gang Jiang, Meng Wang, xiangyang xue, Leonid Sigal

Despite significant progress in object categorization, in recent years, a number of important challenges remain; mainly, the ability to learn from limited labeled data and to recognize object classes within large, potentially open, set of labels.

Object Categorization Open Set Learning +1

Paper
Code

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients

1 code implementation • 19 Nov 2023 • Shangchao Su, Bin Li, xiangyang xue

The implementation of FedRA is straightforward and can be seamlessly integrated into any transformer-based model without the need for further modification to the original model.

Federated Learning

Paper
Code

Local Slot Attention for Vision-and-Language Navigation

1 code implementation • 17 Jun 2022 • Yifeng Zhuang, Qiang Sun, Yanwei Fu, Lifeng Chen, xiangyang xue

Since the attention mechanism in the transformer architecture can better integrate inter- and intra-modal information of vision and language.

Navigate Vision and Language Navigation

Paper
Code

Raven's Progressive Matrices Completion with Latent Gaussian Process Priors

2 code implementations • 22 Mar 2021 • Fan Shi, Bin Li, xiangyang xue

In this paper we aim to solve the latter one by proposing a deep latent variable model, in which multiple Gaussian processes are employed as priors of latent variables to separately learn underlying abstract concepts from RPMs; thus the proposed model is interpretable in terms of concept-specific latent variables.

Answer Selection Gaussian Processes +1

Paper
Code

Compositional Law Parsing with Latent Random Functions

1 code implementation • 15 Sep 2022 • Fan Shi, Bin Li, xiangyang xue

The automatic parsing of these laws indicates the model's ability to understand the scene, which makes law parsing play a central role in many visual tasks.

Position Visual Reasoning

Paper
Code

Abstracting Concept-Changing Rules for Solving Raven's Progressive Matrix Problems

1 code implementation • 15 Jul 2023 • Fan Shi, Bin Li, xiangyang xue

Finally, we conduct experiments to illustrate the interpretability of CRAB in concept learning, answer selection, and global rule abstraction.

Answer Generation Answer Selection +1

Paper
Code

One-shot Federated Learning without Server-side Training

1 code implementation • 26 Apr 2022 • Shangchao Su, Bin Li, xiangyang xue

Federated Learning (FL) has recently made significant progress as a new machine learning paradigm for privacy protection.

Federated Learning Image Classification +1

Paper
Code

Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning

1 code implementation • 15 Nov 2022 • Shangchao Su, Mingzhao Yang, Bin Li, xiangyang xue

In this paper, we propose a federated adaptive prompt tuning algorithm, FedAPT, for multi-domain collaborative image classification with powerful foundation models, like CLIP.

Federated Learning Image Classification

Paper
Code

Multi-to-Single Knowledge Distillation for Point Cloud Semantic Segmentation

1 code implementation • 28 Apr 2023 • Shoumeng Qiu, Feng Jiang, Haiqiang Zhang, xiangyang xue, Jian Pu

In this paper, we propose a novel multi-to-single knowledge distillation framework for the 3D point cloud semantic segmentation task to boost the performance of those hard classes.

Knowledge Distillation Semantic Segmentation

Paper
Code

Knowledge-Guided Object Discovery with Acquired Deep Impressions

1 code implementation • 19 Mar 2021 • Jinyang Yuan, Bin Li, xiangyang xue

The proposed ADI framework focuses on the acquisition and utilization of knowledge, and is complementary to existing deep generative models proposed for compositional scene representation.

Object Object Discovery +1

Paper
Code

SCSP: Spectral Clustering Filter Pruning with Soft Self-adaption Manners

no code implementations • 14 Jun 2018 • Huiyuan Zhuo, Xuelin Qian, Yanwei Fu, Heng Yang, xiangyang xue

In this paper, we proposed a novel filter pruning for convolutional neural networks compression, namely spectral clustering filter pruning with soft self-adaption manners (SCSP).

Clustering Model Compression

Paper
Add Code

Dual Skipping Networks

no code implementations • CVPR 2018 • Changmao Cheng, Yanwei Fu, Yu-Gang Jiang, Wei Liu, Wenlian Lu, Jianfeng Feng, xiangyang xue

Inspired by the recent neuroscience studies on the left-right asymmetry of the human brain in processing low and high spatial frequency information, this paper introduces a dual skipping network which carries out coarse-to-fine object categorization.

General Classification Object +1

Paper
Add Code

Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks

no code implementations • 25 Feb 2015 • Yu-Gang Jiang, Zuxuan Wu, Jun Wang, xiangyang xue, Shih-Fu Chang

In this paper, we study the challenging problem of categorizing videos according to high-level semantics such as the existence of a particular human action or a complex event.

Paper
Add Code

Vocabulary-informed Extreme Value Learning

no code implementations • 28 May 2017 • Yanwei Fu, Hanze Dong, Yu-feng Ma, Zhengjun Zhang, xiangyang xue

To solve this problem, we propose the Extreme Value Learning (EVL) formulation to learn the mapping from visual feature to semantic space.

Open Set Learning

Paper
Add Code

DeepSkeleton: Skeleton Map for 3D Human Pose Regression

no code implementations • 29 Nov 2017 • Qingfu Wan, Wei zhang, xiangyang xue

For the first time, we show that training regression network from skeleton map alone is capable of meeting the performance of state-of-theart 3D human pose estimation works.

2D Human Pose Estimation 3D Human Pose Estimation +1

Paper
Add Code

Recent Advances in Zero-shot Recognition

no code implementations • 13 Oct 2017 • Yanwei Fu, Tao Xiang, Yu-Gang Jiang, xiangyang xue, Leonid Sigal, Shaogang Gong

With the recent renaissance of deep convolution neural networks, encouraging breakthroughs have been achieved on the supervised recognition tasks, where each class has sufficient training data and fully annotated training data.

Open Set Learning Zero-Shot Learning

Paper
Add Code

Multi-scale Deep Learning Architectures for Person Re-identification

no code implementations • ICCV 2017 • Xuelin Qian, Yanwei Fu, Yu-Gang Jiang, Tao Xiang, xiangyang xue

Our model is able to learn deep discriminative feature representations at different scales and automatically determine the most suitable scales for matching.

Person Re-Identification

Paper
Add Code

A Jointly Learned Deep Architecture for Facial Attribute Analysis and Face Detection in the Wild

no code implementations • 27 Jul 2017 • Keke He, Yanwei Fu, xiangyang xue

Facial attribute analysis in the real world scenario is very challenging mainly because of complex face variations.

Attribute Face Detection

Paper
Add Code

Modeling Multimodal Clues in a Hybrid Deep Learning Framework for Video Classification

no code implementations • 14 Jun 2017 • Yu-Gang Jiang, Zuxuan Wu, Jinhui Tang, Zechao Li, xiangyang xue, Shih-Fu Chang

More specifically, we utilize three Convolutional Neural Networks (CNNs) operating on appearance, motion and audio signals to extract their corresponding features.

General Classification Video Classification

Paper
Add Code

Semi-Latent GAN: Learning to generate and modify facial images from attributes

no code implementations • 7 Apr 2017 • Weidong Yin, Yanwei Fu, Leonid Sigal, xiangyang xue

Generating and manipulating human facial images using high-level attributal controls are important and interesting problems.

Attribute Generative Adversarial Network

Paper
Add Code

Weakly Supervised Dense Video Captioning

no code implementations • CVPR 2017 • Zhiqiang Shen, Jianguo Li, Zhou Su, Minjun Li, Yurong Chen, Yu-Gang Jiang, xiangyang xue

This paper focuses on a novel and challenging vision task, dense video captioning, which aims to automatically describe a video clip with multiple informative and diverse caption sentences.

Dense Video Captioning Language Modelling +2

Paper
Add Code

Iterative Object and Part Transfer for Fine-Grained Recognition

no code implementations • 29 Mar 2017 • Zhiqiang Shen, Yu-Gang Jiang, Dequan Wang, xiangyang xue

On both datasets, we achieve better results than many state-of-the-art approaches, including a few using oracle (manually annotated) bounding boxes in the test images.

Object

Paper
Add Code

Evolving Boxes for Fast Vehicle Detection

no code implementations • 1 Feb 2017 • Li Wang, Yao Lu, Hong Wang, Yingbin Zheng, Hao Ye, xiangyang xue

We perform fast vehicle detection from traffic surveillance cameras.

Fast Vehicle Detection

Paper
Add Code

Learning to Point and Count

no code implementations • 8 Dec 2015 • Jie Shao, Dequan Wang, xiangyang xue, Zheng Zhang

This paper proposes the problem of point-and-count as a test case to break the what-and-where deadlock.

General Classification

Paper
Add Code

Fusing Multi-Stream Deep Networks for Video Classification

no code implementations • 21 Sep 2015 • Zuxuan Wu, Yu-Gang Jiang, Xi Wang, Hao Ye, xiangyang xue, Jun Wang

A multi-stream framework is proposed to fully utilize the rich multimodal information in videos.

Classification General Classification +1

Paper
Add Code

Evaluating Two-Stream CNN for Video Classification

no code implementations • 8 Apr 2015 • Hao Ye, Zuxuan Wu, Rui-Wei Zhao, Xi Wang, Yu-Gang Jiang, xiangyang xue

In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification.

Classification General Classification +2

Paper
Add Code

Do More Dropouts in Pool5 Feature Maps for Better Object Detection

no code implementations • 24 Sep 2014 • Zhiqiang Shen, xiangyang xue

In these fields, the outputs of all layers of CNNs are usually considered as a high dimensional feature vector extracted from an input image and the correspondence between finer level feature vectors and concepts that the input image contains is all-important.

General Classification Image Classification +2

Paper
Add Code

Top-Down Tree Structured Text Generation

no code implementations • 14 Aug 2018 • Qipeng Guo, Xipeng Qiu, xiangyang xue, Zheng Zhang

Text generation is a fundamental building block in natural language processing tasks.

Sentence Text Generation

Paper
Add Code

Learning the Compositional Spaces for Generalized Zero-shot Learning

no code implementations • ICLR 2019 • Hanze Dong, Yanwei Fu, Sung Ju Hwang, Leonid Sigal, xiangyang xue

This paper studies the problem of Generalized Zero-shot Learning (G-ZSL), whose goal is to classify instances belonging to both seen and unseen classes at the test time.

Generalized Zero-Shot Learning Open Set Learning

Paper
Add Code

Instance-level Sketch-based Retrieval by Deep Triplet Classification Siamese Network

no code implementations • 28 Nov 2018 • Peng Lu, Hangyu Lin, Yanwei Fu, Shaogang Gong, Yu-Gang Jiang, xiangyang xue

Additionally, to study the tasks of sketch-based hairstyle retrieval, this paper contributes a new instance-level photo-sketch dataset - Hairstyle Photo-Sketch dataset, which is composed of 3600 sketches and photos, and 2400 sketch-photo pairs.

General Classification Retrieval +2

Paper
Add Code

PRUNING IN TRAINING: LEARNING AND RANKING SPARSE CONNECTIONS IN DEEP CONVOLUTIONAL NETWORKS

no code implementations • ICLR 2019 • Yanwei Fu, Shun Zhang, Donghao Li, Xinwei Sun, xiangyang xue, Yuan YAO

This paper proposes a Pruning in Training (PiT) framework of learning to reduce the parameter size of networks.

Paper
Add Code

VOCABULARY-INFORMED VISUAL FEATURE AUGMENTATION FOR ONE-SHOT LEARNING

no code implementations • ICLR 2018 • jianqi ma, Hangyu Lin, yinda zhang, Yanwei Fu, xiangyang xue

Besides directly augmenting image features, we transform the image features to semantic space using the encoder and perform the data augmentation.

Classification Data Augmentation +2

Paper
Add Code

Weakly Supervised Semantic Segmentation for Social Images

no code implementations • CVPR 2015 • Wei Zhang, Sheng Zeng, Dequan Wang, xiangyang xue

Image semantic segmentation is the task of partitioning image into several regions based on semantic concepts.

Segmentation Weakly supervised Semantic Segmentation +1

Paper
Add Code

Multiple Granularity Descriptors for Fine-Grained Categorization

no code implementations • ICCV 2015 • Dequan Wang, Zhiqiang Shen, Jie Shao, Wei zhang, xiangyang xue, Zheng Zhang

Fine-grained categorization, which aims to distinguish subordinate-level categories such as bird species or dog breeds, is an extremely challenging task.

Paper
Add Code

Question Guided Modular Routing Networks for Visual Question Answering

no code implementations • 17 Apr 2019 • Yanze Wu, Qiang Sun, Jianqi Ma, Bin Li, Yanwei Fu, Yao Peng, xiangyang xue

Particularly, The QGMRN is composed of visual, textual and routing network.

Question Answering Visual Question Answering +1

Paper
Add Code

Towards Instance-level Image-to-Image Translation

no code implementations • CVPR 2019 • Zhiqiang Shen, Mingyang Huang, Jianping Shi, xiangyang xue, Thomas Huang

The proposed INIT exhibits three import advantages: (1) the instance-level objective loss can help learn a more accurate reconstruction and incorporate diverse attributes of objects; (2) the styles used for target domain of local/global areas are from corresponding spatial regions in source domain, which intuitively is a more reasonable mapping; (3) the joint training process can benefit both fine and coarse granularity and incorporates instance information to improve the quality of global translation.

Attribute Image-to-Image Translation +3

Paper
Add Code

Correlative Multi-Label Multi-Instance Image Annotation

no code implementations • IEEE International Conference on Computer Vision 2011 • Xiangyang Xue, Wei zhang, Jie Zhang, Bin Wu, Jianping Fan, Yao Lu

The cross-level label coherence en-codes the consistency between the labels at the image leveland the labels at the region level.

Paper
Add Code

Fast Color Constancy with Patch-wise Bright Pixels

no code implementations • 17 Nov 2019 • Yiyao Shi, Jian Wang, xiangyang xue

In this paper, a learning-free color constancy algorithm called the Patch-wise Bright Pixels (PBP) is proposed.

Color Constancy

Paper
Add Code

Multi-Scale Self-Attention for Text Classification

no code implementations • 2 Dec 2019 • Qipeng Guo, Xipeng Qiu, PengFei Liu, xiangyang xue, Zheng Zhang

In this paper, we introduce the prior knowledge, multi-scale structure, into self-attention modules.

General Classification text-classification +1

Paper
Add Code

Learning to Augment Expressions for Few-shot Fine-grained Facial Expression Recognition

no code implementations • 17 Jan 2020 • Wenxuan Wang, Yanwei Fu, Qiang Sun, Tao Chen, Chenjie Cao, Ziqi Zheng, Guoqiang Xu, Han Qiu, Yu-Gang Jiang, xiangyang xue

Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we further evaluate several tasks of few-shot expression learning by virtue of our F2ED, which are to recognize the facial expressions given only few training instances.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

3DCFS: Fast and Robust Joint 3D Semantic-Instance Segmentation via Coupled Feature Selection

no code implementations • 1 Mar 2020 • Liang Du, Jingang Tan, xiangyang xue, Lili Chen, Hongkai Wen, Jianfeng Feng, Jiamao Li, Xiaolin Zhang

We propose a novel fast and robust 3D point clouds segmentation framework via coupled feature selection, named 3DCFS, that jointly performs semantic and instance segmentation.

3D Semantic Instance Segmentation feature selection +2

Paper
Add Code

MOTS: Multiple Object Tracking for General Categories Based On Few-Shot Method

no code implementations • 19 May 2020 • Xixi Xu, Chao Lu, Liang Zhu, xiangyang xue, Guanxian Chen, Qi Guo, Yining Lin, Zhijian Zhao

Most modern Multi-Object Tracking (MOT) systems typically apply REID-based paradigm to hold a balance between computational efficiency and performance.

Computational Efficiency Multi-Object Tracking +1

Paper
Add Code

Long-Term Cloth-Changing Person Re-identification

no code implementations • 26 May 2020 • Xuelin Qian, Wenxuan Wang, Li Zhang, Fangrui Zhu, Yanwei Fu, Tao Xiang, Yu-Gang Jiang, xiangyang xue

Specifically, we consider that under cloth-changes, soft-biometrics such as body shape would be more reliable.

Cloth-Changing Person Re-Identification

Paper
Add Code

A New Screening Method for COVID-19 based on Ocular Feature Recognition by Machine Learning Tools

no code implementations • 4 Sep 2020 • Yanwei Fu, Feng Li, Wenxuan Wang, Haicheng Tang, Xuelin Qian, Mengwei Gu, xiangyang xue

After more than four months study, we found that the confirmed cases of COVID-19 present the consistent ocular pathological symbols; and we propose a new screening method of analyzing the eye-region images, captured by common CCD and CMOS cameras, could reliably make a rapid risk screening of COVID-19 with very high accuracy.

BIG-bench Machine Learning Ethics +2

Paper
Add Code

M3Lung-Sys: A Deep Learning System for Multi-Class Lung Pneumonia Screening from CT Imaging

no code implementations • 7 Oct 2020 • Xuelin Qian, Huazhu Fu, Weiya Shi, Tao Chen, Yanwei Fu, Fei Shan, xiangyang xue

To counter the outbreak of COVID-19, the accurate diagnosis of suspected cases plays a crucial role in timely quarantine, medical treatment, and preventing the spread of the pandemic.

Paper
Add Code

Nonlinear Monte Carlo Method for Imbalanced Data Learning

no code implementations • 27 Oct 2020 • Xuli Shen, Qing Xu, xiangyang xue

and the mean value of loss function is used as the empirical risk by Law of Large Numbers (LLN).

imbalanced classification

Paper
Add Code

A Generic Object Re-identification System for Short Videos

no code implementations • 10 Feb 2021 • Tairu Qiu, Guanxian Chen, Zhongang Qi, Bin Li, Ying Shan, xiangyang xue

Short video applications like TikTok and Kwai have been a great hit recently.

Object object-detection +1

Paper
Add Code

Learning Compositional Representation for 4D Captures with Neural ODE

no code implementations • CVPR 2021 • Boyan Jiang, yinda zhang, Xingkui Wei, xiangyang xue, Yanwei Fu

To model the motion, a neural Ordinary Differential Equation (ODE) is trained to update the initial state conditioned on the learned motion code, and a decoder takes the shape code and the updated state code to reconstruct the 3D model at each time stamp.

4D reconstruction

Paper
Add Code

Delving into Data: Effectively Substitute Training for Black-box Attack

no code implementations • CVPR 2021 • Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, xiangyang xue

Previous substitute training approaches focus on stealing the knowledge of the target model based on real training data or synthetic data, without exploring what kind of data can further improve the transferability between the substitute and target models.

Adversarial Attack

Paper
Add Code

Rapid COVID-19 Risk Screening by Eye-region Manifestations

no code implementations • 12 Jun 2021 • Yanwei Fu, Lei Zhao, Haojie Zheng, Qiang Sun, Li Yang, Hong Li, Jiao Xie, xiangyang xue, Feng Li, Yuan Li, Wei Wang, Yantao Pei, Jianmin Wang, Xiuqi Wu, Yanhua Zheng, Hongxia Tian Mengwei Gu1

It is still nontrivial to develop a new fast COVID-19 screening method with the easier access and lower cost, due to the technical and cost limitations of the current testing methods in the medical resource-poor districts.

Ethics

Paper
Add Code

SAR-Net: Shape Alignment and Recovery Network for Category-level 6D Object Pose and Size Estimation

no code implementations • CVPR 2022 • Haitao Lin, Zichang Liu, Chilam Cheang, Yanwei Fu, Guodong Guo, xiangyang xue

The concatenation of the observed point cloud and symmetric one reconstructs a coarse object shape, thus facilitating object center (3D translation) and 3D size estimation.

Object Optical Character Recognition (OCR)

Paper
Add Code

The Report on China-Spain Joint Clinical Testing for Rapid COVID-19 Risk Screening by Eye-region Manifestations

no code implementations • 18 Sep 2021 • Yanwei Fu, Feng Li, Paula boned Fustel, Lei Zhao, Lijie Jia, Haojie Zheng, Qiang Sun, Shisong Rong, Haicheng Tang, xiangyang xue, Li Yang, Hong Li, Jiao Xie Wenxuan Wang, Yuan Li, Wei Wang, Yantao Pei, Jianmin Wang, Xiuqi Wu, Yanhua Zheng, Hongxia Tian, Mengwei Gu

The image-level performance of COVID-19 prescreening model in the China-Spain multicenter study achieved an AUC of 0. 913 (95% CI, 0. 898-0. 927), with a sensitivity of 0. 695 (95% CI, 0. 643-0. 748), a specificity of 0. 904 (95% CI, 0. 891 -0. 919), an accuracy of 0. 875(0. 861-0. 889), and a F1 of 0. 611(0. 568-0. 655).

Binary Classification Specificity

Paper
Add Code

{\mathcal{P}^2}: A Plan-and-Pretrain Approach for Knowledge Graph-to-Text Generation

no code implementations • ACL (WebNLG, INLG) 2020 • Qipeng Guo, Zhijing Jin, Ning Dai, Xipeng Qiu, xiangyang xue, David Wipf, Zheng Zhang

Text verbalization of knowledge graphs is an important problem with wide application to natural language generation (NLG) systems.

Knowledge Graphs Text Generation

Paper
Add Code

DeepEnFM: Deep neural networks with Encoder enhanced Factorization Machine

no code implementations • 25 Sep 2019 • Qiang Sun, Zhinan Cheng, Yanwei Fu, Wenxuan Wang, Yu-Gang Jiang, xiangyang xue

Instead of learning the cross features directly, DeepEnFM adopts the Transformer encoder as a backbone to align the feature embeddings with the clues of other fields.

Click-Through Rate Prediction

Paper
Add Code

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

no code implementations • 7 Dec 2021 • Jinyang Yuan, Bin Li, xiangyang xue

When observing a visual scene that contains multiple objects from multiple viewpoints, humans are able to perceive the scene in a compositional way from each viewpoint, while achieving the so-called "object constancy" across different viewpoints, even though the exact viewpoints are untold.

Paper
Add Code

The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection

no code implementations • ICCV 2021 • Zhikang Zou, Xiaoqing Ye, Liang Du, Xianhui Cheng, Xiao Tan, Li Zhang, Jianfeng Feng, xiangyang xue, Errui Ding

Low-cost monocular 3D object detection plays a fundamental role in autonomous driving, whereas its accuracy is still far from satisfactory.

Autonomous Driving Monocular 3D Object Detection +4

Paper
Add Code

Compositional Scene Representation Learning via Reconstruction: A Survey

no code implementations • 15 Feb 2022 • Jinyang Yuan, Tonglin Chen, Bin Li, xiangyang xue

In this survey, we first outline the current progress on reconstruction-based compositional scene representation learning with deep neural networks, including development history and categorizations of existing methods from the perspectives of the modeling of visual scenes and the inference of scene representations; then provide benchmarks, including an open source toolbox to reproduce the benchmark experiments, of representative methods that consider the most extensively studied problem setting and form the foundation for other methods; and finally discuss the limitations of existing methods and future directions of this research topic.

Representation Learning

Paper
Add Code

H4D: Human 4D Modeling by Learning Neural Compositional Representation

no code implementations • CVPR 2022 • Boyan Jiang, yinda zhang, Xingkui Wei, xiangyang xue, Yanwei Fu

A simple yet effective linear motion model is proposed to provide a rough and regularized motion estimation, followed by per-frame compensation for pose and geometry details with the residual encoded in the auxiliary code.

3D Reconstruction Future prediction +2

Paper
Add Code

QS-Craft: Learning to Quantize, Scrabble and Craft for Conditional Human Motion Animation

no code implementations • 22 Mar 2022 • Yuxin Hong, Xuelin Qian, Simian Luo, xiangyang xue, Yanwei Fu

To this end, this paper proposes a novel model of learning to Quantize, Scrabble, and Craft (QS-Craft) for conditional human motion animation.

Generative Adversarial Network

Paper
Add Code

ImpDet: Exploring Implicit Fields for 3D Object Detection

no code implementations • 31 Mar 2022 • Xuelin Qian, Li Wang, Yi Zhu, Li Zhang, Yanwei Fu, xiangyang xue

Conventional 3D object detection approaches concentrate on bounding boxes representation learning with several parameters, i. e., localization, dimension, and orientation.

3D Object Detection Object +2

Paper
Add Code

DST: Dynamic Substitute Training for Data-free Black-box Attack

no code implementations • CVPR 2022 • Wenxuan Wang, Xuelin Qian, Yanwei Fu, xiangyang xue

With the wide applications of deep neural network models in various computer vision tasks, more and more works study the model vulnerability to adversarial examples.

Knowledge Distillation

Paper
Add Code

Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images

no code implementations • 21 Apr 2022 • Chao Wen, yinda zhang, Chenjie Cao, Zhuwen Li, xiangyang xue, Yanwei Fu

We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.

Paper
Add Code

Density-preserving Deep Point Cloud Compression

no code implementations • CVPR 2022 • Yun He, Xinlin Ren, Danhang Tang, yinda zhang, xiangyang xue, Yanwei Fu

To address this, we propose a novel deep point cloud compression method that preserves local density information.

Paper
Add Code

Learning 6-DoF Object Poses to Grasp Category-level Objects by Language Instructions

no code implementations • 9 May 2022 • Chilam Cheang, Haitao Lin, Yanwei Fu, xiangyang xue

This paper studies the task of any objects grasping from the known categories by free-form language instructions.

Object Object Localization +1

Paper
Add Code

I Know What You Draw: Learning Grasp Detection Conditioned on a Few Freehand Sketches

no code implementations • 9 May 2022 • Haitao Lin, Chilam Cheang, Yanwei Fu, xiangyang xue

The physical robot experiments confirm the utility of our method in object-cluttered scenes.

Paper
Add Code

Cross-domain Federated Object Detection

no code implementations • 30 Jun 2022 • Shangchao Su, Bin Li, Chengzhi Zhang, Mingzhao Yang, xiangyang xue

Federated learning can enable multi-party collaborative learning without leaking client data.

Autonomous Driving Federated Learning +3

Paper
Add Code

RCLane: Relay Chain Prediction for Lane Detection

no code implementations • 19 Jul 2022 • Shenghua Xu, Xinyue Cai, Bin Zhao, Li Zhang, Hang Xu, Yanwei Fu, xiangyang xue

This is because most of the existing lane detection methods either treat the lane detection as a dense prediction or a detection task, few of them consider the unique topologies (Y-shape, Fork-shape, nearly horizontal lane) of the lane markers, which leads to sub-optimal solution.

Lane Detection

Paper
Add Code

Style Spectroscope: Improve Interpretability and Controllability through Fourier Analysis

no code implementations • 12 Aug 2022 • Zhiyu Jin, Xuli Shen, Bin Li, xiangyang xue

We connect Fourier amplitude and phase with Gram matrices and a content reconstruction loss in style transfer, respectively.

Style Transfer

Paper
Add Code

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

no code implementations • 24 Aug 2022 • Liang Du, Xiaoqing Ye, Xiao Tan, Edward Johns, Bo Chen, Errui Ding, xiangyang xue, Jianfeng Feng

A feasible method is investigated to construct conceptual scenes without external datasets.

3D Object Detection Domain Adaptation +1

Paper
Add Code

Domain Discrepancy Aware Distillation for Model Aggregation in Federated Learning

no code implementations • 4 Oct 2022 • Shangchao Su, Bin Li, xiangyang xue

In this paper, we first analyze the generalization bound of the aggregation model produced from knowledge distillation for the client domains, and then describe two challenges, server-to-client discrepancy and client-to-client discrepancy, brought to the aggregation model by the domain discrepancies.

Federated Learning Knowledge Distillation

Paper
Add Code

Compositional Scene Modeling with Global Object-Centric Representations

no code implementations • 21 Nov 2022 • Tonglin Chen, Bin Li, Zhimeng Shen, xiangyang xue

Inspired by such an ability of humans, this paper proposes a compositional scene modeling method to infer global representations of canonical images of objects without any supervision.

Object Patch Matching +1

Paper
Add Code

Chinese Character Recognition with Radical-Structured Stroke Trees

no code implementations • 24 Nov 2022 • Haiyang Yu, Jingye Chen, Bin Li, xiangyang xue

In this paper, we represent each Chinese character as a stroke tree, which is organized according to its radical structures, to fully exploit the merits of both radical and stroke levels in a decent way.

Paper
Add Code

Rethinking the Multi-view Stereo from the Perspective of Rendering-based Augmentation

no code implementations • 11 Mar 2023 • Chenjie Cao, Xinlin Ren, xiangyang xue, Yanwei Fu

To address these problems, we first apply one of the state-of-the-art learning-based MVS methods, --MVSFormer, to overcome intractable scenarios such as textureless and reflections regions suffered by traditional PatchMatch methods, but it fails in a few large scenes' reconstructions.

Paper
Add Code

Weakly-Supervised Text Instance Segmentation

no code implementations • 20 Mar 2023 • Xinyan Zu, Haiyang Yu, Bin Li, xiangyang xue

Text segmentation is a challenging vision task with many downstream applications.

Contrastive Learning Instance Segmentation +5

Paper
Add Code

Joint fMRI Decoding and Encoding with Latent Embedding Alignment

no code implementations • 26 Mar 2023 • Xuelin Qian, Yikai Wang, Yanwei Fu, Xinwei Sun, xiangyang xue, Jianfeng Feng

Our Latent Embedding Alignment (LEA) model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.

Image Generation

Paper
Add Code

Learning Versatile 3D Shape Generation with Improved AR Models

no code implementations • 26 Mar 2023 • Simian Luo, Xuelin Qian, Yanwei Fu, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue

Auto-Regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.

3D Shape Generation Image Generation +1

Paper
Add Code

Exploring One-shot Semi-supervised Federated Learning with A Pre-trained Diffusion Model

no code implementations • 6 May 2023 • Mingzhao Yang, Shangchao Su, Bin Li, xiangyang xue

Specifically, we first extract prototypes from the labeled data on the server and send them to the clients.

Federated Learning Privacy Preserving

Paper
Add Code

Collaborative Chinese Text Recognition with Personalized Federated Learning

no code implementations • 9 May 2023 • Shangchao Su, Haiyang Yu, Bin Li, xiangyang xue

In Chinese text recognition, to compensate for the insufficient local data and improve the performance of local few-shot character recognition, it is often necessary for one organization to collect a large amount of data from similar organizations.

Personalized Federated Learning Privacy Preserving

Paper
Add Code

OCTScenes: A Versatile Real-World Dataset of Tabletop Scenes for Object-Centric Learning

no code implementations • 16 Jun 2023 • Yinxuan Huang, Tonglin Chen, Zhimeng Shen, Jinghao Huang, Bin Li, xiangyang xue

The results demonstrate the shortcomings of state-of-the-art methods for learning meaningful representations from real-world data, despite their impressive performance on complex synthesis datasets.

Object Representation Learning

Paper
Add Code

Understanding Depth Map Progressively: Adaptive Distance Interval Separation for Monocular 3d Object Detection

no code implementations • 19 Jun 2023 • Xianhui Cheng, Shoumeng Qiu, Zhikang Zou, Jian Pu, xiangyang xue

In this paper, we propose a framework named the Adaptive Distance Interval Separation Network (ADISN) that adopts a novel perspective on understanding depth maps, as a form that lies between LiDAR and images.

Depth Estimation Monocular 3D Object Detection +1

Paper
Add Code

Rethinking Person Re-identification from a Projection-on-Prototypes Perspective

no code implementations • 21 Aug 2023 • Qizao Wang, Xuelin Qian, Bin Li, Yanwei Fu, xiangyang xue

In this paper, we rethink the role of the classifier in person Re-ID, and advocate a new perspective to conceive the classifier as a projection from image features to class prototypes.

Person Re-Identification Person Retrieval +1

Paper
Add Code

Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification

no code implementations • 21 Aug 2023 • Qizao Wang, Xuelin Qian, Bin Li, Ying Fu, Yanwei Fu, xiangyang xue

Images with similar so-called fine-grained attributes (e. g., clothes and viewpoints) are encouraged to cluster together.

Ranked #2 on Person Re-Identification on PRCC (mAP metric)

Attribute Cloth-Changing Person Re-Identification +4

Paper
Add Code

WALL-E: Embodied Robotic WAiter Load Lifting with Large Language Model

no code implementations • 30 Aug 2023 • Tianyu Wang, YiFan Li, Haitao Lin, xiangyang xue, Yanwei Fu

The target instruction is then forwarded to a visual grounding system for object pose and size estimation, following which the robot grasps the object accordingly.

Language Modelling Large Language Model +3

Paper
Add Code

DeNoising-MOT: Towards Multiple Object Tracking with Severe Occlusions

no code implementations • 9 Sep 2023 • Teng Fu, Xiaocong Wang, Haiyang Yu, Ke Niu, Bin Li, xiangyang xue

Multiple object tracking (MOT) tends to become more challenging when severe occlusions occur.

Denoising Multiple Object Tracking +1

Paper
Add Code

Learning Versatile 3D Shape Generation with Improved Auto-regressive Models

no code implementations • ICCV 2023 • Simian Luo, Xuelin Qian, Yanwei Fu, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue

Auto-Regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space.

3D Shape Generation Image Generation +1

Paper
Add Code

OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-modal 3D Data

no code implementations • 20 Oct 2023 • Yijie Zhou, Likun Cai, Xianhui Cheng, Zhongxue Gan, xiangyang xue, Wenchao Ding

In the era of big data and large models, automatic annotating functions for multi-modal data are of great significance for real-world AI-driven applications, such as autonomous driving and embodied AI.

Autonomous Driving

Paper
Add Code

One-Shot Federated Learning with Classifier-Guided Diffusion Models

no code implementations • 15 Nov 2023 • Mingzhao Yang, Shangchao Su, Bin Li, xiangyang xue

Leveraging the extensive knowledge stored in the pre-trained diffusion model, the synthetic datasets can assist us in surpassing the knowledge limitations of the client samples, resulting in aggregation models that even outperform the performance ceiling of centralized training in some cases, which is convincingly demonstrated in the sufficient quantification and visualization experiments conducted on three large-scale multi-domain image datasets.

Federated Learning

Paper
Add Code

Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection

no code implementations • 26 Dec 2023 • Songmin Dai, Yifan Wu, Xiaoqiang Li, xiangyang xue

Recent unsupervised anomaly detection methods often rely on feature extractors pretrained with auxiliary datasets or on well-crafted anomaly-simulated samples.

Unsupervised Anomaly Detection

Paper
Add Code

Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints

no code implementations • 3 Jan 2024 • Jinyang Yuan, Tonglin Chen, Zhimeng Shen, Bin Li, xiangyang xue

This ability is essential for humans to identify the same object while moving and to learn from vision efficiently.

Object

Paper
Add Code

Towards Generative Abstract Reasoning: Completing Raven's Progressive Matrix via Rule Abstraction and Selection

no code implementations • 18 Jan 2024 • Fan Shi, Bin Li, xiangyang xue

In the odd-one-out task and two held-out configurations, RAISE can leverage acquired latent concepts and atomic rules to find the rule-breaking image in a matrix and handle problems with unseen combinations of rules and attributes.

Answer Generation Attribute +2

Paper
Add Code

Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

no code implementations • 19 Feb 2024 • Xuelin Qian, Yu Wang, Simian Luo, yinda zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, xiangyang xue, Bo Zhao, Tiejun Huang, Yunsheng Wu, Yanwei Fu

In this paper, we extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.

3D Generation 3D Shape Generation +1

Paper
Add Code

FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View

no code implementations • 5 Mar 2024 • Jiawei Hou, Xiaoyan Li, Wenhao Guan, Gang Zhang, Di Feng, Yuheng Du, xiangyang xue, Jian Pu

In autonomous driving, 3D occupancy prediction outputs voxel-wise status and semantic labels for more comprehensive understandings of 3D scenes compared with traditional perception tasks, such as 3D object detection and bird's-eye view (BEV) semantic segmentation.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.