Search Results for author: Nanning Zheng

Found 147 papers, 51 papers with code

PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation

1 code implementation8 Sep 2024 Ning Gao, Sanping Zhou, Le Wang, Nanning Zheng

In this paper, we propose a simple yet effective semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation, whose goal is to generate high-fidelity pseudo labels by learning robust and diverse features in the training process.

Image Segmentation Pseudo Label +3

Improving AlphaFlow for Efficient Protein Ensembles Generation

no code implementations8 Jul 2024 Shaoning Li, Mingyu Li, Yusong Wang, Xinheng He, Nanning Zheng, Jian Zhang, Pheng-Ann Heng

Investigating conformational landscapes of proteins is a crucial way to understand their biological functions and properties.

A General Theory for Compositional Generalization

no code implementations20 May 2024 Jingwen Fu, Zhizheng Zhang, Yan Lu, Nanning Zheng

Compositional Generalization (CG) embodies the ability to comprehend novel combinations of familiar concepts, representing a significant cognitive leap in human intellectual advancement.

Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis

no code implementations CVPR 2024 Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang, Wenxuan Xie, Cuiling Lan, Yan Lu, Nanning Zheng

Significant progress has been made in scene text detection models since the rise of deep learning, but scene text layout analysis, which aims to group detected text instances as paragraphs, has not kept pace.

Scene Text Detection Text Detection

F$^3$low: Frame-to-Frame Coarse-grained Molecular Dynamics with SE(3) Guided Flow Matching

no code implementations1 May 2024 Shaoning Li, Yusong Wang, Mingyu Li, Jian Zhang, Bin Shao, Nanning Zheng, Jian Tang

Molecular dynamics (MD) is a crucial technique for simulating biological systems, enabling the exploration of their dynamic nature and fostering an understanding of their functions and properties.

Make Your LLM Fully Utilize the Context

2 code implementations25 Apr 2024 Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou

While many contemporary large language models (LLMs) can process lengthy input, they still struggle to fully utilize information within the long context, known as the lost-in-the-middle challenge.

4k Information Retrieval +2

Robust Noisy Label Learning via Two-Stream Sample Distillation

no code implementations16 Apr 2024 Sihan Bai, Sanping Zhou, Zheng Qin, Le Wang, Nanning Zheng

Noisy label learning aims to learn robust networks under the supervision of noisy labels, which plays a critical role in deep learning.

Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction

no code implementations14 Mar 2024 He Zhang, Chang Liu, Zun Wang, Xinran Wei, Siyuan Liu, Nanning Zheng, Bin Shao, Tie-Yan Liu

Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems.

Property Prediction

Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing Systems

no code implementations11 Mar 2024 Xing Lei, Longjun Liu, Zhiheng Zhou, Hongbin Sun, Nanning Zheng

We deploy our L-Mobilenet model to ZYNQ embedded platform for fully evaluating the performance of our design.

See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI

no code implementations11 Mar 2024 Yulong Liu, Yongqiang Ma, Guibo Zhu, Haodong Jing, Nanning Zheng

Our model integrates a high-level perception decoding pipeline and a pixel-wise reconstruction pipeline guided by high-level perceptions, simulating bottom-up and top-down processes in neuroscience.

Brain Decoding General Knowledge +1

Common 7B Language Models Already Possess Strong Math Capabilities

1 code implementation7 Mar 2024 Chen Li, Weiqi Wang, Jingcheng Hu, Yixuan Wei, Nanning Zheng, Han Hu, Zheng Zhang, Houwen Peng

This paper shows that the LLaMA-2 7B model with common pre-training already exhibits strong mathematical abilities, as evidenced by its impressive accuracy of 97. 7% and 72. 0% on the GSM8K and MATH benchmarks, respectively, when selecting the best response from 256 random generations.

GSM8K Math

Leveraging Anchor-based LiDAR 3D Object Detection via Point Assisted Sample Selection

1 code implementation4 Mar 2024 Shitao Chen, Haolin Zhang, Nanning Zheng

3D object detection based on LiDAR point cloud and prior anchor boxes is a critical technology for autonomous driving environment perception and understanding.

3D Object Detection Autonomous Driving +2

Dual-Space Optimization: Improved Molecule Sequence Design by Latent Prompt Transformer

no code implementations27 Feb 2024 Deqian Kong, Yuhao Huang, Jianwen Xie, Edouardo Honig, Ming Xu, Shuanghong Xue, Pei Lin, Sanping Zhou, Sheng Zhong, Nanning Zheng, Ying Nian Wu

Designing molecules with desirable properties, such as drug-likeliness and high binding affinities towards protein targets, is a challenging problem.

An automated framework for brain vessel centerline extraction from CTA images

1 code implementation13 Jan 2024 Sijie Liu, Ruisheng Su, Jianghang Su, Jingmin Xin, Jiayi Wu, Wim van Zwam, Pieter Jan van Doormaal, Aad van der Lugt, Wiro J. Niessen, Nanning Zheng, Theo van Walsum

In this paper, we consider automatic lumen segmentation generation without additional annotation effort by physicians and more effective use of the generated lumen segmentation for improved centerline extraction performance.

Segmentation

Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking

no code implementations17 Nov 2023 Yizhe Li, Sanping Zhou, Zheng Qin, Le Wang, Jinjun Wang, Nanning Zheng

In this paper, we propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets, so as to achieve robust data association in the tracking process.

Multi-Object Tracking

FFINet: Future Feedback Interaction Network for Motion Forecasting

no code implementations8 Nov 2023 Miao Kang, Shengqi Wang, Sanping Zhou, Ke Ye, Jingjing Jiang, Nanning Zheng

In this paper, we propose a novel Future Feedback Interaction Network (FFINet) to aggregate features the current observations and potential future interactions for trajectory prediction.

Motion Forecasting Position +1

Learning From Mistakes Makes LLM Better Reasoner

1 code implementation31 Oct 2023 Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen

To further improve their reasoning capabilities, this work explores whether LLMs can LEarn from MistAkes (LEMA), akin to the human learning process.

GSM8K Math +1

Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity

no code implementations27 Oct 2023 Bohan Wang, Jingwen Fu, Huishuai Zhang, Nanning Zheng, Wei Chen

Recently, Arjevani et al. [1] established a lower bound of iteration complexity for the first-order optimization under an $L$-smooth condition and a bounded noise variance assumption.

LEMMA valid

G2-MonoDepth: A General Framework of Generalized Depth Inference from Monocular RGB+X Data

1 code implementation24 Oct 2023 Haotian Wang, Meng Yang, Nanning Zheng

This paper investigates a unified task of monocular depth inference, which infers high-quality depth maps from all kinds of input raw data from various robots in unseen scenes.

Data Augmentation Depth Completion +1

Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching

no code implementations8 Oct 2023 Hao Zhang, Lumin Xu, Shenqi Lai, Wenqi Shao, Nanning Zheng, Ping Luo, Yu Qiao, Kaipeng Zhang

Current image-based keypoint detection methods for animal (including human) bodies and faces are generally divided into full-supervised and few-shot class-agnostic approaches.

Keypoint Detection Open Vocabulary Keypoint Detection

Overcoming the Barrier of Orbital-Free Density Functional Theory for Molecular Systems Using Deep Learning

no code implementations28 Sep 2023 He Zhang, Siyuan Liu, Jiacheng You, Chang Liu, Shuxin Zheng, Ziheng Lu, Tong Wang, Nanning Zheng, Bin Shao

Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research.

Breaking through the learning plateaus of in-context learning in Transformer

no code implementations12 Sep 2023 Jingwen Fu, Tao Yang, Yuwang Wang, Yan Lu, Nanning Zheng

To study the mechanism behind the learning plateaus, we conceptually seperate a component within the model's internal representation that is exclusively affected by the model's weights.

In-Context Learning Representation Learning

Generalization error bounds for iterative learning algorithms with bounded updates

no code implementations10 Sep 2023 Jingwen Fu, Nanning Zheng

This paper explores the generalization characteristics of iterative learning algorithms with bounded updates for non-convex loss functions, employing information-theoretic techniques.

InteractionNet: Joint Planning and Prediction for Autonomous Driving with Transformers

1 code implementation7 Sep 2023 Jiawei Fu, Yanqing Shen, Zhiqiang Jian, Shitao Chen, Jingmin Xin, Nanning Zheng

Planning and prediction are two important modules of autonomous driving and have experienced tremendous advancement recently.

Autonomous Driving CARLA longest6

Complementing Onboard Sensors with Satellite Map: A New Perspective for HD Map Construction

1 code implementation29 Aug 2023 Wenjie Gao, Jiawei Fu, Yanqing Shen, Haodong Jing, Shitao Chen, Nanning Zheng

To enable better integration of satellite maps with existing methods, we propose a hierarchical fusion module, which includes feature-level fusion and BEV-level fusion.

Autonomous Driving Semantic Segmentation

DETR Doesn't Need Multi-Scale or Locality Design

1 code implementation3 Aug 2023 Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu

This paper presents an improved DETR detector that maintains a "plain" nature: using a single-scale feature map and global cross-attention calculations without specific locality constraints, in contrast to previous leading DETR-based detectors that reintroduce architectural inductive biases of multi-scale and locality into the decoder.

Decoder

FS-Depth: Focal-and-Scale Depth Estimation from a Single Image in Unseen Indoor Scene

no code implementations27 Jul 2023 Chengrui Wei, Meng Yang, Lei He, Nanning Zheng

It has long been an ill-posed problem to predict absolute depth maps from single images in real (unseen) indoor scenes.

3D Reconstruction Data Augmentation +1

MLF-DET: Multi-Level Fusion for Cross-Modal 3D Object Detection

no code implementations18 Jul 2023 Zewei Lin, Yanqing Shen, Sanping Zhou, Shitao Chen, Nanning Zheng

In this paper, we propose a novel and effective Multi-Level Fusion network, named as MLF-DET, for high-performance cross-modal 3D object DETection, which integrates both the feature-level fusion and decision-level fusion to fully utilize the information in the image.

3D Object Detection Data Augmentation +1

LongNet: Scaling Transformers to 1,000,000,000 Tokens

3 code implementations5 Jul 2023 Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei

Scaling sequence length has become a critical demand in the era of large language models.

When and Why Momentum Accelerates SGD:An Empirical Study

no code implementations15 Jun 2023 Jingwen Fu, Bohan Wang, Huishuai Zhang, Zhizheng Zhang, Wei Chen, Nanning Zheng

In the comparison of SGDM and SGD with the same effective learning rate and the same batch size, we observe a consistent pattern: when $\eta_{ef}$ is small, SGDM and SGD experience almost the same empirical training losses; when $\eta_{ef}$ surpasses a certain threshold, SGDM begins to perform better than SGD.

Milestones in Autonomous Driving and Intelligent Vehicles Part II: Perception and Planning

no code implementations3 Jun 2023 Long Chen, Siyu Teng, Bai Li, Xiaoxiang Na, Yuchen Li, Zixuan Li, Jinjun Wang, Dongpu Cao, Nanning Zheng, Fei-Yue Wang

Growing interest in autonomous driving (AD) and intelligent vehicles (IVs) is fueled by their promise for enhanced safety, efficiency, and economic benefits.

Autonomous Driving Ethics

Vector-based Representation is the Key: A Study on Disentanglement and Compositional Generalization

no code implementations29 May 2023 Tao Yang, Yuwang Wang, Cuiling Lan, Yan Lu, Nanning Zheng

In this paper, we study several typical disentangled representation learning works in terms of both disentanglement and compositional generalization abilities, and we provide an important insight: vector-based representation (using a vector instead of a scalar to represent a concept) is the key to empower both good disentanglement and strong compositional generalization.

Disentanglement

Skill-Based Few-Shot Selection for In-Context Learning

no code implementations23 May 2023 Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou

Few-shot selection -- selecting appropriate examples for each test instance separately -- is important for in-context learning.

In-Context Learning Semantic Parsing +1

Milestones in Autonomous Driving and Intelligent Vehicles Part I: Control, Computing System Design, Communication, HD Map, Testing, and Human Behaviors

no code implementations12 May 2023 Long Chen, Yuchen Li, Chao Huang, Yang Xing, Daxin Tian, Li Li, Zhongxu Hu, Siyu Teng, Chen Lv, Jinjun Wang, Dongpu Cao, Nanning Zheng, Fei-Yue Wang

Our work is divided into 3 independent articles and the first part is a Survey of Surveys (SoS) for total technologies of AD and IVs that involves the history, summarizes the milestones, and provides the perspectives, ethics, and future research directions.

Autonomous Driving Ethics

How Do In-Context Examples Affect Compositional Generalization?

no code implementations8 May 2023 Shengnan An, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Jian-Guang Lou, Dongmei Zhang

Compositional generalization--understanding unseen combinations of seen primitives--is an essential reasoning capability in human intelligence.

In-Context Learning

Quadric Representations for LiDAR Odometry, Mapping and Localization

no code implementations27 Apr 2023 Chao Xia, Chenfeng Xu, Patrick Rim, Mingyu Ding, Nanning Zheng, Kurt Keutzer, Masayoshi Tomizuka, Wei Zhan

Current LiDAR odometry, mapping and localization methods leverage point-wise representations of 3D scenes and achieve high accuracy in autonomous driving tasks.

Autonomous Driving

MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes

no code implementations25 Apr 2023 Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nanning Zheng

Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes.

Position Relationship Detection

Milestones in Autonomous Driving and Intelligent Vehicles: Survey of Surveys

no code implementations30 Mar 2023 Long Chen, Yuchen Li, Chao Huang, Bai Li, Yang Xing, Daxin Tian, Li Li, Zhongxu Hu, Xiaoxiang Na, Zixuan Li, Siyu Teng, Chen Lv, Jinjun Wang, Dongpu Cao, Nanning Zheng, Fei-Yue Wang

Interest in autonomous driving (AD) and intelligent vehicles (IVs) is growing at a rapid pace due to the convenience, safety, and economic benefits.

Autonomous Driving Ethics

MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering

1 code implementation CVPR 2023 Jingjing Jiang, Nanning Zheng

In this paper, we propose MixPHM, a redundancy-aware parameter-efficient tuning method that outperforms full finetuning in low-resource VQA.

Question Answering Visual Question Answering

BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding

1 code implementation25 Feb 2023 Yulong Liu, Yongqiang Ma, Wei Zhou, Guibo Zhu, Nanning Zheng

Our experiments show that this combination can boost the decoding model's performance on certain tasks like fMRI-text matching and fMRI-to-image generation.

Brain Decoding Image Generation +3

Does Deep Learning Learn to Abstract? A Systematic Probing Framework

1 code implementation23 Feb 2023 Shengnan An, Zeqi Lin, Bei Chen, Qiang Fu, Nanning Zheng, Jian-Guang Lou

Abstraction is a desirable capability for deep learning models, which means to induce abstract concepts from concrete instances and flexibly apply them beyond the learning context.

DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models

1 code implementation NeurIPS 2023 Tao Yang, Yuwang Wang, Yan Lv, Nanning Zheng

Targeting to understand the underlying explainable factors behind observations and modeling the conditional generation process on these factors, we connect disentangled representation learning to Diffusion Probabilistic Models (DPMs) to take advantage of the remarkable modeling ability of DPMs.

Disentanglement

Dynamic Grained Encoder for Vision Transformers

1 code implementation NeurIPS 2021 Lin Song, Songyang Zhang, Songtao Liu, Zeming Li, Xuming He, Hongbin Sun, Jian Sun, Nanning Zheng

Specifically, we propose a Dynamic Grained Encoder for vision transformers, which can adaptively assign a suitable number of queries to each spatial region.

Image Classification Language Modelling +2

DETR Does Not Need Multi-Scale or Locality Design

1 code implementation ICCV 2023 Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu

This paper presents an improved DETR detector that maintains a "plain" nature: using a single-scale feature map and global cross-attention calculations without specific locality constraints, in contrast to previous leading DETR-based detectors that reintroduce architectural inductive biases of multi-scale and locality into the decoder.

Decoder

Inverse Compositional Learning for Weakly-supervised Relation Grounding

no code implementations ICCV 2023 Huan Li, Ping Wei, Zeyu Ma, Nanning Zheng

In this study, we introduce a novel approach called inverse compositional learning (ICL) for weakly-supervised video relation grounding.

Relation Video Understanding

Object-fabrication Targeted Attack for Object Detection

no code implementations13 Dec 2022 Xuchong Zhang, Changfeng Sun, Haoliang Han, Hang Wang, Hongbin Sun, Nanning Zheng

Evaluation results demonstrate that, the proposed object-fabrication targeted attack mode and the corresponding targeted feature space attack method show significant improvements in terms of image-specific attack, universal performance and generalization capability, compared with the previous targeted attack for object detection.

Adversarial Attack Object +2

StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition

no code implementations CVPR 2023 Yanqing Shen, Sanping Zhou, Jingwen Fu, Ruotong Wang, Shitao Chen, Nanning Zheng

In this paper, we propose StructVPR, a novel training architecture for VPR, to enhance structural knowledge in RGB global features and thus improve feature stability in a constantly changing environment.

Image Retrieval Knowledge Distillation +3

Could Giant Pretrained Image Models Extract Universal Representations?

no code implementations3 Nov 2022 Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao

In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition.

Action Recognition In Videos Instance Segmentation +5

Using Detection, Tracking and Prediction in Visual SLAM to Achieve Real-time Semantic Mapping of Dynamic Scenarios

no code implementations10 Oct 2022 Xingyu Chen, Jianru Xue, Jianwu Fang, Yuxin Pan, Nanning Zheng

In this paper, we propose a lightweight system, RDS-SLAM, based on ORB-SLAM2, which can accurately estimate poses and build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU.

Object object-detection +1

Correlation Information Bottleneck: Towards Adapting Pretrained Multimodal Models for Robust Visual Question Answering

1 code implementation14 Sep 2022 Jingjing Jiang, Ziyi Liu, Nanning Zheng

In this paper, we aim to improve input robustness from an information bottleneck perspective when adapting pretrained VLMs to the downstream VQA task.

Adversarial Robustness Question Answering +1

DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection

1 code implementation22 Jul 2022 Jinrong Yang, Lin Song, Songtao Liu, Weixin Mao, Zeming Li, Xiaoping Li, Hongbin Sun, Jian Sun, Nanning Zheng

Many point-based 3D detectors adopt point-feature sampling strategies to drop some points for efficient inference.

3D Object Detection object-detection

Learning to Refactor Action and Co-occurrence Features for Temporal Action Localization

no code implementations CVPR 2022 Kun Xia, Le Wang, Sanping Zhou, Nanning Zheng, Wei Tang

The main challenge of Temporal Action Localization is to retrieve subtle human actions from various co-occurring ingredients, e. g., context and background, in an untrimmed video.

Temporal Action Localization

Visual Concepts Tokenization

2 code implementations20 May 2022 Tao Yang, Yuwang Wang, Yan Lu, Nanning Zheng

We further propose a Concept Disentangling Loss to facilitate that different concept tokens represent independent visual concepts.

Representation Learning

Test-time Batch Normalization

no code implementations20 May 2022 Tao Yang, Shenglong Zhou, Yuwang Wang, Yan Lu, Nanning Zheng

Deep neural networks often suffer the data distribution shift between training and testing, and the batch statistics are observed to reflect the shift.

Domain Generalization

Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models

no code implementations7 Mar 2022 Shengnan An, Yifei Li, Zeqi Lin, Qian Liu, Bei Chen, Qiang Fu, Weizhu Chen, Nanning Zheng, Jian-Guang Lou

This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations, leading to a more effective way to adapt unfamiliar inputs to frozen PLMs.

Language Modelling Natural Language Understanding +1

Trajectory Forecasting from Detection with Uncertainty-Aware Motion Encoding

no code implementations3 Feb 2022 Pu Zhang, Lei Bai, Jianru Xue, Jianwu Fang, Nanning Zheng, Wanli Ouyang

Trajectories obtained from object detection and tracking are inevitably noisy, which could cause serious forecasting errors to predictors built on ground truth trajectories.

object-detection Object Detection +1

TransVPR: Transformer-based place recognition with multi-level attention aggregation

no code implementations CVPR 2022 Ruotong Wang, Yanqing Shen, Weiliang Zuo, Sanping Zhou, Nanning Zheng

In addition, the output tokens from Transformer layers filtered by the fused attention mask are considered as key-patch descriptors, which are used to perform spatial matching to re-rank the candidates retrieved by the global image features.

Autonomous Driving Visual Place Recognition

Co-evolution Transformer for Protein Contact Prediction

1 code implementation NeurIPS 2021 He Zhang, Fusong Ju, Jianwei Zhu, Liang He, Bin Shao, Nanning Zheng, Tie-Yan Liu

These methods generally derive coevolutionary features by aggregating the learned residue representations from individual sequences with equal weights, which is inconsistent with the premise that residue co-evolutions are a reflection of collective covariation patterns of numerous homologous proteins.

LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering

1 code implementation29 Nov 2021 Jingjing Jiang, Ziyi Liu, Nanning Zheng

Video Question Answering (VideoQA), aiming to correctly answer the given question based on understanding multi-modal video content, is challenging due to the rich video content.

Diversity Question Answering +3

Improved Drug-target Interaction Prediction with Intermolecular Graph Transformer

no code implementations14 Oct 2021 Siyuan Liu, Yusong Wang, Tong Wang, Yifan Deng, Liang He, Bin Shao, Jian Yin, Nanning Zheng, Tie-Yan Liu

The identification of active binding drugs for target proteins (termed as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery.

Drug Discovery Molecular Docking +1

Foreground-attention in neural decoding: Guiding Loop-Enc-Dec to reconstruct visual stimulus images from fMRI

no code implementations29 Sep 2021 Kai Chen, Yongqiang Ma, Mingyang Sheng, Nanning Zheng

Inspired by the mechanism of human visual attention, in this paper, we propose a novel method of reconstructing visual stimulus images, which first decodes the distribution of visual attention from fMRI, and then reconstructs the visual images guided by visual attention.

Decoder Image Reconstruction

LGD: Label-guided Self-distillation for Object Detection

1 code implementation23 Sep 2021 Peizhen Zhang, Zijian Kang, Tong Yang, Xiangyu Zhang, Nanning Zheng, Jian Sun

Instead, we generate an instructive knowledge based only on student representations and regular labels.

Instance Segmentation Object +4

INVIGORATE: Interactive Visual Grounding and Grasping in Clutter

no code implementations25 Aug 2021 Hanbo Zhang, Yunfan Lu, Cunjun Yu, David Hsu, Xuguang Lan, Nanning Zheng

This paper presents INVIGORATE, a robot system that interacts with human through natural language and grasps a specified object in clutter.

Blocking Object +5

Model-based Decision Making with Imagination for Autonomous Parking

1 code implementation25 Aug 2021 Ziyue Feng, Yu Chen, Shitao Chen, Nanning Zheng

The proposed algorithm consists of three parts: an imaginative model for anticipating results before parking, an improved rapid-exploring random tree (RRT) for planning a feasible trajectory from a given start point to a parking lot, and a path smoothing module for optimizing the efficiency of parking tasks.

Autonomous Driving Decision Making

Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

1 code implementation ICCV 2021 Fang Zheng, Le Wang, Sanping Zhou, Wei Tang, Zhenxing Niu, Nanning Zheng, Gang Hua

Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously, which is adaptive to any number of agents and any range of interaction area.

Graph Attention Trajectory Prediction

X-GGM: Graph Generative Modeling for Out-of-Distribution Generalization in Visual Question Answering

1 code implementation24 Jul 2021 Jingjing Jiang, Ziyi Liu, Yifan Liu, Zhixiong Nan, Nanning Zheng

In this paper, we formulate OOD generalization in VQA as a compositional generalization problem and propose a graph generative modeling-based training scheme (X-GGM) to implicitly model the problem.

Attribute Out-of-Distribution Generalization +2

Adversarial Attack and Defense in Deep Ranking

1 code implementation7 Jun 2021 Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Nanning Zheng, Gang Hua

In this paper, we propose two attacks against deep ranking systems, i. e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations.

Adversarial Attack Adversarial Robustness

Video Imprint

no code implementations7 Jun 2021 Zhanning Gao, Le Wang, Nebojsa Jojic, Zhenxing Niu, Nanning Zheng, Gang Hua

In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i. e., the video imprint.

Language Modelling Retrieval

Practical Relative Order Attack in Deep Ranking

2 code implementations ICCV 2021 Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Yinghui Xu, Nanning Zheng, Gang Hua

In this paper, we formulate a new adversarial attack against deep ranking systems, i. e., the Order Attack, which covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation, with limited interference to other unrelated candidates.

Adversarial Attack

A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN

no code implementations1 Mar 2021 He Zhang, Zhixiong Nan, Tao Yang, Yifan Liu, Nanning Zheng

In autonomous driving, perceiving the driving behaviors of surrounding agents is important for the ego-vehicle to make a reasonable decision.

Autonomous Driving

Towards Building A Group-based Unsupervised Representation Disentanglement Framework

1 code implementation ICLR 2022 Tao Yang, Xuanchi Ren, Yuwang Wang, Wenjun Zeng, Nanning Zheng

We then propose a model, based on existing VAE-based methods, to tackle the unsupervised learning problem of the framework.

Disentanglement

Practical Order Attack in Deep Ranking

no code implementations1 Jan 2021 Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Xu Yinghui, Nanning Zheng, Gang Hua

The objective of this paper is to formalize and practically implement a new adversarial attack against deep ranking systems, i. e., the Order Attack, which covertly alters the relative order of a selected set of candidates according to a permutation vector predefined by the attacker, with only limited interference to other unrelated candidates.

Adversarial Attack Image Retrieval

Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation

no code implementations7 Dec 2020 Lipeng Wan, Xuwei Song, Xuguang Lan, Nanning Zheng

General methods for policy based multi-agent reinforcement learning to solve the challenge introduce differentiate value functions or advantage functions for individual agents.

Multi-agent Reinforcement Learning Starcraft

Fine-Grained Dynamic Head for Object Detection

1 code implementation NeurIPS 2020 Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng

To this end, we propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance, which further releases the ability of multi-scale feature representation.

Object object-detection +1

Learning to Infer Unseen Attribute-Object Compositions

no code implementations27 Oct 2020 Hui Chen, Zhixiong Nan, Jingjing Jiang, Nanning Zheng

The composition recognition of unseen attribute-object is critical to make machines learn to decompose and compose complex concepts like people.

Attribute Object

Conditional Uncorrelation and Efficient Non-approximate Subset Selection in Sparse Regression

no code implementations8 Sep 2020 Jianji Wang, Qi Liu, Shupei Zhang, Nanning Zheng, Fei-Yue Wang

By the proposed method, the computational complexity is reduced from $O(\frac{1}{6}{k^3}+mk^2+mkd)$ to $O(\frac{1}{6}{k^3}+\frac{1}{2}mk^2)$ for each candidate subset in sparse regression.

regression

A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning

2 code implementations ECCV 2020 Xingyu Chen, Xuguang Lan, Fuchun Sun, Nanning Zheng

Using a gating mechanism that discriminates the unseen samples from the seen samples can decompose the GZSL problem to a conventional Zero-Shot Learning (ZSL) problem and a supervised classification problem.

Generalized Zero-Shot Learning

Compositional Generalization by Learning Analytical Expressions

1 code implementation NeurIPS 2020 Qian Liu, Shengnan An, Jian-Guang Lou, Bei Chen, Zeqi Lin, Yan Gao, Bin Zhou, Nanning Zheng, Dongmei Zhang

Compositional generalization is a basic and essential intellective capability of human beings, which allows us to recombine known parts readily.

Hierarchical Reinforcement Learning

REGNet: REgion-based Grasp Network for End-to-end Grasp Detection in Point Clouds

1 code implementation28 Feb 2020 Binglei Zhao, Hanbo Zhang, Xuguang Lan, Haoyu Wang, Zhiqiang Tian, Nanning Zheng

Reliable robotic grasping in unstructured environments is a crucial but challenging task.

Robotics

Visual Semantic SLAM with Landmarks for Large-Scale Outdoor Environment

1 code implementation4 Jan 2020 Zirui Zhao, Yijun Mao, Yan Ding, Pengju Ren, Nanning Zheng

Semantic SLAM is an important field in autonomous driving and intelligent agents, which can enable robots to achieve high-level navigation tasks, obtain simple cognition or reasoning ability and achieve language-based human-robot-interaction.

Autonomous Driving Semantic Segmentation +1

BEYOND SUPERVISED LEARNING: RECOGNIZING UNSEEN ATTRIBUTE-OBJECT PAIRS WITH VISION-LANGUAGE FUSION AND ATTRACTOR NETWORKS

no code implementations ICLR 2020 Hui Chen, Zhixiong Nan, Nanning Zheng

This paper handles a challenging problem, unseen attribute-object pair recognition, which asks a model to simultaneously recognize the attribute type and the object type of a given image while this attribute-object pair is not included in the training set.

Attribute Object

EleAtt-RNN: Adding Attentiveness to Neurons in Recurrent Neural Networks

no code implementations3 Sep 2019 Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng

For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i. e., attention, to each element/dimension of the input.

Action Recognition Gesture Recognition +1

An encoding framework with brain inner state for natural image identification

no code implementations22 Aug 2019 Hao Wu, Ziyu Zhu, Jiayi Wang, Nanning Zheng, Badong Chen

The framework comprises two parts: forward encoding model that deals with visual stimuli and inner state model that captures influence from intrinsic connections in the brain.

Brain Decoding

Hindsight Trust Region Policy Optimization

1 code implementation29 Jul 2019 Hanbo Zhang, Site Bai, Xuguang Lan, David Hsu, Nanning Zheng

We propose \emph{Hindsight Trust Region Policy Optimization}(HTRPO), a new RL algorithm that extends the highly successful TRPO algorithm with \emph{hindsight} to tackle the challenge of sparse rewards.

Atari Games Policy Gradient Methods +1

SR-LSTM: State Refinement for LSTM towards Pedestrian Trajectory Prediction

1 code implementation CVPR 2019 Pu Zhang, Wanli Ouyang, Pengfei Zhang, Jianru Xue, Nanning Zheng

In order to address this issue, we propose a data-driven state refinement module for LSTM network (SR-LSTM), which activates the utilization of the current intention of neighbors, and jointly and iteratively refines the current states of all participants in the crowd through a message passing mechanism.

Pedestrian Trajectory Prediction Trajectory Prediction

Consistency-aware Shading Orders Selective Fusion for Intrinsic Image Decomposition

no code implementations23 Oct 2018 Yuanliu Liu, Ang Li, Zejian yuan, Badong Chen, Nanning Zheng

We propose a Consistency-aware Selective Fusion (CSF) to integrate the pairwise orders into a globally consistent order.

Intrinsic Image Decomposition

A Real-time Robotic Grasp Approach with Oriented Anchor Box

no code implementations8 Sep 2018 Hanbo Zhang, Xinwen Zhou, Xuguang Lan, Jin Li, Zhiqiang Tian, Nanning Zheng

The main component of our approach is a grasp detection network with oriented anchor boxes as detection priors.

Robotics

Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification

no code implementations ECCV 2018 Xing Wei, Yue Zhang, Yihong Gong, Jiawei Zhang, Nanning Zheng

The reason is that the bilinear feature matrix is sensitive to the magnitudes and correlations of local CNN feature elements which can be measured by its singular values.

Fine-Grained Image Classification Fine-Grained Visual Recognition +1

Transductive Semi-Supervised Deep Learning using Min-Max Features

no code implementations ECCV 2018 Weiwei Shi, Yihong Gong, Chris Ding, Zhiheng MaXiaoyu Tao, Nanning Zheng

In this paper, we propose Transductive Semi-Supervised Deep Learning (TSSDL) method that is effective for training Deep Convolutional Neural Network (DCNN) models.

General Classification Image Classification +1

ROI-based Robotic Grasp Detection for Object Overlapping Scenes

no code implementations30 Aug 2018 Hanbo Zhang, Xuguang Lan, Site Bai, Xinwen Zhou, Zhiqiang Tian, Nanning Zheng

Experimental results demonstrate that ROI-GD performs much better in object overlapping scenes and at the meantime, remains comparable with state-of-the-art grasp detection algorithms on Cornell Grasp Dataset and Jacquard Dataset.

Robotics

Adding Attentiveness to the Neurons in Recurrent Neural Networks

no code implementations ECCV 2018 Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng

We propose adding a simple yet effective Element-wiseAttention Gate (EleAttG) to an RNN block (e. g., all RNN neurons in a network layer) that empowers the RNN neurons to have the attentiveness capability.

Action Recognition Skeleton Based Action Recognition +1

Discriminative Feature Learning with Foreground Attention for Person Re-Identification

no code implementations4 Jul 2018 Sanping Zhou, Jinjun Wang, Deyu Meng, Yudong Liang, Yihong Gong, Nanning Zheng

Specifically, a novel foreground attentive subnetwork is designed to drive the network's attention, in which a decoder network is used to reconstruct the binary mask by using a novel local regression loss function, and an encoder network is regularized by the decoder network to focus its attention on the foreground persons.

Decoder Multi-Task Learning +1

Kernelized Subspace Pooling for Deep Local Descriptors

no code implementations CVPR 2018 Xing Wei, Yue Zhang, Yihong Gong, Nanning Zheng

Experimental results on several patch matching benchmarks show that our method outperforms the state-of-the-arts significantly.

Patch Matching

View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition

2 code implementations20 Apr 2018 Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng

In order to alleviate the effects of view variations, this paper introduces a novel view adaptation scheme, which automatically determines the virtual observation viewpoints in a learning based data driven manner.

Action Recognition Skeleton Based Action Recognition +1

Attention-based Temporal Weighted Convolutional Neural Network for Action Recognition

no code implementations19 Mar 2018 Jinliang Zang, Le Wang, Ziyi Liu, Qilin Zhang, Zhenxing Niu, Gang Hua, Nanning Zheng

Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs).

Action Recognition Temporal Action Localization

Feature Selective Small Object Detection via Knowledge-based Recurrent Attentive Neural Network

no code implementations13 Mar 2018 Kai Yi, Zhiqiang Jian, Shitao Chen, Nanning Zheng

At present, the performance of deep neural network in general object detection is comparable to or even surpasses that of human beings.

Autonomous Driving Decision Making +4

Augmented Space Linear Model

no code implementations1 Feb 2018 Zhengda Qin, Badong Chen, Nanning Zheng, Jose C. Principe

In this paper, we propose a linear model called Augmented Space Linear Model (ASLM), which uses the full joint space of input and desired signal as the projection space and approaches the performance of nonlinear models.

Computational Efficiency

A Novel Brain Decoding Method: a Correlation Network Framework for Revealing Brain Connections

no code implementations1 Dec 2017 Siyu Yu, Nanning Zheng, Yongqiang Ma, Hao Wu, Badong Chen

Analyzing the correlations of collected data from human brain activities and representing activity patterns are two problems in brain decoding based on functional magnetic resonance imaging (fMRI) signals.

Brain Decoding

Quantized Minimum Error Entropy Criterion

no code implementations11 Oct 2017 Badong Chen, Lei Xing, Nanning Zheng, Jose C. Príncipe

Comparing with traditional learning criteria, such as mean square error (MSE), the minimum error entropy (MEE) criterion is superior in nonlinear and non-Gaussian signal processing and machine learning.

Quantization

Deep Self-Paced Learning for Person Re-Identification

no code implementations7 Oct 2017 Sanping Zhou, Jinjun Wang, Deyu Meng, Xiaomeng Xin, Yubing Li, Yihong Gong, Nanning Zheng

In this paper, we propose a novel deep self-paced learning (DSPL) algorithm to alleviate this problem, in which we apply a self-paced constraint and symmetric regularization to help the relative distance metric training the deep neural network, so as to learn the stable and discriminative features for person Re-ID.

Person Re-Identification

Large Margin Learning in Set to Set Similarity Comparison for Person Re-identification

no code implementations18 Aug 2017 Sanping Zhou, Jinjun Wang, Rui Shi, Qiqi Hou, Yihong Gong, Nanning Zheng

The class-identity term keeps the intra-class samples within each camera view gathering together, the relative distance term maximizes the distance between the intra-class class set and inter-class set across different camera views, and the regularization term smoothness the parameters of deep convolutional neural network (CNN).

Person Re-Identification Retrieval

Associations among Image Assessments as Cost Functions in Linear Decomposition: MSE, SSIM, and Correlation Coefficient

no code implementations4 Aug 2017 Jianji Wang, Nanning Zheng, Badong Chen, Jose C. Principe

Moreover, for a target vector, the ratio of the corresponding affine parameters in the MSE-based linear decomposition scheme and the SSIM-based scheme is a constant, which is just the value of PCC between the target vector and its estimated vector.

SSIM

Deep Feature Learning via Structured Graph Laplacian Embedding for Person Re-Identification

no code implementations25 Jul 2017 De Cheng, Yihong Gong, Zhihui Li, Weiwei Shi, Alexander G. Hauptmann, Nanning Zheng

The proposed method can take full advantages of the structured distance relationships among these training samples, with the constructed complete graph.

Person Re-Identification

Point to Set Similarity Based Deep Feature Learning for Person Re-Identification

no code implementations CVPR 2017 Sanping Zhou, Jinjun Wang, Jiayun Wang, Yihong Gong, Nanning Zheng

One of the key issues for deep learning based person Re-ID is the selection of proper similarity comparison criteria, and the performance of learned features using existing criterion based on pairwise similarity is still limited, because only P2P distances are mostly considered.

Person Re-Identification

Detecting Drivable Area for Self-driving Cars: An Unsupervised Approach

no code implementations1 May 2017 Ziyi Liu, Siyu Yu, Xiao Wang, Nanning Zheng

Experiments show that our unsupervised approach is efficient and robust for detecting drivable area for self-driving cars.

Self-Driving Cars

Single Image Super Resolution - When Model Adaptation Matters

no code implementations31 Mar 2017 Yudong Liang, Radu Timofte, Jinjun Wang, Yihong Gong, Nanning Zheng

The internal contents of the low resolution input image is neglected with deep modeling despite the earlier works showing the power of using such internal priors.

Image Super-Resolution

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

1 code implementation ICCV 2017 Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng

Rather than re-positioning the skeletons based on a human defined prior criterion, we design a view adaptive recurrent neural network (RNN) with LSTM architecture, which enables the network itself to adapt to the most suitable observation viewpoints from end to end.

Action Recognition Skeleton Based Action Recognition +1

Single Image Super-resolution via a Lightweight Residual Convolutional Neural Network

no code implementations23 Mar 2017 Yudong Liang, Ze Yang, Kai Zhang, Yihui He, Jinjun Wang, Nanning Zheng

To tackle with the second problem, a lightweight CNN architecture which has carefully designed width, depth and skip connections was proposed.

Image Super-Resolution SSIM

Robust Learning with Kernel Mean p-Power Error Loss

no code implementations21 Dec 2016 Badong Chen, Lei Xing, Xin Wang, Jing Qin, Nanning Zheng

Correntropy is a second order statistical measure in kernel space, which has been successfully applied in robust learning and signal processing.

Kernel Risk-Sensitive Loss: Definition, Properties and Application to Robust Adaptive Filtering

no code implementations1 Aug 2016 Badong Chen, Lei Xing, Bin Xu, Haiquan Zhao, Nanning Zheng, Jose C. Principe

Nonlinear similarity measures defined in kernel space, such as correntropy, can extract higher-order statistics of data and offer potentially significant performance improvement over their linear counterparts especially in non-Gaussian signal processing and machine learning.

Similarity Learning With Spatial Constraints for Person Re-Identification

no code implementations CVPR 2016 Dapeng Chen, Zejian yuan, Badong Chen, Nanning Zheng

We therefore learn a novel similarity function, which consists of multiple sub-similarity measurements with each taking in charge of a subregion.

Person Re-Identification

Person Re-Identification by Multi-Channel Parts-Based CNN With Improved Triplet Loss Function

1 code implementation CVPR 2016 De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, Nanning Zheng

Person re-identification across cameras remains a very challenging problem, especially when there are no overlapping fields of view between cameras.

Person Re-Identification Robust Face Recognition

Counting Grid Aggregation for Event Retrieval and Recognition

no code implementations5 Apr 2016 Zhanning Gao, Gang Hua, Dongqing Zhang, Jianru Xue, Nanning Zheng

Event retrieval and recognition in a large corpus of videos necessitates a holistic fixed-size visual representation at the video clip level that is comprehensive, compact, and yet discriminative.

Retrieval

Correntropy Maximization via ADMM - Application to Robust Hyperspectral Unmixing

no code implementations4 Feb 2016 Fei Zhu, Abderrahim Halimi, Paul Honeine, Badong Chen, Nanning Zheng

In hyperspectral images, some spectral bands suffer from low signal-to-noise ratio due to noisy acquisition and atmospheric effects, thus requiring robust techniques for the unmixing problem.

Hyperspectral Unmixing

Illumination Robust Color Naming via Label Propagation

no code implementations ICCV 2015 Yuanliu liu, Zejian yuan, Badong Chen, Jianru Xue, Nanning Zheng

In this paper we address the problem of inferring the color composition of the intrinsic reflectance of objects, where the shadows and highlights may change the observed color dramatically.

Image Retrieval Retrieval

Contour Guided Hierarchical Model for Shape Matching

no code implementations ICCV 2015 Yuanqi Su, Yuehu Liu, Bonan Cuan, Nanning Zheng

For the purpose, we divide the shape template into overlapped parts and model the matching through a part-based layered structure that uses the latent variable to constrain parts' deformation.

Similarity Learning on an Explicit Polynomial Kernel Feature Map for Person Re-Identification

no code implementations CVPR 2015 Dapeng Chen, Zejian yuan, Gang Hua, Nanning Zheng, Jingdong Wang

We follow the learning-to-rank methodology and learn a similarity function to maximize the difference between the similarity scores of matched and unmatched images for a same person.

Learning-To-Rank Patch Matching +1

Saturation-Preserving Specular Reflection Separation

no code implementations CVPR 2015 Yuanliu Liu, Zejian yuan, Nanning Zheng, Yang Wu

Specular reflection generally decreases the saturation of surface colors, which will be possibly confused with other colors that have the same hue but lower saturation.

Generalized Correntropy for Robust Adaptive Filtering

no code implementations12 Apr 2015 Badong Chen, Lei Xing, Haiquan Zhao, Nanning Zheng, José C. Príncipe

In this work, we propose a generalized correntropy that adopts the generalized Gaussian density (GGD) function as the kernel (not necessarily a Mercer kernel), and present some important properties.

Salient Object Detection: A Discriminative Regional Feature Integration Approach

no code implementations CVPR 2013 Huaizu Jiang, Zejian yuan, Ming-Ming Cheng, Yihong Gong, Nanning Zheng, Jingdong Wang

Our method, which is based on multi-level image segmentation, utilizes the supervised learning approach to map the regional feature vector to a saliency score.

Image Segmentation Object +4

Kernel Least Mean Square with Adaptive Kernel Size

no code implementations23 Jan 2014 Badong Chen, Junli Liang, Nanning Zheng, Jose C. Principe

Kernel adaptive filters (KAF) are a class of powerful nonlinear filters developed in Reproducing Kernel Hilbert Space (RKHS).

Time Series Time Series Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.