Search Results for author: Peng Wang

Found 301 papers, 117 papers with code

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

7 code implementations • 2 Nov 2018 • Hui Li, Peng Wang, Chunhua Shen, Guyu Zhang

Recognizing irregular text in natural scene images is challenging due to the large variance in text appearance, such as curvature, orientation and distortion.

Ranked #26 on Scene Text Recognition on ICDAR2015

Irregular Text Recognition Scene Text Recognition

38,330

Paper
Code

NAS-FCOS: Fast Neural Architecture Search for Object Detection

3 code implementations • CVPR 2020 • Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, Yanning Zhang

The success of deep neural networks relies on significant architecture engineering.

Ranked #113 on Object Detection on COCO test-dev

Neural Architecture Search Object +2

27,708

Paper
Code

Qwen Technical Report

2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.

Ranked #3 on Multi-Label Text Classification on CC3M-TagMask

Language Modelling Large Language Model +2

10,726

Paper
Code

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

4 code implementations • 7 Feb 2022 • Peng Wang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang

In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization.

Ranked #1 on Visual Question Answering on VQA v2 test-std (yes/no metric)

Image Captioning Language Modelling +11

6,005

Paper
Code

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

2 code implementations • 18 May 2023 • Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

Ranked #1 on Semantic Segmentation on ADE20K (using extra training data)

Action Classification AudioCaps +16

6,005

Paper
Code

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

1 code implementation • 24 Aug 2023 • Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, Jingren Zhou

In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images.

Ranked #3 on Visual Question Answering on MM-Vet

Chart Question Answering Image Captioning +6

3,615

Paper
Code

DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

1 code implementation • 10 Jan 2022 • Ningyu Zhang, Xin Xu, Liankuan Tao, Haiyang Yu, Hongbin Ye, Shuofei Qiao, Xin Xie, Xiang Chen, Zhoubo Li, Lei LI, Xiaozhuan Liang, Yunzhi Yao, Shumin Deng, Peng Wang, Wen Zhang, Zhenru Zhang, Chuanqi Tan, Qiang Chen, Feiyu Xiong, Fei Huang, Guozhou Zheng, Huajun Chen

We present an open-source and extensible knowledge extraction toolkit DeepKE, supporting complicated low-resource, document-level and multimodal scenarios in the knowledge base population.

Attribute Attribute Extraction +5

2,907

Paper
Code

Verified Low-Level Programming Embedded in F*

4 code implementations • 28 Feb 2017 • Jonathan Protzenko, Jean-Karim Zinzindohoué, Aseem Rastogi, Tahina Ramananandro, Peng Wang, Santiago Zanella-Béguelin, Antoine Delignat-Lavaud, Catalin Hritcu, Karthikeyan Bhargavan, Cédric Fournet, Nikhil Swamy

Low* is a shallow embedding of a small, sequential, well-behaved subset of C in F*, a dependently-typed variant of ML aimed at program verification.

Programming Languages Cryptography and Security

2,560

Paper
Code

Prompt Tuning for Generative Multimodal Pretrained Models

1 code implementation • 4 Aug 2022 • Hao Yang, Junyang Lin, An Yang, Peng Wang, Chang Zhou, Hongxia Yang

Prompt tuning has become a new paradigm for model tuning and it has demonstrated success in natural language pretraining and even vision pretraining.

Ranked #2 on Visual Entailment on SNLI-VE test

Image Captioning Visual Entailment +1

2,320

Paper
Code

Transferring General Multimodal Pretrained Models to Text Recognition

1 code implementation • 19 Dec 2022 • Junyang Lin, Xuancheng Ren, Yichang Zhang, Gao Liu, Peng Wang, An Yang, Chang Zhou

This paper proposes a new method, OFA-OCR, to transfer multimodal pretrained models to text recognition.

Image Captioning Optical Character Recognition (OCR)

2,320

Paper
Code

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

6 code implementations • NeurIPS 2021 • Peng Wang, Lingjie Liu, YuAn Liu, Christian Theobalt, Taku Komura, Wenping Wang

In NeuS, we propose to represent a surface as the zero-level set of a signed distance function (SDF) and develop a new volume rendering method to train a neural SDF representation.

Novel View Synthesis Surface Reconstruction

1,473

Paper
Code

Multi-Label Image Recognition with Graph Convolutional Networks

2 code implementations • CVPR 2019 • Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo

The task of multi-label image recognition is to predict a set of object labels that present in an image.

Ranked #12 on Multi-Label Classification on PASCAL VOC 2007

Long-tail Learning Multi-Label Classification +2

1,382

Paper
Code

Editing Large Language Models: Problems, Methods, and Opportunities

3 code implementations • 22 May 2023 • Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang

Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.

Model Editing

1,362

Paper
Code

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

2 code implementations • 14 Aug 2023 • Peng Wang, Ningyu Zhang, Bozhong Tian, Zekun Xi, Yunzhi Yao, Ziwen Xu, Mengru Wang, Shengyu Mao, Xiaohan Wang, Siyuan Cheng, Kangwei Liu, Yuansheng Ni, Guozhou Zheng, Huajun Chen

Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data.

knowledge editing

1,362

Paper
Code

A Comprehensive Study of Knowledge Editing for Large Language Models

2 code implementations • 2 Jan 2024 • Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches.

Ranked #1 on knowledge editing on zsRE (using extra training data)

knowledge editing

1,362

Paper
Code

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis

1 code implementation • ICLR 2022 • Jiatao Gu, Lingjie Liu, Peng Wang, Christian Theobalt

We perform volume rendering only to produce a low-resolution feature map and progressively apply upsampling in 2D to address the first issue.

Image Generation

953

Paper
Code

Multi-Granularity Prediction for Scene Text Recognition

2 code implementations • 8 Sep 2022 • Peng Wang, Cheng Da, Cong Yao

In this work, we first draw inspiration from the recent progress in Vision Transformer (ViT) to construct a conceptually simple yet powerful vision STR model, which is built upon ViT and outperforms previous state-of-the-art models for scene text recognition, including both pure vision models and language-augmented methods.

Ranked #1 on Scene Text Recognition on Uber-Text (using extra training data)

Language Modelling Optical Character Recognition (OCR) +1

896

Paper
Code

Levenshtein OCR

2 code implementations • 8 Sep 2022 • Cheng Da, Peng Wang, Cong Yao

A novel scene text recognizer based on Vision-Language Transformer (VLT) is presented.

Imitation Learning Optical Character Recognition (OCR) +1

894

Paper
Code

Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition

1 code implementation • 25 Jul 2023 • Cheng Da, Peng Wang, Cong Yao

Specifically, MGP-STR achieves an average recognition accuracy of $94\%$ on standard benchmarks for scene text recognition.

Language Modelling Optical Character Recognition (OCR) +1

894

Paper
Code

LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition

1 code implementation • ICCV 2023 • Changxu Cheng, Peng Wang, Cheng Da, Qi Zheng, Cong Yao

The diversity in length constitutes a significant characteristic of text.

Scene Text Recognition

894

Paper
Code

F$^{2}$-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

1 code implementation • 28 Mar 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Based on our analysis, we further propose a novel space-warping method called perspective warping, which allows us to handle arbitrary trajectories in the grid-based NeRF framework.

Novel View Synthesis

893

Paper
Code

MVDream: Multi-view Diffusion for 3D Generation

2 code implementations • 31 Aug 2023 • Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt.

3D Generation

644

Paper
Code

The ApolloScape Open Dataset for Autonomous Driving and its Application

2 code implementations • 16 Mar 2018 • Xinyu Huang, Peng Wang, Xinjing Cheng, Dingfu Zhou, Qichuan Geng, Ruigang Yang

In this paper, we provide a sensor fusion scheme integrating camera videos, consumer-grade motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robust self-localization and semantic segmentation for autonomous driving.

Autonomous Driving Instance Segmentation +3

527

Paper
Code

NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images

1 code implementation • 27 May 2023 • YuAn Liu, Peng Wang, Cheng Lin, Xiaoxiao Long, Jiepeng Wang, Lingjie Liu, Taku Komura, Wenping Wang

We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.

Neural Rendering Object

499

Paper
Code

Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network

1 code implementation • ECCV 2018 • Xinjing Cheng, Peng Wang, Ruigang Yang

Depth estimation from a single image is a fundamental problem in computer vision.

Depth Estimation Depth Prediction

490

Paper
Code

Learning Depth with Convolutional Spatial Propagation Network

1 code implementation • 4 Oct 2018 • Xinjing Cheng, Peng Wang, Ruigang Yang

In this paper, we propose a simple yet effective convolutional spatial propagation network (CSPN) to learn the affinity matrix for various depth estimation tasks.

Depth Completion Depth Estimation +3

490

Paper
Code

Visual Question Answering: A Survey of Methods and Datasets

1 code implementation • 20 Jul 2016 • Qi Wu, Damien Teney, Peng Wang, Chunhua Shen, Anthony Dick, Anton Van Den Hengel

Visual Question Answering (VQA) is a challenging task that has received increasing attention from both the computer vision and the natural language processing communities.

General Knowledge Visual Question Answering

436

Paper
Code

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

1 code implementation • NeurIPS 2023 • Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, Yasutaka Furukawa

This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e. g., perspective crops from a panorama or multi-view images given depth maps and poses).

Image Generation

426

Paper
Code

Neural Rays for Occlusion-aware Image-based Rendering

1 code implementation • CVPR 2022 • YuAn Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, Wenping Wang

On such a 3D point, these generalization methods will include inconsistent image features from invisible views, which interfere with the radiance field construction.

Neural Rendering Novel View Synthesis +1

395

Paper
Code

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

1 code implementation • 12 Jun 2022 • Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, Wenping Wang

We introduce SparseNeuS, a novel neural rendering based method for the task of surface reconstruction from multi-view images.

Neural Rendering Surface Reconstruction

313

Paper
Code

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

1 code implementation • CVPR 2020 • Shizhe Chen, Qin Jin, Peng Wang, Qi Wu

From the ASG, we propose a novel ASG2Caption model, which is able to recognise user intentions and semantics in the graph, and therefore generate desired captions according to the graph structure.

Attribute Caption Generation +1

198

Paper
Code

Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video

1 code implementation • 7 May 2020 • Peng Wang, Lingjie Liu, Nenglun Chen, Hung-Kuo Chu, Christian Theobalt, Wenping Wang

We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera.

Motion Estimation Occlusion Handling +1

193

Paper
Code

NAS-FCOS: Efficient Search for Object Detection Architectures

1 code implementation • 24 Oct 2021 • Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, Yanning Zhang

Neural Architecture Search (NAS) has shown great potential in effectively reducing manual effort in network design by automatically discovering optimal architectures.

Neural Architecture Search Object +2

187

Paper
Code

BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields

1 code implementation • CVPR 2023 • Peng Wang, Lingzhe Zhao, Ruijie Ma, Peidong Liu

Our approach models the physical image formation process of a motion blurred image, and jointly learns the parameters of NeRF and recovers the camera motion trajectories during exposure time.

Novel View Synthesis

180

Paper
Code

Real-time Segmentation and Facial Skin Tones Grading

1 code implementation • 30 Dec 2019 • Ling Luo, Dingyu Xue, Xinglong Feng, Yichun Yu, Peng Wang

Modern approaches for semantic segmention usually pay too much attention to the accuracy of the model, and therefore it is strongly recommended to introduce cumbersome backbones, which brings heavy computation burden and memory footprint.

178

Paper
Code

PERF: Panoramic Neural Radiance Field from a Single Panorama

1 code implementation • 25 Oct 2023 • Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu

In this paper, we present PERF, a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.

Novel View Synthesis Text to 3D

158

Paper
Code

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation • 8 Dec 2022 • Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

142

Paper
Code

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

1 code implementation • CVPR 2021 • Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo

Last, we proposed an efficient fine-grained search strategy to train HR-NAS, which effectively explores the search space, and finds optimal architectures given various tasks and computation resources.

Image Classification Neural Architecture Search +3

138

Paper
Code

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos

1 code implementation • 8 Oct 2018 • Yang Wang, Zhenheng Yang, Peng Wang, Yi Yang, Chenxu Luo, Wei Xu

Then the whole scene is decomposed into moving foreground and static background by compar- ing the estimated optical flow and rigid flow derived from the depth and ego-motion.

Motion Estimation Optical Flow Estimation

128

Paper
Code

Self-Taught Convolutional Neural Networks for Short Text Clustering

1 code implementation • 1 Jan 2017 • Jiaming Xu, Peng Wang, Suncong Zheng, Guanhua Tian, Jun Zhao, Bo Xu

Short text clustering is a challenging problem due to its sparseness of text representation.

Ranked #2 on Short Text Clustering on Stackoverflow

Clustering Dimensionality Reduction +2

115

Paper
Code

Short Text Clustering via Convolutional Neural Networks

3 code implementations • WS 2015 • Jiaming Xu, Peng Wang, Guanhua Tian, Bo Xu, Jun Zhao, Fangyuan Wang, Hong-Wei Hao

Clustering Short Text Clustering +1

115

Paper
Code

DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings

1 code implementation • 10 Dec 2021 • Chaochen Gao, Xing Wu, Peng Wang, Jue Wang, Liangjun Zang, Zhongyuan Wang, Songlin Hu

To tackle that, we propose an effective knowledge distillation framework for contrastive sentence embeddings, termed DistilCSE.

Contrastive Learning Knowledge Distillation +5

Paper
Code

Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses

1 code implementation • 17 Jul 2020 • Miao Liao, Sibo Zhang, Peng Wang, Hao Zhu, Xinxin Zuo, Ruigang Yang

In this paper, we propose a novel approach to convert given speech audio to a photo-realistic speaking video of a specific person, where the output video has synchronized, realistic, and expressive rich body dynamics.

Generative Adversarial Network

Paper
Code

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

1 code implementation • CVPR 2020 • Jie Li, Kai Han, Peng Wang, Yu Liu, Xia Yuan

In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely.

Ranked #4 on 3D Semantic Scene Completion from a single RGB image on NYUv2

3D Semantic Scene Completion from a single RGB image

Paper
Code

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting

1 code implementation • 18 Mar 2024 • Lingzhe Zhao, Peng Wang, Peidong Liu

In this paper, we introduce a novel approach, named BAD-Gaussians (Bundle Adjusted Deblur Gaussian Splatting), which leverages explicit Gaussian representation and handles severe motion-blurred images with inaccurate camera poses to achieve high-quality scene reconstruction.

3D Scene Reconstruction Neural Rendering +1

Paper
Code

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge

1 code implementation • 2 Jun 2020 • Peng Wang, Dongyang Liu, Hui Li, Qi Wu

In this case, we need to use commonsense knowledge to identify the objects in the image.

16k Referring Expression +1

Paper
Code

LEGO: Learning Edge with Geometry all at Once by Watching Videos

1 code implementation • CVPR 2018 • Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

In our framework, the predicted depths, normals and edges are forced to be consistent all the time.

Paper
Code

Deep Learning and Its Applications to Machine Health Monitoring: A Survey

1 code implementation • 16 Dec 2016 • Rui Zhao, Ruqiang Yan, Zhenghua Chen, Kezhi Mao, Peng Wang, Robert X. Gao

Since 2006, deep learning (DL) has become a rapidly growing research direction, redefining state-of-the-art performances in a wide range of areas such as object recognition, image segmentation, speech recognition and machine translation.

Image Segmentation Machine Translation +5

Paper
Code

Discriminative and Robust Online Learning for Siamese Visual Tracking

1 code implementation • 6 Sep 2019 • Jinghao Zhou, Peng Wang, Haoyang Sun

The problem of visual object tracking has traditionally been handled by variant tracking paradigms, either learning a model of the object's appearance exclusively online or matching the object with the target in an offline-trained embedding space.

Visual Object Tracking Visual Tracking

Paper
Code

TouchStone: Evaluating Vision-Language Models by Language Models

1 code implementation • 31 Aug 2023 • Shuai Bai, Shusheng Yang, Jinze Bai, Peng Wang, Xingxuan Zhang, Junyang Lin, Xinggang Wang, Chang Zhou, Jingren Zhou

Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs).

Visual Storytelling

Paper
Code

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

1 code implementation • 22 Aug 2023 • Peng Wu, Xuerong Zhou, Guansong Pang, Lingru Zhou, Qingsen Yan, Peng Wang, Yanning Zhang

With the benefit of dual branch, VadCLIP achieves both coarse-grained and fine-grained video anomaly detection by transferring pre-trained knowledge from CLIP to WSVAD task.

Anomaly Detection Binary Classification +1

Paper
Code

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps

1 code implementation • 9 Dec 2020 • Qi Zhu, Chenyu Gao, Peng Wang, Qi Wu

Texts appearing in daily scenes that can be recognized by OCR (Optical Character Recognition) tools contain significant information, such as street name, product brand and prices.

Image Captioning Optical Character Recognition +3

Paper
Code

DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map

1 code implementation • CVPR 2018 • Peng Wang, Ruigang Yang, Binbin Cao, Wei Xu, Yuanqing Lin

The uniqueness of our design is a sensor fusion scheme which integrates camera videos, motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robustness and efficiency of the system.

Autonomous Driving Pose Estimation +2

Paper
Code

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

1 code implementation • CVPR 2022 • Jingzhou Chen, Peng Wang, Jian Liu, Yuntao Qian

Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e. g., ["Albatross", "Laysan Albatross"] from coarse-to-fine levels.

Fine-Grained Image Classification Relation

Paper
Code

Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings

1 code implementation • 14 Jan 2022 • Peng Wang, Xin Xie, Xiaohan Wang, Ningyu Zhang

Previous knowledge graph embedding approaches usually map entities to representations and utilize score functions to predict the target entities, yet they typically struggle to reason rare or emerging unseen entities.

Ranked #1 on Link Prediction on FB15k-237-ind

Knowledge Graph Embedding Knowledge Graph Embeddings +2

Paper
Code

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

1 code implementation • 16 Nov 2020 • Hiroaki Hayashi, Prashant Budania, Peng Wang, Chris Ackerson, Raj Neervannan, Graham Neubig

In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization.

Paper
Code

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

1 code implementation • 11 Feb 2024 • Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks.

Graph Question Answering Instruction Following +4

Paper
Code

VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis

1 code implementation • 30 May 2022 • Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille

The Gaussian reconstruction kernels have been proposed by Westover (1990) and studied by the computer graphics community back in the 90s, which gives an alternative representation of object 3D geometry from meshes and point clouds.

Pose Estimation

Paper
Code

Semantic Instance Segmentation via Deep Metric Learning

1 code implementation • 30 Mar 2017 • Alireza Fathi, Zbigniew Wojna, Vivek Rathod, Peng Wang, Hyun Oh Song, Sergio Guadarrama, Kevin P. Murphy

We propose a new method for semantic instance segmentation, by first computing how likely two pixels are to belong to the same object, and then by grouping similar pixels together.

Ranked #3 on Object Proposal Generation on PASCAL VOC 2012, 60 proposals per image

Instance Segmentation Metric Learning +3

Paper
Code

Glocal Energy-based Learning for Few-Shot Open-Set Recognition

1 code implementation • CVPR 2023 • Haoyu Wang, Guansong Pang, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang

Few-shot open-set recognition (FSOR) is a challenging task of great practical value.

Open Set Learning

Paper
Code

Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer

1 code implementation • CVPR 2022 • Fushun Zhu, Shan Zhao, Peng Wang, Hao Wang, Hua Yan, Shuaicheng Liu

We propose a semi-supervised network for wide-angle portraits correction.

Paper
Code

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

1 code implementation • 17 Aug 2022 • Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Peng Wang, Yanning Zhang

To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.

General Knowledge Language Modelling +1

Paper
Code

AerialVLN: Vision-and-Language Navigation for UAVs

1 code implementation • ICCV 2023 • Shubo Liu, Hongsheng Zhang, Yuankai Qi, Peng Wang, Yaning Zhang, Qi Wu

Navigating in the sky is more complicated than on the ground because agents need to consider the flying height and more complex spatial relationship reasoning.

Navigate Vision and Language Navigation

Paper
Code

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

1 code implementation • 14 Oct 2018 • Chenxu Luo, Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia, Alan Yuille

Performance on the five tasks of depth estimation, optical flow estimation, odometry, moving object segmentation and scene flow estimation shows that our approach outperforms other SoTA methods.

Ranked #1 on Scene Flow Estimation on KITTI 2015 Scene Flow Training

Depth Estimation Optical Flow Estimation +2

Paper
Code

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

1 code implementation • CVPR 2022 • Xueqing Deng, Peng Wang, Xiaochen Lian, Shawn Newsam

Notably, NightLab contains models at two levels of granularity, i. e. image and regional, and each level is composed of light adaptation and segmentation modules.

Segmentation Self-Driving Cars +1

Paper
Code

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

1 code implementation • CVPR 2022 • Wei Dong, Junsheng Wu, Yi Luo, ZongYuan Ge, Peng Wang

In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing.

Node Classification Representation Learning

Paper
Code

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

1 code implementation • CVPR 2022 • Caiyuan Zheng, Hui Li, Seon-Min Rhee, Seungju Han, Jae-Joon Han, Peng Wang

A robust consistency regularization based semi-supervised framework is proposed for STR, which can effectively solve the instability issue due to domain inconsistency between synthetic and real images.

Scene Text Recognition

Paper
Code

Self-Supervised Node Representation Learning via Node-to-Neighbourhood Alignment

1 code implementation • 9 Feb 2023 • Wei Dong, Dawei Yan, Peng Wang

Considering the excessive memory overheads of contrastive learning, we further propose a negative-free solution, where the main contribution is a Graph Signal Decorrelation (GSD) constraint to avoid representation collapse and over-smoothing.

Contrastive Learning Node Classification +1

Paper
Code

Contrastive Diffusion Model with Auxiliary Guidance for Coarse-to-Fine PET Reconstruction

1 code implementation • 20 Aug 2023 • Zeyu Han, YuHan Wang, Luping Zhou, Peng Wang, Binyu Yan, Jiliu Zhou, Yan Wang, Dinggang Shen

To obtain high-quality positron emission tomography (PET) scans while reducing radiation exposure to the human body, various approaches have been proposed to reconstruct standard-dose PET (SPET) images from low-dose PET (LPET) images.

Paper
Code

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation

1 code implementation • CVPR 2022 • Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu

Pre-training has been adopted in a few of recent works for Vision-and-Language Navigation (VLN).

Ranked #4 on Visual Navigation on R2R

Decision Making Language Modelling +3

Paper
Code

Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing

1 code implementation • NeurIPS 2023 • Wei Dong, Dawei Yan, Zhijun Lin, Peng Wang

Consequently, effectively adapting large pre-trained models to downstream tasks in an efficient manner has become a prominent research area.

Image Classification Transfer Learning

Paper
Code

FastRE: Towards Fast Relation Extraction with Convolutional Encoder and Improved Cascade Binary Tagging Framework

1 code implementation • 5 May 2022 • Guozheng Li, Xu Chen, Peng Wang, Jiafeng Xie, Qiqing Luo

Recent work for extracting relations from texts has achieved excellent performance.

Language Modelling Relation +2

Paper
Code

A geometry-aware deep network for depth estimation in monocular endoscopy

1 code implementation • 20 Apr 2023 • Yongming Yang, Shuwei Shao, Tao Yang, Peng Wang, Zhuo Yang, Chengdong Wu, Hao liu

To address this issue, we introduce a gradient loss to penalize edge fluctuations ambiguous around stepped edge structures and a normal loss to explicitly express the sensitivity to frequently small structures, and propose a geometric consistency loss to spreads the spatial information across the sample grids to constrain the global geometric anatomy structures.

3D Reconstruction Anatomy +1

Paper
Code

Revisiting Prototypical Network for Cross Domain Few-Shot Learning

1 code implementation • CVPR 2023 • Fei Zhou, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang

Prototypical Network is a popular few-shot solver that aims at establishing a feature metric generalizable to novel few-shot classification (FSC) tasks using deep neural networks.

cross-domain few-shot learning Knowledge Distillation

Paper
Code

Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation

1 code implementation • 15 Dec 2020 • Feixiang Lu, Zongdai Liu, Hui Miao, Peng Wang, Liangjun Zhang, Ruigang Yang, Dinesh Manocha, Bin Zhou

For autonomous driving, the dynamics and states of vehicle parts such as doors, the trunk, and the bonnet can provide meaningful semantic information and interaction states, which are essential to ensuring the safety of the self-driving vehicle.

Autonomous Driving Data Augmentation +3

Paper
Code

A Simple and Robust Correlation Filtering Method for Text-based Person Search

1 code implementation • ECCV 2022 2022 • Wei Suo, Mengyang Sun, Kai Niu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu

Text-based person search aims to associate pedestrian images with natural language descriptions.

Ranked #8 on Text based Person Retrieval on ICFG-PEDES

Denoising Person Search +3

Paper
Code

SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting

1 code implementation • 9 May 2018 • Yuhang Song, Chao Yang, Yeji Shen, Peng Wang, Qin Huang, C. -C. Jay Kuo

In this paper, we focus on image inpainting task, aiming at recovering the missing area of an incomplete image given the context information.

Image Inpainting Interactive Segmentation +2

Paper
Code

Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer Learning

2 code implementations • 7 Dec 2020 • Haokui Zhang, Ying Li, Yenan Jiang, Peng Wang, Qiang Shen, Chunhua Shen

In contrast to previous approaches, we do not impose restrictions over the source data sets, in which they do not have to be collected by the same sensors as the target data sets.

Classification General Classification +1

Paper
Code

Flash: Efficient Dynamic Routing for Offchain Networks

2 code implementations • 14 Feb 2019 • Peng Wang, Hong Xu, Xin Jin, Tao Wang

Mice payments are directly sent by looking up a routing table with a few precomputed paths to reduce probing overhead.

Networking and Internet Architecture

Paper
Code

Person Re-identification in Aerial Imagery

1 code implementation • 14 Aug 2019 • Shizhou Zhang, Qi Zhang, Yifei Yang, Xing Wei, Peng Wang, Bingliang Jiao, Yanning Zhang

Our method can learn a discriminative and compact feature representation for ReID in aerial imagery and can be trained in an end-to-end fashion efficiently.

object-detection Object Detection +1

Paper
Code

Knowledge Rumination for Pre-trained Language Models

1 code implementation • 15 May 2023 • Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun Chen, Ningyu Zhang

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs.

Language Modelling

Paper
Code

AdaXpert: Adapting Neural Architecture for Growing Data

1 code implementation • 1 Jul 2021 • Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan

To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data.

Paper
Code

Domain Adaptation for Deep Entity Resolution: A Design Space Exploration

1 code implementation • SIGMOD/PODS 2022 • Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Chengliang Chai, Guoliang Li, Ruixue Fan, Xiaoyong Du

Entity resolution (ER) is a core problem of data integration.

Ranked #2 on Entity Resolution on WDC Watches-small

Domain Adaptation Entity Resolution

Paper
Code

Learning Conditional Attributes for Compositional Zero-Shot Learning

1 code implementation • CVPR 2023 • Qingsheng Wang, Lingqiao Liu, Chenchen Jing, Hao Chen, Guoqiang Liang, Peng Wang, Chunhua Shen

Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel compositional concepts based on learned concepts such as attribute-object combinations.

Ranked #1 on Compositional Zero-Shot Learning on MIT-States

Attribute Compositional Zero-Shot Learning

Paper
Code

Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration

1 code implementation • SIGMOD/PODS 2023 • Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li, Xiaoyong Du, Xiaofeng Jia, Song Gao

The widely used practice is to build task-specific or even dataset-specific solutions, which are hard to generalize and disable the opportunities of knowledge sharing that can be learned from different datasets and multiple tasks.

Entity Resolution Zero-Shot Learning

Paper
Code

Structural Recurrent Neural Network for Traffic Speed Prediction

1 code implementation • 18 Feb 2019 • Youngjoo Kim, Peng Wang, Lyudmila Mihaylova

We use a graph of a vehicular road network with recurrent neural networks (RNNs) to infer the interaction between adjacent road segments as well as the temporal dynamics.

Time Series Time Series Analysis +1

Paper
Code

Scalable Learning With a Structural Recurrent Neural Network for Short-Term Traffic Prediction

1 code implementation • 3 Mar 2021 • Youngjoo Kim, Peng Wang, Lyudmila Mihaylova

With the real traffic speed data measured in the city of Santander, we demonstrate the proposed SRNN outperforms the image-based approaches using the capsule network (CapsNet) by 14. 1% and the convolutional neural network (CNN) by 5. 87%, respectively, in terms of root mean squared error (RMSE).

Paper
Code

Structured Multimodal Attentions for TextVQA

2 code implementations • 1 Jun 2020 • Chenyu Gao, Qi Zhu, Peng Wang, Hui Li, Yuliang Liu, Anton Van Den Hengel, Qi Wu

In this paper, we propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above.

Graph Attention Optical Character Recognition (OCR) +3

Paper
Code

USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields

1 code implementation • 4 Oct 2023 • Moyang Li, Peng Wang, Lingzhe Zhao, Bangyan Liao, Peidong Liu

USB-NeRF is able to correct rolling shutter distortions and recover accurate camera motion trajectory simultaneously under the framework of NeRF, by modeling the physical image formation process of a RS camera.

Image Generation Motion Estimation +2

Paper
Code

A Holistic Representation Guided Attention Network for Scene Text Recognition

1 code implementation • 2 Apr 2019 • Lu Yang, Fan Dang, Peng Wang, Hui Li, Zhen Li, Yanning Zhang

In this work, we propose a simple yet strong approach for scene text recognition.

Irregular Text Recognition Scene Text Recognition

Paper
Code

Ground-to-Aerial Person Search: Benchmark Dataset and Approach

1 code implementation • 24 Aug 2023 • Shizhou Zhang, Qingchun Yang, De Cheng, Yinghui Xing, Guoqiang Liang, Peng Wang, Yanning Zhang

In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31, 770 images of 260, 559 annotated bounding boxes for 2, 644 identities appearing in both of the UAVs and ground surveillance cameras.

Knowledge Distillation Person Search

Paper
Code

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

1 code implementation • 28 Mar 2024 • Wei Dong, Xing Zhang, Bihui Chen, Dawei Yan, Zhijun Lin, Qingsen Yan, Peng Wang, Yang Yang

Parameter-efficient fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks by learning a minimal set of new adaptation parameters while preserving the frozen majority of pre-trained parameters.

Image Classification

Paper
Code

SCL-VI: Self-supervised Context Learning for Visual Inspection of Industrial Defects

1 code implementation • 11 Nov 2023 • Peng Wang, Haiming Yao, Wenyong Yu

Current unsupervised models struggle to strike a balance between detecting texture and object defects, lacking the capacity to discern latent representations and intricate features.

Self-Supervised Learning

Paper
Code

A Capsule Network for Traffic Speed Prediction in Complex Road Networks

1 code implementation • 23 Jul 2018 • Youngjoo Kim, Peng Wang, Yifei Zhu, Lyudmila Mihaylova

Traffic flow data from induction loop sensors are essentially a time series, which is also spatially related to traffic in different road segments.

Time Series Time Series Forecasting

Paper
Code

AutoRemover: Automatic Object Removal for Autonomous Driving Videos

1 code implementation • 28 Nov 2019 • Rong Zhang, Wei Li, Peng Wang, Chenye Guan, Jin Fang, Yuhang Song, Jinhui Yu, Baoquan Chen, Weiwei Xu, Ruigang Yang

To deal with shadows, we build up an autonomous driving shadow dataset and design a deep neural network to detect shadows automatically.

Autonomous Driving Object +1

Paper
Code

PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting

1 code implementation • Findings (NAACL) 2022 • Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung

In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.

Model Compression

Paper
Code

The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

1 code implementation • 1 Jun 2023 • Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

Second, it allows us to better understand deep representation learning by elucidating the linear progressive separation and concentration of representations from shallow to deep layers.

Representation Learning

Paper
Code

Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples

1 code implementation • 11 May 2018 • Xiu-Shen Wei, Peng Wang, Lingqiao Liu, Chunhua Shen, Jianxin Wu

To solve this problem, we propose an end-to-end trainable deep network which is inspired by the state-of-the-art fine-grained recognition model and is tailored for the FSFG task.

Few-Shot Learning Fine-Grained Image Recognition

Paper
Code

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model

1 code implementation • 24 Jul 2023 • Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang

In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e. g., language descriptions and synchronous audios.

Anomaly Detection Retrieval +2

Paper
Code

Link Prediction in Social Networks: the State-of-the-Art

2 code implementations • 19 Nov 2014 • Peng Wang, Baowen Xu, Yurong Wu, Xiaoyu Zhou

Finally, some future challenges of the link prediction in social networks are discussed.

Social and Information Networks Physics and Society

Paper
Code

CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning

1 code implementation • 15 Mar 2024 • Yukun Li, Guansong Pang, Wei Suo, Chenchen Jing, Yuling Xi, Lingqiao Liu, Hao Chen, Guoqiang Liang, Peng Wang

Large pre-trained VLMs like CLIP have demonstrated superior zero-shot recognition ability, and a number of recent studies leverage this ability to mitigate catastrophic forgetting in CL, but they focus on closed-set CL in a single domain dataset.

Class Incremental Learning Incremental Learning +1

Paper
Code

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

1 code implementation • 19 Sep 2022 • Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon.

Multi-class Classification Representation Learning +1

Paper
Code

Watch out Venomous Snake Species: A Solution to SnakeCLEF2023

1 code implementation • 19 Jul 2023 • Feiran Hu, Peng Wang, Yangyang Li, Chenlong Duan, Zijian Zhu, Fei Wang, Faen Zhang, Yong Li, Xiu-Shen Wei

The SnakeCLEF2023 competition aims to the development of advanced algorithms for snake species identification through the analysis of images and accompanying metadata.

Data Augmentation

Paper
Code

Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

1 code implementation • 6 Nov 2023 • Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

To the best of our knowledge, this is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.

Feature Compression Multi-class Classification +2

Paper
Code

Learning Graph Convolutional Networks for Multi-Label Recognition and Applications

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence 2021 • ZhaoMin Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo

The task of multi-label image recognition is to predict a set of object labels that present in an image.

Multi-Label Classification Multi-label Image Recognition with Partial Labels

Paper
Code

Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

1 code implementation • 11 Jun 2022 • Peng Wang, Huikang Liu, Anthony Man-Cho So, Laura Balzano

The K-subspaces (KSS) method is a generalization of the K-means method for subspace clustering.

Clustering

Paper
Code

Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale Fine-Grained Image Retrieval

1 code implementation • 21 Nov 2023 • Xiu-Shen Wei, Yang shen, Xuhao Sun, Peng Wang, Yuxin Peng

Our work focuses on tackling large-scale fine-grained image retrieval as ranking the images depicting the concept of interests (i. e., the same sub-category labels) highest based on the fine-grained details in the query.

Attribute Deep Hashing +2

Paper
Code

Continual Referring Expression Comprehension via Dual Modular Memorization

1 code implementation • 25 Nov 2023 • Heng Tao Shen, Cheng Chen, Peng Wang, Lianli Gao, Meng Wang, Jingkuan Song

In this paper, we propose Continual Referring Expression Comprehension (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks.

Memorization Referring Expression +1

Paper
Code

Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network

no code implementations • 5 Jun 2018 • Lei Zhang, Peng Wang, Chunhua Shen, Lingqiao Liu, Wei Wei, Yanning Zhang, Anton Van Den Hengel

In this study, we revisit this problem from an orthog- onal view, and propose a novel learning strategy to maxi- mize the pixel-wise fitting capacity of a given lightweight network architecture.

Image Super-Resolution

Paper
Add Code

Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving

no code implementations • 17 Apr 2018 • Jiaolong Xu, Peng Wang, Heng Yang, Antonio M. López

Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection.

Autonomous Driving object-detection +2

Paper
Add Code

View Extrapolation of Human Body from a Single Image

no code implementations • CVPR 2018 • Hao Zhu, Hao Su, Peng Wang, Xun Cao, Ruigang Yang

We study how to synthesize novel views of human body from a single image.

Image Generation

Paper
Add Code

Occlusion Aware Unsupervised Learning of Optical Flow

no code implementations • CVPR 2018 • Yang Wang, Yi Yang, Zhenheng Yang, Liang Zhao, Peng Wang, Wei Xu

Especially on KITTI dataset where abundant unlabeled samples exist, our unsupervised method outperforms its counterpart trained with supervised learning.

Optical Flow Estimation

Paper
Add Code

Visual Question Answering with Memory-Augmented Networks

no code implementations • CVPR 2018 • Chao Ma, Chunhua Shen, Anthony Dick, Qi Wu, Peng Wang, Anton Van Den Hengel, Ian Reid

In this paper, we exploit a memory-augmented neural network to predict accurate answers to visual questions, even when those answers occur rarely in the training set.

Question Answering Visual Question Answering

Paper
Add Code

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

no code implementations • CVPR 2018 • Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian Schroff, Peng Wang, Hartwig Adam

Within each region of interest, MaskLab performs foreground/background segmentation by combining semantic and direction prediction.

Ranked #85 on Instance Segmentation on COCO test-dev (using extra training data)

Instance Segmentation Object +4

Paper
Add Code

Fine-grained Pattern Matching Over Streaming Time Series

no code implementations • 27 Oct 2017 • Rong Kang, Chen Wang, Peng Wang, Yuting Ding, Jian-Min Wang

Hence, we formulate a new problem, called "fine-grained pattern matching", which allows users to specify varied granularities of matching deviation to different segments of a given pattern, and fuzzy regions for adaptive breakpoints determination between consecutive segments.

Time Series Time Series Analysis

Paper
Add Code

Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning

no code implementations • CVPR 2018 • Qi Wu, Peng Wang, Chunhua Shen, Ian Reid, Anton Van Den Hengel

The Visual Dialogue task requires an agent to engage in a conversation about an image with a human.

Ranked #4 on Visual Dialog on VisDial v0.9 val

Question Answering Visual Dialog +1

Paper
Add Code

Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency

no code implementations • 10 Nov 2017 • Zhenheng Yang, Peng Wang, Wei Xu, Liang Zhao, Ramakant Nevatia

Learning to reconstruct depths in a single image by watching unlabeled videos via deep convolutional network (DCN) is attracting significant attention in recent years.

Depth Estimation

Paper
Add Code

Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks

no code implementations • 26 Sep 2017 • Hui Li, Peng Wang, Chunhua Shen

In contrast to existing approaches which take license plate detection and recognition as two separate tasks and settle them step by step, our method jointly solves these two tasks by a single network.

License Plate Detection

Paper
Add Code

Joint Multi-Person Pose Estimation and Semantic Part Segmentation

no code implementations • CVPR 2017 • Fangting Xia, Peng Wang, Xianjie Chen, Alan Yuille

To refine part segments, the refined pose and the original part potential are integrated through a Part FCN, where the skeleton feature from pose serves as additional regularization cues for part segments.

Ranked #5 on Human Part Segmentation on PASCAL-Part

Human Detection Multi-Person Pose Estimation

Paper
Add Code

FVQA: Fact-based Visual Question Answering

no code implementations • 17 Jun 2016 • Peng Wang, Qi Wu, Chunhua Shen, Anton Van Den Hengel, Anthony Dick

We evaluate several baseline models on the FVQA dataset, and describe a novel model which is capable of reasoning about an image on the basis of supporting facts.

Ranked #2 on Visual Question Answering (VQA) on F-VQA

Common Sense Reasoning Question Answering +1

Paper
Add Code

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

no code implementations • ICCV 2017 • Hui Li, Peng Wang, Chunhua Shen

In this work, we jointly address the problem of text detection and recognition in natural scene images based on convolutional recurrent neural networks.

Image Cropping Text Detection +1

Paper
Add Code

A Fast and Compact Saliency Score Regression Network Based on Fully Convolutional Network

no code implementations • 2 Feb 2017 • Xuanyang Xi, Yongkang Luo, Fengfu Li, Peng Wang, Hong Qiao

In this paper, we tackle this problem by proposing a fast and compact saliency score regression network which employs fully convolutional network, a special deep convolutional neural network, to estimate the saliency of objects in images.

regression Saliency Detection

Paper
Add Code

Compositional Model based Fisher Vector Coding for Image Classification

1 code implementation • 16 Jan 2016 • Lingqiao Liu, Peng Wang, Chunhua Shen, Lei Wang, Anton Van Den Hengel, Chao Wang, Heng Tao Shen

To handle this limitation, in this paper we break the convention which assumes that a local feature is drawn from one of few Gaussian distributions.

Classification General Classification +1

Paper
Code

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

no code implementations • 9 Mar 2016 • Qi Wu, Chunhua Shen, Anton Van Den Hengel, Peng Wang, Anthony Dick

Much recent progress in Vision-to-Language problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs).

Ranked #9 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 open ended

General Knowledge Image Captioning +2

Paper
Add Code

The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions

no code implementations • CVPR 2017 • Peng Wang, Qi Wu, Chunhua Shen, Anton Van Den Hengel

To train a method to perform even one of these operations accurately from {image, question, answer} tuples would be challenging, but to aim to achieve them all with a limited set of such training data seems ambitious at best.

BIG-bench Machine Learning Question Answering +1

Paper
Add Code

Exploiting Temporal Information for DCNN-based Fine-Grained Object Classification

no code implementations • 1 Aug 2016 • ZongYuan Ge, Chris McCool, Conrad Sanderson, Peng Wang, Lingqiao Liu, Ian Reid, Peter Corke

Fine-grained classification is a relatively new field that has concentrated on using information from a single image, while ignoring the enormous potential of using video data to improve classification.

Classification General Classification

Paper
Add Code

DOC: Deep OCclusion Estimation From a Single Image

no code implementations • 20 Nov 2015 • Peng Wang, Alan Yuille

In this paper we propose a deep network architecture, called DOC, which acts on a single image, detects object boundaries and estimates the border ownership (i. e. which side of the boundary is foreground and which is background).

Occlusion Estimation

Paper
Add Code

Where to Focus: Query Adaptive Matching for Instance Retrieval Using Convolutional Feature Maps

no code implementations • 22 Jun 2016 • Jiewei Cao, Lingqiao Liu, Peng Wang, Zi Huang, Chunhua Shen, Heng Tao Shen

Instance retrieval requires one to search for images that contain a particular object within a large corpus.

Retrieval

Paper
Add Code

Pushing the Limits of Deep CNNs for Pedestrian Detection

no code implementations • 15 Mar 2016 • Qichang Hu, Peng Wang, Chunhua Shen, Anton Van Den Hengel, Fatih Porikli

In this work, we show that by re-using the convolutional feature maps (CFMs) of a deep convolutional neural network (DCNN) model as image features to train an ensemble of boosted decision models, we are able to achieve the best reported accuracy without using specially designed learning algorithms.

Occlusion Handling Optical Flow Estimation +1

Paper
Add Code

Large-scale Binary Quadratic Optimization Using Semidefinite Relaxation and Applications

no code implementations • 27 Nov 2014 • Peng Wang, Chunhua Shen, Anton Van Den Hengel, Philip H. S. Torr

Two standard relaxation methods are widely used for solving general BQPs--spectral methods and semidefinite programming (SDP), each with their own advantages and disadvantages.

Clustering Image Segmentation +2

Paper
Add Code

Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources

no code implementations • CVPR 2016 • Qi Wu, Peng Wang, Chunhua Shen, Anthony Dick, Anton Van Den Hengel

Priming a recurrent neural network with this combined information, and the submitted question, leads to a very flexible visual question answering approach.

General Knowledge Question Answering +1

Paper
Add Code

Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net

no code implementations • 21 Nov 2015 • Fangting Xia, Peng Wang, Liang-Chieh Chen, Alan L. Yuille

To tackle these difficulties, we propose a "Hierarchical Auto-Zoom Net" (HAZN) for object part parsing which adapts to the local scales of objects and parts.

Ranked #8 on Human Part Segmentation on PASCAL-Part

Paper
Add Code

Hi Detector, What's Wrong with that Object? Identifying Irregular Object From Images by Modelling the Detection Score Distribution

no code implementations • 14 Feb 2016 • Peng Wang, Lingqiao Liu, Chunhua Shen, Anton Van Den Hengel, Heng Tao Shen

To address this problem, we propose a novel approach by inspecting the distribution of the detection scores at multiple image regions based on the detector trained from the "regular object" and "other objects".

Gaussian Processes Object

Paper
Add Code

Order-aware Convolutional Pooling for Video Based Action Recognition

no code implementations • 31 Jan 2016 • Peng Wang, Lingqiao Liu, Chunhua Shen, Heng Tao Shen

Most video based action recognition approaches create the video-level representation by temporally pooling the features extracted at each frame.

Action Recognition Temporal Action Localization

Paper
Add Code

Pose-Guided Human Parsing with Deep Learned Features

no code implementations • 17 Aug 2015 • Fangting Xia, Jun Zhu, Peng Wang, Alan Yuille

Parsing human body into semantic regions is crucial to human-centric analysis.

Human Parsing

Paper
Add Code

Explicit Knowledge-based Reasoning for Visual Question Answering

no code implementations • 9 Nov 2015 • Peng Wang, Qi Wu, Chunhua Shen, Anton Van Den Hengel, Anthony Dick

We describe a method for visual question answering which is capable of reasoning about contents of an image on the basis of information extracted from a large-scale knowledge base.

Question Answering Visual Question Answering

Paper
Add Code

Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

no code implementations • 20 Apr 2014 • Peng Wang, Chunhua Shen, Anton Van Den Hengel, Philip Torr

We propose a Branch-and-Cut (B&C) method for solving general MAP-MRF inference problems.

Paper
Add Code

Implementation of Training Convolutional Neural Networks

no code implementations • 3 Jun 2015 • Tianyi Liu, Shuangsang Fang, Yuehui Zhao, Peng Wang, Jun Zhang

Deep learning refers to the shining branch of machine learning that is based on learning levels of representations.

BIG-bench Machine Learning Face Recognition

Paper
Add Code

Joint Object and Part Segmentation using Deep Learned Potentials

no code implementations • ICCV 2015 • Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan Yuille

Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision.

Object Segmentation +1

Paper
Add Code

Temporal Pyramid Pooling Based Convolutional Neural Networks for Action Recognition

no code implementations • 4 Mar 2015 • Peng Wang, Yuanzhouhan Cao, Chunhua Shen, Lingqiao Liu, Heng Tao Shen

One challenge is that video contains a varying number of frames which is incompatible to the standard input format of CNNs.

Action Recognition Image Classification +1

Paper
Add Code

Efficient SDP Inference for Fully-connected CRFs Based on Low-rank Decomposition

no code implementations • CVPR 2015 • Peng Wang, Chunhua Shen, Anton Van Den Hengel

Conditional Random Fields (CRF) have been widely used in a variety of computer vision tasks.

Paper
Add Code

A Fast Semidefinite Approach to Solving Binary Quadratic Problems

1 code implementation • CVPR 2013 • Peng Wang, Chunhua Shen, Anton Van Den Hengel

Second, compared with conventional SDP methods, the new SDP formulation leads to a significantly more efficient and scalable dual optimization approach, which has the same degree of complexity as spectral methods.

Clustering Image Segmentation +2

Paper
Code

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding

no code implementations • 27 Jun 2018 • Zhenheng Yang, Peng Wang, Yang Wang, Wei Xu, Ram Nevatia

The four types of information, i. e. 2D flow, camera pose, segment mask and depth maps, are integrated into a differentiable holistic 3D motion parser (HMP), where per-pixel 3D motion for rigid background and moving objects are recovered.

Ranked #2 on Scene Flow Estimation on KITTI 2015 Scene Flow Training

Depth And Camera Motion Optical Flow Estimation +1

Paper
Add Code

Towards Effective Deep Embedding for Zero-Shot Learning

no code implementations • 30 Aug 2018 • Lei Zhang, Peng Wang, Lingqiao Liu, Chunhua Shen, Wei Wei, Yannning Zhang, Anton Van Den Hengel

Towards this goal, we present a simple but effective two-branch network to simultaneously map semantic descriptions and visual samples into a joint space, on which visual embeddings are forced to regress to their class-level semantic embeddings and the embeddings crossing classes are required to be distinguishable by a trainable classifier.

Zero-Shot Learning

Paper
Add Code

RGB-D Based Action Recognition with Light-weight 3D Convolutional Networks

no code implementations • 24 Nov 2018 • Haokui Zhang, Ying Li, Peng Wang, Yu Liu, Chunhua Shen

Different from RGB videos, depth data in RGB-D videos provide key complementary information for tristimulus visual data which potentially could achieve accuracy improvement for action recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving

no code implementations • CVPR 2019 • Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang

Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints.

3D Car Instance Understanding Autonomous Driving

Paper
Add Code

Visual Question Answering as Reading Comprehension

no code implementations • CVPR 2019 • Hui Li, Peng Wang, Chunhua Shen, Anton Van Den Hengel

In contrast to struggling on multimodal feature fusion, in this paper, we propose to unify all the input information by natural language so as to convert VQA into a machine reading comprehension problem.

Common Sense Reasoning General Knowledge +4

Paper
Add Code

Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

no code implementations • CVPR 2019 • Peng Wang, Qi Wu, Jiewei Cao, Chunhua Shen, Lianli Gao, Anton Van Den Hengel

Being composed of node attention component and edge attention component, the proposed graph attention mechanism explicitly represents inter-object relationships, and properties with a flexibility and power impossible with competing approaches.

Graph Attention Object +2

Paper
Add Code

CVTE at IJCNLP-2017 Task 1: Character Checking System for Chinese Grammatical Error Diagnosis Task

no code implementations • IJCNLP 2017 • Xi-An Li, Peng Wang, Suixue Wang, Guanyu Jiang, Tianyuan You

Grammatical error diagnosis is an important task in natural language processing.

Position

Paper
Add Code

SURGE: Surface Regularized Geometry Estimation from a Single Image

no code implementations • NeurIPS 2016 • Peng Wang, Xiaohui Shen, Bryan Russell, Scott Cohen, Brian Price, Alan L. Yuille

This paper introduces an approach to regularize 2. 5D surface normal and depth predictions at each pixel given a single input image.

Paper
Add Code

RPC: A Large-Scale Retail Product Checkout Dataset

no code implementations • 22 Jan 2019 • Xiu-Shen Wei, Quan Cui, Lei Yang, Peng Wang, Lingqiao Liu

The main challenge of this problem comes from the large scale and the fine-grained nature of the product categories as well as the difficulty for collecting training images that reflect the realistic checkout scenarios due to continuous update of the products.

Paper
Add Code

Supervised Kernel Descriptors for Visual Recognition

no code implementations • CVPR 2013 • Peng Wang, Jingdong Wang, Gang Zeng, Weiwei Xu, Hongbin Zha, Shipeng Li

In visual recognition tasks, the design of low level image feature representation is fundamental.

General Classification Image Classification

Paper
Add Code

Towards Unified Depth and Semantic Prediction From a Single Image

no code implementations • CVPR 2015 • Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, Alan L. Yuille

By allowing for interactions between the depth and semantic information, the joint network provides more accurate depth prediction than a state-of-the-art CNN trained solely for depth prediction [5].

Depth Estimation Depth Prediction +1

Paper
Add Code

What's Wrong With That Object? Identifying Images of Unusual Objects by Modelling the Detection Score Distribution

no code implementations • CVPR 2016 • Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, Anton Van Den Hengel, Heng Tao Shen

The key observation motivating our approach is that "regular object" images, "unusual object" images and "other objects" images exhibit different region-level scores in terms of both the score values and the spatial distributions.

Gaussian Processes Object +2

Paper
Add Code

Multi-Attention Network for One Shot Learning

no code implementations • CVPR 2017 • Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, Anton Van Den Hengel, Heng Tao Shen

One-shot learning is a challenging problem where the aim is to recognize a class identified by a single training image.

One-Shot Learning TAG +1

Paper
Add Code

Semantic Clustering and Convolutional Neural Network for Short Text Categorization

no code implementations • IJCNLP 2015 • Peng Wang, Jiaming Xu, Bo Xu, Cheng-Lin Liu, Heng Zhang, Fangyuan Wang, Hong-Wei Hao

Clustering Learning Word Embeddings +2

Paper
Add Code

Pixel-aware Deep Function-mixture Network for Spectral Super-Resolution

no code implementations • 24 Mar 2019 • Lei Zhang, Zhiqiang Lang, Peng Wang, Wei Wei, Shengcai Liao, Ling Shao, Yanning Zhang

To address this problem, we propose a pixel-aware deep function-mixture network for SSR, which is composed of a new class of modules, termed function-mixture (FM) blocks.

Spectral Super-Resolution Super-Resolution

Paper
Add Code

Vehicle Re-identification in Aerial Imagery: Dataset and Approach

no code implementations • ICCV 2019 • Peng Wang, Bingliang Jiao, Lu Yang, Yifei Yang, Shizhou Zhang, Wei Wei, Yanning Zhang

It is capable of explicitly detecting discriminative parts for each specific vehicle and significantly outperforms the evaluated baselines and state-of-the-art vehicle ReID approaches.

Vehicle Re-Identification

Paper
Add Code

Towards End-to-End Text Spotting in Natural Scenes

no code implementations • 14 Jun 2019 • Peng Wang, Hui Li, Chunhua Shen

Text spotting in natural scene images is of great importance for many image understanding tasks.

Image Cropping Text Detection +1

Paper
Add Code

Evaluating Local Geometric Feature Representations for 3D Rigid Data Matching

no code implementations • 29 Jun 2019 • Jiaqi Yang, Siwen Quan, Peng Wang, Yanning Zhang

The outcomes present interesting findings that may shed new light on this community and provide complementary perspectives to existing evaluations on the topic of local geometric feature description.

Object Recognition Point Cloud Registration +1

Paper
Add Code

A Performance Evaluation of Correspondence Grouping Methods for 3D Rigid Data Matching

no code implementations • 5 Jul 2019 • Jiaqi Yang, Ke Xian, Peng Wang, Yanning Zhang

Seeking consistent point-to-point correspondences between 3D rigid data (point clouds, meshes, or depth maps) is a fundamental problem in 3D computer vision.

3D Object Recognition Point Cloud Registration +1

Paper
Add Code

EPNAS: Efficient Progressive Neural Architecture Search

no code implementations • 7 Jul 2019 • Yanqi Zhou, Peng Wang, Sercan Arik, Haonan Yu, Syed Zawad, Feng Yan, Greg Diamos

In this paper, we propose Efficient Progressive Neural Architecture Search (EPNAS), a neural architecture search (NAS) that efficiently handles large search space through a novel progressive search policy with performance prediction based on REINFORCE~\cite{Williams. 1992. PG}.

Neural Architecture Search

Paper
Add Code

V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

no code implementations • 29 Jul 2019 • Damien Teney, Peng Wang, Jiewei Cao, Lingqiao Liu, Chunhua Shen, Anton Van Den Hengel

One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced.

Visual Reasoning

Paper
Add Code

A Method to Learn Embedding of a Probabilistic Medical Knowledge Graph: Algorithm Development

no code implementations • 2 Sep 2019 • Linfeng Li, Peng Wang, Yao Wang, Jinpeng Jiang, Buzhou Tang, Jun Yan, Sheng-Hui Wang, Yu-Ting Liu

This paper proposes an algorithm named as PrTransH to learn embedding vectors from real world EMR data based medical knowledge.

Knowledge Graphs Link Prediction +1

Paper
Add Code

Efficient Automatic Meta Optimization Search for Few-Shot Learning

no code implementations • 6 Sep 2019 • Xinyue Zheng, Peng Wang, Qigang Wang, Zhongchao shi, Feiyu Xu

NAS automatically generates and evaluates meta-learner's architecture for few-shot learning problems, while the meta-learner uses meta-learning algorithm to optimize its parameters based on the distribution of learning tasks.

Few-Shot Learning Neural Architecture Search

Paper
Add Code

Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation

no code implementations • 25 Oct 2019 • Shizhou Zhang, Yifei Yang, Peng Wang, Guoqiang Liang, Xiuwei Zhang, Yanning Zhang

The problem of cross-modality person re-identification has been receiving increasing attention recently, due to its practical significance.

Cross-Modality Person Re-identification Person Re-Identification

Paper
Add Code

Resilient Load Restoration in Microgrids Considering Mobile Energy Storage Fleets: A Deep Reinforcement Learning Approach

no code implementations • 6 Nov 2019 • Shuhan Yao, Jiuxiang Gu, Peng Wang, Tianyang Zhao, Huajun Zhang, Xiaochuan Liu

Mobile energy storage systems (MESSs) provide mobility and flexibility to enhance distribution system resilience.

Scheduling

Paper
Add Code

CSPN++: Learning Context and Resource Aware Convolutional Spatial Propagation Networks for Depth Completion

no code implementations • 13 Nov 2019 • Xinjing Cheng, Peng Wang, Chenye Guan, Ruigang Yang

In this paper, we propose CSPN++, which further improves its effectiveness and efficiency by learning adaptive convolutional kernel sizes and the number of iterations for the propagation, thus the context and computational resources needed at each pixel could be dynamically assigned upon requests.

Ranked #2 on Stereo-LiDAR Fusion on KITTI Depth Completion Validation

Depth Completion Stereo-LiDAR Fusion

Paper
Add Code

To Balance or Not to Balance: A Simple-yet-Effective Approach for Learning with Long-Tailed Distributions

no code implementations • 10 Dec 2019 • Jun-Jie Zhang, Lingqiao Liu, Peng Wang, Chunhua Shen

Such imbalanced distribution causes a great challenge for learning a deep neural network, which can be boiled down into a dilemma: on the one hand, we prefer to increase the exposure of tail class samples to avoid the excessive dominance of head classes in the classifier training.

Auxiliary Learning Self-Supervised Learning

Paper
Add Code

Using Sampled Network Data With The Autologistic Actor Attribute Model

1 code implementation • 30 Jan 2020 • Alex D. Stivala, H. Colin Gallagher, David A. Rolls, Peng Wang, Garry L. Robins

Social science research increasingly benefits from statistical methods for understanding the structured nature of social life, including for social network data.

Social and Information Networks Methodology

Paper
Code

Deep Domain Adaptive Object Detection: a Survey

no code implementations • 17 Feb 2020 • Wanyi Li, Fuyu Li, Yongkang Luo, Peng Wang, Jia Sun

Deep learning (DL) based object detection has achieved great progress.

Domain Adaptation Object +2

Paper
Add Code

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

no code implementations • CVPR 2020 • Zhenfang Chen, Peng Wang, Lin Ma, Kwan-Yee K. Wong, Qi Wu

To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features.

Referring Expression Referring Expression Comprehension +1

Paper
Add Code

TEDL: A Text Encryption Method Based on Deep Learning

1 code implementation • 9 Mar 2020 • Xiang Li, Peng Wang

Firstly, both communication parties establish a word vector table by training a deep learning model according to specified hyperparameters.

Paper
Code

Toward Interpretability of Dual-Encoder Models for Dialogue Response Suggestions

no code implementations • 2 Mar 2020 • Yitong Li, Dianqi Li, Sushant Prakash, Peng Wang

To improve the interpretability in the dual encoder models, we design a novel regularization loss to minimize the mutual information between unimportant words and desired labels, in addition to the original attention method, so that important words are emphasized while unimportant words are de-emphasized.

Word Embeddings

Paper
Add Code

Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System

no code implementations • 26 Apr 2020 • Xinyue Zheng, Peng Wang, Qigang Wang, Zhongchao shi

Prior work in standardized science exams requires support from large text corpus, such as targeted science corpus fromWikipedia or SimpleWikipedia.

Language Modelling Large Language Model +3

Paper
Add Code

A Robust Attentional Framework for License Plate Recognition in the Wild

no code implementations • 6 Jun 2020 • Linjiang Zhang, Peng Wang, Hui Li, Zhen Li, Chunhua Shen, Yanning Zhang

On the other hand, the 2D attentional based license plate recognizer with an Xception-based CNN encoder is capable of recognizing license plates with different patterns under various scenarios accurately and robustly.

Image Generation License Plate Recognition

Paper
Add Code

Non-Convex Exact Community Recovery in Stochastic Block Model

1 code implementation • 29 Jun 2020 • Peng Wang, Zirui Zhou, Anthony Man-Cho So

Community detection in graphs that are generated according to stochastic block models (SBMs) has received much attention lately.

Community Detection Stochastic Block Model

Paper
Code

ODE-CNN: Omnidirectional Depth Extension Networks

no code implementations • 3 Jul 2020 • Xinjing Cheng, Peng Wang, Yanqi Zhou, Chenye Guan, Ruigang Yang

Omnidirectional 360{\deg} camera proliferates rapidly for autonomous robots since it significantly enhances the perception ability by widening the field of view(FoV).

Paper
Add Code

Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks

no code implementations • ECCV 2020 • Yan Liu, Lingqiao Liu, Peng Wang, Pingping Zhang, Yinjie Lei

Most existing crowd counting systems rely on the availability of the object location annotation which can be expensive to obtain.

Crowd Counting

Paper
Add Code

A Nearly-Linear Time Algorithm for Exact Community Recovery in Stochastic Block Model

no code implementations • ICML 2020 • Peng Wang, Zirui Zhou, Anthony Man-Cho So

In this paper, we focus on the problem of exactly recovering the communities in a binary symmetric SBM, where a graph of $n$ vertices is partitioned into two equal-sized communities and the vertices are connected with probability $p = \alpha\log(n)/n$ within communities and $q = \beta\log(n)/n$ across communities for some $\alpha>\beta>0$.

Stochastic Block Model

Paper
Add Code

Disentangled Neural Architecture Search

no code implementations • 24 Sep 2020 • Xinyue Zheng, Peng Wang, Qigang Wang, Zhongchao shi

However, existing methods rely heavily on a black-box controller to search architectures, which suffers from the serious problem of lacking interpretability.

Neural Architecture Search

Paper
Add Code

Few-shot Action Recognition with Implicit Temporal Alignment and Pair Similarity Optimization

no code implementations • 13 Oct 2020 • Congqi Cao, Yajuan Li, Qinyi Lv, Peng Wang, Yanning Zhang

Few-shot learning aims to recognize instances from novel classes with few labeled samples, which has great value in research and application.

Few-Shot action recognition Few Shot Action Recognition +3

Paper
Add Code

Localization and delocalization of light in photonic moire lattices

no code implementations • 17 Sep 2020 • Peng Wang, Yuanlin Zheng, Xianfeng Chen, Changming Huang, Yaroslav V. Kartashov, Lluis Torner, Vladimir V. Konotop, Fangwei Ye

Moire lattices consist of two identical periodic structures overlaid with a relative rotation angle.

Optics

Paper
Add Code

Where to Look and How to Describe: Fashion Image Retrieval with an Attentional Heterogeneous Bilinear Network

no code implementations • 26 Oct 2020 • Haibo Su, Peng Wang, Lingqiao Liu, Hui Li, Zhen Li, Yanning Zhang

Fashion products typically feature in compositions of a variety of styles at different clothing parts.

Image Retrieval Retrieval

Paper
Add Code

Quantum Dynamics of Optimization Problems

no code implementations • 6 Dec 2020 • Peng Wang, Gang Xin, Yuwei Jiao

The mathematical relationship between the objective function and the wave function is established, and the quantum interpretation of the optimization problem is realized.

Paper
Add Code

Fully-Automated Liver Tumor Localization and Characterization from Multi-Phase MR Volumes Using Key-Slice ROI Parsing: A Physician-Inspired Approach

no code implementations • 13 Dec 2020 • Bolin Lai, YuHsuan Wu, Xiaoyu Bai, Xiao-Yun Zhou, Peng Wang, Jinzheng Cai, Yuankai Huo, Lingyun Huang, Yong Xia, Jing Xiao, Le Lu, Heping Hu, Adam Harrison

Using radiological scans to identify liver tumors is crucial for proper patient treatment.

Paper
Add Code

Derive Lovelock Gravity from String Theory in Cosmological Background

no code implementations • 24 Dec 2020 • Peng Wang, Houwen Wu, Haitang Yang, Shuxuan Ying

It was proved more than three decades ago, that the first order $\alpha'$ correction of string effective theory could be written as the Gauss-Bonnet term, which is the quadratic term of Lovelock gravity.

High Energy Physics - Theory General Relativity and Quantum Cosmology High Energy Physics - Phenomenology

Paper
Add Code

MeisterMorxrc at SemEval-2020 Task 9: Fine-Tune Bert and Multitask Learning for Sentiment Analysis of Code-Mixed Tweets

no code implementations • SEMEVAL 2020 • Qi Wu, Peng Wang, Chenghao Huang

Natural language processing (NLP) has been applied to various fields including text classification and sentiment analysis.

Sentiment Analysis text-classification +1

Paper
Add Code

A Collaborative Visual SLAM Framework for Service Robots

no code implementations • 5 Feb 2021 • Ming Ouyang, Xuesong Shi, Yujie Wang, Yuxin Tian, Yingzhe Shen, Dawei Wang, Peng Wang, Zhiqiang Cao

We present a collaborative visual simultaneous localization and mapping (SLAM) framework for service robots.

Retrieval Simultaneous Localization and Mapping

Paper
Add Code

M6: A Chinese Multimodal Pretrainer

no code implementations • 1 Mar 2021 • Junyang Lin, Rui Men, An Yang, Chang Zhou, Ming Ding, Yichang Zhang, Peng Wang, Ang Wang, Le Jiang, Xianyan Jia, Jie Zhang, Jianwei Zhang, Xu Zou, Zhikang Li, Xiaodong Deng, Jie Liu, Jinbao Xue, Huiling Zhou, Jianxin Ma, Jin Yu, Yong Li, Wei Lin, Jingren Zhou, Jie Tang, Hongxia Yang

In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1. 9TB images and 292GB texts that cover a wide range of domains.

Image Generation

Paper
Add Code

Instance and Pair-Aware Dynamic Networks for Re-Identification

no code implementations • 9 Mar 2021 • Bingliang Jiao, Xin Tan, Jinghao Zhou, Lu Yang, Yunlong Wang, Peng Wang

The proposed model is composed of three main branches where a self-guided dynamic branch is constructed to strengthen instance-specific features, focusing on every single image.

Paper
Add Code

Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification

no code implementations • 9 Mar 2021 • Lu Yang, Hongbang Liu, Jinghao Zhou, Lingqiao Liu, Lei Zhang, Peng Wang, Yanning Zhang

Learning cross-view consistent feature representation is the key for accurate vehicle Re-identification (ReID), since the visual appearance of vehicles changes significantly under different viewpoints.

Vehicle Re-Identification

Paper
Add Code

Real-Time Visual Object Tracking via Few-Shot Learning

no code implementations • 18 Mar 2021 • Jinghao Zhou, Bo Li, Peng Wang, Peixia Li, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang

Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL).

Few-Shot Learning Object +2

Paper
Add Code

Higher Performance Visual Tracking with Dual-Modal Localization

no code implementations • 18 Mar 2021 • Jinghao Zhou, Bo Li, Lei Qiao, Peng Wang, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang

Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy.

regression Visual Object Tracking +1

Paper
Add Code

Hetero-Modal Learning and Expansive Consistency Constraints for Semi-Supervised Detection from Multi-Sequence Data

no code implementations • 24 Mar 2021 • Bolin Lai, YuHsuan Wu, Xiao-Yun Zhou, Peng Wang, Le Lu, Lingyun Huang, Mei Han, Jing Xiao, Heping Hu, Adam P. Harrison

Lesion detection serves a critical role in early diagnosis and has been well explored in recent years due to methodological advancesand increased data availability.

Lesion Detection

Paper
Add Code

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification

no code implementations • CVPR 2021 • Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, Lei Wang

Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases.

Ranked #10 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Classification Contrastive Learning +4

Paper
Add Code

An Adversarial Human Pose Estimation Network Injected with Graph Structure

no code implementations • 29 Mar 2021 • Lei Tian, Guoqiang Liang, Peng Wang, Chunhua Shen

Because of the invisible human keypoints in images caused by illumination, occlusion and overlap, it is likely to produce unreasonable human pose prediction for most of the current human pose estimation methods.

Generative Adversarial Network Pose Estimation +1

Paper
Add Code

Residual Gaussian Process: A Tractable Nonparametric Bayesian Emulator for Multi-fidelity Simulations

no code implementations • 8 Apr 2021 • Wei W. Xing, Akeel A. Shah, Peng Wang, Shandian Zhe Qian Fu, Robert. M. Kirby

The resulting model is equipped with a closed-form solution for the predictive posterior, making it applicable to advanced, high-dimensional tasks that require uncertainty estimation.

Active Learning

Paper
Add Code

PURE: Passive mUlti-peRson idEntification via Deep Footstep Separation and Recognition

no code implementations • 15 Apr 2021 • Chao Cai, Ruinan Jin, Peng Wang, Liyuan Ye, Hongbo Jiang, Jun Luo

Recently, \textit{passive behavioral biometrics} (e. g., gesture or footstep) have become promising complements to conventional user identification methods (e. g., face or fingerprint) under special situations, yet existing sensing technologies require lengthy measurement traces and cannot identify multiple users at the same time.

Person Identification

Paper
Add Code

CAT: Cross-Attention Transformer for One-Shot Object Detection

no code implementations • 30 Apr 2021 • Weidong Lin, Yuyan Deng, Yang Gao, Ning Wang, Jinghao Zhou, Lingqiao Liu, Lei Zhang, Peng Wang

Given a query patch from a novel class, one-shot object detection aims to detect all instances of that class in a target image through the semantic similarity comparison.

Object object-detection +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.