Search Results for author: Peng Wang

Found 315 papers, 120 papers with code

A Nearly-Linear Time Algorithm for Exact Community Recovery in Stochastic Block Model

no code implementations • ICML 2020 • Peng Wang, Zirui Zhou, Anthony Man-Cho So

In this paper, we focus on the problem of exactly recovering the communities in a binary symmetric SBM, where a graph of $n$ vertices is partitioned into two equal-sized communities and the vertices are connected with probability $p = \alpha\log(n)/n$ within communities and $q = \beta\log(n)/n$ across communities for some $\alpha>\beta>0$.

Stochastic Block Model

Paper
Add Code

Hyperbolic Hierarchy-Aware Knowledge Graph Embedding for Link Prediction

no code implementations • Findings (EMNLP) 2021 • Zhe Pan, Peng Wang

Existing embedding methods are mostly built on Euclidean space, which are difficult to handle hierarchical structures.

Knowledge Graph Embedding Knowledge Graphs +1

Paper
Add Code

PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting

1 code implementation • Findings (NAACL) 2022 • Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung

In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.

Model Compression

Paper
Code

A Full-duplex Speech Dialogue Scheme Based On Large Language Models

no code implementations • 29 May 2024 • Peng Wang, Songshuo Lu, Yaohua Tang, Sijie Yan, Yuanjun Xiong, Wei Xia

The perception and motor function modules operate simultaneously, allowing the system to simultaneously speak and listen to the user.

Paper
Add Code

WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

1 code implementation • 23 May 2024 • Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

In WISE, we design a dual parametric memory scheme, which consists of the main memory for the pretrained knowledge and a side memory for the edited knowledge.

Hallucination Model Editing +2

1,491

Paper
Code

RaFe: Ranking Feedback Improves Query Rewriting for RAG

no code implementations • 23 May 2024 • Shengyu Mao, Yong Jiang, Boli Chen, Xiao Li, Peng Wang, Xinyu Wang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA.

Retrieval

Paper
Add Code

Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning

no code implementations • 22 May 2024 • Yuanhao Yue, Chengyu Wang, Jun Huang, Peng Wang

The process of instruction tuning aligns pre-trained large language models (LLMs) with open-domain instructions and human-preferred responses.

Code Generation Instruction Following +1

Paper
Add Code

Multi-Objective Optimization-Based Waveform Design for Multi-User and Multi-Target MIMO-ISAC Systems

no code implementations • 22 May 2024 • Peng Wang, Dongsheng Han, Yashuai Cao, Wanli Ni, Dusit Niyato

In this paper, we investigate the waveform design problem in a downlink multi-user and multi-target ISAC system under different C&S performance preferences.

Paper
Add Code

C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning

no code implementations • 21 May 2024 • Ji Ma, Wei Suo, Peng Wang, Yanning Zhang

Vision-Language Instruction Tuning (VLIT) is a critical training phase for Large Vision-Language Models (LVLMs).

Contrastive Learning

Paper
Add Code

Learning Social Graph for Inactive User Recommendation

1 code implementation • 8 May 2024 • Nian Liu, Shen Fan, Ting Bai, Peng Wang, Mingwei Sun, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Chuan Shi

In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{S}ocial Graph for \textbf{I}nactive User \textbf{R}ecommendation) that learns an optimal social graph structure for social recommendation, especially for inactive users.

Graph structure learning Recommendation Systems

Paper
Code

Towards Continual Knowledge Graph Embedding via Incremental Distillation

1 code implementation • 7 May 2024 • Jiajun Liu, Wenjun Ke, Peng Wang, Ziyu Shang, Jinhua Gao, Guozheng Li, Ke Ji, Yanhe Liu

On the one hand, existing methods usually learn new triples in a random order, destroying the inner structure of new KGs.

Knowledge Graph Embedding

Paper
Code

Depth Priors in Removal Neural Radiance Fields

no code implementations • 1 May 2024 • Zhihao Guo, Peng Wang

This paper proposes a new pipeline that leverages SpinNeRF and monocular depth estimation models like ZoeDepth to enhance NeRF's performance in complex object removal with improved efficiency.

3D Reconstruction Monocular Depth Estimation +2

Paper
Add Code

Dual-Modal Prompting for Sketch-Based Image Retrieval

no code implementations • 29 Apr 2024 • Liying Gao, Bingliang Jiao, Peng Wang, Shizhou Zhang, Hanwang Zhang, Yanning Zhang

In this study, we aim to tackle two major challenges of this task simultaneously: i) zero-shot, dealing with unseen categories, and ii) fine-grained, referring to intra-category instance-level retrieval.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors

no code implementations • 27 Apr 2024 • Guozheng Li, Peng Wang, Jiajun Liu, Yikai Guo, Ke Ji, Ziyu Shang, Zijie Xu

To this end, we introduce \textsc{Micre} (\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction), a new meta-training framework for zero and few-shot RE where an LLM is tuned to do ICL on a diverse collection of RE datasets (i. e., learning to learn in context for RE).

Few-Shot Learning In-Context Learning +2

Paper
Add Code

Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

no code implementations • 27 Apr 2024 • Guozheng Li, Peng Wang, Wenjun Ke, Yikai Guo, Ke Ji, Ziyu Shang, Jiajun Liu, Zijie Xu

On the one hand, retrieving good demonstrations is a non-trivial process in RE, which easily results in low relevance regarding entities and relations.

In-Context Learning Language Modelling +4

Paper
Add Code

Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation

no code implementations • 26 Apr 2024 • SeungWook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang

Using image as prompts for 3D generation demonstrate particularly strong performances compared to using text prompts alone, for images provide a more intuitive guidance for the 3D generation process.

3D Generation

Paper
Add Code

Enhancing Prompt Following with Visual Control Through Training-Free Mask-Guided Diffusion

no code implementations • 23 Apr 2024 • Hongyu Chen, Yiqi Gao, Min Zhou, Peng Wang, Xubin Li, Tiezheng Ge, Bo Zheng

Meanwhile, a network, dubbed as Masked ControlNet, is designed to utilize these object masks for object generation in the misaligned visual control region.

Attribute Object

Paper
Add Code

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

no code implementations • 16 Apr 2024 • SeungWook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang

Leveraging multi-view diffusion models as priors for 3D optimization have alleviated the problem of 3D consistency, e. g., the Janus face problem or the content drift problem, in zero-shot text-to-3D models.

Common Sense Reasoning Text to 3D

Paper
Add Code

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

no code implementations • 15 Apr 2024 • Mude Hui, Siwei Yang, Bingchen Zhao, Yichun Shi, Heng Wang, Peng Wang, Yuyin Zhou, Cihang Xie

This study introduces HQ-Edit, a high-quality instruction-based image editing dataset with around 200, 000 edits.

Attribute

Paper
Add Code

COCONut: Modernizing COCO Segmentation

no code implementations • 12 Apr 2024 • Xueqing Deng, Qihang Yu, Peng Wang, Xiaohui Shen, Liang-Chieh Chen

By enhancing the annotation quality and expanding the dataset to encompass 383K images with more than 5. 18M panoptic masks, we introduce COCONut, the COCO Next Universal segmenTation dataset.

Panoptic Segmentation Segmentation +1

Paper
Add Code

Self-Explainable Affordance Learning with Embodied Caption

no code implementations • 8 Apr 2024 • Zhipeng Zhang, Zhimin Wei, Guolei Sun, Peng Wang, Luc van Gool

In the field of visual affordance learning, previous methods mainly used abundant images or videos that delineate human behavior patterns to identify action possibility regions for object manipulation, with a variety of applications in robotic tasks.

Paper
Add Code

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

1 code implementation • 28 Mar 2024 • Wei Dong, Xing Zhang, Bihui Chen, Dawei Yan, Zhijun Lin, Qingsen Yan, Peng Wang, Yang Yang

Parameter-efficient fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks by learning a minimal set of new adaptation parameters while preserving the frozen majority of pre-trained parameters.

Image Classification

Paper
Code

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting

1 code implementation • 18 Mar 2024 • Lingzhe Zhao, Peng Wang, Peidong Liu

In this paper, we introduce a novel approach, named BAD-Gaussians (Bundle Adjusted Deblur Gaussian Splatting), which leverages explicit Gaussian representation and handles severe motion-blurred images with inaccurate camera poses to achieve high-quality scene reconstruction.

3D Scene Reconstruction Deblurring +2

127

Paper
Code

CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning

1 code implementation • 15 Mar 2024 • Yukun Li, Guansong Pang, Wei Suo, Chenchen Jing, Yuling Xi, Lingqiao Liu, Hao Chen, Guoqiang Liang, Peng Wang

Large pre-trained VLMs like CLIP have demonstrated superior zero-shot recognition ability, and a number of recent studies leverage this ability to mitigate catastrophic forgetting in CL, but they focus on closed-set CL in a single domain dataset.

Class Incremental Learning Incremental Learning +1

Paper
Code

Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning

no code implementations • 15 Mar 2024 • Meixuan Li, Tianyu Li, Guoqing Wang, Peng Wang, Yang Yang, Heng Tao Shen

Aligning these distributions between corresponding regions from different tasks imparts higher flexibility and capacity to capture intra-region structures, accommodating a broader range of tasks.

Depth Estimation Semantic Segmentation +1

Paper
Add Code

SAM-Lightening: A Lightweight Segment Anything Model with Dilated Flash Attention to Achieve 30 times Acceleration

no code implementations • 14 Mar 2024 • Yanfei Song, Bangzheng Pu, Peng Wang, Hongxu Jiang, Dong Dong, Yongxiang Cao, Yiqing Shen

Moreover, it takes only 244MB memory, which is 3. 5\% of the vanilla SAM.

Transfer Learning Zero-shot Generalization

Paper
Add Code

Dcl-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ Segmentation

no code implementations • 6 Mar 2024 • Lu Wen, Zhenghao Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang

Semi-supervised learning is a sound measure to relieve the strict demand of abundant annotated datasets, especially for challenging multi-organ segmentation .

Contrastive Learning Organ Segmentation

Paper
Add Code

Vision-Language Navigation with Embodied Intelligence: A Survey

no code implementations • 22 Feb 2024 • Peng Gao, Peng Wang, Feng Gao, Fei Wang, Ruyue Yuan

As a long-term vision in the field of artificial intelligence, the core goal of embodied intelligence is to improve the perception, understanding, and interaction capabilities of agents and the environment.

Vision-Language Navigation

Paper
Add Code

Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

no code implementations • 21 Feb 2024 • Guozheng Li, Wenjun Ke, Peng Wang, Zijie Xu, Ke Ji, Jiajun Liu, Ziyu Shang, Qiqing Luo

The in-context learning (ICL) for relational triple extraction (RTE) has achieved promising performance, but still encounters two key challenges: (1) how to design effective prompts and (2) how to select proper demonstrations.

Blocking In-Context Learning +1

Paper
Add Code

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

1 code implementation • 11 Feb 2024 • Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks.

Graph Question Answering Instruction Following +4

Paper
Code

TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation

no code implementations • 11 Feb 2024 • Peng Wang, Xiang Wei, Fangxu Hu, Wenjuan Han

TransGPT-MM is finetuned on a multi-modal Transportation dataset (MTD) that we manually collected from three areas of the transportation domain: driving tests, traffic signs, and landmarks.

Language Modelling Large Language Model

Paper
Add Code

Image Fusion via Vision-Language Model

no code implementations • 3 Feb 2024 • Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc van Gool

Therefore, we introduce a novel fusion paradigm named image Fusion via vIsion-Language Model (FILM), for the first time, utilizing explicit textual information in different source images to guide image fusion.

Decoder Language Modelling

Paper
Add Code

A Comprehensive Study of Knowledge Editing for Large Language Models

2 code implementations • 2 Jan 2024 • Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches.

Ranked #1 on knowledge editing on zsRE (using extra training data)

knowledge editing

1,491

Paper
Code

ReCo-Diff: Explore Retinex-Based Condition Strategy in Diffusion Model for Low-Light Image Enhancement

no code implementations • 20 Dec 2023 • Yuhui Wu, Guoqing Wang, Zhiwen Wang, Yang Yang, Tianyu Li, Peng Wang, Chongyi Li, Heng Tao Shen

Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models.

Low-Light Image Enhancement

Paper
Add Code

LLM A: Human in the Loop Large Language Models Enabled A Search for Robotics

no code implementations • 4 Dec 2023 • Hengjia Xiao, Peng Wang

This makes the whole path planning process a `white box' and human feedback guides LLM A* to converge quickly compared to other data-driven methods such as reinforcement learning-based (RL) path planning.

Paper
Add Code

ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation

no code implementations • 2 Dec 2023 • Peng Wang, Yichun Shi

We introduce "ImageDream," an innovative image-prompt, multi-view diffusion model for 3D object generation.

3D Generation Object

Paper
Add Code

Matching Weak Informative Ontologies

1 code implementation • 1 Dec 2023 • Peng Wang

In this paper, these ontologies are named as weak informative ontologies (WIOs) and it is challenging for existing methods to matching WIOs.

Ontology Matching

Paper
Code

Continual Referring Expression Comprehension via Dual Modular Memorization

1 code implementation • 25 Nov 2023 • Heng Tao Shen, Cheng Chen, Peng Wang, Lianli Gao, Meng Wang, Jingkuan Song

In this paper, we propose Continual Referring Expression Comprehension (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks.

Memorization Referring Expression +1

Paper
Code

Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale Fine-Grained Image Retrieval

1 code implementation • 21 Nov 2023 • Xiu-Shen Wei, Yang shen, Xuhao Sun, Peng Wang, Yuxin Peng

Our work focuses on tackling large-scale fine-grained image retrieval as ranking the images depicting the concept of interests (i. e., the same sub-category labels) highest based on the fine-grained details in the query.

Attribute Deep Hashing +2

Paper
Code

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

no code implementations • 20 Nov 2023 • Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang

We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1. 3 seconds on a single A100 GPU.

3D Reconstruction Image to 3D +1

Paper
Add Code

FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models

no code implementations • 16 Nov 2023 • Yimin Jing, Renren Jin, Jiahao Hu, Huishi Qiu, Xiaohua Wang, Peng Wang, Deyi Xiong

In pursuit of this goal, various benchmarks have been constructed to evaluate the instruction-following capacity of these models.

Instruction Following Logical Reasoning

Paper
Add Code

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

no code implementations • 15 Nov 2023 • Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang

We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion.

3D Generation Denoising +2

Paper
Add Code

Open-Vocabulary Video Anomaly Detection

no code implementations • 13 Nov 2023 • Peng Wu, Xuerong Zhou, Guansong Pang, Yujia Sun, Jing Liu, Peng Wang, Yanning Zhang

Particularly, we devise a semantic knowledge injection module to introduce semantic knowledge from large language models for the detection task, and design a novel anomaly synthesis module to generate pseudo unseen anomaly videos with the help of large vision generation models for the classification task.

Anomaly Detection Video Anomaly Detection

Paper
Add Code

SCL-VI: Self-supervised Context Learning for Visual Inspection of Industrial Defects

1 code implementation • 11 Nov 2023 • Peng Wang, Haiming Yao, Wenyong Yu

Current unsupervised models struggle to strike a balance between detecting texture and object defects, lacking the capacity to discern latent representations and intricate features.

Ranked #58 on Anomaly Detection on MVTec AD

Anomaly Detection Self-Supervised Learning

Paper
Code

Interpretable Graph Anomaly Detection using Gradient Attention Maps

no code implementations • 10 Nov 2023 • Yifei Yang, Peng Wang, Xiaofan He, Dongmian Zou

Detecting unusual patterns in graph data is a crucial task in data mining.

Decision Making Graph Anomaly Detection

Paper
Add Code

Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

1 code implementation • 6 Nov 2023 • Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

To the best of our knowledge, this is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.

Feature Compression Multi-class Classification +2

Paper
Code

PERF: Panoramic Neural Radiance Field from a Single Panorama

1 code implementation • 25 Oct 2023 • Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu

In this paper, we present PERF, a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.

Novel View Synthesis Text to 3D

188

Paper
Code

Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

no code implementations • 24 Oct 2023 • Yinjie Lei, Zixuan Wang, Feng Chen, Guoqing Wang, Peng Wang, Yang Yang

Multi-modal 3D scene understanding has gained considerable attention due to its wide applications in many areas, such as autonomous driving and human-computer interaction.

Autonomous Driving Scene Understanding

Paper
Add Code

Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing

1 code implementation • NeurIPS 2023 • Wei Dong, Dawei Yan, Zhijun Lin, Peng Wang

Consequently, effectively adapting large pre-trained models to downstream tasks in an efficient manner has become a prominent research area.

Image Classification Transfer Learning

Paper
Code

Generalized Neural Collapse for a Large Number of Classes

no code implementations • 9 Oct 2023 • Jiachen Jiang, Jinxin Zhou, Peng Wang, Qing Qu, Dustin Mixon, Chong You, Zhihui Zhu

However, most of the existing empirical and theoretical studies in neural collapse focus on the case that the number of classes is small relative to the dimension of the feature space.

Face Recognition Retrieval

Paper
Add Code

Revisiting Large Language Models as Zero-shot Relation Extractors

no code implementations • 8 Oct 2023 • Guozheng Li, Peng Wang, Wenjun Ke

On the one hand, we analyze the drawbacks of existing RE prompts and attempt to incorporate recent prompt techniques such as chain-of-thought (CoT) to improve zero-shot RE.

Question Answering Relation +1

Paper
Add Code

The Emergence of Reproducibility and Consistency in Diffusion Models

no code implementations • 8 Oct 2023 • Huijie Zhang, Jinfan Zhou, Yifu Lu, Minzhe Guo, Peng Wang, Liyue Shen, Qing Qu

In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility": given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs.

Image Generation Memorization

Paper
Add Code

USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields

1 code implementation • 4 Oct 2023 • Moyang Li, Peng Wang, Lingzhe Zhao, Bangyan Liao, Peidong Liu

USB-NeRF is able to correct rolling shutter distortions and recover accurate camera motion trajectory simultaneously under the framework of NeRF, by modeling the physical image formation process of a RS camera.

Image Generation Motion Estimation +2

Paper
Code

Human-centric Behavior Description in Videos: New Benchmark and Model

no code implementations • 4 Oct 2023 • Lingru Zhou, Yiqi Gao, Manqing Zhang, Peng Wu, Peng Wang, Yanning Zhang

To address this challenge, we construct a human-centric video surveillance captioning dataset, which provides detailed descriptions of the dynamic behaviors of 7, 820 individuals.

Video Captioning

Paper
Add Code

Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models

no code implementations • 4 Oct 2023 • Jianglong Ye, Peng Wang, Kejie Li, Yichun Shi, Heng Wang

Specifically, we decompose the NVS task into two stages: (i) transforming observed regions to a novel view, and (ii) hallucinating unseen regions.

Image to 3D Novel View Synthesis

Paper
Add Code

Selective Feature Adapter for Dense Vision Transformers

no code implementations • 3 Oct 2023 • Xueqing Deng, Qi Fan, Xiaojie Jin, Linjie Yang, Peng Wang

Specifically, SFA consists of external adapters and internal adapters which are sequentially operated over a transformer model.

Depth Estimation

Paper
Add Code

MMPI: a Flexible Radiance Field Representation by Multiple Multi-plane Images Blending

no code implementations • 30 Sep 2023 • Yuze He, Peng Wang, Yubin Hu, Wang Zhao, Ran Yi, Yong-Jin Liu, Wenping Wang

In this paper, we explore the potential of MPI and show that MPI can synthesize high-quality novel views of complex scenes with diverse camera distributions and view directions, which are not only limited to simple forward-facing scenes.

Autonomous Driving Novel View Synthesis

Paper
Add Code

Qwen Technical Report

2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.

Ranked #3 on Multi-Label Text Classification on CC3M-TagMask

Language Modelling Large Language Model +2

11,708

Paper
Code

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

no code implementations • 14 Sep 2023 • Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

In spite of the excellent strides made by end-to-end (E2E) models in speech recognition in recent years, named entity recognition is still challenging but critical for semantic understanding.

Language Modelling named-entity-recognition +3

Paper
Add Code

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning

no code implementations • CVPR 2023 • Wei Suo, Mengyang Sun, Weisong Liu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu

VQA Natural Language Explanation (VQA-NLE) task aims to explain the decision-making process of VQA models in natural language.

Decision Making Visual Question Answering (VQA)

Paper
Add Code

MVDream: Multi-view Diffusion for 3D Generation

2 code implementations • 31 Aug 2023 • Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt.

3D Generation

682

Paper
Code

TouchStone: Evaluating Vision-Language Models by Language Models

1 code implementation • 31 Aug 2023 • Shuai Bai, Shusheng Yang, Jinze Bai, Peng Wang, Xingxuan Zhang, Junyang Lin, Xinggang Wang, Chang Zhou, Jingren Zhou

Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs).

Visual Storytelling

Paper
Code

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

1 code implementation • 24 Aug 2023 • Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, Jingren Zhou

In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images.

Ranked #3 on Visual Question Answering on MM-Vet

Chart Question Answering Image Captioning +6

4,011

Paper
Code

Ground-to-Aerial Person Search: Benchmark Dataset and Approach

1 code implementation • 24 Aug 2023 • Shizhou Zhang, Qingchun Yang, De Cheng, Yinghui Xing, Guoqiang Liang, Peng Wang, Yanning Zhang

In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31, 770 images of 260, 559 annotated bounding boxes for 2, 644 identities appearing in both of the UAVs and ground surveillance cameras.

Knowledge Distillation Person Search

Paper
Code

LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition

1 code implementation • ICCV 2023 • Changxu Cheng, Peng Wang, Cheng Da, Qi Zheng, Cong Yao

The diversity in length constitutes a significant characteristic of text.

Decoder Scene Text Recognition

1,052

Paper
Code

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

1 code implementation • 22 Aug 2023 • Peng Wu, Xuerong Zhou, Guansong Pang, Lingru Zhou, Qingsen Yan, Peng Wang, Yanning Zhang

With the benefit of dual branch, VadCLIP achieves both coarse-grained and fine-grained video anomaly detection by transferring pre-trained knowledge from CLIP to WSVAD task.

Anomaly Detection Binary Classification +1

Paper
Code

Contrastive Diffusion Model with Auxiliary Guidance for Coarse-to-Fine PET Reconstruction

1 code implementation • 20 Aug 2023 • Zeyu Han, YuHan Wang, Luping Zhou, Peng Wang, Binyu Yan, Jiliu Zhou, Yan Wang, Dinggang Shen

To obtain high-quality positron emission tomography (PET) scans while reducing radiation exposure to the human body, various approaches have been proposed to reconstruct standard-dose PET (SPET) images from low-dose PET (LPET) images.

Paper
Code

Polymerized Feature-based Domain Adaptation for Cervical Cancer Dose Map Prediction

no code implementations • 20 Aug 2023 • Jie Zeng, Zeyu Han, Xingchen Peng, Jianghong Xiao, Peng Wang, Yan Wang

Recently, deep learning (DL) has automated and accelerated the clinical radiation therapy (RT) planning significantly by predicting accurate dose maps.

Domain Adaptation

Paper
Add Code

Pre-training with Large Language Model-based Document Expansion for Dense Passage Retrieval

no code implementations • 16 Aug 2023 • Guangyuan Ma, Xing Wu, Peng Wang, Zijia Lin, Songlin Hu

Concretely, we leverage the capabilities of LLMs for document expansion, i. e. query generation, and effectively transfer expanded knowledge to retrievers using pre-training strategies tailored for passage retrieval.

Contrastive Learning Language Modelling +3

Paper
Add Code

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

2 code implementations • 14 Aug 2023 • Peng Wang, Ningyu Zhang, Bozhong Tian, Zekun Xi, Yunzhi Yao, Ziwen Xu, Mengru Wang, Shengyu Mao, Xiaohan Wang, Siyuan Cheng, Kangwei Liu, Yuansheng Ni, Guozhou Zheng, Huajun Chen

Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data.

knowledge editing

1,491

Paper
Code

AerialVLN: Vision-and-Language Navigation for UAVs

1 code implementation • ICCV 2023 • Shubo Liu, Hongsheng Zhang, Yuankai Qi, Peng Wang, Yaning Zhang, Qi Wu

Navigating in the sky is more complicated than on the ground because agents need to consider the flying height and more complex spatial relationship reasoning.

Navigate Vision and Language Navigation

Paper
Code

TriDo-Former: A Triple-Domain Transformer for Direct PET Reconstruction from Low-Dose Sinograms

no code implementations • 10 Aug 2023 • Jiaqi Cui, Pinxian Zeng, Xinyi Zeng, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang, Dinggang Shen

Specifically, the TriDo-Former consists of two cascaded networks, i. e., a sinogram enhancement transformer (SE-Former) for denoising the input LPET sinograms and a spatial-spectral reconstruction transformer (SSR-Former) for reconstructing SPET images from the denoised sinograms.

Denoising Image Reconstruction +1

Paper
Add Code

A Survey on Deep Learning-based Spatio-temporal Action Detection

no code implementations • 3 Aug 2023 • Peng Wang, Fanwei Zeng, Yuntao Qian

Spatio-temporal action detection (STAD) aims to classify the actions present in a video and localize them in space and time.

Action Detection Autonomous Driving

Paper
Add Code

Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition

1 code implementation • 25 Jul 2023 • Cheng Da, Peng Wang, Cong Yao

Specifically, MGP-STR achieves an average recognition accuracy of $94\%$ on standard benchmarks for scene text recognition.

Language Modelling Optical Character Recognition (OCR) +1

1,052

Paper
Code

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model

1 code implementation • 24 Jul 2023 • Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang

In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e. g., language descriptions and synchronous audios.

Anomaly Detection Retrieval +2

Paper
Code

Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

no code implementations • 20 Jul 2023 • Yinghui Xing, Dexuan Kong, Shizhou Zhang, Geng Chen, Lingyan Ran, Peng Wang, Yanning Zhang

Camouflaged object detection (COD), aiming to segment camouflaged objects which exhibit similar patterns with the background, is a challenging task.

Multi-Task Learning object-detection +1

Paper
Add Code

6G Network Business Support System

no code implementations • 19 Jul 2023 • Ye Ouyang, Yaqin Zhang, Peng Wang, Yunxin Liu, Wen Qiao, Jun Zhu, Yang Liu, Feng Zhang, Shuling Wang, Xidong Wang

6G is the next-generation intelligent and integrated digital information infrastructure, characterized by ubiquitous interconnection, native intelligence, multi-dimensional perception, global coverage, green and low-carbon, native network security, etc.

Paper
Add Code

Watch out Venomous Snake Species: A Solution to SnakeCLEF2023

1 code implementation • 19 Jul 2023 • Feiran Hu, Peng Wang, Yangyang Li, Chenlong Duan, Zijian Zhu, Fei Wang, Faen Zhang, Yong Li, Xiu-Shen Wei

The SnakeCLEF2023 competition aims to the development of advanced algorithms for snake species identification through the analysis of images and accompanying metadata.

Data Augmentation

Paper
Code

DiffDP: Radiotherapy Dose Prediction via a Diffusion Model

no code implementations • 19 Jul 2023 • Zhenghao Feng, Lu Wen, Peng Wang, Binyu Yan, Xi Wu, Jiliu Zhou, Yan Wang

To alleviate this limitation, we innovatively introduce a diffusion-based dose prediction (DiffDP) model for predicting the radiotherapy dose distribution of cancer patients.

Anatomy

Paper
Add Code

Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models

no code implementations • 14 Jul 2023 • Xiaorui Zhu, Yichen Qin, Peng Wang

A critical question remains unsettled; that is, is it possible and how to embed the inference of the model into the simultaneous inference of the coefficients?

Model Selection

Paper
Add Code

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

1 code implementation • NeurIPS 2023 • Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, Yasutaka Furukawa

This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e. g., perspective crops from a panorama or multi-view images given depth maps and poses).

Image Generation

441

Paper
Code

Fast and Automatic 3D Modeling of Antenna Structure Using CNN-LSTM Network for Efficient Data Generation

no code implementations • 27 Jun 2023 • Zhaohui Wei, Zhao Zhou, Peng Wang, Jian Ren, Yingzeng Yin, Gert Frølund Pedersen, Ming Shen

In this study, we proposed a deep learning-assisted and image-based intelligent modeling approach for accelerating the data acquisition of antenna samples with different physical structures.

Paper
Add Code

A Dynamic Feature Interaction Framework for Multi-task Visual Perception

no code implementations • 8 Jun 2023 • Yuling Xi, Hao Chen, Ning Wang, Peng Wang, Yanning Zhang, Chunhua Shen, Yifan Liu

In particular, one feature merge branch is designed for instance-level recognition the other for dense predictions.

Autonomous Driving Depth Estimation +3

Paper
Add Code

Teacher Agent: A Knowledge Distillation-Free Framework for Rehearsal-based Video Incremental Learning

1 code implementation • 1 Jun 2023 • Shengqin Jiang, Yaoyu Fang, Haokui Zhang, Qingshan Liu, Yuankai Qi, Yang Yang, Peng Wang

Rehearsal-based video incremental learning often employs knowledge distillation to mitigate catastrophic forgetting of previously learned data.

Incremental Learning Knowledge Distillation +1

Paper
Code

The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

1 code implementation • 1 Jun 2023 • Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

Second, it allows us to better understand deep representation learning by elucidating the linear progressive separation and concentration of representations from shallow to deep layers.

Representation Learning

Paper
Code

Learning Conditional Attributes for Compositional Zero-Shot Learning

1 code implementation • CVPR 2023 • Qingsheng Wang, Lingqiao Liu, Chenchen Jing, Hao Chen, Guoqiang Liang, Peng Wang, Chunhua Shen

Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel compositional concepts based on learned concepts such as attribute-object combinations.

Ranked #1 on Compositional Zero-Shot Learning on MIT-States

Attribute Compositional Zero-Shot Learning

Paper
Code

Continuous and Noninvasive Measurement of Arterial Pulse Pressure and Pressure Waveform using an Image-free Ultrasound System

no code implementations • 29 May 2023 • Lirui Xu, Pang Wu, Pan Xia, Fanglin Geng, Peng Wang, Xianxiang Chen, Zhenfeng Li, Lidong Du, Shuping Liu, Li Li, Hongbo Chang, Zhen Fang

In in vitro cardiovascular phantom experiments, the results demonstrated high accuracy in the measurement of PP (error < 3 mmHg) and blood pressure waveform (root-mean-square-errors (RMSE) < 2 mmHg, correlation coefficient (r) > textgreater 0. 99).

Paper
Add Code

NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images

1 code implementation • 27 May 2023 • YuAn Liu, Peng Wang, Cheng Lin, Xiaoxiao Long, Jiepeng Wang, Lingjie Liu, Taku Komura, Wenping Wang

We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.

Neural Rendering Object

505

Paper
Code

A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation

no code implementations • CVPR 2023 • Congqi Cao, Yue Lu, Peng Wang, Yanning Zhang

At present, it is the largest semi-supervised VAD dataset with the largest number of scenes and classes of anomalies, the longest duration, and the only one considering the scene-dependent anomaly.

Anomaly Detection Video Anomaly Detection

Paper
Add Code

Editing Large Language Models: Problems, Methods, and Opportunities

2 code implementations • 22 May 2023 • Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang

Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.

Model Editing

1,491

Paper
Code

MALM: Mask Augmentation based Local Matching for Food-Recipe Retrieval

1 code implementation • 18 May 2023 • Bhanu Prakash Voutharoja, Peng Wang, Lei Wang, Vivienne Guan

A de-facto idea to address this task is to learn a shared feature embedding space in which a food image is aligned better to its paired recipe than other recipes.

Image-text matching Retrieval +1

Paper
Code

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

2 code implementations • 18 May 2023 • Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

Ranked #1 on Semantic Segmentation on ADE20K (using extra training data)

Action Classification AudioCaps +16

6,243

Paper
Code

Knowledge Rumination for Pre-trained Language Models

1 code implementation • 15 May 2023 • Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun Chen, Ningyu Zhang

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs.

Language Modelling

Paper
Code

Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration

1 code implementation • SIGMOD/PODS 2023 • Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li, Xiaoyong Du, Xiaofeng Jia, Song Gao

The widely used practice is to build task-specific or even dataset-specific solutions, which are hard to generalize and disable the opportunities of knowledge sharing that can be learned from different datasets and multiple tasks.

Entity Resolution Zero-Shot Learning

Paper
Code

ViewFormer: View Set Attention for Multi-view 3D Shape Understanding

no code implementations • 29 Apr 2023 • Hongyu Sun, Yongcai Wang, Peng Wang, Xudong Cai, Deying Li

This paper presents ViewFormer, a simple yet effective model for multi-view 3d shape recognition and retrieval.

3D Shape Recognition 3D Shape Retrieval +1

Paper
Add Code

Maximizing Model Generalization for Machine Condition Monitoring with Self-Supervised Learning and Federated Learning

no code implementations • 27 Apr 2023 • Matthew Russell, Peng Wang

Specifically, Self-Supervised Learning (SSL) with Barlow Twins may produce more discriminative features for monitoring health condition than supervised learning by focusing on semantic properties of the data.

Domain Adaptation Federated Learning +2

Paper
Add Code

Glocal Energy-based Learning for Few-Shot Open-Set Recognition

1 code implementation • CVPR 2023 • Haoyu Wang, Guansong Pang, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang

Few-shot open-set recognition (FSOR) is a challenging task of great practical value.

Open Set Learning

Paper
Code

AirBirds: A Large-scale Challenging Dataset for Bird Strike Prevention in Real-world Airports

no code implementations • 23 Apr 2023 • Hongyu Sun, Yongcai Wang, Xudong Cai, Peng Wang, Zhe Huang, Deying Li, Yu Shao, Shuo Wang

To advance the research and practical solutions for bird strike prevention, in this paper, we present a large-scale challenging dataset AirBirds that consists of 118, 312 time-series images, where a total of 409, 967 bounding boxes of flying birds are manually, carefully annotated.

Time Series

Paper
Add Code

A geometry-aware deep network for depth estimation in monocular endoscopy

1 code implementation • 20 Apr 2023 • Yongming Yang, Shuwei Shao, Tao Yang, Peng Wang, Zhuo Yang, Chengdong Wu, Hao liu

To address this issue, we introduce a gradient loss to penalize edge fluctuations ambiguous around stepped edge structures and a normal loss to explicitly express the sensitivity to frequently small structures, and propose a geometric consistency loss to spreads the spatial information across the sample grids to constrain the global geometric anatomy structures.

3D Reconstruction Anatomy +1

Paper
Code

CoT-MoTE: Exploring ConTextual Masked Auto-Encoder Pre-training with Mixture-of-Textual-Experts for Passage Retrieval

no code implementations • 20 Apr 2023 • Guangyuan Ma, Xing Wu, Peng Wang, Songlin Hu

Siamese or fully separated dual-encoders are often adopted as basic retrieval architecture in the pre-training and fine-tuning stages for encoding queries and passages into their latent embedding spaces.

Passage Retrieval Retrieval

Paper
Add Code

CoT-MAE v2: Contextual Masked Auto-Encoder with Multi-view Modeling for Passage Retrieval

no code implementations • 5 Apr 2023 • Xing Wu, Guangyuan Ma, Peng Wang, Meng Lin, Zijia Lin, Fuzheng Zhang, Songlin Hu

As an effective representation bottleneck pretraining technique, the contextual masked auto-encoder utilizes contextual embedding to assist in the reconstruction of passages.

Passage Retrieval Retrieval +1

Paper
Add Code

F$^{2}$-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

1 code implementation • 28 Mar 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Based on our analysis, we further propose a novel space-warping method called perspective warping, which allows us to handle arbitrary trajectories in the grid-based NeRF framework.

Novel View Synthesis

902

Paper
Code

HOP+: History-enhanced and Order-aware Pre-training for Vision-and-Language Navigation

no code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2023 • Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, and Qi Wu ̊

To address these problems, we present a history-enhanced and order-aware pre-training with the complementing fine-tuning paradigm (HOP+) for VLN.

Decision Making Language Modelling +2

Paper
Add Code

Joint Spectrum and Power Allocation for V2X Communications with Imperfect CSI

no code implementations • 21 Feb 2023 • Peng Wang, Weihua Wu, Jiayi Liu, Guanhua Chai, Li Feng

More specifically, Bernstein approximations are employed to convert the chance constraint into a calculable constraint, and Bisection search method is proposed to obtain the optimal allocation solution with low complexity.

Self-Learning

Paper
Add Code

Self-Supervised Node Representation Learning via Node-to-Neighbourhood Alignment

1 code implementation • 9 Feb 2023 • Wei Dong, Dawei Yan, Peng Wang

Considering the excessive memory overheads of contrastive learning, we further propose a negative-free solution, where the main contribution is a Graph Signal Decorrelation (GSD) constraint to avoid representation collapse and over-smoothing.

Contrastive Learning Node Classification +1

Paper
Code

Delving Deep into Simplicity Bias for Long-Tailed Image Recognition

no code implementations • 7 Feb 2023 • Xiu-Shen Wei, Xuhao Sun, Yang shen, Anqi Xu, Peng Wang, Faen Zhang

Simplicity Bias (SB) is a phenomenon that deep neural networks tend to rely favorably on simpler predictive patterns but ignore some complex features when applied to supervised discriminative tasks.

Ranked #4 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Long-tail Learning Self-Supervised Learning

Paper
Add Code

Industrial computed tomography based intelligent non-destructive testing method for power capacitor

no code implementations • 6 Feb 2023 • Zhenxing Cheng, Peng Wang, Yue Liu, Wei Qin, Zidi Tang

Power capacitor device is a widely used reactive power compensation equipment in power transmission and distribution system which can easily have internal fault and therefore affects the safe operation of the power system.

Data Augmentation

Paper
Add Code

Data-driven prognostics based on time-frequency analysis and symbolic recurrent neural network for fuel cells under dynamic load

no code implementations • 3 Feb 2023 • Chu Wang, Manfeng Dou, Zhongliang Li, Rachid Outbib, Dongdong Zhao, Jian Zuo, Yuanlin Wang, Bin Liang, Peng Wang

Data-centric prognostics is beneficial to improve the reliability and safety of proton exchange membrane fuel cell (PEMFC).

Paper
Add Code

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval

1 code implementation • 19 Jan 2023 • Xiaojie Jin, BoWen Zhang, Weibo Gong, Kai Xu, Xueqing Deng, Peng Wang, Zhao Zhang, Xiaohui Shen, Jiashi Feng

The first is a Temporal Adaptation Module that is incorporated in the video branch to introduce global and local temporal contexts.

Retrieval Text Retrieval +2

Paper
Code

Revisiting Prototypical Network for Cross Domain Few-Shot Learning

1 code implementation • CVPR 2023 • Fei Zhou, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang

Prototypical Network is a popular few-shot solver that aims at establishing a feature metric generalizable to novel few-shot classification (FSC) tasks using deep neural networks.

cross-domain few-shot learning Knowledge Distillation

Paper
Code

F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories

no code implementations • CVPR 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Existing fast grid-based NeRF training frameworks, like Instant-NGP, Plenoxels, DVGO, or TensoRF, are mainly designed for bounded scenes and rely on space warping to handle unbounded scenes.

Novel View Synthesis

Paper
Add Code

Transferring General Multimodal Pretrained Models to Text Recognition

1 code implementation • 19 Dec 2022 • Junyang Lin, Xuancheng Ren, Yichang Zhang, Gao Liu, Peng Wang, An Yang, Chang Zhou

This paper proposes a new method, OFA-OCR, to transfer multimodal pretrained models to text recognition.

Image Captioning Optical Character Recognition (OCR)

2,343

Paper
Code

Weakly Supervised Video Anomaly Detection Based on Cross-Batch Clustering Guidance

no code implementations • 16 Dec 2022 • Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang

To enhance the discriminative power of features, we propose a batch clustering based loss to encourage a clustering branch to generate distinct normal and abnormal clusters based on a batch of data.

Anomaly Detection Clustering +1

Paper
Add Code

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation • 8 Dec 2022 • Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

142

Paper
Code

Generalizable Person Re-Identification via Viewpoint Alignment and Fusion

no code implementations • 5 Dec 2022 • Bingliang Jiao, Lingqiao Liu, Liying Gao, Guosheng Lin, Ruiqi Wu, Shizhou Zhang, Peng Wang, Yanning Zhang

The key insight of this design is that the cross-attention mechanism in the transformer could be an ideal solution to align the discriminative texture clues from the original image with the canonical view image, which could compensate for the low-quality texture information of the canonical view image.

Domain Generalization Generalizable Person Re-identification +1

Paper
Add Code

NeuralUDF: Learning Unsigned Distance Fields for Multi-view Reconstruction of Surfaces with Arbitrary Topologies

no code implementations • CVPR 2023 • Xiaoxiao Long, Cheng Lin, Lingjie Liu, YuAn Liu, Peng Wang, Christian Theobalt, Taku Komura, Wenping Wang

In this paper, we propose to represent surfaces as the Unsigned Distance Function (UDF) and develop a new volume rendering scheme to learn the neural UDF representation.

Neural Rendering

Paper
Add Code

BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields

1 code implementation • CVPR 2023 • Peng Wang, Lingzhe Zhao, Ruijie Ma, Peidong Liu

Our approach models the physical image formation process of a motion blurred image, and jointly learns the parameters of NeRF and recovers the camera motion trajectories during exposure time.

3D Scene Reconstruction Deblurring +2

181

Paper
Code

Semantic Guided Level-Category Hybrid Prediction Network for Hierarchical Image Classification

no code implementations • 22 Nov 2022 • Peng Wang, Jingzhou Chen, Yuntao Qian

Hierarchical classification (HC) assigns each object with multiple labels organized into a hierarchical structure.

Image Classification Word Embeddings

Paper
Add Code

Batch-based Model Registration for Fast 3D Sherd Reconstruction

no code implementations • ICCV 2023 • Jiepeng Wang, Congyi Zhang, Peng Wang, Xin Li, Peter J. Cobb, Christian Theobalt, Wenping Wang

In this work, we aim to develop a portable, high-throughput, and accurate reconstruction system for efficient digitization of fragments excavated in archaeological sites.

3D Reconstruction

Paper
Add Code

A Simple and Robust Correlation Filtering Method for Text-based Person Search

1 code implementation • ECCV 2022 2022 • Wei Suo, Mengyang Sun, Kai Niu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu

Text-based person search aims to associate pedestrian images with natural language descriptions.

Ranked #8 on Text based Person Retrieval on ICFG-PEDES

Denoising Person Search +3

Paper
Code

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

1 code implementation • 19 Sep 2022 • Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon.

Multi-class Classification Representation Learning +1

Paper
Code

Levenshtein OCR

2 code implementations • 8 Sep 2022 • Cheng Da, Peng Wang, Cong Yao

A novel scene text recognizer based on Vision-Language Transformer (VLT) is presented.

Imitation Learning Optical Character Recognition (OCR) +1

1,052

Paper
Code

Multi-Granularity Prediction for Scene Text Recognition

2 code implementations • 8 Sep 2022 • Peng Wang, Cheng Da, Cong Yao

In this work, we first draw inspiration from the recent progress in Vision Transformer (ViT) to construct a conceptually simple yet powerful vision STR model, which is built upon ViT and outperforms previous state-of-the-art models for scene text recognition, including both pure vision models and language-augmented methods.

Ranked #2 on Scene Text Recognition on Uber-Text (using extra training data)

Language Modelling Optical Character Recognition (OCR) +1

1,052

Paper
Code

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

1 code implementation • 17 Aug 2022 • Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Peng Wang, Yanning Zhang

To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.

General Knowledge Language Modelling +1

Paper
Code

Instance Image Retrieval by Learning Purely From Within the Dataset

no code implementations • 12 Aug 2022 • Zhongyan Zhang, Lei Wang, Yang Wang, Luping Zhou, Jianjia Zhang, Peng Wang, Fang Chen

Although achieving promising results, this approach is restricted by two issues: 1) the domain gap between benchmark datasets and the dataset of a given retrieval task; 2) the required auxiliary dataset cannot be readily obtained.

Image Retrieval Retrieval +2

Paper
Add Code

Prompt Tuning for Generative Multimodal Pretrained Models

1 code implementation • 4 Aug 2022 • Hao Yang, Junyang Lin, An Yang, Peng Wang, Chang Zhou, Hongxia Yang

Prompt tuning has become a new paradigm for model tuning and it has demonstrated success in natural language pretraining and even vision pretraining.

Ranked #2 on Visual Entailment on SNLI-VE test

Image Captioning Visual Entailment +1

2,343

Paper
Code

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

no code implementations • 31 Jul 2022 • Zhipeng Zhang, Zhimin Wei, Zhongzhen Huang, Rui Niu, Peng Wang

However, one unsolved issue of these models is that the number of reasoning steps needs to be pre-defined and fixed before inference, ignoring the varying complexity of expressions.

Referring Expression Referring Expression Comprehension +2

Paper
Add Code

Progressively-connected Light Field Network for Efficient View Synthesis

no code implementations • 10 Jul 2022 • Peng Wang, YuAn Liu, Guying Lin, Jiatao Gu, Lingjie Liu, Taku Komura, Wenping Wang

ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.

Novel View Synthesis

Paper
Add Code

NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors

no code implementations • 27 Jun 2022 • Jiepeng Wang, Peng Wang, Xiaoxiao Long, Christian Theobalt, Taku Komura, Lingjie Liu, Wenping Wang

The key idea of NeuRIS is to integrate estimated normal of indoor scenes as a prior in a neural rendering framework for reconstructing large texture-less shapes and, importantly, to do this in an adaptive manner to also enable the reconstruction of irregular shapes with fine details.

3D Reconstruction Neural Rendering

Paper
Add Code

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

1 code implementation • 12 Jun 2022 • Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, Wenping Wang

We introduce SparseNeuS, a novel neural rendering based method for the task of surface reconstruction from multi-view images.

Neural Rendering Surface Reconstruction

316

Paper
Code

Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

1 code implementation • 11 Jun 2022 • Peng Wang, Huikang Liu, Anthony Man-Cho So, Laura Balzano

The K-subspaces (KSS) method is a generalization of the K-means method for subspace clustering.

Clustering

Paper
Code

Domain Adaptation for Deep Entity Resolution: A Design Space Exploration

1 code implementation • SIGMOD/PODS 2022 • Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Chengliang Chai, Guoliang Li, Ruixue Fan, Xiaoyong Du

Entity resolution (ER) is a core problem of data integration.

Ranked #2 on Entity Resolution on WDC Watches-small

Domain Adaptation Entity Resolution

Paper
Code

Fast-Spanning Ant Colony Optimisation (FaSACO) for Mobile Robot Coverage Path Planning

no code implementations • 31 May 2022 • Christopher Carr, Peng Wang

Bio-inspired algorithms such as Ant Colony Optimisation (ACO) have been exploited to solve the problem because they can utilise heuristic information to mitigate the path planning complexity.

Paper
Add Code

VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis

1 code implementation • 30 May 2022 • Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille

The Gaussian reconstruction kernels have been proposed by Westover (1990) and studied by the computer graphics community back in the 90s, which gives an alternative representation of object 3D geometry from meshes and point clouds.

Pose Estimation

Paper
Code

Balanced control between performance and saturation for constrained nonlinear systems

no code implementations • 10 May 2022 • Peng Wang, Haibin Wang, Shuzhi Sam Ge, Xiaobing Zhang

This paper addresses the balanced control between performance and saturation for a class of constrained nonlinear systems, including the branches: balanced command filtered backstepping (BCFB) and balanced performance control (BPC).

Paper
Add Code

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

no code implementations • 7 May 2022 • Zhipeng Zhang, Xinglin Hou, Kai Niu, Zhongzhen Huang, Tiezheng Ge, Yuning Jiang, Qi Wu, Peng Wang

Therefore, we present a dataset, E-MMAD (e-commercial multimodal multi-structured advertisement copywriting), which requires, and supports much more detailed information in text generation.

Text Generation Video Captioning

Paper
Add Code

Dual-Level Decoupled Transformer for Video Captioning

no code implementations • 6 May 2022 • Yiqi Gao, Xinglin Hou, Wei Suo, Mengyang Sun, Tiezheng Ge, Yuning Jiang, Peng Wang

As for the latter, \textbf{\textit{"couple"}} means treating the generation of visual semantic and syntax-related words equally.

Descriptive Sentence +1

Paper
Add Code

FastRE: Towards Fast Relation Extraction with Convolutional Encoder and Improved Cascade Binary Tagging Framework

1 code implementation • 5 May 2022 • Guozheng Li, Xu Chen, Peng Wang, Jiafeng Xie, Qiqing Luo

Recent work for extracting relations from texts has achieved excellent performance.

Language Modelling Relation +2

Paper
Code

CapOnImage: Context-driven Dense-Captioning on Image

no code implementations • 27 Apr 2022 • Yiqi Gao, Xinglin Hou, Yuanmeng Zhang, Tiezheng Ge, Yuning Jiang, Peng Wang

Existing image captioning systems are dedicated to generating narrative captions for images, which are spatially detached from the image in presentation.

Dense Captioning Image Captioning

Paper
Add Code

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

1 code implementation • CVPR 2022 • Caiyuan Zheng, Hui Li, Seon-Min Rhee, Seungju Han, Jae-Joon Han, Peng Wang

A robust consistency regularization based semi-supervised framework is proposed for STR, which can effectively solve the instability issue due to domain inconsistency between synthetic and real images.

Scene Text Recognition

Paper
Code

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

no code implementations • 12 Apr 2022 • Xueqing Deng, Dawei Sun, Shawn Newsam, Peng Wang

Specifically, given a pair of student and teacher networks, DistPro first sets up a rich set of KD connection from the transmitting layers of the teacher to the receiving layers of the student, and in the meanwhile, various transforms are also proposed for comparing feature maps along its pathway for the distillation.

Knowledge Distillation Meta-Learning

Paper
Add Code

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

1 code implementation • CVPR 2022 • Xueqing Deng, Peng Wang, Xiaochen Lian, Shawn Newsam

Notably, NightLab contains models at two levels of granularity, i. e. image and regional, and each level is composed of light adaptation and segmentation modules.

Segmentation Self-Driving Cars +1

Paper
Code

Self-Contrastive Learning based Semi-Supervised Radio Modulation Classification

no code implementations • 29 Mar 2022 • Dongxin Liu, Peng Wang, Tianshi Wang, Tarek Abdelzaher

This paper presents a semi-supervised learning framework that is new in being designed for automatic modulation classification (AMC).

Classification Contrastive Learning

Paper
Add Code

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

1 code implementation • CVPR 2022 • Wei Dong, Junsheng Wu, Yi Luo, ZongYuan Ge, Peng Wang

In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing.

Node Classification Representation Learning

Paper
Code

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation

1 code implementation • CVPR 2022 • Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu

Pre-training has been adopted in a few of recent works for Vision-and-Language Navigation (VLN).

Ranked #4 on Visual Navigation on R2R

Decision Making Language Modelling +3

Paper
Code

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

no code implementations • ACL 2022 • Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu

To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.

Descriptive Representation Learning +1

Paper
Add Code

Exact Community Recovery over Signed Graphs

no code implementations • 22 Feb 2022 • Xiaolu Wang, Peng Wang, Anthony Man-Cho So

Signed graphs encode similarity and dissimilarity relationships among different entities with positive and negative edges.

Stochastic Block Model

Paper
Add Code

Relation Regularized Scene Graph Generation

no code implementations • 22 Feb 2022 • Yuyu Guo, Lianli Gao, Jingkuan Song, Peng Wang, Nicu Sebe, Heng Tao Shen, Xuelong Li

Inspired by this observation, in this article, we propose a relation regularized network (R2-Net), which can predict whether there is a relationship between two objects and encode this relation into object feature refinement and better SGG.

Graph Classification Graph Generation +6

Paper
Add Code

Graph-based Extractive Explainer for Recommendations

no code implementations • 20 Feb 2022 • Peng Wang, Renqin Cai, Hongning Wang

Explanations in a recommender system assist users in making informed decisions among a set of recommended items.

Attribute Recommendation Systems +1

Paper
Add Code

Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos

no code implementations • 14 Feb 2022 • Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang

For weakly supervised anomaly detection, most existing work is limited to the problem of inadequate video representation due to the inability of modeling long-term contextual information.

Graph Learning Supervised Anomaly Detection +1

Paper
Add Code

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

4 code implementations • 7 Feb 2022 • Peng Wang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang

In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization.

Ranked #1 on Visual Question Answering on VQA v2 test-std (yes/no metric)

Image Captioning Language Modelling +11

6,243

Paper
Code

A Wearable ECG Monitor for Deep Learning Based Real-Time Cardiovascular Disease Detection

no code implementations • 25 Jan 2022 • Peng Wang, Zihuai Lin, Xucun Yan, Zijiao Chen, Ming Ding, Yang song, Lu Meng

Cardiovascular disease has become one of the most significant threats endangering human life and health.

ECG Classification

Paper
Add Code

Negative-ResNet: Noisy Ambulatory Electrocardiogram Signal Classification Scheme

no code implementations • 25 Jan 2022 • Zijiao Chen, Zihuai Lin, Peng Wang, Ming Ding

With recently successful applications of deep learning in computer vision and general signal processing, deep learning has shown many unique advantages in medical signal processing.

Classification

Paper
Add Code

Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings

1 code implementation • 14 Jan 2022 • Peng Wang, Xin Xie, Xiaohan Wang, Ningyu Zhang

Previous knowledge graph embedding approaches usually map entities to representations and utilize score functions to predict the target entities, yet they typically struggle to reason rare or emerging unseen entities.

Ranked #1 on Link Prediction on FB15k-237-ind

Knowledge Graph Embedding Knowledge Graph Embeddings +2

Paper
Code

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

no code implementations • 11 Jan 2022 • Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Fabio Ferreira, Isabelle Guyon, Sirui Hong, Frank Hutter, Rongrong Ji, Julio C. S. Jacques Junior, Ge Li, Marius Lindauer, Zhipeng Luo, Meysam Madadi, Thomas Nierhoff, Kangning Niu, Chunguang Pan, Danny Stoll, Sebastien Treguer, Jin Wang, Peng Wang, Chenglin Wu, Youcheng Xiong, Arbe r Zela, Yang Zhang

Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly.

Management Meta-Learning +4

Paper
Add Code

DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

1 code implementation • 10 Jan 2022 • Ningyu Zhang, Xin Xu, Liankuan Tao, Haiyang Yu, Hongbin Ye, Shuofei Qiao, Xin Xie, Xiang Chen, Zhoubo Li, Lei LI, Xiaozhuan Liang, Yunzhi Yao, Shumin Deng, Peng Wang, Wen Zhang, Zhenru Zhang, Chuanqi Tan, Qiang Chen, Feiyu Xiong, Fei Huang, Guozhou Zheng, Huajun Chen

We present an open-source and extensible knowledge extraction toolkit DeepKE, supporting complicated low-resource, document-level and multimodal scenarios in the knowledge base population.

Attribute Attribute Extraction +5

3,052

Paper
Code

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

1 code implementation • CVPR 2022 • Jingzhou Chen, Peng Wang, Jian Liu, Yuntao Qian

Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e. g., ["Albatross", "Laysan Albatross"] from coarse-to-fine levels.

Fine-Grained Image Classification Relation

Paper
Code

Multi-Domain Joint Training for Person Re-Identification

no code implementations • 6 Jan 2022 • Lu Yang, Lingqiao Liu, Yunlong Wang, Peng Wang, Yanning Zhang

Our discovery is that training with such an adaptive model can better benefit from more training samples.

Person Re-Identification

Paper
Add Code

Robust Security Analysis Based on Random Geometry Theory for Satellite-Terrestrial-Vehicle Network

no code implementations • 28 Dec 2021 • Xudong Li, Ye Fan, Rugui Yao, Peng Wang, Nan Qi, Xiaoya Zuo

Driven by B5G and 6G technologies, multi-network fusion is an indispensable tendency for future communications.

Paper
Add Code

DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings

1 code implementation • 10 Dec 2021 • Chaochen Gao, Xing Wu, Peng Wang, Jue Wang, Liangjun Zang, Zhongyuan Wang, Songlin Hu

To tackle that, we propose an effective knowledge distillation framework for contrastive sentence embeddings, termed DistilCSE.

Contrastive Learning Knowledge Distillation +5

100

Paper
Code

Contrast-reconstruction Representation Learning for Self-supervised Skeleton-based Action Recognition

no code implementations • 22 Nov 2021 • Peng Wang, Jun Wen, Chenyang Si, Yuntao Qian, Liang Wang

Finally, in the Information Fuser, we explore varied strategies to combine the Sequence Reconstructor and Contrastive Motion Learner, and propose to capture postures and motions simultaneously via a knowledge-distillation based fusion strategy that transfers the motion learning from the Contrastive Motion Learner to the Sequence Reconstructor.

Action Recognition Contrastive Learning +4

Paper
Add Code

Spatial-Interference Aware Cooperative Resource Allocation for 5G NR Sidelink Communications

no code implementations • 15 Nov 2021 • Silvia Mura, Francesco Linsalata, Marouan Mizmizi, Maurizio Magarini, Majid Nasiri Khormuji, Peng Wang, Alberto Perotti, Umberto Spagnolini

Distributed resource allocation (RA) schemes have been introduced in cellular vehicle-to-everything (C-V2X) standard for vehicle-to-vehicle (V2V) sidelink (SL) communications to share the limited spectrum (sub-6GHz) efficiently.

Paper
Add Code

LoS-Map Construction for Proactive Relay of Opportunity Selection in 6G V2X Systems

no code implementations • 15 Nov 2021 • Francesco Linsalata, Silvia Mura, Marouan Mizmizi, Maurizio Magarini, Peng Wang, Majid Nasiri Khormuji, Alberto Perotti, Umberto Spagnolini

Recent advances in Vehicle-to-Everything (V2X) technology and the upcoming sixth-generation (6G) network will dawn a new era for vehicular services with enhanced communication capabilities.

Autonomous Vehicles

Paper
Add Code

NAS-FCOS: Efficient Search for Object Detection Architectures

1 code implementation • 24 Oct 2021 • Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, Yanning Zhang

Neural Architecture Search (NAS) has shown great potential in effectively reducing manual effort in network design by automatically discovering optimal architectures.

Neural Architecture Search Object +2

187

Paper
Code

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis

1 code implementation • ICLR 2022 • Jiatao Gu, Lingjie Liu, Peng Wang, Christian Theobalt

We perform volume rendering only to produce a low-resolution feature map and progressively apply upsampling in 2D to address the first issue.

Image Generation

956

Paper
Code

A deep learning pipeline for localization, differentiation, and uncertainty estimation of liver lesions using multi-phasic and multi-sequence MRI

no code implementations • 17 Oct 2021 • Peng Wang, YuHsuan Wu, Bolin Lai, Xiao-Yun Zhou, Le Lu, Wendi Liu, Huabang Zhou, Lingyun Huang, Jing Xiao, Adam P. Harrison, Ningyang Jia, Heping Hu

Results: the proposed CAD solution achieves a mean F1 score of 0. 62, outperforming the abdominal radiologist (0. 47), matching the junior hepatology radiologist (0. 61), and underperforming the senior hepatology radiologist (0. 68).

Specificity

Paper
Add Code

Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer

1 code implementation • CVPR 2022 • Fushun Zhu, Shan Zhao, Peng Wang, Hao Wang, Hua Yan, Shuaicheng Liu

We propose a semi-supervised network for wide-angle portraits correction.

Paper
Code

Space-and-time-synchronized simultaneous vehicle tracking/formation using cascaded prescribed-time control

no code implementations • 11 Sep 2021 • Peng Wang, Ziyin Chen, Xiaobing Zhang

In this paper, we present a space-and-time-synchronized control method with application to the simultaneous tracking/formation.

Paper
Add Code

Continual Neural Mapping: Learning An Implicit Scene Representation from Sequential Observations

no code implementations • ICCV 2021 • Zike Yan, Yuxin Tian, Xuesong Shi, Ping Guo, Peng Wang, Hongbin Zha

We introduce an experience replay approach to tackle an exemplary task of continual neural mapping: approximating a continuous signed distance function (SDF) from sequential depth images as a scene geometry representation.

Continual Learning

Paper
Add Code

Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation

no code implementations • 5 Aug 2021 • Yiming Li, Tao Kong, Ruihang Chu, Yifeng Li, Peng Wang, Lei LI

In a unified framework, we jointly predict the feasible 6-DoF grasp poses, instance semantic segmentation, and collision information.

Multi-Task Learning Pose Estimation +1

Paper
Add Code

Neural Rays for Occlusion-aware Image-based Rendering

1 code implementation • CVPR 2022 • YuAn Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, Wenping Wang

On such a 3D point, these generalization methods will include inconsistent image features from invisible views, which interfere with the radiance field construction.

Neural Rendering Novel View Synthesis +1

401

Paper
Code

ClueReader: Heterogeneous Graph Attention Network for Multi-hop Machine Reading Comprehension

no code implementations • 2 Jul 2021 • Peng Gao, Feng Gao, Peng Wang, Jian-Cheng Ni, Fei Wang, Hamido Fujita

Multi-hop machine reading comprehension is a challenging task in natural language processing as it requires more reasoning ability across multiple documents.

Graph Attention Machine Reading Comprehension

Paper
Add Code

AdaXpert: Adapting Neural Architecture for Growing Data

1 code implementation • 1 Jul 2021 • Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan

To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data.

Paper
Code

Investigation of Bare-bones Algorithms from Quantum Perspective: A Quantum Dynamical Global Optimizer

no code implementations • 26 Jun 2021 • Peng Wang, Gang Xin, Fang Wang

Correspondingly, the basic search behaviour is derived, which constitutes the basic iterative process of a simple optimization system.

Paper
Add Code

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

6 code implementations • NeurIPS 2021 • Peng Wang, Lingjie Liu, YuAn Liu, Christian Theobalt, Taku Komura, Wenping Wang

In NeuS, we propose to represent a surface as the zero-level set of a signed distance function (SDF) and develop a new volume rendering method to train a neural SDF representation.

Novel View Synthesis Surface Reconstruction

1,505

Paper
Code

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

1 code implementation • CVPR 2021 • Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo

Last, we proposed an efficient fine-grained search strategy to train HR-NAS, which effectively explores the search space, and finds optimal architectures given various tasks and computation resources.

Image Classification Neural Architecture Search +3

138

Paper
Code

Fastening the Initial Access in 5G NR Sidelink for 6G V2X Networks

no code implementations • 10 Jun 2021 • Marouan Mizmizi, Francesco Linsalata, Mattia Brambilla, Filippo Morandi, Kai Dong, Maurizio Magarini, Monica Nicoli, Majid Nasiri Khormuji, Peng Wang, Renaud Alexandre Pitaval, Umberto Spagnolini

The ever-increasing demand for intelligent, automated, and connected mobility solutions pushes for the development of an innovative sixth Generation (6G) of cellular networks.

Position Quantization

Paper
Add Code

Generative Adversarial Networks: A Survey Towards Private and Secure Applications

no code implementations • 7 Jun 2021 • Zhipeng Cai, Zuobin Xiong, Honghui Xu, Peng Wang, Wei Li, Yi Pan

Generative Adversarial Networks (GAN) have promoted a variety of applications in computer vision, natural language processing, etc.

Paper
Add Code

Sketch and Refine: Towards Faithful and Informative Table-to-Text Generation

no code implementations • Findings (ACL) 2021 • Peng Wang, Junyang Lin, An Yang, Chang Zhou, Yichang Zhang, Jingren Zhou, Hongxia Yang

Experimental results demonstrate that our method outperforms the previous state-of-the-art methods in both automatic and human evaluation, especially on coverage and faithfulness.

Descriptive Table-to-Text Generation

Paper
Add Code

Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention

no code implementations • 5 May 2021 • Wei Suo, Mengyang Sun, Peng Wang, Qi Wu

Referring Expression Comprehension (REC) has become one of the most important tasks in visual reasoning, since it is an essential step for many vision-and-language tasks such as visual question answering.

Question Answering Referring Expression +3

Paper
Add Code

CAT: Cross-Attention Transformer for One-Shot Object Detection

no code implementations • 30 Apr 2021 • Weidong Lin, Yuyan Deng, Yang Gao, Ning Wang, Jinghao Zhou, Lingqiao Liu, Lei Zhang, Peng Wang

Given a query patch from a novel class, one-shot object detection aims to detect all instances of that class in a target image through the semantic similarity comparison.

Object object-detection +3

Paper
Add Code

Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads

no code implementations • 30 Apr 2021 • Chenyu Gao, Qi Zhu, Peng Wang, Qi Wu

Based on this observation, we design a dynamic chopping module that can automatically remove heads and layers of the VisualBERT at an instance level when dealing with different questions.

Question Answering Visual Question Answering +1

Paper
Add Code

Center Prediction Loss for Re-identification

no code implementations • 30 Apr 2021 • Lu Yang, Yunlong Wang, Lingqiao Liu, Peng Wang, Lu Chi, Zehuan Yuan, Changhu Wang, Yanning Zhang

In this paper, we propose a new loss based on center predictivity, that is, a sample must be positioned in a location of the feature space such that from it we can roughly predict the location of the center of same-class samples.

Paper
Add Code

PURE: Passive mUlti-peRson idEntification via Deep Footstep Separation and Recognition

no code implementations • 15 Apr 2021 • Chao Cai, Ruinan Jin, Peng Wang, Liyuan Ye, Hongbo Jiang, Jun Luo

Recently, \textit{passive behavioral biometrics} (e. g., gesture or footstep) have become promising complements to conventional user identification methods (e. g., face or fingerprint) under special situations, yet existing sensing technologies require lengthy measurement traces and cannot identify multiple users at the same time.

Person Identification

Paper
Add Code

Residual Gaussian Process: A Tractable Nonparametric Bayesian Emulator for Multi-fidelity Simulations

no code implementations • 8 Apr 2021 • Wei W. Xing, Akeel A. Shah, Peng Wang, Shandian Zhe Qian Fu, Robert. M. Kirby

The resulting model is equipped with a closed-form solution for the predictive posterior, making it applicable to advanced, high-dimensional tasks that require uncertainty estimation.

Active Learning

Paper
Add Code

An Adversarial Human Pose Estimation Network Injected with Graph Structure

no code implementations • 29 Mar 2021 • Lei Tian, Guoqiang Liang, Peng Wang, Chunhua Shen

Because of the invisible human keypoints in images caused by illumination, occlusion and overlap, it is likely to produce unreasonable human pose prediction for most of the current human pose estimation methods.

Generative Adversarial Network Pose Estimation +1

Paper
Add Code

Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification

no code implementations • CVPR 2021 • Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, Lei Wang

Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases.

Ranked #10 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Classification Contrastive Learning +4

Paper
Add Code

Hetero-Modal Learning and Expansive Consistency Constraints for Semi-Supervised Detection from Multi-Sequence Data

no code implementations • 24 Mar 2021 • Bolin Lai, YuHsuan Wu, Xiao-Yun Zhou, Peng Wang, Le Lu, Lingyun Huang, Mei Han, Jing Xiao, Heping Hu, Adam P. Harrison

Lesion detection serves a critical role in early diagnosis and has been well explored in recent years due to methodological advancesand increased data availability.

Lesion Detection

Paper
Add Code

Real-Time Visual Object Tracking via Few-Shot Learning

no code implementations • 18 Mar 2021 • Jinghao Zhou, Bo Li, Peng Wang, Peixia Li, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang

Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL).

Few-Shot Learning Object +2

Paper
Add Code

Higher Performance Visual Tracking with Dual-Modal Localization

no code implementations • 18 Mar 2021 • Jinghao Zhou, Bo Li, Lei Qiao, Peng Wang, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang

Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy.

regression Visual Object Tracking +1

Paper
Add Code

Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification

no code implementations • 9 Mar 2021 • Lu Yang, Hongbang Liu, Jinghao Zhou, Lingqiao Liu, Lei Zhang, Peng Wang, Yanning Zhang

Learning cross-view consistent feature representation is the key for accurate vehicle Re-identification (ReID), since the visual appearance of vehicles changes significantly under different viewpoints.

Vehicle Re-Identification

Paper
Add Code

Instance and Pair-Aware Dynamic Networks for Re-Identification

no code implementations • 9 Mar 2021 • Bingliang Jiao, Xin Tan, Jinghao Zhou, Lu Yang, Yunlong Wang, Peng Wang

The proposed model is composed of three main branches where a self-guided dynamic branch is constructed to strengthen instance-specific features, focusing on every single image.

Paper
Add Code

Learning Graph Convolutional Networks for Multi-Label Recognition and Applications

1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence 2021 • ZhaoMin Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo

The task of multi-label image recognition is to predict a set of object labels that present in an image.

Multi-Label Classification Multi-label Image Recognition with Partial Labels

Paper
Code

Scalable Learning With a Structural Recurrent Neural Network for Short-Term Traffic Prediction

1 code implementation • 3 Mar 2021 • Youngjoo Kim, Peng Wang, Lyudmila Mihaylova

With the real traffic speed data measured in the city of Santander, we demonstrate the proposed SRNN outperforms the image-based approaches using the capsule network (CapsNet) by 14. 1% and the convolutional neural network (CNN) by 5. 87%, respectively, in terms of root mean squared error (RMSE).

Paper
Code

M6: A Chinese Multimodal Pretrainer

no code implementations • 1 Mar 2021 • Junyang Lin, Rui Men, An Yang, Chang Zhou, Ming Ding, Yichang Zhang, Peng Wang, Ang Wang, Le Jiang, Xianyan Jia, Jie Zhang, Jianwei Zhang, Xu Zou, Zhikang Li, Xiaodong Deng, Jie Liu, Jinbao Xue, Huiling Zhou, Jianxin Ma, Jin Yu, Yong Li, Wei Lin, Jingren Zhou, Jie Tang, Hongxia Yang

In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1. 9TB images and 292GB texts that cover a wide range of domains.

Image Generation

Paper
Add Code

A Collaborative Visual SLAM Framework for Service Robots

no code implementations • 5 Feb 2021 • Ming Ouyang, Xuesong Shi, Yujie Wang, Yuxin Tian, Yingzhe Shen, Dawei Wang, Peng Wang, Zhiqiang Cao

We present a collaborative visual simultaneous localization and mapping (SLAM) framework for service robots.

Retrieval Simultaneous Localization and Mapping

Paper
Add Code

Derive Lovelock Gravity from String Theory in Cosmological Background

no code implementations • 24 Dec 2020 • Peng Wang, Houwen Wu, Haitang Yang, Shuxuan Ying

It was proved more than three decades ago, that the first order $\alpha'$ correction of string effective theory could be written as the Gauss-Bonnet term, which is the quadratic term of Lovelock gravity.

High Energy Physics - Theory General Relativity and Quantum Cosmology High Energy Physics - Phenomenology

Paper
Add Code

Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation

1 code implementation • 15 Dec 2020 • Feixiang Lu, Zongdai Liu, Hui Miao, Peng Wang, Liangjun Zhang, Ruigang Yang, Dinesh Manocha, Bin Zhou

For autonomous driving, the dynamics and states of vehicle parts such as doors, the trunk, and the bonnet can provide meaningful semantic information and interaction states, which are essential to ensuring the safety of the self-driving vehicle.

Autonomous Driving Data Augmentation +3

Paper
Code

MeisterMorxrc at SemEval-2020 Task 9: Fine-Tune Bert and Multitask Learning for Sentiment Analysis of Code-Mixed Tweets

no code implementations • SEMEVAL 2020 • Qi Wu, Peng Wang, Chenghao Huang

Natural language processing (NLP) has been applied to various fields including text classification and sentiment analysis.

Sentiment Analysis text-classification +1

Paper
Add Code

Fully-Automated Liver Tumor Localization and Characterization from Multi-Phase MR Volumes Using Key-Slice ROI Parsing: A Physician-Inspired Approach

no code implementations • 13 Dec 2020 • Bolin Lai, YuHsuan Wu, Xiaoyu Bai, Xiao-Yun Zhou, Peng Wang, Jinzheng Cai, Yuankai Huo, Lingyun Huang, Yong Xia, Jing Xiao, Le Lu, Heping Hu, Adam Harrison

Using radiological scans to identify liver tumors is crucial for proper patient treatment.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.