Search Results for author: Peng Wang

Found 309 papers, 119 papers with code

PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting

1 code implementation Findings (NAACL) 2022 Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung

In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.

Model Compression

A Nearly-Linear Time Algorithm for Exact Community Recovery in Stochastic Block Model

no code implementations ICML 2020 Peng Wang, Zirui Zhou, Anthony Man-Cho So

In this paper, we focus on the problem of exactly recovering the communities in a binary symmetric SBM, where a graph of $n$ vertices is partitioned into two equal-sized communities and the vertices are connected with probability $p = \alpha\log(n)/n$ within communities and $q = \beta\log(n)/n$ across communities for some $\alpha>\beta>0$.

Stochastic Block Model

Learning Social Graph for Inactive User Recommendation

1 code implementation8 May 2024 Nian Liu, Shen Fan, Ting Bai, Peng Wang, Mingwei Sun, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Chuan Shi

In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{S}ocial Graph for \textbf{I}nactive User \textbf{R}ecommendation) that learns an optimal social graph structure for social recommendation, especially for inactive users.

Graph structure learning Recommendation Systems

Towards Continual Knowledge Graph Embedding via Incremental Distillation

1 code implementation7 May 2024 Jiajun Liu, Wenjun Ke, Peng Wang, Ziyu Shang, Jinhua Gao, Guozheng Li, Ke Ji, Yanhe Liu

On the one hand, existing methods usually learn new triples in a random order, destroying the inner structure of new KGs.

Knowledge Graph Embedding

Depth Priors in Removal Neural Radiance Fields

no code implementations1 May 2024 Zhihao Guo, Peng Wang

This paper proposes a new pipeline that leverages SpinNeRF and monocular depth estimation models like ZoeDepth to enhance NeRF's performance in complex object removal with improved efficiency.

3D Reconstruction Monocular Depth Estimation +2

Dual-Modal Prompting for Sketch-Based Image Retrieval

no code implementations29 Apr 2024 Liying Gao, Bingliang Jiao, Peng Wang, Shizhou Zhang, Hanwang Zhang, Yanning Zhang

In this study, we aim to tackle two major challenges of this task simultaneously: i) zero-shot, dealing with unseen categories, and ii) fine-grained, referring to intra-category instance-level retrieval.

Retrieval Sketch-Based Image Retrieval

Meta In-Context Learning Makes Large Language Models Better Zero and Few-Shot Relation Extractors

no code implementations27 Apr 2024 Guozheng Li, Peng Wang, Jiajun Liu, Yikai Guo, Ke Ji, Ziyu Shang, Zijie Xu

To this end, we introduce \textsc{Micre} (\textbf{M}eta \textbf{I}n-\textbf{C}ontext learning of LLMs for \textbf{R}elation \textbf{E}xtraction), a new meta-training framework for zero and few-shot RE where an LLM is tuned to do ICL on a diverse collection of RE datasets (i. e., learning to learn in context for RE).

Few-Shot Learning In-Context Learning +2

Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction

no code implementations27 Apr 2024 Guozheng Li, Peng Wang, Wenjun Ke, Yikai Guo, Ke Ji, Ziyu Shang, Jiajun Liu, Zijie Xu

On the one hand, retrieving good demonstrations is a non-trivial process in RE, which easily results in low relevance regarding entities and relations.

In-Context Learning Language Modelling +4

Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation

no code implementations26 Apr 2024 SeungWook Kim, Yichun Shi, Kejie Li, Minsu Cho, Peng Wang

Using image as prompts for 3D generation demonstrate particularly strong performances compared to using text prompts alone, for images provide a more intuitive guidance for the 3D generation process.

3D Generation

Enhancing Prompt Following with Visual Control Through Training-Free Mask-Guided Diffusion

no code implementations23 Apr 2024 Hongyu Chen, Yiqi Gao, Min Zhou, Peng Wang, Xubin Li, Tiezheng Ge, Bo Zheng

Meanwhile, a network, dubbed as Masked ControlNet, is designed to utilize these object masks for object generation in the misaligned visual control region.

Attribute Object

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

no code implementations16 Apr 2024 SeungWook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang

Leveraging multi-view diffusion models as priors for 3D optimization have alleviated the problem of 3D consistency, e. g., the Janus face problem or the content drift problem, in zero-shot text-to-3D models.

Common Sense Reasoning Text to 3D

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

no code implementations15 Apr 2024 Mude Hui, Siwei Yang, Bingchen Zhao, Yichun Shi, Heng Wang, Peng Wang, Yuyin Zhou, Cihang Xie

This study introduces HQ-Edit, a high-quality instruction-based image editing dataset with around 200, 000 edits.


COCONut: Modernizing COCO Segmentation

no code implementations12 Apr 2024 Xueqing Deng, Qihang Yu, Peng Wang, Xiaohui Shen, Liang-Chieh Chen

By enhancing the annotation quality and expanding the dataset to encompass 383K images with more than 5. 18M panoptic masks, we introduce COCONut, the COCO Next Universal segmenTation dataset.

Panoptic Segmentation Segmentation +1

Self-Explainable Affordance Learning with Embodied Caption

no code implementations8 Apr 2024 Zhipeng Zhang, Zhimin Wei, Guolei Sun, Peng Wang, Luc van Gool

In the field of visual affordance learning, previous methods mainly used abundant images or videos that delineate human behavior patterns to identify action possibility regions for object manipulation, with a variety of applications in robotic tasks.

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

1 code implementation28 Mar 2024 Wei Dong, Xing Zhang, Bihui Chen, Dawei Yan, Zhijun Lin, Qingsen Yan, Peng Wang, Yang Yang

Parameter-efficient fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks by learning a minimal set of new adaptation parameters while preserving the frozen majority of pre-trained parameters.

Image Classification

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting

1 code implementation18 Mar 2024 Lingzhe Zhao, Peng Wang, Peidong Liu

In this paper, we introduce a novel approach, named BAD-Gaussians (Bundle Adjusted Deblur Gaussian Splatting), which leverages explicit Gaussian representation and handles severe motion-blurred images with inaccurate camera poses to achieve high-quality scene reconstruction.

3D Scene Reconstruction Deblurring +2

CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning

1 code implementation15 Mar 2024 Yukun Li, Guansong Pang, Wei Suo, Chenchen Jing, Yuling Xi, Lingqiao Liu, Hao Chen, Guoqiang Liang, Peng Wang

Large pre-trained VLMs like CLIP have demonstrated superior zero-shot recognition ability, and a number of recent studies leverage this ability to mitigate catastrophic forgetting in CL, but they focus on closed-set CL in a single domain dataset.

Class Incremental Learning Incremental Learning +1

Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning

no code implementations15 Mar 2024 Meixuan Li, Tianyu Li, Guoqing Wang, Peng Wang, Yang Yang, Heng Tao Shen

Aligning these distributions between corresponding regions from different tasks imparts higher flexibility and capacity to capture intra-region structures, accommodating a broader range of tasks.

Depth Estimation Semantic Segmentation +1

Dcl-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ Segmentation

no code implementations6 Mar 2024 Lu Wen, Zhenghao Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang

Semi-supervised learning is a sound measure to relieve the strict demand of abundant annotated datasets, especially for challenging multi-organ segmentation .

Contrastive Learning Organ Segmentation

Vision-Language Navigation with Embodied Intelligence: A Survey

no code implementations22 Feb 2024 Peng Gao, Peng Wang, Feng Gao, Fei Wang, Ruyue Yuan

As a long-term vision in the field of artificial intelligence, the core goal of embodied intelligence is to improve the perception, understanding, and interaction capabilities of agents and the environment.

Vision-Language Navigation

Unlocking Instructive In-Context Learning with Tabular Prompting for Relational Triple Extraction

no code implementations21 Feb 2024 Guozheng Li, Wenjun Ke, Peng Wang, Zijie Xu, Ke Ji, Jiajun Liu, Ziyu Shang, Qiqing Luo

The in-context learning (ICL) for relational triple extraction (RTE) has achieved promising performance, but still encounters two key challenges: (1) how to design effective prompts and (2) how to select proper demonstrations.

Blocking In-Context Learning +1

TransGPT: Multi-modal Generative Pre-trained Transformer for Transportation

no code implementations11 Feb 2024 Peng Wang, Xiang Wei, Fangxu Hu, Wenjuan Han

TransGPT-MM is finetuned on a multi-modal Transportation dataset (MTD) that we manually collected from three areas of the transportation domain: driving tests, traffic signs, and landmarks.

Language Modelling Large Language Model

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

1 code implementation11 Feb 2024 Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks.

Graph Question Answering Instruction Following +4

Image Fusion via Vision-Language Model

no code implementations3 Feb 2024 Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc van Gool

Therefore, we introduce a novel fusion paradigm named image Fusion via vIsion-Language Model (FILM), for the first time, utilizing explicit textual information in different source images to guide image fusion.

Decoder Language Modelling

LLM A*: Human in the Loop Large Language Models Enabled A* Search for Robotics

no code implementations4 Dec 2023 Hengjia Xiao, Peng Wang

This makes the whole path planning process a `white box' and human feedback guides LLM A* to converge quickly compared to other data-driven methods such as reinforcement learning-based (RL) path planning.

ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation

no code implementations2 Dec 2023 Peng Wang, Yichun Shi

We introduce "ImageDream," an innovative image-prompt, multi-view diffusion model for 3D object generation.

3D Generation Object

Matching Weak Informative Ontologies

1 code implementation1 Dec 2023 Peng Wang

In this paper, these ontologies are named as weak informative ontologies (WIOs) and it is challenging for existing methods to matching WIOs.

Ontology Matching

Continual Referring Expression Comprehension via Dual Modular Memorization

1 code implementation25 Nov 2023 Heng Tao Shen, Cheng Chen, Peng Wang, Lianli Gao, Meng Wang, Jingkuan Song

In this paper, we propose Continual Referring Expression Comprehension (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks.

Memorization Referring Expression +1

Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale Fine-Grained Image Retrieval

1 code implementation21 Nov 2023 Xiu-Shen Wei, Yang shen, Xuhao Sun, Peng Wang, Yuxin Peng

Our work focuses on tackling large-scale fine-grained image retrieval as ranking the images depicting the concept of interests (i. e., the same sub-category labels) highest based on the fine-grained details in the query.

Attribute Deep Hashing +2

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

no code implementations20 Nov 2023 Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wenping Wang, Zexiang Xu, Kai Zhang

We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1. 3 seconds on a single A100 GPU.

3D Reconstruction Image to 3D +1

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

no code implementations15 Nov 2023 Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang

We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion.

3D Generation Denoising +2

Open-Vocabulary Video Anomaly Detection

no code implementations13 Nov 2023 Peng Wu, Xuerong Zhou, Guansong Pang, Yujia Sun, Jing Liu, Peng Wang, Yanning Zhang

Particularly, we devise a semantic knowledge injection module to introduce semantic knowledge from large language models for the detection task, and design a novel anomaly synthesis module to generate pseudo unseen anomaly videos with the help of large vision generation models for the classification task.

Anomaly Detection Video Anomaly Detection

SCL-VI: Self-supervised Context Learning for Visual Inspection of Industrial Defects

1 code implementation11 Nov 2023 Peng Wang, Haiming Yao, Wenyong Yu

Current unsupervised models struggle to strike a balance between detecting texture and object defects, lacking the capacity to discern latent representations and intricate features.

Self-Supervised Learning

Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

1 code implementation6 Nov 2023 Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, Qing Qu

To the best of our knowledge, this is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.

Feature Compression Multi-class Classification +2

PERF: Panoramic Neural Radiance Field from a Single Panorama

1 code implementation25 Oct 2023 Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu

In this paper, we present PERF, a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.

Novel View Synthesis Text to 3D

Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

no code implementations24 Oct 2023 Yinjie Lei, Zixuan Wang, Feng Chen, Guoqing Wang, Peng Wang, Yang Yang

Multi-modal 3D scene understanding has gained considerable attention due to its wide applications in many areas, such as autonomous driving and human-computer interaction.

Autonomous Driving Scene Understanding

Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing

1 code implementation NeurIPS 2023 Wei Dong, Dawei Yan, Zhijun Lin, Peng Wang

Consequently, effectively adapting large pre-trained models to downstream tasks in an efficient manner has become a prominent research area.

Image Classification Transfer Learning

Generalized Neural Collapse for a Large Number of Classes

no code implementations9 Oct 2023 Jiachen Jiang, Jinxin Zhou, Peng Wang, Qing Qu, Dustin Mixon, Chong You, Zhihui Zhu

However, most of the existing empirical and theoretical studies in neural collapse focus on the case that the number of classes is small relative to the dimension of the feature space.

Face Recognition Retrieval

The Emergence of Reproducibility and Consistency in Diffusion Models

no code implementations8 Oct 2023 Huijie Zhang, Jinfan Zhou, Yifu Lu, Minzhe Guo, Peng Wang, Liyue Shen, Qing Qu

In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility": given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs.

Image Generation Memorization

Revisiting Large Language Models as Zero-shot Relation Extractors

no code implementations8 Oct 2023 Guozheng Li, Peng Wang, Wenjun Ke

On the one hand, we analyze the drawbacks of existing RE prompts and attempt to incorporate recent prompt techniques such as chain-of-thought (CoT) to improve zero-shot RE.

Question Answering Relation +1

Human-centric Behavior Description in Videos: New Benchmark and Model

no code implementations4 Oct 2023 Lingru Zhou, Yiqi Gao, Manqing Zhang, Peng Wu, Peng Wang, Yanning Zhang

To address this challenge, we construct a human-centric video surveillance captioning dataset, which provides detailed descriptions of the dynamic behaviors of 7, 820 individuals.

Video Captioning

Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models

no code implementations4 Oct 2023 Jianglong Ye, Peng Wang, Kejie Li, Yichun Shi, Heng Wang

Specifically, we decompose the NVS task into two stages: (i) transforming observed regions to a novel view, and (ii) hallucinating unseen regions.

Image to 3D Novel View Synthesis

USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields

1 code implementation4 Oct 2023 Moyang Li, Peng Wang, Lingzhe Zhao, Bangyan Liao, Peidong Liu

USB-NeRF is able to correct rolling shutter distortions and recover accurate camera motion trajectory simultaneously under the framework of NeRF, by modeling the physical image formation process of a RS camera.

Image Generation Motion Estimation +2

Selective Feature Adapter for Dense Vision Transformers

no code implementations3 Oct 2023 Xueqing Deng, Qi Fan, Xiaojie Jin, Linjie Yang, Peng Wang

Specifically, SFA consists of external adapters and internal adapters which are sequentially operated over a transformer model.

Depth Estimation

MMPI: a Flexible Radiance Field Representation by Multiple Multi-plane Images Blending

no code implementations30 Sep 2023 Yuze He, Peng Wang, Yubin Hu, Wang Zhao, Ran Yi, Yong-Jin Liu, Wenping Wang

In this paper, we explore the potential of MPI and show that MPI can synthesize high-quality novel views of complex scenes with diverse camera distributions and view directions, which are not only limited to simple forward-facing scenes.

Autonomous Driving Novel View Synthesis

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

no code implementations14 Sep 2023 Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

In spite of the excellent strides made by end-to-end (E2E) models in speech recognition in recent years, named entity recognition is still challenging but critical for semantic understanding.

Language Modelling named-entity-recognition +3

MVDream: Multi-view Diffusion for 3D Generation

2 code implementations31 Aug 2023 Yichun Shi, Peng Wang, Jianglong Ye, Mai Long, Kejie Li, Xiao Yang

We introduce MVDream, a diffusion model that is able to generate consistent multi-view images from a given text prompt.

3D Generation

TouchStone: Evaluating Vision-Language Models by Language Models

1 code implementation31 Aug 2023 Shuai Bai, Shusheng Yang, Jinze Bai, Peng Wang, Xingxuan Zhang, Junyang Lin, Xinggang Wang, Chang Zhou, Jingren Zhou

Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs).

Visual Storytelling

Ground-to-Aerial Person Search: Benchmark Dataset and Approach

1 code implementation24 Aug 2023 Shizhou Zhang, Qingchun Yang, De Cheng, Yinghui Xing, Guoqiang Liang, Peng Wang, Yanning Zhang

In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31, 770 images of 260, 559 annotated bounding boxes for 2, 644 identities appearing in both of the UAVs and ground surveillance cameras.

Knowledge Distillation Person Search

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

1 code implementation22 Aug 2023 Peng Wu, Xuerong Zhou, Guansong Pang, Lingru Zhou, Qingsen Yan, Peng Wang, Yanning Zhang

With the benefit of dual branch, VadCLIP achieves both coarse-grained and fine-grained video anomaly detection by transferring pre-trained knowledge from CLIP to WSVAD task.

Anomaly Detection Binary Classification +1

Contrastive Diffusion Model with Auxiliary Guidance for Coarse-to-Fine PET Reconstruction

1 code implementation20 Aug 2023 Zeyu Han, YuHan Wang, Luping Zhou, Peng Wang, Binyu Yan, Jiliu Zhou, Yan Wang, Dinggang Shen

To obtain high-quality positron emission tomography (PET) scans while reducing radiation exposure to the human body, various approaches have been proposed to reconstruct standard-dose PET (SPET) images from low-dose PET (LPET) images.

Polymerized Feature-based Domain Adaptation for Cervical Cancer Dose Map Prediction

no code implementations20 Aug 2023 Jie Zeng, Zeyu Han, Xingchen Peng, Jianghong Xiao, Peng Wang, Yan Wang

Recently, deep learning (DL) has automated and accelerated the clinical radiation therapy (RT) planning significantly by predicting accurate dose maps.

Domain Adaptation

Pre-training with Large Language Model-based Document Expansion for Dense Passage Retrieval

no code implementations16 Aug 2023 Guangyuan Ma, Xing Wu, Peng Wang, Zijia Lin, Songlin Hu

Concretely, we leverage the capabilities of LLMs for document expansion, i. e. query generation, and effectively transfer expanded knowledge to retrievers using pre-training strategies tailored for passage retrieval.

Contrastive Learning Language Modelling +3

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

2 code implementations14 Aug 2023 Peng Wang, Ningyu Zhang, Bozhong Tian, Zekun Xi, Yunzhi Yao, Ziwen Xu, Mengru Wang, Shengyu Mao, Xiaohan Wang, Siyuan Cheng, Kangwei Liu, Yuansheng Ni, Guozhou Zheng, Huajun Chen

Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data.

knowledge editing

AerialVLN: Vision-and-Language Navigation for UAVs

1 code implementation ICCV 2023 Shubo Liu, Hongsheng Zhang, Yuankai Qi, Peng Wang, Yaning Zhang, Qi Wu

Navigating in the sky is more complicated than on the ground because agents need to consider the flying height and more complex spatial relationship reasoning.

Navigate Vision and Language Navigation

TriDo-Former: A Triple-Domain Transformer for Direct PET Reconstruction from Low-Dose Sinograms

no code implementations10 Aug 2023 Jiaqi Cui, Pinxian Zeng, Xinyi Zeng, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang, Dinggang Shen

Specifically, the TriDo-Former consists of two cascaded networks, i. e., a sinogram enhancement transformer (SE-Former) for denoising the input LPET sinograms and a spatial-spectral reconstruction transformer (SSR-Former) for reconstructing SPET images from the denoised sinograms.

Denoising Image Reconstruction +1

A Survey on Deep Learning-based Spatio-temporal Action Detection

no code implementations3 Aug 2023 Peng Wang, Fanwei Zeng, Yuntao Qian

Spatio-temporal action detection (STAD) aims to classify the actions present in a video and localize them in space and time.

Action Detection Autonomous Driving

Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition

1 code implementation25 Jul 2023 Cheng Da, Peng Wang, Cong Yao

Specifically, MGP-STR achieves an average recognition accuracy of $94\%$ on standard benchmarks for scene text recognition.

Language Modelling Optical Character Recognition (OCR) +1

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model

1 code implementation24 Jul 2023 Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang

In this context, we propose a novel task called Video Anomaly Retrieval (VAR), which aims to pragmatically retrieve relevant anomalous videos by cross-modalities, e. g., language descriptions and synchronous audios.

Anomaly Detection Retrieval +2

Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

no code implementations20 Jul 2023 Yinghui Xing, Dexuan Kong, Shizhou Zhang, Geng Chen, Lingyan Ran, Peng Wang, Yanning Zhang

Camouflaged object detection (COD), aiming to segment camouflaged objects which exhibit similar patterns with the background, is a challenging task.

Multi-Task Learning object-detection +1

Watch out Venomous Snake Species: A Solution to SnakeCLEF2023

1 code implementation19 Jul 2023 Feiran Hu, Peng Wang, Yangyang Li, Chenlong Duan, Zijian Zhu, Fei Wang, Faen Zhang, Yong Li, Xiu-Shen Wei

The SnakeCLEF2023 competition aims to the development of advanced algorithms for snake species identification through the analysis of images and accompanying metadata.

Data Augmentation

DiffDP: Radiotherapy Dose Prediction via a Diffusion Model

no code implementations19 Jul 2023 Zhenghao Feng, Lu Wen, Peng Wang, Binyu Yan, Xi Wu, Jiliu Zhou, Yan Wang

To alleviate this limitation, we innovatively introduce a diffusion-based dose prediction (DiffDP) model for predicting the radiotherapy dose distribution of cancer patients.


6G Network Business Support System

no code implementations19 Jul 2023 Ye Ouyang, Yaqin Zhang, Peng Wang, Yunxin Liu, Wen Qiao, Jun Zhu, Yang Liu, Feng Zhang, Shuling Wang, Xidong Wang

6G is the next-generation intelligent and integrated digital information infrastructure, characterized by ubiquitous interconnection, native intelligence, multi-dimensional perception, global coverage, green and low-carbon, native network security, etc.

Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models

no code implementations14 Jul 2023 Xiaorui Zhu, Yichen Qin, Peng Wang

A critical question remains unsettled; that is, is it possible and how to embed the inference of the model into the simultaneous inference of the coefficients?

Model Selection

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

1 code implementation NeurIPS 2023 Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, Yasutaka Furukawa

This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e. g., perspective crops from a panorama or multi-view images given depth maps and poses).

Image Generation

Fast and Automatic 3D Modeling of Antenna Structure Using CNN-LSTM Network for Efficient Data Generation

no code implementations27 Jun 2023 Zhaohui Wei, Zhao Zhou, Peng Wang, Jian Ren, Yingzeng Yin, Gert Frølund Pedersen, Ming Shen

In this study, we proposed a deep learning-assisted and image-based intelligent modeling approach for accelerating the data acquisition of antenna samples with different physical structures.

Teacher Agent: A Knowledge Distillation-Free Framework for Rehearsal-based Video Incremental Learning

1 code implementation1 Jun 2023 Shengqin Jiang, Yaoyu Fang, Haokui Zhang, Qingshan Liu, Yuankai Qi, Yang Yang, Peng Wang

Rehearsal-based video incremental learning often employs knowledge distillation to mitigate catastrophic forgetting of previously learned data.

Incremental Learning Knowledge Distillation +1

The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

1 code implementation1 Jun 2023 Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

Second, it allows us to better understand deep representation learning by elucidating the linear progressive separation and concentration of representations from shallow to deep layers.

Representation Learning

Continuous and Noninvasive Measurement of Arterial Pulse Pressure and Pressure Waveform using an Image-free Ultrasound System

no code implementations29 May 2023 Lirui Xu, Pang Wu, Pan Xia, Fanglin Geng, Peng Wang, Xianxiang Chen, Zhenfeng Li, Lidong Du, Shuping Liu, Li Li, Hongbo Chang, Zhen Fang

In in vitro cardiovascular phantom experiments, the results demonstrated high accuracy in the measurement of PP (error < 3 mmHg) and blood pressure waveform (root-mean-square-errors (RMSE) < 2 mmHg, correlation coefficient (r) > textgreater 0. 99).

Learning Conditional Attributes for Compositional Zero-Shot Learning

1 code implementation CVPR 2023 Qingsheng Wang, Lingqiao Liu, Chenchen Jing, Hao Chen, Guoqiang Liang, Peng Wang, Chunhua Shen

Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel compositional concepts based on learned concepts such as attribute-object combinations.

Attribute Compositional Zero-Shot Learning

NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images

1 code implementation27 May 2023 YuAn Liu, Peng Wang, Cheng Lin, Xiaoxiao Long, Jiepeng Wang, Lingjie Liu, Taku Komura, Wenping Wang

We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.

Neural Rendering Object

A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation

no code implementations CVPR 2023 Congqi Cao, Yue Lu, Peng Wang, Yanning Zhang

At present, it is the largest semi-supervised VAD dataset with the largest number of scenes and classes of anomalies, the longest duration, and the only one considering the scene-dependent anomaly.

Anomaly Detection Video Anomaly Detection

Editing Large Language Models: Problems, Methods, and Opportunities

2 code implementations22 May 2023 Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang

Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.

Model Editing

MALM: Mask Augmentation based Local Matching for Food-Recipe Retrieval

1 code implementation18 May 2023 Bhanu Prakash Voutharoja, Peng Wang, Lei Wang, Vivienne Guan

A de-facto idea to address this task is to learn a shared feature embedding space in which a food image is aligned better to its paired recipe than other recipes.

Image-text matching Retrieval +1

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

2 code implementations18 May 2023 Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

 Ranked #1 on Semantic Segmentation on ADE20K (using extra training data)

Action Classification AudioCaps +16

Knowledge Rumination for Pre-trained Language Models

1 code implementation15 May 2023 Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun Chen, Ningyu Zhang

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs.

Language Modelling

Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration

1 code implementation SIGMOD/PODS 2023 Jianhong Tu, Ju Fan, Nan Tang, Peng Wang, Guoliang Li, Xiaoyong Du, Xiaofeng Jia, Song Gao

The widely used practice is to build task-specific or even dataset-specific solutions, which are hard to generalize and disable the opportunities of knowledge sharing that can be learned from different datasets and multiple tasks.

Entity Resolution Zero-Shot Learning

ViewFormer: View Set Attention for Multi-view 3D Shape Understanding

no code implementations29 Apr 2023 Hongyu Sun, Yongcai Wang, Peng Wang, Xudong Cai, Deying Li

This paper presents ViewFormer, a simple yet effective model for multi-view 3d shape recognition and retrieval.

3D Shape Recognition 3D Shape Retrieval +1

Maximizing Model Generalization for Machine Condition Monitoring with Self-Supervised Learning and Federated Learning

no code implementations27 Apr 2023 Matthew Russell, Peng Wang

Specifically, Self-Supervised Learning (SSL) with Barlow Twins may produce more discriminative features for monitoring health condition than supervised learning by focusing on semantic properties of the data.

Domain Adaptation Federated Learning +2

AirBirds: A Large-scale Challenging Dataset for Bird Strike Prevention in Real-world Airports

no code implementations23 Apr 2023 Hongyu Sun, Yongcai Wang, Xudong Cai, Peng Wang, Zhe Huang, Deying Li, Yu Shao, Shuo Wang

To advance the research and practical solutions for bird strike prevention, in this paper, we present a large-scale challenging dataset AirBirds that consists of 118, 312 time-series images, where a total of 409, 967 bounding boxes of flying birds are manually, carefully annotated.

Time Series

CoT-MoTE: Exploring ConTextual Masked Auto-Encoder Pre-training with Mixture-of-Textual-Experts for Passage Retrieval

no code implementations20 Apr 2023 Guangyuan Ma, Xing Wu, Peng Wang, Songlin Hu

Siamese or fully separated dual-encoders are often adopted as basic retrieval architecture in the pre-training and fine-tuning stages for encoding queries and passages into their latent embedding spaces.

Passage Retrieval Retrieval

A geometry-aware deep network for depth estimation in monocular endoscopy

1 code implementation20 Apr 2023 Yongming Yang, Shuwei Shao, Tao Yang, Peng Wang, Zhuo Yang, Chengdong Wu, Hao liu

To address this issue, we introduce a gradient loss to penalize edge fluctuations ambiguous around stepped edge structures and a normal loss to explicitly express the sensitivity to frequently small structures, and propose a geometric consistency loss to spreads the spatial information across the sample grids to constrain the global geometric anatomy structures.

3D Reconstruction Anatomy +1

CoT-MAE v2: Contextual Masked Auto-Encoder with Multi-view Modeling for Passage Retrieval

no code implementations5 Apr 2023 Xing Wu, Guangyuan Ma, Peng Wang, Meng Lin, Zijia Lin, Fuzheng Zhang, Songlin Hu

As an effective representation bottleneck pretraining technique, the contextual masked auto-encoder utilizes contextual embedding to assist in the reconstruction of passages.

Passage Retrieval Retrieval +1

F$^{2}$-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

1 code implementation28 Mar 2023 Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Based on our analysis, we further propose a novel space-warping method called perspective warping, which allows us to handle arbitrary trajectories in the grid-based NeRF framework.

Novel View Synthesis

Joint Spectrum and Power Allocation for V2X Communications with Imperfect CSI

no code implementations21 Feb 2023 Peng Wang, Weihua Wu, Jiayi Liu, Guanhua Chai, Li Feng

More specifically, Bernstein approximations are employed to convert the chance constraint into a calculable constraint, and Bisection search method is proposed to obtain the optimal allocation solution with low complexity.


Self-Supervised Node Representation Learning via Node-to-Neighbourhood Alignment

1 code implementation9 Feb 2023 Wei Dong, Dawei Yan, Peng Wang

Considering the excessive memory overheads of contrastive learning, we further propose a negative-free solution, where the main contribution is a Graph Signal Decorrelation (GSD) constraint to avoid representation collapse and over-smoothing.

Contrastive Learning Node Classification +1

Delving Deep into Simplicity Bias for Long-Tailed Image Recognition

no code implementations7 Feb 2023 Xiu-Shen Wei, Xuhao Sun, Yang shen, Anqi Xu, Peng Wang, Faen Zhang

Simplicity Bias (SB) is a phenomenon that deep neural networks tend to rely favorably on simpler predictive patterns but ignore some complex features when applied to supervised discriminative tasks.

Long-tail Learning Self-Supervised Learning

Industrial computed tomography based intelligent non-destructive testing method for power capacitor

no code implementations6 Feb 2023 Zhenxing Cheng, Peng Wang, Yue Liu, Wei Qin, Zidi Tang

Power capacitor device is a widely used reactive power compensation equipment in power transmission and distribution system which can easily have internal fault and therefore affects the safe operation of the power system.

Data Augmentation

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval

1 code implementation19 Jan 2023 Xiaojie Jin, BoWen Zhang, Weibo Gong, Kai Xu, Xueqing Deng, Peng Wang, Zhao Zhang, Xiaohui Shen, Jiashi Feng

The first is a Temporal Adaptation Module that is incorporated in the video branch to introduce global and local temporal contexts.

Retrieval Text Retrieval +2

F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories

no code implementations CVPR 2023 Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Existing fast grid-based NeRF training frameworks, like Instant-NGP, Plenoxels, DVGO, or TensoRF, are mainly designed for bounded scenes and rely on space warping to handle unbounded scenes.

Novel View Synthesis

Revisiting Prototypical Network for Cross Domain Few-Shot Learning

1 code implementation CVPR 2023 Fei Zhou, Peng Wang, Lei Zhang, Wei Wei, Yanning Zhang

Prototypical Network is a popular few-shot solver that aims at establishing a feature metric generalizable to novel few-shot classification (FSC) tasks using deep neural networks.

cross-domain few-shot learning Knowledge Distillation

Weakly Supervised Video Anomaly Detection Based on Cross-Batch Clustering Guidance

no code implementations16 Dec 2022 Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang

To enhance the discriminative power of features, we propose a batch clustering based loss to encourage a clustering branch to generate distinct normal and abnormal clusters based on a batch of data.

Anomaly Detection Clustering +1

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

1 code implementation8 Dec 2022 Jinze Bai, Rui Men, Hao Yang, Xuancheng Ren, Kai Dang, Yichang Zhang, Xiaohuan Zhou, Peng Wang, Sinan Tan, An Yang, Zeyu Cui, Yu Han, Shuai Bai, Wenbin Ge, Jianxin Ma, Junyang Lin, Jingren Zhou, Chang Zhou

As a starting point, we provide presets of 7 different modalities and 23 highly-diverse example tasks in OFASys, with which we also develop a first-in-kind, single model, OFA+, that can handle text, image, speech, video, and motion data.

Multi-Task Learning

Generalizable Person Re-Identification via Viewpoint Alignment and Fusion

no code implementations5 Dec 2022 Bingliang Jiao, Lingqiao Liu, Liying Gao, Guosheng Lin, Ruiqi Wu, Shizhou Zhang, Peng Wang, Yanning Zhang

The key insight of this design is that the cross-attention mechanism in the transformer could be an ideal solution to align the discriminative texture clues from the original image with the canonical view image, which could compensate for the low-quality texture information of the canonical view image.

Domain Generalization Generalizable Person Re-identification +1

NeuralUDF: Learning Unsigned Distance Fields for Multi-view Reconstruction of Surfaces with Arbitrary Topologies

no code implementations CVPR 2023 Xiaoxiao Long, Cheng Lin, Lingjie Liu, YuAn Liu, Peng Wang, Christian Theobalt, Taku Komura, Wenping Wang

In this paper, we propose to represent surfaces as the Unsigned Distance Function (UDF) and develop a new volume rendering scheme to learn the neural UDF representation.

Neural Rendering

BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields

1 code implementation CVPR 2023 Peng Wang, Lingzhe Zhao, Ruijie Ma, Peidong Liu

Our approach models the physical image formation process of a motion blurred image, and jointly learns the parameters of NeRF and recovers the camera motion trajectories during exposure time.

3D Scene Reconstruction Deblurring +2

Semantic Guided Level-Category Hybrid Prediction Network for Hierarchical Image Classification

no code implementations22 Nov 2022 Peng Wang, Jingzhou Chen, Yuntao Qian

Hierarchical classification (HC) assigns each object with multiple labels organized into a hierarchical structure.

Image Classification Word Embeddings

Batch-based Model Registration for Fast 3D Sherd Reconstruction

no code implementations ICCV 2023 Jiepeng Wang, Congyi Zhang, Peng Wang, Xin Li, Peter J. Cobb, Christian Theobalt, Wenping Wang

In this work, we aim to develop a portable, high-throughput, and accurate reconstruction system for efficient digitization of fragments excavated in archaeological sites.

3D Reconstruction

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

1 code implementation19 Sep 2022 Can Yaras, Peng Wang, Zhihui Zhu, Laura Balzano, Qing Qu

When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon.

Multi-class Classification Representation Learning +1

Multi-Granularity Prediction for Scene Text Recognition

2 code implementations8 Sep 2022 Peng Wang, Cheng Da, Cong Yao

In this work, we first draw inspiration from the recent progress in Vision Transformer (ViT) to construct a conceptually simple yet powerful vision STR model, which is built upon ViT and outperforms previous state-of-the-art models for scene text recognition, including both pure vision models and language-augmented methods.

Ranked #2 on Scene Text Recognition on Uber-Text (using extra training data)

Language Modelling Optical Character Recognition (OCR) +1

Levenshtein OCR

2 code implementations8 Sep 2022 Cheng Da, Peng Wang, Cong Yao

A novel scene text recognizer based on Vision-Language Transformer (VLT) is presented.

Imitation Learning Optical Character Recognition (OCR) +1

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

1 code implementation17 Aug 2022 Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Peng Wang, Yanning Zhang

To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.

General Knowledge Language Modelling +1

Instance Image Retrieval by Learning Purely From Within the Dataset

no code implementations12 Aug 2022 Zhongyan Zhang, Lei Wang, Yang Wang, Luping Zhou, Jianjia Zhang, Peng Wang, Fang Chen

Although achieving promising results, this approach is restricted by two issues: 1) the domain gap between benchmark datasets and the dataset of a given retrieval task; 2) the required auxiliary dataset cannot be readily obtained.

Image Retrieval Retrieval +2

Prompt Tuning for Generative Multimodal Pretrained Models

1 code implementation4 Aug 2022 Hao Yang, Junyang Lin, An Yang, Peng Wang, Chang Zhou, Hongxia Yang

Prompt tuning has become a new paradigm for model tuning and it has demonstrated success in natural language pretraining and even vision pretraining.

Image Captioning Visual Entailment +1

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

no code implementations31 Jul 2022 Zhipeng Zhang, Zhimin Wei, Zhongzhen Huang, Rui Niu, Peng Wang

However, one unsolved issue of these models is that the number of reasoning steps needs to be pre-defined and fixed before inference, ignoring the varying complexity of expressions.

Referring Expression Referring Expression Comprehension +2

Progressively-connected Light Field Network for Efficient View Synthesis

no code implementations10 Jul 2022 Peng Wang, YuAn Liu, Guying Lin, Jiatao Gu, Lingjie Liu, Taku Komura, Wenping Wang

ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses.

Novel View Synthesis

NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors

no code implementations27 Jun 2022 Jiepeng Wang, Peng Wang, Xiaoxiao Long, Christian Theobalt, Taku Komura, Lingjie Liu, Wenping Wang

The key idea of NeuRIS is to integrate estimated normal of indoor scenes as a prior in a neural rendering framework for reconstructing large texture-less shapes and, importantly, to do this in an adaptive manner to also enable the reconstruction of irregular shapes with fine details.

3D Reconstruction Neural Rendering

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

1 code implementation12 Jun 2022 Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, Wenping Wang

We introduce SparseNeuS, a novel neural rendering based method for the task of surface reconstruction from multi-view images.

Neural Rendering Surface Reconstruction

Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

1 code implementation11 Jun 2022 Peng Wang, Huikang Liu, Anthony Man-Cho So, Laura Balzano

The K-subspaces (KSS) method is a generalization of the K-means method for subspace clustering.


Fast-Spanning Ant Colony Optimisation (FaSACO) for Mobile Robot Coverage Path Planning

no code implementations31 May 2022 Christopher Carr, Peng Wang

Bio-inspired algorithms such as Ant Colony Optimisation (ACO) have been exploited to solve the problem because they can utilise heuristic information to mitigate the path planning complexity.

VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis

1 code implementation30 May 2022 Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille

The Gaussian reconstruction kernels have been proposed by Westover (1990) and studied by the computer graphics community back in the 90s, which gives an alternative representation of object 3D geometry from meshes and point clouds.

Pose Estimation

Balanced control between performance and saturation for constrained nonlinear systems

no code implementations10 May 2022 Peng Wang, Haibin Wang, Shuzhi Sam Ge, Xiaobing Zhang

This paper addresses the balanced control between performance and saturation for a class of constrained nonlinear systems, including the branches: balanced command filtered backstepping (BCFB) and balanced performance control (BPC).

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

no code implementations7 May 2022 Zhipeng Zhang, Xinglin Hou, Kai Niu, Zhongzhen Huang, Tiezheng Ge, Yuning Jiang, Qi Wu, Peng Wang

Therefore, we present a dataset, E-MMAD (e-commercial multimodal multi-structured advertisement copywriting), which requires, and supports much more detailed information in text generation.

Text Generation Video Captioning

Dual-Level Decoupled Transformer for Video Captioning

no code implementations6 May 2022 Yiqi Gao, Xinglin Hou, Wei Suo, Mengyang Sun, Tiezheng Ge, Yuning Jiang, Peng Wang

As for the latter, \textbf{\textit{"couple"}} means treating the generation of visual semantic and syntax-related words equally.

Descriptive Sentence +1

CapOnImage: Context-driven Dense-Captioning on Image

no code implementations27 Apr 2022 Yiqi Gao, Xinglin Hou, Yuanmeng Zhang, Tiezheng Ge, Yuning Jiang, Peng Wang

Existing image captioning systems are dedicated to generating narrative captions for images, which are spatially detached from the image in presentation.

Dense Captioning Image Captioning

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

1 code implementation CVPR 2022 Caiyuan Zheng, Hui Li, Seon-Min Rhee, Seungju Han, Jae-Joon Han, Peng Wang

A robust consistency regularization based semi-supervised framework is proposed for STR, which can effectively solve the instability issue due to domain inconsistency between synthetic and real images.

Scene Text Recognition

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

1 code implementation CVPR 2022 Xueqing Deng, Peng Wang, Xiaochen Lian, Shawn Newsam

Notably, NightLab contains models at two levels of granularity, i. e. image and regional, and each level is composed of light adaptation and segmentation modules.

Segmentation Self-Driving Cars +1

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

no code implementations12 Apr 2022 Xueqing Deng, Dawei Sun, Shawn Newsam, Peng Wang

Specifically, given a pair of student and teacher networks, DistPro first sets up a rich set of KD connection from the transmitting layers of the teacher to the receiving layers of the student, and in the meanwhile, various transforms are also proposed for comparing feature maps along its pathway for the distillation.

Knowledge Distillation Meta-Learning

Self-Contrastive Learning based Semi-Supervised Radio Modulation Classification

no code implementations29 Mar 2022 Dongxin Liu, Peng Wang, Tianshi Wang, Tarek Abdelzaher

This paper presents a semi-supervised learning framework that is new in being designed for automatic modulation classification (AMC).

Classification Contrastive Learning

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

1 code implementation CVPR 2022 Wei Dong, Junsheng Wu, Yi Luo, ZongYuan Ge, Peng Wang

In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing.

Node Classification Representation Learning

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

no code implementations ACL 2022 Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Jiaxu Miao, Wenqiao Zhang, Wenming Tan, Jin Wang, Peng Wang, ShiLiang Pu, Fei Wu

To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner.

Descriptive Representation Learning +1

Exact Community Recovery over Signed Graphs

no code implementations22 Feb 2022 Xiaolu Wang, Peng Wang, Anthony Man-Cho So

Signed graphs encode similarity and dissimilarity relationships among different entities with positive and negative edges.

Stochastic Block Model

Relation Regularized Scene Graph Generation

no code implementations22 Feb 2022 Yuyu Guo, Lianli Gao, Jingkuan Song, Peng Wang, Nicu Sebe, Heng Tao Shen, Xuelong Li

Inspired by this observation, in this article, we propose a relation regularized network (R2-Net), which can predict whether there is a relationship between two objects and encode this relation into object feature refinement and better SGG.

Graph Classification Graph Generation +6

Graph-based Extractive Explainer for Recommendations

no code implementations20 Feb 2022 Peng Wang, Renqin Cai, Hongning Wang

Explanations in a recommender system assist users in making informed decisions among a set of recommended items.

Attribute Recommendation Systems +1

Adaptive Graph Convolutional Networks for Weakly Supervised Anomaly Detection in Videos

no code implementations14 Feb 2022 Congqi Cao, Xin Zhang, Shizhou Zhang, Peng Wang, Yanning Zhang

For weakly supervised anomaly detection, most existing work is limited to the problem of inadequate video representation due to the inability of modeling long-term contextual information.

Graph Learning Supervised Anomaly Detection +1

Negative-ResNet: Noisy Ambulatory Electrocardiogram Signal Classification Scheme

no code implementations25 Jan 2022 Zijiao Chen, Zihuai Lin, Peng Wang, Ming Ding

With recently successful applications of deep learning in computer vision and general signal processing, deep learning has shown many unique advantages in medical signal processing.


Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings

1 code implementation14 Jan 2022 Peng Wang, Xin Xie, Xiaohan Wang, Ningyu Zhang

Previous knowledge graph embedding approaches usually map entities to representations and utilize score functions to predict the target entities, yet they typically struggle to reason rare or emerging unseen entities.

Knowledge Graph Embedding Knowledge Graph Embeddings +2

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

1 code implementation CVPR 2022 Jingzhou Chen, Peng Wang, Jian Liu, Yuntao Qian

Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e. g., ["Albatross", "Laysan Albatross"] from coarse-to-fine levels.

Fine-Grained Image Classification Relation

Multi-Domain Joint Training for Person Re-Identification

no code implementations6 Jan 2022 Lu Yang, Lingqiao Liu, Yunlong Wang, Peng Wang, Yanning Zhang

Our discovery is that training with such an adaptive model can better benefit from more training samples.

Person Re-Identification

Robust Security Analysis Based on Random Geometry Theory for Satellite-Terrestrial-Vehicle Network

no code implementations28 Dec 2021 Xudong Li, Ye Fan, Rugui Yao, Peng Wang, Nan Qi, Xiaoya Zuo

Driven by B5G and 6G technologies, multi-network fusion is an indispensable tendency for future communications.

DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings

1 code implementation10 Dec 2021 Chaochen Gao, Xing Wu, Peng Wang, Jue Wang, Liangjun Zang, Zhongyuan Wang, Songlin Hu

To tackle that, we propose an effective knowledge distillation framework for contrastive sentence embeddings, termed DistilCSE.

Contrastive Learning Knowledge Distillation +5

Contrast-reconstruction Representation Learning for Self-supervised Skeleton-based Action Recognition

no code implementations22 Nov 2021 Peng Wang, Jun Wen, Chenyang Si, Yuntao Qian, Liang Wang

Finally, in the Information Fuser, we explore varied strategies to combine the Sequence Reconstructor and Contrastive Motion Learner, and propose to capture postures and motions simultaneously via a knowledge-distillation based fusion strategy that transfers the motion learning from the Contrastive Motion Learner to the Sequence Reconstructor.

Action Recognition Contrastive Learning +4

LoS-Map Construction for Proactive Relay of Opportunity Selection in 6G V2X Systems

no code implementations15 Nov 2021 Francesco Linsalata, Silvia Mura, Marouan Mizmizi, Maurizio Magarini, Peng Wang, Majid Nasiri Khormuji, Alberto Perotti, Umberto Spagnolini

Recent advances in Vehicle-to-Everything (V2X) technology and the upcoming sixth-generation (6G) network will dawn a new era for vehicular services with enhanced communication capabilities.

Autonomous Vehicles