Search Results for author: Wei zhang

Found 388 papers, 112 papers with code

HARD-Net: Hardness-AwaRe Discrimination Network for 3D Early Activity Prediction

no code implementations ECCV 2020 Tianjiao Li, Jun Liu, Wei zhang, Ling-Yu Duan

In this paper, we propose a novel Hardness-AwaRe Discrimination Network (HARD-Net) to specifically investigate the relationships between the similar activity pairs that are hard to be discriminated.

Activity Prediction Skeleton Based Action Recognition

Towards Generalizeable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

no code implementations ECNLP (ACL) 2022 Zheng Liu, Wei zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder

Recently, semantic search has been successfully applied to E-commerce product search and the learned semantic space for query and product encoding are expected to generalize well to unseen queries or products.

text similarity

Unsupervised Multi-View CNN for Salient View Selection of 3D Objects and Scenes

1 code implementation ECCV 2020 Ran Song, Wei zhang, Yitian Zhao, Yonghuai Liu

We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named by us view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class.

Hybrid Driven Learning for Channel Estimation in Intelligent Reflecting Surface Aided Millimeter Wave Communications

no code implementations30 May 2023 Shuntian Zheng, Sheng Wu, Chunxiao Jiang, Wei zhang, Xiaojun Jing

Intelligent reflecting surfaces (IRS) have been proposed in millimeter wave (mmWave) and terahertz (THz) systems to achieve both coverage and capacity enhancement, where the design of hybrid precoders, combiners, and the IRS typically relies on channel state information.


MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition

no code implementations24 May 2023 Tianlun Zheng, Zhineng Chen, Bingchen Huang, Wei zhang, Yu-Gang Jiang

Traditional Multilingual Text Recognition (MLTR) usually targets a fixed set of languages and thus struggles to handle newly added languages or adapt to ever-changing class distributions.

Incremental Learning

MolXPT: Wrapping Molecules with Text for Generative Pre-training

no code implementations18 May 2023 Zequn Liu, Wei zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, Tie-Yan Liu

Considering that text is the most important record for scientific discovery, in this paper, we propose MolXPT, a unified language model of text and molecules pre-trained on SMILES (a sequence representation of molecules) wrapped by text.

Language Modelling Molecular Property Prediction +2

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model

2 code implementations28 Apr 2023 Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao

This strategy effectively alleviates the interference between the two tasks of image-text alignment and instruction following and achieves strong multi-modal reasoning with only a small-scale image-text and instruction dataset.

Instruction Following Optical Character Recognition (OCR)

STNet: Spatial and Temporal feature fusion network for change detection in remote sensing images

no code implementations22 Apr 2023 Xiaowen Ma, Jiawei Yang, Tingfeng Hong, Mengting Ma, Ziyan Zhao, Tian Feng, Wei zhang

As an important task in remote sensing image analysis, remote sensing change detection (RSCD) aims to identify changes of interest in a region from spatially co-registered multi-temporal remote sensing images, so as to monitor the local development.

Binary Classification Change Detection

SACANet: scene-aware class attention network for semantic segmentation of remote sensing images

no code implementations22 Apr 2023 Xiaowen Ma, Rui Che, Tingfeng Hong, Mengting Ma, Ziyan Zhao, Tian Feng, Wei zhang

In this paper, we integrate both scene-aware and class attentions to propose a scene-aware class attention network (SACANet) for semantic segmentation of remote sensing images.

Semantic Segmentation

Road Genome: A Topology Reasoning Benchmark for Scene Understanding in Autonomous Driving

1 code implementation20 Apr 2023 Huijie Wang, Zhenbo Liu, Yang Li, Tianyu Li, Li Chen, Chonghao Sima, Yuting Wang, Shengyin Jiang, Feng Wen, Hang Xu, Ping Luo, Junchi Yan, Wei zhang, Jun Yao, Yu Qiao, Hongyang Li

By introducing Road Genome (OpenLane-V2), we intend to shift the community's attention and take a step further beyond perception - to the task of topology reasoning for scene structure.

3D Lane Detection Autonomous Driving +1

Network Pruning Spaces

no code implementations19 Apr 2023 Xuanyu He, Yu-I Yang, Ran Song, Jiachen Pu, Conggang Hu, Feijun Jiang, Wei zhang, Huanghao Ding

Statistically, the structure of a winning subnetwork guarantees an approximately optimal ratio in this regime.

Network Pruning

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment

no code implementations CVPR 2023 Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Hang Xu

This paper presents DetCLIPv2, an efficient and scalable training framework that incorporates large-scale image-text pairs to achieve open-vocabulary object detection (OVD).

Language Modelling object-detection +1

RSPT: Reconstruct Surroundings and Predict Trajectories for Generalizable Active Object Tracking

no code implementations7 Apr 2023 Fangwei Zhong, Xiao Bi, Yudi Zhang, Wei zhang, Yizhou Wang

However, building a generalizable active tracker that works robustly across different scenarios remains a challenge, especially in unstructured environments with cluttered obstacles and diverse layouts.

Autonomous Driving Object Tracking

HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation

1 code implementation CVPR 2023 Linfang Zheng, Chen Wang, Yinghan Sun, Esha Dasgupta, Hua Chen, Ales Leonardis, Wei zhang, Hyung Jin Chang

In this paper, we focus on the problem of category-level object pose estimation, which is challenging due to the large intra-category shape variation.

Pose Estimation Translation

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box

no code implementations27 Mar 2023 Yifu Zhang, Xinggang Wang, Xiaoqing Ye, Wei zhang, Jincheng Lu, Xiao Tan, Errui Ding, Peize Sun, Jingdong Wang

We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes, which alleviates the problems of object missing and fragmented trajectories.

3D Multi-Object Tracking motion prediction

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection

1 code implementation CVPR 2023 Chang Liu, Weiming Zhang, Xiangru Lin, Wei zhang, Xiao Tan, Junyu Han, Xiaomao Li, Errui Ding, Jingdong Wang

It employs a "divide-and-conquer" strategy and separately exploits positives for the classification and localization task, which is more robust to the assignment ambiguity.

Dense Object Detection object-detection +2

How Does Attention Work in Vision Transformers? A Visual Analytics Attempt

no code implementations24 Mar 2023 Yiran Li, Junpeng Wang, Xin Dai, Liang Wang, Chin-Chia Michael Yeh, Yan Zheng, Wei zhang, Kwan-Liu Ma

Multi-head self-attentions are then applied to the sequence to learn the attention between patches.

Multi-modal Facial Affective Analysis based on Masked Autoencoder

no code implementations20 Mar 2023 Wei zhang, Bowen Ma, Feng Qiu, Yu Ding

The CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to providing high-quality and large-scale Aff-wild2 for the recognition of commonly used emotion representations, such as Action Units (AU), basic expression categories(EXPR), and Valence-Arousal (VA).

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning

1 code implementation15 Mar 2023 Jinxiang Lai, Siqian Yang, Wenlong Wu, Tao Wu, Guannan Jiang, Xi Wang, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

Then we derive two specific attention modules, named SpatialFormer Semantic Attention (SFSA) and SpatialFormer Target Attention (SFTA), to enhance the target object regions while reduce the background distraction.

Few-Shot Learning

LoG-CAN: local-global Class-aware Network for semantic segmentation of remote sensing images

no code implementations14 Mar 2023 Xiaowen Ma, Mengting Ma, Chenlu Hu, Zhiyuan Song, Ziyan Zhao, Tian Feng, Wei zhang

We present LoG-CAN, a multi-scale semantic segmentation network with a global class-aware (GCA) module and local class-aware (LCA) modules to remote sensing images.

Semantic Segmentation

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

no code implementations CVPR 2023 Yanxin Long, Youpeng Wen, Jianhua Han, Hang Xu, Pengzhen Ren, Wei zhang, Shen Zhao, Xiaodan Liang

Besides, our CapDet also achieves state-of-the-art performance on dense captioning tasks, e. g., 15. 44% mAP on VG V1. 2 and 13. 98% on the VG-COCO dataset.

Dense Captioning

Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme

no code implementations27 Feb 2023 Jianhao Huang, Dongxu Li, Chuan Huang, Xiaoqi Qin, Wei zhang

This paper proposes a deep separate source-channel coding (DSSCC) framework for the joint task and data oriented semantic communications (JTD-SC) and utilizes the variational autoencoder approach to solve the rate-distortion problem with semantic distortion.

Bayesian Inference Data Compression

Entity-Level Text-Guided Image Manipulation

no code implementations22 Feb 2023 Yikai Wang, Jianan Wang, Guansong Lu, Hang Xu, Zhenguo Li, Wei zhang, Yanwei Fu

In the image manipulation phase, SeMani adopts a generative model to synthesize new images conditioned on the entity-irrelevant regions and target text descriptions.

Denoising Image Manipulation

RIS-Position and Orientation Estimation in MIMO-OFDM Systems with Practical Scatterers

no code implementations9 Feb 2023 Sheng Hong, Minghui Li, Cunhua Pan, Marco Di Renzo, Wei zhang, Lajos Hanzo

A two-step positioning scheme is exploited, where the channel parameters are first acquired, and the position-related parameters are then estimated.

Simulation-to-reality UAV Fault Diagnosis with Deep Learning

no code implementations9 Feb 2023 Wei zhang, Junjie Tong, Fang Liao, Yunfeng Zhang

Accurate diagnosis of propeller faults is crucial for ensuring the safe and efficient operation of quadrotors.

Domain Adaptation

Language-Driven Anchors for Zero-Shot Adversarial Robustness

no code implementations30 Jan 2023 Xiao Li, Wei zhang, Yining Liu, Zhanhao Hu, Bo Zhang, Xiaolin Hu

By leveraging the semantic consistency of the text encoders, LAAT can enhance the adversarial robustness of the image model on novel categories without additional examples.

Adversarial Defense Adversarial Robustness +3

Deep-learning-based on-chip rapid spectral imaging with high spatial resolution

no code implementations16 Jan 2023 Jiawei Yang, Kaiyu Cui, Yidong Huang, Wei zhang, Xue Feng, Fang Liu

Spectral imaging extends the concept of traditional color cameras to capture images across multiple spectral channels and has broad application prospects.

Autonomous Driving Metamerism +1

EPR-Net: Constructing non-equilibrium potential landscape via a variational force projection formulation

no code implementations5 Jan 2023 Yue Zhao, Wei zhang, Tiejun Li

We present a novel yet simple deep learning approach, dubbed EPR-Net, for constructing the potential landscape of high-dimensional non-equilibrium steady state (NESS) systems.

Dimensionality Reduction

Machine Learning for Large-Scale Optimization in 6G Wireless Networks

no code implementations3 Jan 2023 Yandong Shi, Lixiang Lian, Yuanming Shi, Zixin Wang, Yong Zhou, Liqun Fu, Lin Bai, Jun Zhang, Wei zhang

The sixth generation (6G) wireless systems are envisioned to enable the paradigm shift from "connected things" to "connected intelligence", featured by ultra high density, large-scale, dynamic heterogeneity, diversified functional requirements and machine learning capabilities, which leads to a growing need for highly efficient intelligent algorithms.

Distributed Optimization Federated Learning +1

Semi-DETR: Semi-Supervised Object Detection With Detection Transformers

no code implementations CVPR 2023 Jiacheng Zhang, Xiangru Lin, Wei zhang, Kuo Wang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li

We analyze the DETR-based framework on semi-supervised object detection (SSOD) and observe that (1) the one-to-one assignment strategy generates incorrect matching when the pseudo ground-truth bounding box is inaccurate, leading to training inefficiency; (2) DETR-based detectors lack deterministic correspondence between the input query and its prediction output, which hinders the applicability of the consistency-based regularization widely used in current SSOD methods.

object-detection Object Detection +2

A Deep Learning Method for Real-time Bias Correction of Wind Field Forecasts in the Western North Pacific

no code implementations29 Dec 2022 Wei zhang, Yueyue Jiang, Junyu Dong, Xiaojiang Song, Renbo Pang, Boyu Guoan, Hui Yu

In this study, we developed the Multi-Task-Double Encoder Trajectory Gated Recurrent Unit (MT-DETrajGRU) model, which uses an improved double-encoder forecaster architecture to model the spatiotemporal sequence of the U and V components of the wind field; we designed a multi-task learning loss function to correct wind speed and wind direction simultaneously using only one model.

Multi-Task Learning

Circular Accessible Depth: A Robust Traversability Representation for UGV Navigation

no code implementations28 Dec 2022 Shikuan Xie, Ran Song, Yuenan Zhao, Xueqin Huang, Yibin Li, Wei zhang

In this paper, we present the Circular Accessible Depth (CAD), a robust traversability representation for an unmanned ground vehicle (UGV) to learn traversability in various scenarios containing irregular obstacles.

Semantic optical fiber communication system

no code implementations27 Dec 2022 Zhenming Yu, Hongyu Huang, Liming Cheng, Wei zhang, Yueqiu Mu, Kun Xu

The current optical communication systems minimize bit or symbol errors without considering the semantic meaning behind digital bits, thus transmitting a lot of unnecessary information.

Differentiating Student Feedbacks for Knowledge Tracing

no code implementations16 Dec 2022 Jiajun Cui, Wei zhang

In computer-aided education and intelligent tutoring systems, knowledge tracing (KT) raises attention due to the development of data-driven learning methods, which aims to predict students' future performance given their past question response sequences to trace their knowledge states.

Knowledge Tracing

Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction

no code implementations12 Dec 2022 Shiwei Li, Huifeng Guo, Lu Hou, Wei zhang, Xing Tang, Ruiming Tang, Rui Zhang, Ruixuan Li

To this end, we formulate a novel quantization training paradigm to compress the embeddings from the training stage, termed low-precision training (LPT).

Click-Through Rate Prediction Quantization

Low-rank Tensor Assisted K-space Generative Model for Parallel Imaging Reconstruction

no code implementations11 Dec 2022 Wei zhang, Zengwei Xiao, Hui Tao, Minghui Zhang, Xiaoling Xu, Qiegen Liu

Although recent deep learning methods, especially generative models, have shown good performance in fast magnetic resonance imaging, there is still much room for improvement in high-dimensional generation.

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

1 code implementation7 Dec 2022 Zhongwei Wan, Yichun Yin, Wei zhang, Jiaxin Shi, Lifeng Shang, Guangyong Chen, Xin Jiang, Qun Liu

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e. g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora.

General Knowledge Language Modelling +3

FlowFace: Semantic Flow-guided Shape-aware Face Swapping

no code implementations6 Dec 2022 Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu

Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.

Face Swapping

Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces

no code implementations AMTA 2022 Prince O Aboagye, Yan Zheng, Michael Yeh, Junpeng Wang, Zhongfang Zhuang, Huiyuan Chen, Liang Wang, Wei zhang, Jeff Phillips

Optimal Transport (OT) provides a useful geometric framework to estimate the permutation matrix under unsupervised cross-lingual word embedding (CLWE) models that pose the alignment task as a Wasserstein-Procrustes problem.

Bilingual Lexicon Induction Quantization

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

no code implementations2 Dec 2022 Zutao Jiang, Guangsong Lu, Xiaodan Liang, Jihua Zhu, Wei zhang, Xiaojun Chang, Hang Xu

Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module.

Contrastive Learning SSIM

Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision

no code implementations23 Nov 2022 Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao, Tianliang Zhang, Wenlong Wu, Wei zhang, Chengjie Wang, Yuan Xie

This paper builds a unified framework to perform effective noisy-proposal suppression and to interact between global and local features for robust feature learning.

Multi-Label Image Classification

RIS-Assisted Self-Interference Mitigation for In-Band Full-Duplex Transceivers

no code implementations22 Nov 2022 Wei zhang, Yi Jiang, Bin Zhou

The wireless in-band full-duplex (IBFD) technology can in theory double the system capacity over the conventional frequency division duplex (FDD) or time-division duplex (TDD) alternatives.


Leveraging the Hints: Adaptive Bidding in Repeated First-Price Auctions

no code implementations5 Nov 2022 Wei zhang, Yanjun Han, Zhengyuan Zhou, Aaron Flores, Tsachy Weissman

In the past four years, a particularly important development in the digital advertising industry is the shift from second-price auctions to first-price auctions for online display ads.


Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective

1 code implementation2 Nov 2022 Jinxiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao, Wei zhang, Yuan Xie, Chengjie Wang

In this paper, for the first time, we investigate the contributions of different distance metrics, and propose an adaptive fusion scheme, bringing significant improvements in few-shot classification.

Few-Shot Learning

Facial Action Unit Detection and Intensity Estimation from Self-supervised Representation

no code implementations28 Oct 2022 Bowen Ma, Rudong An, Wei zhang, Yu Ding, Zeng Zhao, Rongsheng Zhang, Tangjie Lv, Changjie Fan, Zhipeng Hu

As a fine-grained and local expression behavior measurement, facial action unit (FAU) analysis (e. g., detection and intensity estimation) has been documented for its time-consuming, labor-intensive, and error-prone annotation.

Action Unit Detection Facial Action Unit Detection

Global-to-local Expression-aware Embeddings for Facial Action Unit Detection

no code implementations27 Oct 2022 Rudong An, Wei zhang, Hao Zeng, Wei Chen, Zhigang Deng, Yu Ding

Then, AU feature maps and their corresponding AU masks are multiplied to generate AU masked features focusing on local facial region.

Action Unit Detection Facial Action Unit Detection

Facial Action Units Detection Aided by Global-Local Expression Embedding

no code implementations25 Oct 2022 Zhipeng Hu, Wei zhang, Lincheng Li, Yu Ding, Wei Chen, Zhigang Deng, Xin Yu

We find that AUs and facial expressions are highly associated, and existing facial expression datasets often contain a large number of identities.

3D Face Reconstruction

HAM: Hierarchical Attention Model with High Performance for 3D Visual Grounding

1 code implementation22 Oct 2022 Jiaming Chen, Weixin Luo, Xiaolin Wei, Lin Ma, Wei zhang

To simplify the pipeline, we carefully investigate 3D visual grounding and summarize three fundamental problems about how to develop an end-to-end model with high performance for this task.

Visual Grounding Vocal Bursts Intensity Prediction

Slippage-robust Gaze Tracking for Near-eye Display

no code implementations20 Oct 2022 Wei zhang, Jiaxi Cao, Xiang Wang, Enqi Tian, Bin Li

In recent years, head-mounted near-eye display devices have become the key hardware foundation for virtual reality and augmented reality.

ISTA-Inspired Network for Image Super-Resolution

no code implementations14 Oct 2022 Yuqing Liu, Wei zhang, Weifeng Sun, Zhikai Yu, Jianfeng Wei, Shengquan Li

Inspired by the mathematical analysis, the ISTA block is developed to conduct the optimization in an end-to-end manner.

Image Super-Resolution

Repainting and Imitating Learning for Lane Detection

no code implementations11 Oct 2022 Yue He, Minyue Jiang, Xiaoqing Ye, Liang Du, Zhikang Zou, Wei zhang, Xiao Tan, Errui Ding

In this paper, we target at finding an enhanced feature space where the lane features are distinctive while maintaining a similar distribution of lanes in the wild.

Lane Detection

SoccerNet 2022 Challenges Results

7 code implementations5 Oct 2022 Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Collaboration of Pre-trained Models Makes Better Few-shot Learner

no code implementations25 Sep 2022 Renrui Zhang, Bohao Li, Wei zhang, Hao Dong, Hongsheng Li, Peng Gao, Yu Qiao

In this paper, we propose CoMo, a Collaboration of pre-trained Models that incorporates diverse prior knowledge from various pre-training paradigms for better few-shot learning.

Few-Shot Learning Representation Learning

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection

no code implementations20 Sep 2022 Lewei Yao, Jianhua Han, Youpeng Wen, Xiaodan Liang, Dan Xu, Wei zhang, Zhenguo Li, Chunjing Xu, Hang Xu

We further design a concept dictionary~(with descriptions) from various online sources and detection datasets to provide prior knowledge for each concept.

object-detection Open World Object Detection

Provably Uncertainty-Guided Universal Domain Adaptation

no code implementations19 Sep 2022 Yifan Wang, Lin Zhang, Ran Song, Paul L. Rosin, Yibin Li, Wei zhang

It fully utilizes the relationship between a target sample and its neighbors in the source domain to avoid the influence of domain misalignment.

Universal Domain Adaptation Unsupervised Domain Adaptation

SENDER: SEmi-Nonlinear Deep Efficient Reconstructor for Extraction Canonical, Meta, and Sub Functional Connectivity in the Human Brain

no code implementations12 Sep 2022 Wei zhang, Yu Bao

Deep Linear and Nonlinear learning methods have already been vital machine learning methods for investigating the hierarchical features such as functional connectivity in the human brain via functional Magnetic Resonance signals; however, there are three major shortcomings: 1).

A detail-enhanced sampling strategy in Hadamard single-pixel imaging

no code implementations9 Sep 2022 Yan Cai, Shijian Li, Wei zhang, Hao Wu, Xu-Ri Yao, Qing Zhao

Hadamard single-pixel imaging (HSI) is an appealing imaging technique due to its features of low hardware complexity and industrial cost.

Image Reconstruction

Dual Representation Learning for One-Step Clustering of Multi-View Data

no code implementations30 Aug 2022 Wei zhang, Zhaohong Deng, Kup-Sze Choi, Jun Wang, Shitong Wang

Meanwhile, to make the representation learning more specific to the clustering task, a one-step learning framework is proposed to integrate representation learning and clustering partition as a whole.

Representation Learning

Robustness to Unbounded Smoothness of Generalized SignSGD

no code implementations23 Aug 2022 Michael Crawshaw, Mingrui Liu, Francesco Orabona, Wei zhang, Zhenxun Zhuang

We also compare these algorithms with popular optimizers on a set of deep learning tasks, observing that we can match the performance of Adam while beating the others.

SFF-DA: Sptialtemporal Feature Fusion for Detecting Anxiety Nonintrusively

1 code implementation12 Aug 2022 Haimiao Mo, Yuchen Li, Shanlin Yang, Wei zhang, Shuai Ding

To address these issues, we propose a framework with spatiotemporal feature fusion for detecting anxiety nonintrusively.

Embedding Compression with Hashing for Efficient Representation Learning in Large-Scale Graph

no code implementations11 Aug 2022 Chin-Chia Michael Yeh, Mengting Gu, Yan Zheng, Huiyuan Chen, Javid Ebrahimi, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang

Graph neural networks (GNNs) are deep learning models designed specifically for graph data, and they typically rely on node features as the input to the first layer.

Representation Learning

Multi-View Pre-Trained Model for Code Vulnerability Identification

no code implementations10 Aug 2022 Xuxiang Jiang, Yinhao Xiao, Jun Wang, Wei zhang

Vulnerability identification is crucial for cyber security in the software-related industry.

Contrastive Learning

Exploiting Inter-Sample Affinity for Knowability-Aware Universal Domain Adaptation

no code implementations19 Jul 2022 Yifan Wang, Lin Zhang, Ran Song, Hongliang Li, Paul L. Rosin, Wei zhang

Specifically, we introduce a knowability-based labeling scheme which can be divided into two steps: 1) Knowability-guided detection of known and unknown samples based on the intrinsic structure of the neighborhoods of samples, where we leverage the first singular vectors of the affinity matrices to obtain the knowability of every target sample.

Universal Domain Adaptation

Weakly Supervised Video Salient Object Detection via Point Supervision

no code implementations15 Jul 2022 Shuyong Gao, Haozhe Xing, Wei zhang, Yan Wang, Qianyu Guo, Wenqiang Zhang

Several works attempt to use scribble annotations to mitigate this problem, but point supervision as a more labor-saving annotation method (even the most labor-saving method among manual annotation methods for dense prediction), has not been explored.

object-detection Optical Flow Estimation +2

Effective Few-Shot Named Entity Linking by Meta-Learning

1 code implementation12 Jul 2022 Xiuxing Li, Zhenyu Li, Zhengyan Zhang, Ning Liu, Haitao Yuan, Wei zhang, Zhiyuan Liu, Jianyong Wang

In this paper, we endeavor to solve the problem of few-shot entity linking, which only requires a minimal amount of in-domain labeled data and is more practical in real situations.

Entity Linking Knowledge Base Completion +2

Consecutive Pretraining: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain

1 code implementation8 Jul 2022 Tong Zhang, Peng Gao, Hao Dong, Yin Zhuang, Guanqun Wang, Wei zhang, He Chen

Currently, under supervised learning, a model pretrained by a large-scale nature scene dataset and then fine-tuned on a few specific task labeling data is the paradigm that has dominated the knowledge transfer learning.

Land Cover Classification object-detection +3

Network Amplification With Efficient MACs Allocation

1 code implementation Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2022 Chuanjian Liu, Kai Han, An Xiao, Ying Nie, Wei zhang, Yunhe Wang

In particular, the proposed method is used to enlarge models sourced by GhostNet, we achieve state-of-the-art 80. 9% and 84. 3% ImageNet top-1 accuracies under the setting of 600M and 4. 4B MACs, respectively.

Video2StyleGAN: Encoding Video in Latent Space for Manipulation

no code implementations27 Jun 2022 Jiyang Yu, Jingen Liu, Jing Huang, Wei zhang, Tao Mei

To this end, we propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation.

SADAM: Stochastic Adam, A Stochastic Operator for First-Order Gradient-based Optimizer

no code implementations20 May 2022 Wei zhang, Yu Bao

In this work, to efficiently help escape the stationary and saddle points, we propose, analyze, and generalize a stochastic strategy performed as an operator for a first-order gradient descent algorithm in order to increase the target accuracy and reduce time consumption.

DEMAND: Deep Matrix Approximately Nonlinear Decomposition to Identify Meta, Canonical, and Sub-Spatial Pattern of functional Magnetic Resonance Imaging in the Human Brain

no code implementations20 May 2022 Wei zhang, Yu Bao

At first, the proposed DEMAND employs a non-fully connected and multilayer-stacked architecture that is easier to be optimized compared with canonical DNNs; furthermore, due to the efficient architecture, training DEMAND can avoid overfitting and enables the recognition of individual/minor features based on a small dataset such as an individual data; finally, a novel rank estimator technique is introduced to tune all hyperparameters of DEMAND automatically.

Dictionary Learning

DELMAR: Deep Linear Matrix Approximately Reconstruction to Extract Hierarchical Functional Connectivity in the Human Brain

no code implementations20 May 2022 Wei zhang, Yu Bao

Moreover, the theoretical analyses indicate that DELMAR can converge to the unique fixed point and even enable the accurate approximation of original input as DNNs.

Joint OAM Radar-Communication Systems: Target Recognition and Beam Optimization

no code implementations11 May 2022 Wen-Xuan Long, Rui Chen, Marco Moretti, Wei zhang, Jiandong Li

In details, we first propose an OAM-based three-dimensional (3-D) super-resolution position estimation and rotation velocity detection method, which can accurately estimate the 3-D position and rotation velocity of multiple targets.


Intelligent Reflecting Surface Configurations for Smart Radio Using Deep Reinforcement Learning

1 code implementation11 May 2022 Wei Wang, Wei zhang

Intelligent reflecting surface (IRS) is envisioned to change the paradigm of wireless communications from "adapting to wireless channels" to "changing wireless channels".

reinforcement-learning Reinforcement Learning (RL)

Data-Efficient Backdoor Attacks

1 code implementation22 Apr 2022 Pengfei Xia, Ziqiang Li, Wei zhang, Bin Li

Recent studies have proven that deep neural networks are vulnerable to backdoor attacks.

Reconfigurable Intelligent Surface for Near Field Communications: Beamforming and Sensing

no code implementations21 Apr 2022 Yuhua Jiang, Feifei Gao, Mengnan Jian, Shun Zhang, Wei zhang

However, the conventional continuous aperture RIS is designed to convert the incoming planar waves into the outgoing planar waves, which is not the optimal reflecting scheme when the receiver is not a planar array and is located in the near field of the RIS.

Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

no code implementations11 Apr 2022 Zheng Liu, Wei zhang, Yan Chen, Weiyi Sun, Tianchuan Du, Benjamin Schroeder

Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products.

text similarity

Point2Seq: Detecting 3D Objects as Sequences

1 code implementation CVPR 2022 Yujing Xue, Jiageng Mao, Minzhe Niu, Hang Xu, Michael Bi Mi, Wei zhang, Xiaogang Wang, Xinchao Wang

We further propose a lightweight scene-to-sequence decoder that can auto-regressively generate words conditioned on features from a 3D scene as well as cues from the preceding words.

3D Object Detection object-detection

Beam Training and Alignment for RIS-Assisted Millimeter Wave Systems:State of the Art and Beyond

no code implementations25 Mar 2022 Peilan Wang, Jun Fang, Weizheng Zhang, Zhi Chen, Hongbin Li, Wei zhang

The deployment of RIS, however, complicates the system architecture and poses a significant challenge for beam training (BT)/ beam alignment (BA), a process that is required to establish a reliable link between the transmitter and the receiver.

Transformer-based Multimodal Information Fusion for Facial Expression Analysis

no code implementations23 Mar 2022 Wei zhang, Feng Qiu, Suzhen Wang, Hao Zeng, Zhimeng Zhang, Rudong An, Bowen Ma, Yu Ding

Then, we introduce a transformer-based fusion module that integrates the static vision features and the dynamic multimodal features.

Action Unit Detection Arousal Estimation +2

Weakly-Supervised Salient Object Detection Using Point Supervision

1 code implementation22 Mar 2022 Shuyong Gao, Wei zhang, Yan Wang, Qianyu Guo, Chenglong Zhang, Yangji He, Wenqiang Zhang

Then we develop a transformer-based point-supervised saliency detection model to produce the first round of saliency maps.

object-detection Object Detection +2

Compression of Generative Pre-trained Language Models via Quantization

no code implementations ACL 2022 Chaofan Tao, Lu Hou, Wei zhang, Lifeng Shang, Xin Jiang, Qun Liu, Ping Luo, Ngai Wong

We find that previous quantization methods fail on generative tasks due to the \textit{homogeneous word embeddings} caused by reduced capacity, and \textit{varied distribution of weights}.

Model Compression Quantization +1

CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving

no code implementations15 Mar 2022 Kaican Li, Kai Chen, Haoyu Wang, Lanqing Hong, Chaoqiang Ye, Jianhua Han, Yukuai Chen, Wei zhang, Chunjing Xu, Dit-yan Yeung, Xiaodan Liang, Zhenguo Li, Hang Xu

One main reason that impedes the development of truly reliably self-driving systems is the lack of public datasets for evaluating the performance of object detectors on corner cases.

Autonomous Driving object-detection +1

Bilinear Systems Induced by Proper Lie Group Actions

no code implementations14 Mar 2022 Gong Cheng, Wei zhang, Jr-Shin Li

In the study of induced bilinear systems, the classical Lie algebra rank condition (LARC) is known to be impractical since it requires computing the rank everywhere.

Visualizing and Understanding Patch Interactions in Vision Transformer

no code implementations11 Mar 2022 Jie Ma, Yalong Bai, Bineng Zhong, Wei zhang, Ting Yao, Tao Mei

Vision Transformer (ViT) has become a leading tool in various computer vision tasks, owing to its unique self-attention mechanism that learns visual representations explicitly through cross-patch information interactions.

Amplitude-Constrained Constellation and Reflection Pattern Designs for Directional Backscatter Communications Using Programmable Metasurface

no code implementations8 Mar 2022 Wei Wang, Bincheng Zhu, Yongming Huang, Wei zhang

For the constellation design, we adopt the amplitude and phase-shift keying (APSK) constellation and optimize the parameters of APSK such as ring number, ring radius, and inter-ring phase difference.

Freeform Body Motion Generation from Speech

1 code implementation4 Mar 2022 Jing Xu, Wei zhang, Yalong Bai, Qibin Sun, Tao Mei

Motivated by studies in linguistics, we decompose the co-speech motion into two complementary parts: pose modes and rhythmic dynamics.

PKGM: A Pre-trained Knowledge Graph Model for E-commerce Application

no code implementations2 Mar 2022 Wen Zhang, Chi-Man Wong, Ganqinag Ye, Bo Wen, Hongting Zhou, Wei zhang, Huajun Chen

On the one hand, it could provide item knowledge services in a uniform way with service vectors for embedding-based and item-knowledge-related task models without accessing triple data.

Knowledge Graphs Sequential Recommendation

High-order Correlation Preserved Incomplete Multi-view Subspace Clustering

3 code implementations IEEE Transactions on Image Processing 2022 Zhenglai Li, Chang Tang, Xiao Zheng, Xinwang Liu, Senior Member, Wei zhang, Member, IEEE, and En Zhu

Specifically, multiple affinity matrices constructed from the incomplete multi-view data are treated as a thirdorder low rank tensor with a tensor factorization regularization which preserves the high-order view correlation and sample correlation.

Incomplete multi-view clustering Multi-view Subspace Clustering +1

Model-Based Neural Network and Its Application to Line Spectral Estimation

no code implementations14 Feb 2022 Yi Jiang, Tianyi Zhang, Wei zhang

Owing to the same layered form as an ANN, a MNN can also be optimized using the back-propagation (BP) algorithm.

A comprehensive benchmark analysis for sand dust image reconstruction

no code implementations7 Feb 2022 Yazhong Si, Fan Yang, Ya Guo, Wei zhang, Yipu Yang

In this paper, we presented a comprehensive perceptual study and analysis of real-world sand dust images, then constructed a Sand-dust Image Reconstruction Benchmark (SIRB) for training Convolutional Neural Networks (CNNs) and evaluating algorithms performance.

Image Enhancement Image Reconstruction

PowerGear: Early-Stage Power Estimation in FPGA HLS via Heterogeneous Edge-Centric GNNs

1 code implementation25 Jan 2022 Zhe Lin, Zike Yuan, Jieru Zhao, Wei zhang, Hui Wang, Yonghong Tian

Specifically, in the graph construction flow, we introduce buffer insertion, datapath merging, graph trimming and feature annotation techniques to transform HLS designs into graph-structured data, which encode both intra-operation micro-architectures and inter-operation interconnects annotated with switching activities.

graph construction Graph Learning

Learning-From-Disagreement: A Model Comparison and Visual Analytics Framework

no code implementations19 Jan 2022 Junpeng Wang, Liang Wang, Yan Zheng, Chin-Chia Michael Yeh, Shubham Jain, Wei zhang

With these metrics, one can easily identify meta-features with the most complementary behaviors in two classifiers, and use them to better ensemble the classifiers.

Binary Classification

Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting

2 code implementations13 Jan 2022 Ling Chen, Donghui Chen, Zongjiang Shang, Binqing Wu, Cen Zheng, Bo Wen, Wei zhang

Given the multi-scale feature representations and scale-specific inter-variable dependencies, a multi-scale temporal graph neural network is introduced to jointly model intra-variable dependencies and inter-variable dependencies.

Graph Learning Multivariate Time Series Forecasting

New volatility evolution model after extreme events

no code implementations10 Jan 2022 Mei-Ling Cai, Zhang-HangJian Chen, Sai-Ping Li, Xiong Xiong, Wei zhang, Ming-Yuan Yang, Fei Ren

Empirical study of the evolutionary behaviors of volatility after endogenous and exogenous events further demonstrates the descriptive power of our new model.

What Hinders Perceptual Quality of PSNR-oriented Methods?

no code implementations4 Jan 2022 Tianshuo Xu, Peng Mi, Xiawu Zheng, Lijiang Li, Fei Chao, Guannan Jiang, Wei zhang, Yiyi Zhou, Rongrong Ji

E. g, in EDSR, our proposed method achieves 3. 60$\times$ faster learning speed compared to a GAN-based method with a subtle degradation in visual quality.

Contrastive Learning

Data-Free Knowledge Transfer: A Survey

no code implementations31 Dec 2021 Yuang Liu, Wei zhang, Jun Wang, Jianyong Wang

In this paper, we provide a comprehensive survey on data-free knowledge transfer from the perspectives of knowledge distillation and unsupervised domain adaptation, to help readers have a better understanding of the current research status and ideas.

Knowledge Distillation Model Compression +2

Responsive Listening Head Generation: A Benchmark Dataset and Baseline

no code implementations27 Dec 2021 Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

Automatically synthesizing listening behavior that actively responds to a talking head, is critical to applications such as digital human, virtual agents and social robots.

Talking Head Generation Translation

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

no code implementations10 Dec 2021 Zhiwei Chen, Changan Wang, Yabiao Wang, Guannan Jiang, Yunhang Shen, Ying Tai, Chengjie Wang, Wei zhang, Liujuan Cao

In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies.

Inductive Bias Weakly-Supervised Object Localization

PointCLIP: Point Cloud Understanding by CLIP

2 code implementations CVPR 2022 Renrui Zhang, Ziyu Guo, Wei zhang, Kunchang Li, Xupeng Miao, Bin Cui, Yu Qiao, Peng Gao, Hongsheng Li

On top of that, we design an inter-view adapter to better extract the global feature and adaptively fuse the few-shot knowledge learned from 3D into CLIP pre-trained in 2D.

Ranked #3 on Training-free 3D Part Segmentation on ShapeNet-Part (using extra training data)

Few-Shot Learning Training-free 3D Part Segmentation +3

Multi-Domain Transformer-Based Counterfactual Augmentation for Earnings Call Analysis

no code implementations2 Dec 2021 Zixuan Yuan, Yada Zhu, Wei zhang, Ziming Huang, Guangnan Ye, Hui Xiong

Earnings call (EC), as a periodic teleconference of a publicly-traded company, has been extensively studied as an essential market indicator because of its high analytical value in corporate fundamentals.

Data Augmentation

Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent

no code implementations2 Dec 2021 Wei zhang, Mingrui Liu, Yu Feng, Xiaodong Cui, Brian Kingsbury, Yuhai Tu

We conduct extensive studies over 18 state-of-the-art DL models/tasks and demonstrate that DPSGD often converges in cases where SSGD diverges for large learning rates in the large batch setting.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Optimizing for In-memory Deep Learning with Emerging Memory Technology

no code implementations1 Dec 2021 Zhehui Wang, Tao Luo, Rick Siow Mong Goh, Wei zhang, Weng-Fai Wong

In-memory deep learning has already demonstrated orders of magnitude higher performance density and energy efficiency.

Using Reconfigurable Intelligent Surfaces for UE Positioning in mmWave MIMO Systems

no code implementations1 Dec 2021 Wei zhang, Wee Peng Tay

We develop a RIS-aided positioning framework to locate a UE in environments where the LOS path may or may not be available.

Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling

1 code implementation6 Nov 2021 Renrui Zhang, Rongyao Fang, Wei zhang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li

To further enhance CLIP's few-shot capability, CLIP-Adapter proposed to fine-tune a lightweight residual feature adapter and significantly improves the performance for few-shot classification.

Language Modelling Transfer Learning

ViDA-MAN: Visual Dialog with Digital Humans

no code implementations26 Oct 2021 Tong Shen, Jiawei Zuo, Fan Shi, Jin Zhang, Liqin Jiang, Meng Chen, Zhengchen Zhang, Wei zhang, Xiaodong He, Tao Mei

We demonstrate ViDA-MAN, a digital-human agent for multi-modal interaction, which offers realtime audio-visual responses to instant speech inquiries.

speech-recognition Speech Recognition +2

Directional Self-supervised Learning for Heavy Image Augmentations

no code implementations CVPR 2022 Yalong Bai, Yifan Yang, Wei zhang, Tao Mei

Specifically, we adapt heavy augmentation policies after the views lightly augmented by standard augmentations, to generate harder view (HV).

Representation Learning Self-Supervised Learning

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning

1 code implementation25 Oct 2021 Pan Lu, Liang Qiu, Jiaqi Chen, Tony Xia, Yizhou Zhao, Wei zhang, Zhou Yu, Xiaodan Liang, Song-Chun Zhu

Also, we develop a strong IconQA baseline Patch-TRM that applies a pyramid cross-modal Transformer with input diagram embeddings pre-trained on the icon dataset.

Arithmetic Reasoning Mathematical Reasoning +3

Asynchronous Decentralized Distributed Training of Acoustic Models

no code implementations21 Oct 2021 Xiaodong Cui, Wei zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, David Kung

Specifically, we study three variants of asynchronous decentralized parallel SGD (ADPSGD), namely, fixed and randomized communication patterns on a ring as well as a delay-by-one scheme.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sensoring and Application of Multimodal Data for the Detection of Freezing of Gait in Parkinson's Disease

no code implementations9 Oct 2021 Wei zhang, Debin Huang, Hantao Li, Lipeng Wang, Yanzhao Wei, Kang Pan, Lin Ma, Huanhuan Feng, Jing Pan, Yuzhu Guo

The accurate and reliable detection or prediction of freezing of gaits (FOG) is important for fall prevention in Parkinson's Disease (PD) and studying the physiological transitions during the occurrence of FOG.

Electroencephalogram (EEG)

TSK Fuzzy System Towards Few Labeled Incomplete Multi-View Data Classification

no code implementations8 Oct 2021 Wei zhang, Zhaohong Deng, Qiongdan Lou, Te Zhang, Kup-Sze Choi, Shitong Wang

The proposed method has the following distinctive characteristics: 1) it can deal with the incomplete and few labeled multi-view data simultaneously; 2) it integrates the missing view imputation and model learning as a single process, which is more efficient than the traditional two-step strategy; 3) attributed to the interpretable fuzzy inference rules, this method is more interpretable.


Scalable Rule-Based Representation Learning for Interpretable Classification

2 code implementations NeurIPS 2021 Zhuo Wang, Wei zhang, Ning Liu, Jianyong Wang

Rule-based models, e. g., decision trees, are widely used in scenarios demanding high model interpretability for their transparent inner structures and good model expressivity.

Classification Representation Learning

Embedding Compression with Hashing for Efficient Representation Learning in Graph

no code implementations29 Sep 2021 Chin-Chia Michael Yeh, Mengting Gu, Yan Zheng, Huiyuan Chen, Javid Ebrahimi, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei zhang

When applying such type of networks on graph without node feature, one can extract simple graph-based node features (e. g., number of degrees) or learn the input node representation (i. e., embeddings) when training the network.

Representation Learning

MC$^2$-SF: Slow-Fast Learning for Mobile-Cloud Collaborative Recommendation

no code implementations25 Sep 2021 Zeyuan Chen, Jiangchao Yao, Feng Wang, Kunyang Jia, Bo Han, Wei zhang, Hongxia Yang

With the hardware development of mobile devices, it is possible to build the recommendation models on the mobile side to utilize the fine-grained features and the real-time feedbacks.

Learning Dual Dynamic Representations on Time-Sliced User-Item Interaction Graphs for Sequential Recommendation

1 code implementation24 Sep 2021 Zeyuan Chen, Wei zhang, Junchi Yan, Gang Wang, Jianyong Wang

Sequential Recommendation aims to recommend items that a target user will interact with in the near future based on the historically interacted items.

Representation Learning Sequential Recommendation

Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning

1 code implementation22 Sep 2021 Hanwei Fan, Jiandong Mu, Wei zhang

Subsequently, a rollback algorithm is proposed to recover the high-dimensional design space so that higher pruning accuracy can be obtained.

Bayesian Optimization Model Compression

Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks

no code implementations21 Sep 2021 Chin-Chia Michael Yeh, Zhongfang Zhuang, Junpeng Wang, Yan Zheng, Javid Ebrahimi, Ryan Mercer, Liang Wang, Wei zhang

In this work, we study the problem of multivariate time series prediction for estimating transaction metrics associated with entities in the payment transaction database.

Time Series Prediction

OMPQ: Orthogonal Mixed Precision Quantization

1 code implementation16 Sep 2021 Yuexiao Ma, Taisong Jin, Xiawu Zheng, Yan Wang, Huixia Li, Yongjian Wu, Guannan Jiang, Wei zhang, Rongrong Ji

Instead of solving a problem of the original integer programming, we propose to optimize a proxy metric, the concept of network orthogonality, which is highly correlated with the loss of the integer programming but also easy to optimize with linear programming.

AutoML Quantization

Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis

no code implementations5 Sep 2021 Tong Sha, Wei zhang, Tong Shen, Zhoujun Li, Tao Mei

Deep person generation has attracted extensive research attention due to its wide applications in virtual agents, video conferencing, online shopping and art/movie production.

Data Augmentation Talking Head Generation

How Does Adversarial Fine-Tuning Benefit BERT?

no code implementations31 Aug 2021 Javid Ebrahimi, Hao Yang, Wei zhang

Adversarial training (AT) is one of the most reliable methods for defending against adversarial attacks in machine learning.

Continual Learning Dependency Parsing +2

4-bit Quantization of LSTM-based Speech Recognition Models

no code implementations27 Aug 2021 Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei zhang, Zoltán Tüske, Kailash Gopalakrishnan

We investigate the impact of aggressive low-precision representations of weights and activations in two families of large LSTM-based architectures for Automatic Speech Recognition (ASR): hybrid Deep Bidirectional LSTM - Hidden Markov Models (DBLSTM-HMMs) and Recurrent Neural Network - Transducers (RNN-Ts).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

ARShoe: Real-Time Augmented Reality Shoe Try-on System on Smartphones

no code implementations24 Aug 2021 Shan An, Guangfu Che, Jinghao Guo, Haogang Zhu, Junjie Ye, Fangru Zhou, Zhaoqi Zhu, Dong Wei, Aishan Liu, Wei zhang

To this concern, this work proposes a real-time augmented reality virtual shoe try-on system for smartphones, namely ARShoe.

Pose Estimation Virtual Try-on

G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation

no code implementations ICCV 2021 Lewei Yao, Renjie Pi, Hang Xu, Wei zhang, Zhenguo Li, Tong Zhang

In this paper, we investigate the knowledge distillation (KD) strategy for object detection and propose an effective framework applicable to both homogeneous and heterogeneous student-teacher pairs.

Knowledge Distillation object-detection +1

Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds

2 code implementations ICCV 2021 Chaoda Zheng, Xu Yan, Jiantao Gao, Weibing Zhao, Wei zhang, Zhen Li, Shuguang Cui

Current 3D single object tracking approaches track the target based on a feature comparison between the target template and the search area.

3D Single Object Tracking Object Tracking

Binary Complex Neural Network Acceleration on FPGA

no code implementations10 Aug 2021 Hongwu Peng, Shanglin Zhou, Scott Weitze, Jiaxin Li, Sahidul Islam, Tong Geng, Ang Li, Wei zhang, Minghu Song, Mimi Xie, Hang Liu, Caiwen Ding

Deep complex networks (DCN), in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decision-making requirements of many deployable systems dealing with short observations or short signal bursts.

Decision Making

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video

no code implementations9 Aug 2021 Jie Wu, Wei zhang, Guanbin Li, Wenhao Wu, Xiao Tan, YingYing Li, Errui Ding, Liang Lin

In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video.

Anomaly Detection

On Sample Based Explanation Methods for NLP: Faithfulness, Efficiency and Semantic Evaluation

no code implementations ACL 2021 Wei zhang, Ziming Huang, Yada Zhu, Guangnan Ye, Xiaodong Cui, Fan Zhang

In the recent advances of natural language processing, the scale of the state-of-the-art models and datasets is usually extensive, which challenges the application of sample-based explanation methods in many aspects, such as explanation interpretability, efficiency, and faithfulness.

Performance assessment and tuning of PID control using TLBO: the single-loop case and PI/P cascade case

no code implementations31 Jul 2021 Wei zhang, He Dong, Yunlang Xu, Xiaoping Li

Minimum output variance (MOV) is used as a benchmark for CPA of PID, but it is difficult to be found due to the associated non-convex optimization problem.

Stochastic Optimization

Greedy Network Enlarging

1 code implementation31 Jul 2021 Chuanjian Liu, Kai Han, An Xiao, Yiping Deng, Wei zhang, Chunjing Xu, Yunhe Wang

Recent studies on deep convolutional neural networks present a simple paradigm of architecture design, i. e., models with more MACs typically achieve better accuracy, such as EfficientNet and RegNet.

Augmentation Pathways Network for Visual Recognition

1 code implementation26 Jul 2021 Yalong Bai, Mohan Zhou, Wei zhang, BoWen Zhou, Tao Mei

Experimental results on ImageNet demonstrate the compatibility and effectiveness on a much wider range of augmentations, while consuming fewer parameters and lower computational costs at inference time.

Data Augmentation

Boosting the Convergence of Reinforcement Learning-based Auto-pruning Using Historical Data

no code implementations16 Jul 2021 Jiandong Mu, Mengdi Wang, Feiwen Zhu, Jun Yang, Wei Lin, Wei zhang

Reinforcement learning (RL)-based auto-pruning has been further proposed to automate the DNN pruning process to avoid expensive hand-crafted work.

Neural Network Compression reinforcement-learning +2

Prior Aided Streaming Network for Multi-task Affective Recognitionat the 2nd ABAW2 Competition

no code implementations8 Jul 2021 Wei zhang, Zunhu Guo, Keyu Chen, Lincheng Li, Zhimeng Zhang, Yu Ding

Automatic affective recognition has been an important research topic in human computer interaction (HCI) area.

Emotion Recognition

SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving

no code implementations21 Jun 2021 Jianhua Han, Xiwen Liang, Hang Xu, Kai Chen, Lanqing Hong, Jiageng Mao, Chaoqiang Ye, Wei zhang, Zhenguo Li, Xiaodan Liang, Chunjing Xu

Experiments show that SODA10M can serve as a promising pre-training dataset for different self-supervised learning methods, which gives superior performance when fine-tuning with different downstream tasks (i. e., detection, semantic/instance segmentation) in autonomous driving domain.

Autonomous Driving Instance Segmentation +5

One Million Scenes for Autonomous Driving: ONCE Dataset

1 code implementation21 Jun 2021 Jiageng Mao, Minzhe Niu, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei zhang, Zhenguo Li, Jie Yu, Hang Xu, Chunjing Xu

To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.

3D Object Detection Autonomous Driving +1

Mesh Saliency: An Independent Perceptual Measure or a Derivative of Image Saliency?

1 code implementation CVPR 2021 Ran Song, Wei zhang, Yitian Zhao, Yonghuai Liu, Paul L. Rosin

While mesh saliency aims to predict regional importance of 3D surfaces in agreement with human visual perception and is well researched in computer vision and graphics, latest work with eye-tracking experiments shows that state-of-the-art mesh saliency methods remain poor at predicting human fixations.

Discrimination-Aware Mechanism for Fine-Grained Representation Learning

no code implementations CVPR 2021 Furong Xu, Meng Wang, Wei zhang, Yuan Cheng, Wei Chu

Therefore, there is a need for a training mechanism that enforces the discriminativeness of all the elements in the feature to capture more the subtle visual cues.

Representation Learning Retrieval

LPSNet: A Lightweight Solution for Fast Panoptic Segmentation

no code implementations CVPR 2021 Weixiang Hong, Qingpei Guo, Wei zhang, Jingdong Chen, Wei Chu

Panoptic segmentation is a challenging task aiming to simultaneously segment objects (things) at instance level and background contents (stuff) at semantic level.

Instance Segmentation Panoptic Segmentation

Tensor-Based Multi-View Block-Diagonal Structure Diffusion for Clustering Incomplete Multi-View Data

1 code implementation IEEE International Conference on Multimedia and Expo 2021 Zhenglai Li, Chang Tang, Xinwang Liu, Xiao Zheng, Wei zhang, En Zhu

In this paper, we propose a novel incomplete multi-view clustering method, in which a tensor nuclear norm regularizer elegantly diffuses the information of multi-view block-diagonal structure across different views.

Incomplete multi-view clustering

On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation

no code implementations9 Jun 2021 Wei zhang, Ziming Huang, Yada Zhu, Guangnan Ye, Xiaodong Cui, Fan Zhang

In the recent advances of natural language processing, the scale of the state-of-the-art models and datasets is usually extensive, which challenges the application of sample-based explanation methods in many aspects, such as explanation interpretability, efficiency, and faithfulness.

Joint Channel Estimation and Mixed-ADCs Allocation for Massive MIMO via Deep Learning

no code implementations8 Jun 2021 Liangyuan Xu, Feifei Gao, Ting Zhou, Shaodan Ma, Wei zhang

Instead of randomly assigning the mixed-ADCs, we then design a novel antenna selection network for mixed-ADCs allocation to further improve the channel estimation accuracy.

Model Aided Deep Learning Based MIMO OFDM Receiver With Nonlinear Power Amplifiers

no code implementations30 May 2021 Liangyuan Xu, Feifei Gao, Wei zhang, Shaodan Ma

Multi-input multi-output orthogonal frequency division multiplexing (MIMO OFDM) is a key technology for mobile communication systems.