Search Results for author: Wen Wang

Found 82 papers, 36 papers with code

Privacy-Preserving Training-as-a-Service for On-Device Intelligence: Concept, Architectural Scheme, and Open Problems

no code implementations • 16 Apr 2024 • Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Bo Gao, Tianliu He, Wen Wang

On-device intelligence (ODI) enables artificial intelligence (AI) applications to run on end devices, providing real-time and customized AI services without relying on remote servers.

Federated Learning Privacy Preserving +1

Paper
Add Code

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models

1 code implementation • 18 Mar 2024 • Yang Yang, Wen Wang, Liang Peng, Chaotian Song, Yao Chen, Hengjia Li, Xiaolong Yang, Qinglin Lu, Deng Cai, Boxi Wu, Wei Liu

Customization generation techniques have significantly advanced the synthesis of specific concepts across varied contexts.

Paper
Code

Aligning Knowledge Graph with Visual Perception for Object-goal Navigation

1 code implementation • 29 Feb 2024 • Nuo Xu, Wen Wang, Rong Yang, Mengjie Qin, Zheyuan Lin, Wei Song, Chunlong Zhang, Jason Gu, Chao Li

Object-goal navigation is a challenging task that requires guiding an agent to specific objects based on first-person visual observations.

Object

Paper
Code

Improving Communication Efficiency of Federated Distillation via Accumulating Local Updates

1 code implementation • 7 Dec 2023 • Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Tian Wen, Wen Wang

ALU drastically decreases the frequency of communication in federated distillation, thereby significantly reducing the communication overhead during the training process.

Federated Learning

Paper
Code

GenDeF: Learning Generative Deformation Field for Video Generation

no code implementations • 7 Dec 2023 • Wen Wang, Kecheng Zheng, Qiuyu Wang, Hao Chen, Zifan Shi, Ceyuan Yang, Yujun Shen, Chunhua Shen

We offer a new perspective on approaching the task of video generation.

Disentanglement Video Editing +3

Paper
Add Code

AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort

no code implementations • 19 Nov 2023 • Wen Wang, Canyu Zhao, Hao Chen, Zhekai Chen, Kecheng Zheng, Chunhua Shen

We empirically find that sparse control conditions, such as bounding boxes, are suitable for layout planning, while dense control conditions, e. g., sketches and keypoints, are suitable for generating high-quality image content.

Image Generation Story Visualization

Paper
Add Code

CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation

1 code implementation • 14 Nov 2023 • Weixiang Yan, Haitian Liu, Yunkun Wang, Yunzhe Li, Qian Chen, Wen Wang, Tingyu Lin, Weishan Zhao, Li Zhu, Shuiguang Deng, Hari Sundaram

To bridge these gaps between existing benchmarks and expectations from practical applications, we introduce CodeScope, an execution-based, multilingual, multi-task, multi-dimensional evaluation benchmark for comprehensively gauging LLM capabilities on coding tasks.

Code Generation

Paper
Code

Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR

1 code implementation • 8 Nov 2023 • Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang

We find that applying the conventional cross-entropy loss on input speech tokens does not consistently improve the ASR performance over the Loss Masking approach.

Paper
Code

Towards Plastic and Stable Exemplar-Free Incremental Learning: A Dual-Learner Framework with Cumulative Parameter Averaging

no code implementations • 28 Oct 2023 • Wenju Sun, Qingyong Li, Wen Wang, Yangli-ao Geng

The knowledge from the plastic learner is transferred to the stable learner via cumulative parameter averaging.

Incremental Learning

Paper
Add Code

Object-aware Inversion and Reassembly for Image Editing

no code implementations • 18 Oct 2023 • Zhen Yang, Ganggui Ding, Wen Wang, Hao Chen, Bohan Zhuang, Chunhua Shen

Subsequently, we propose an additional reassembly step to seamlessly integrate the respective editing results and the non-editing region to obtain the final edited image.

Benchmarking Denoising +1

Paper
Add Code

Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling

1 code implementation • 18 Oct 2023 • Hai Yu, Chong Deng, Qinglin Zhang, Jiaqing Liu, Qian Chen, Wen Wang

Our approach improve $F_1$ of old SOTA by 3. 42 (73. 74 -> 77. 16) and reduces $P_k$ by 1. 11 points (15. 0 -> 13. 89) on WIKI-727K and achieves an average relative reduction of 4. 3% on $P_k$ on WikiSection.

Information Retrieval Segmentation +3

Paper
Code

CodeTransOcean: A Comprehensive Multilingual Benchmark for Code Translation

1 code implementation • 8 Oct 2023 • Weixiang Yan, Yuchen Tian, Yunzhe Li, Qian Chen, Wen Wang

To advance research on code translation and meet diverse requirements of real-world applications, we construct CodeTransOcean, a large-scale comprehensive benchmark that supports the largest variety of programming languages for code translation.

Code Translation Machine Translation +1

Paper
Code

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

1 code implementation • 7 Oct 2023 • JiaMing Wang, Zhihao Du, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang

In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.

Audio captioning Automatic Speech Recognition +11

273

Paper
Code

Multi-Functional Reconfigurable Intelligent Surface: System Modeling and Performance Optimization

no code implementations • 4 Oct 2023 • Wen Wang, Wanli Ni, Hui Tian, Yonina C. Eldar, Rui Zhang

In this paper, we propose and study a multi-functional reconfigurable intelligent surface (MF-RIS) architecture.

Paper
Add Code

Performance Analysis and Optimization of Reconfigurable Multi-Functional Surface Assisted Wireless Communications

no code implementations • 4 Oct 2023 • Wen Wang, Wanli Ni, Hui Tian, Naofal Al-Dhahir

To realize a self-sustainable communication system, we investigate the use of MF-RIS in improving the sum-rate of multi-user wireless networks.

Paper
Add Code

B2C-AFM: Bi-Directional Co-Temporal and Cross-Spatial Attention Fusion Model for Human Action Recognition

1 code implementation • IEEE Transactions on Image Processing 2023 • Fangtai Guo, Tianlei Jin, Shiqiang Zhu, Xiangming Xi, Wen Wang, Qiwei Meng, Wei Song, and Jiakai Zhu

Human Action Recognition plays a driving engine of many human-computer interaction applications.

Ranked #19 on Action Recognition on NTU RGB+D

Action Recognition Skeleton Based Action Recognition +1

Paper
Code

Improving BERT with Hybrid Pooling Network and Drop Mask

no code implementations • 14 Jul 2023 • Qian Chen, Wen Wang, Qinglin Zhang, Chong Deng, Ma Yukun, Siqi Zheng

Transformer-based pre-trained language models, such as BERT, achieve great success in various natural language understanding tasks.

Language Modelling Masked Language Modeling +2

Paper
Add Code

Exploiting Correlations Between Contexts and Definitions with Multiple Definition Modeling

no code implementations • 24 May 2023 • Linhan Zhang, Qian Chen, Wen Wang, Yuxin Jiang, Bing Li, Wei Wang, Xin Cao

In this paper, we carefully design a new task called Multiple Definition Modeling (MDM) that pool together all contexts and definition of target words.

Paper
Add Code

Advancing Precise Outline-Conditioned Text Generation with Task Duality and Explicit Outline Control

no code implementations • 23 May 2023 • Yunzhe Li, Qian Chen, Weixiang Yan, Wen Wang, Qinglin Zhang, Hari Sundaram

Furthermore, we identify an issue of imbalanced utilization of the outline information in the precise outline-conditioned generation, which is ubiquitously observed across fine-tuned models and zero-shot inference models.

Sentence Text Generation

Paper
Add Code

Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings

1 code implementation • 18 May 2023 • Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang

Prior studies diagnose the anisotropy problem in sentence representations from pre-trained language models, e. g., BERT, without fine-tuning.

Language Modelling Semantic Textual Similarity +4

Paper
Code

SegGPT: Segmenting Everything In Context

1 code implementation • 6 Apr 2023 • Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang

We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images.

Ranked #1 on Few-Shot Semantic Segmentation on PASCAL-5i (5-Shot) (using extra training data)

Few-Shot Semantic Segmentation In-Context Learning +5

2,418

Paper
Code

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models

1 code implementation • 30 Mar 2023 • Wen Wang, Yan Jiang, Kangyang Xie, Zide Liu, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen

Our vid2vid-zero leverages off-the-shelf image diffusion models, and doesn't require training on any video.

Image Generation Video Alignment +1

317

Paper
Code

Meeting Action Item Detection with Regularized Context Modeling

no code implementations • 27 Mar 2023 • Jiaqing Liu, Chong Deng, Qinglin Zhang, Qian Chen, Wen Wang

We construct and release the first Chinese meeting corpus with manual action item annotations.

Contrastive Learning

Paper
Add Code

MUG: A General Meeting Understanding and Generation Benchmark

1 code implementation • 24 Mar 2023 • Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao

To prompt SLP advancement, we establish a large-scale general Meeting Understanding and Generation Benchmark (MUG) to benchmark the performance of a wide range of SLP tasks, including topic segmentation, topic-level and session-level extractive summarization and topic title generation, keyphrase extraction, and action item detection.

Extractive Summarization Keyphrase Extraction +1

Paper
Code

Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG)

no code implementations • 24 Mar 2023 • Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao

ICASSP2023 General Meeting Understanding and Generation Challenge (MUG) focuses on prompting a wide range of spoken language processing (SLP) research on meeting transcripts, as SLP applications are critical to improve users' efficiency in grasping important information in meetings.

Extractive Summarization Keyphrase Extraction

Paper
Add Code

Adaptive Knowledge Distillation between Text and Speech Pre-trained Models

no code implementations • 7 Mar 2023 • Jinjie Ni, Yukun Ma, Wen Wang, Qian Chen, Dianwen Ng, Han Lei, Trung Hieu Nguyen, Chong Zhang, Bin Ma, Erik Cambria

Learning on a massive amount of speech corpus leads to the recent success of many self-supervised speech models.

Knowledge Distillation Spoken Language Understanding

Paper
Add Code

Weighted Sampling for Masked Language Modeling

no code implementations • 28 Feb 2023 • Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Xin Cao, Kongzhang Hao, Yuxin Jiang, Wei Wang

Experiments on the Semantic Textual Similarity benchmark (STS) show that WSBERT significantly improves sentence embeddings over BERT.

Language Modelling Masked Language Modeling +5

Paper
Add Code

Fast Contextual Scene Graph Generation With Unbiased Context Augmentation

no code implementations • CVPR 2023 • Tianlei Jin, Fangtai Guo, Qiwei Meng, Shiqiang Zhu, Xiangming Xi, Wen Wang, Zonghao Mu, Wei Song

Therefore, at the context level, we can produce diverse context descriptions by using a context augmentation method based on the original dataset.

Graph Generation Scene Graph Generation

Paper
Add Code

SegGPT: Towards Segmenting Everything in Context

no code implementations • ICCV 2023 • Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang

We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images.

Few-Shot Semantic Segmentation In-Context Learning +4

Paper
Add Code

Decoupling Learning and Remembering: A Bilevel Memory Framework With Knowledge Projection for Task-Incremental Learning

1 code implementation • CVPR 2023 • Wenju Sun, Qingyong Li, Jing Zhang, Wen Wang, Yangli-ao Geng

BMKP decouples the functions of learning and knowledge remembering via a bilevel-memory design: a working memory responsible for adaptively model learning, to ensure plasticity; a long-term memory in charge of enduringly storing the knowledge incorporated within the learned model, to guarantee stability.

Incremental Learning

Paper
Code

Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation

1 code implementation • 16 Dec 2022 • Qian Yang, Qian Chen, Wen Wang, Baotian Hu, Min Zhang

Moreover, the pipelined approaches of retrieval and generation might result in poor generation performance when retrieval performance is low.

Answer Generation Language Modelling +3

Paper
Code

DopplerBAS: Binaural Audio Synthesis Addressing Doppler Effect

no code implementations • 14 Dec 2022 • Jinglin Liu, Zhenhui Ye, Qian Chen, Siqi Zheng, Wen Wang, Qinglin Zhang, Zhou Zhao

Recently, binaural audio synthesis (BAS) has emerged as a promising research field for its applications in augmented and virtual realities.

Audio Synthesis

Paper
Add Code

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

1 code implementation • CVPR 2023 • Xinlong Wang, Wen Wang, Yue Cao, Chunhua Shen, Tiejun Huang

In this work, we present Painter, a generalist model which addresses these obstacles with an "image"-centric solution, that is, to redefine the output of core vision tasks as images, and specify task prompts as also images.

Ranked #6 on Personalized Segmentation on PerSeg

In-Context Learning Keypoint Detection +2

2,418

Paper
Code

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

6 code implementations • CVPR 2023 • Yuxin Fang, Wen Wang, Binhui Xie, Quan Sun, Ledell Wu, Xinggang Wang, Tiejun Huang, Xinlong Wang, Yue Cao

We launch EVA, a vision-centric foundation model to explore the limits of visual representation at scale using only publicly accessible data.

Ranked #1 on Self-Supervised Image Classification (with CLIP) on ImageNet (zero-shot)

Action Classification Action Recognition +9

29,671

Paper
Code

RL-MD: A Novel Reinforcement Learning Approach for DNA Motif Discovery

no code implementations • 30 Sep 2022 • Wen Wang, Jianzong Wang, Shijing Si, Zhangcheng Huang, Jing Xiao

The extraction of sequence patterns from a collection of functionally linked unlabeled DNA sequences is known as DNA motif discovery, and it is a key task in computational biology.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Diffsound: Discrete Diffusion Model for Text-to-sound Generation

1 code implementation • 20 Jul 2022 • Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu

In this study, we investigate generating sound conditioned on a text prompt and propose a novel text-to-sound generation framework that consists of a text encoder, a Vector Quantized Variational Autoencoder (VQ-VAE), a decoder, and a vocoder.

Ranked #13 on Audio Generation on AudioCaps

Audio Generation

330

Paper
Code

TGRMPT: A Head-Shoulder Aided Multi-Person Tracker and a New Large-Scale Dataset for Tour-Guide Robot

1 code implementation • 8 Jul 2022 • Wen Wang, Shunda Hu, Shiqiang Zhu, Wei Song, Zheyuan Lin, Tianlei Jin, Zonghao Mu, Yuanhai Zhou

A service robot serving safely and politely needs to track the surrounding people robustly, especially for Tour-Guide Robot (TGR).

Multi-Object Tracking

Paper
Code

CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose

1 code implementation • CVPR 2023 • Xu Zhang, Wen Wang, Zhe Chen, Yufei Xu, Jing Zhang, DaCheng Tao

Motivated by the progress of visual-language research, we propose that pre-trained language models (e. g., CLIP) can facilitate animal pose estimation by providing rich prior knowledge for describing animal keypoints in text.

Animal Pose Estimation Contrastive Learning

Paper
Code

Safeguarding NOMA Networks via Reconfigurable Dual-Functional Surface under Imperfect CSI

no code implementations • 29 May 2022 • Wen Wang, Wanli Ni, Hui Tian, Zhaohui Yang, Chongwen Huang, Kai-Kit Wong

This paper investigates the use of the reconfigurable dual-functional surface to guarantee the full-space secure transmission in non-orthogonal multiple access (NOMA) networks.

Paper
Add Code

A Simple yet Effective Framework for Active Learning to Rank

no code implementations • 20 May 2022 • Qingzhong Wang, Haifang Li, Haoyi Xiong, Wen Wang, Jiang Bian, Yu Lu, Shuaiqiang Wang, Zhicong Cheng, Dejing Dou, Dawei Yin

To handle the diverse query requests from users at web-scale, Baidu has done tremendous efforts in understanding users' queries, retrieve relevant contents from a pool of trillions of webpages, and rank the most relevant webpages on the top of results.

Active Learning Learning-To-Rank

Paper
Add Code

DePA: Improving Non-autoregressive Machine Translation with Dependency-Aware Decoder

1 code implementation • 30 Mar 2022 • Jiaao Zhan, Qian Chen, Boxing Chen, Wen Wang, Yu Bai, Yang Gao

We propose a novel and general Dependency-Aware Decoder (DePA) to enhance target dependency modeling in the decoder of fully NAT models from two perspectives: decoder self-attention and decoder input.

Machine Translation Translation

Paper
Code

Towards Data-Efficient Detection Transformers

2 code implementations • 17 Mar 2022 • Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, DaCheng Tao

Besides, we introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.

Paper
Code

Exemplar-free Class Incremental Learning via Discriminative and Comparable One-class Classifiers

1 code implementation • 5 Jan 2022 • Wenju Sun, Qingyong Li, Jing Zhang, Danyu Wang, Wen Wang, Yangli-ao Geng

DisCOIL follows the basic principle of POC, but it adopts variational auto-encoders (VAE) instead of other well-established one-class classifiers (e. g. deep SVDD), because a trained VAE can not only identify the probability of an input sample belonging to a class but also generate pseudo samples of the class to assist in learning new tasks.

Class Incremental Learning Incremental Learning +1

Paper
Code

Supervised Homogeneity Fusion: a Combinatorial Approach

no code implementations • 4 Jan 2022 • Wen Wang, Shihao Wu, Ziwei Zhu, Ling Zhou, Peter X. -K. Song

Fusing regression coefficients into homogenous groups can unveil those coefficients that share a common value within each group.

Paper
Add Code

MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction

1 code implementation • Findings (ACL) 2022 • Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Shiliang Zhang, Bing Li, Wei Wang, Xin Cao

In this work, we propose a novel unsupervised embedding-based KPE approach, Masked Document Embedding Rank (MDERank), to address this problem by leveraging a mask strategy and ranking candidates by the similarity between embeddings of the source document and the masked document.

Contrastive Learning Document Embedding +4

Paper
Code

PoNet: Pooling Network for Efficient Token Mixing in Long Sequences

1 code implementation • ICLR 2022 • Chao-Hong Tan, Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Zhen-Hua Ling

We propose a novel Pooling Network (PoNet) for token mixing in long sequences with linear complexity.

Transfer Learning

Paper
Code

FP-DETR: Detection Transformer Advanced by Fully Pre-training

no code implementations • ICLR 2022 • Wen Wang, Yang Cao, Jing Zhang, DaCheng Tao

To this end, we propose the task adapter which leverages self-attention to model the contextual relation between object query embedding.

Object object-detection +2

Paper
Add Code

Parsing Table Structures in the Wild

2 code implementations • ICCV 2021 • Rujiao Long, Wen Wang, Nan Xue, Feiyu Gao, Zhibo Yang, Yongpan Wang, Gui-Song Xia

In contrast to existing studies that mainly focus on parsing well-aligned tabular images with simple layouts from scanned PDF documents, we aim to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions.

Object Detection

145

Paper
Code

Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

1 code implementation • 27 Jul 2021 • Wen Wang, Yang Cao, Jing Zhang, Fengxiang He, Zheng-Jun Zha, Yonggang Wen, DaCheng Tao

In DQFA, a novel domain query is used to aggregate and align global context from the token sequence of both domains.

Domain Adaptation Object +2

Paper
Code

Sequence Model with Self-Adaptive Sliding Window for Efficient Spoken Document Segmentation

1 code implementation • 20 Jul 2021 • Qinglin Zhang, Qian Chen, YaLi Li, Jiaqing Liu, Wen Wang

Evaluations are conducted on the English Wiki-727K document segmentation benchmark, a Chinese Wikipedia-based document segmentation dataset we created, and an in-house Chinese spoken document dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

IntraLoss: Further Margin via Gradient-Enhancing Term for Deep Face Recognition

no code implementations • 7 Jul 2021 • Chengzhi Jiang, Yanzhou Su, Wen Wang, Haiwei Bai, Haijun Liu, Jian Cheng

This method, named IntraLoss, explicitly performs gradient enhancement in the anisotropic region so that the intra-class distribution continues to shrink, resulting in isotropic and more compact intra-class distribution and further margin between identities.

Face Recognition

Paper
Add Code

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

1 code implementation • ICCV 2021 • Wenyuan Xue, Baosheng Yu, Wen Wang, DaCheng Tao, Qingyong Li

A table arranging data in rows and columns is a very effective data structure, which has been widely used in business and scientific research.

Cell Detection Graph Reconstruction +1

Paper
Code

Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition

1 code implementation • ACL 2021 • Yongliang Shen, Xinyin Ma, Zeqi Tan, Shuai Zhang, Wen Wang, Weiming Lu

Although these methods have the innate ability to handle nested NER, they suffer from high computational cost, ignorance of boundary information, under-utilization of the spans that partially match with entities, and difficulties in long entity recognition.

Ranked #6 on Nested Named Entity Recognition on GENIA

Chinese Named Entity Recognition named-entity-recognition +3

102

Paper
Code

Reranking Machine Translation Hypotheses with Structured and Web-based Language Models

no code implementations • 25 Apr 2021 • Wen Wang, Andreas Stolcke, Jing Zheng

In this paper, we investigate the use of linguistically motivated and computationally efficient structured language models for reranking N-best hypotheses in a statistical machine translation system.

Language Modelling Machine Translation +2

Paper
Add Code

Discriminative Self-training for Punctuation Prediction

no code implementations • 21 Apr 2021 • Qian Chen, Wen Wang, Mengzhe Chen, Qinglin Zhang

Punctuation prediction for automatic speech recognition (ASR) output transcripts plays a crucial role for improving the readability of the ASR transcripts and for improving the performance of downstream natural language processing applications.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Pre-training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning

no code implementations • 21 Apr 2021 • Qian Chen, Wen Wang, Qinglin Zhang

In this paper, we propose a novel joint textual-phonetic pre-training approach for learning spoken language representations, aiming at exploring the full potentials of phonetic information to improve SLU robustness to ASR errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Graph-Based Tri-Attention Network for Answer Ranking in CQA

no code implementations • 5 Mar 2021 • Wei zhang, Zeyuan Chen, Chao Dong, Wen Wang, Hongyuan Zha, Jianyong Wang

However, they encounter two main limitations: (1) Correlations between answers in the same question are often overlooked.

Question Answering

Paper
Add Code

What Makes a Star Teacher? A Hierarchical BERT Model for Evaluating Teacher's Performance in Online Education

no code implementations • 3 Dec 2020 • Wen Wang, Honglei Zhuang, Mi Zhou, Hanyu Liu, Beibei Li

Based on these insights, we then propose a hierarchical course BERT model to predict teachers' performance in online education.

Paper
Add Code

SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection

1 code implementation • 11 Mar 2020 • Yue Zhao, Xiyang Hu, Cheng Cheng, Cong Wang, Changlin Wan, Wen Wang, Jianing Yang, Haoping Bai, Zheng Li, Cao Xiao, Yunlong Wang, Zhi Qiao, Jimeng Sun, Leman Akoglu

Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection.

Dimensionality Reduction Fraud Detection +2

372

Paper
Code

TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation

no code implementations • 7 Mar 2020 • Wen Wang, Xiaojiang Peng, Yanzhou Su, Yu Qiao, Jian Cheng

Video action anticipation aims to predict future action categories from observed frames.

Action Anticipation

Paper
Add Code

Transfer Learning for Context-Aware Spoken Language Understanding

no code implementations • 3 Mar 2020 • Qian Chen, Zhu Zhuo, Wen Wang, Qiuyun Xu

We explore different transfer learning approaches to reduce dependency on data collection and annotation.

Spoken Language Understanding Task-Oriented Dialogue Systems +2

Paper
Add Code

Sequential Neural Networks for Noetic End-to-End Response Selection

1 code implementation • 3 Mar 2020 • Qian Chen, Wen Wang

The noetic end-to-end response selection challenge as one track in the 7th Dialog System Technology Challenges (DSTC7) aims to push the state of the art of utterance classification for real world goal-oriented dialog systems, for which participants need to select the correct next utterances from a set of candidates for the multi-turn context.

Goal-Oriented Dialog

580

Paper
Code

Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection

no code implementations • 3 Mar 2020 • Qian Chen, Mengzhe Chen, Bo Li, Wen Wang

With the increased applications of automatic speech recognition (ASR) in recent years, it is essential to automatically insert punctuation marks and remove disfluencies in transcripts, to improve the readability of the transcripts as well as the performance of subsequent applications, such as machine translation, dialogue systems, and so forth.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

1 code implementation • 19 Feb 2020 • Wen Wang, Wei zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, Hongyuan Zha

Specifically, we build a Multi-Relational Item Graph (MRIG) based on all behavior sequences from all sessions, involving target and auxiliary behavior types.

Representation Learning

Paper
Code

A Comprehensive Study on Temporal Modeling for Online Action Detection

1 code implementation • 21 Jan 2020 • Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng

Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.

Online Action Detection

Paper
Code

A Discriminative Learned CNN Embedding for Remote Sensing Image Scene Classification

no code implementations • 28 Nov 2019 • Wen Wang, Lijun Du, Yinxing Gao, Yanzhou Su, Feng Wang, Jian Cheng

Concretely, for remote sensing image scene classification, we would like to map images from the same scene to feature vectors that are close, and map images from different scenes to feature vectors that are widely separated.

Classification General Classification +2

Paper
Add Code

Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models

no code implementations • 19 Aug 2019 • Zhi-Xiu Ye, Qian Chen, Wen Wang, Zhen-Hua Ling

We also observe that fine-tuned models after the proposed pre-training approach maintain comparable performance on other NLP tasks, such as sentence classification and natural language inference tasks, compared to the original BERT models.

Ranked #26 on Common Sense Reasoning on CommonsenseQA

Common Sense Reasoning Natural Language Inference +3

Paper
Add Code

Automated Curriculum Learning for Turn-level Spoken Language Understanding with Weak Supervision

no code implementations • 10 Jun 2019 • Hao Lang, Wen Wang

The RBSMA algorithm improves the test set accuracy by 7. 8% relative compared to the standard beam search.

Spoken Language Understanding

Paper
Add Code

The General Pair-based Weighting Loss for Deep Metric Learning

no code implementations • 30 May 2019 • Haijun Liu, Jian Cheng, Wen Wang, Yanzhou Su

A large amount of loss functions based on pair distances have been presented in the literature for guiding the training of deep metric learning.

Image Retrieval Metric Learning +1

Paper
Add Code

Attention: A Big Surprise for Cross-Domain Person Re-Identification

no code implementations • 30 May 2019 • Haijun Liu, Jian Cheng, Shiguang Wang, Wen Wang

Unlike existing cross-domain Re-ID methods, leveraging the auxiliary information of those unlabeled target-domain data, we aim at enhancing the model generalization and adaptation by discriminative feature learning, and directly exploiting a pre-trained model to new domains (datasets) without any utilization of the information from target domains.

Person Re-Identification

Paper
Add Code

BERT for Joint Intent Classification and Slot Filling

14 code implementations • 28 Feb 2019 • Qian Chen, Zhu Zhuo, Wen Wang

Intent classification and slot filling are two essential tasks for natural language understanding.

Ranked #3 on Slot Filling on ATIS

General Classification intent-classification +5

606

Paper
Code

Sequential Attention-based Network for Noetic End-to-End Response Selection

4 code implementations • 9 Jan 2019 • Qian Chen, Wen Wang

The noetic end-to-end response selection challenge as one track in Dialog System Technology Challenges 7 (DSTC7) aims to push the state of the art of utterance classification for real world goal-oriented dialog systems, for which participants need to select the correct next utterances from a set of candidates for the multi-turn context.

Ranked #1 on Conversational Response Selection on Advising Corpus

Conversational Response Selection Goal-Oriented Dialog

580

Paper
Code

Early Stratification of Patients at Risk for Postoperative Complications after Elective Colectomy

no code implementations • 29 Nov 2018 • Wen Wang, Rema Padman, Nirav Shah

Stratifying patients at risk for postoperative complications may facilitate timely and accurate workups and reduce the burden of adverse events on patients and the health system.

Paper
Add Code

Temporal Action Detection by Joint Identification-Verification

no code implementations • 19 Oct 2018 • Wen Wang, Yongjian Wu, Haijun Liu, Shiguang Wang, Jian Cheng

Temporal action detection aims at not only recognizing action category but also detecting start time and end time for each action instance in an untrimmed video.

Action Detection

Paper
Add Code

Articulatory information and Multiview Features for Large Vocabulary Continuous Speech Recognition

no code implementations • 16 Feb 2018 • Vikramjit Mitra, Wen Wang, Chris Bartels, Horacio Franco, Dimitra Vergyri

This paper explores the use of multi-view features and their discriminative transforms in a convolutional deep neural network (CNN) architecture for a continuous large vocabulary speech recognition task.

speech-recognition Speech Recognition

Paper
Add Code

Discriminative Covariance Oriented Representation Learning for Face Recognition With Image Sets

no code implementations • CVPR 2017 • Wen Wang, Ruiping Wang, Shiguang Shan, Xilin Chen

For face recognition with image sets, while most existing works mainly focus on building robust set models with hand-crafted feature, it remains a research gap to learn better image representations which can closely match the subsequent image set modeling and classification.

Face Recognition General Classification +2

Paper
Add Code

Exploiting Out-of-Domain Data Sources for Dialectal Arabic Statistical Machine Translation

no code implementations • 7 Sep 2015 • Katrin Kirchhoff, Bing Zhao, Wen Wang

Statistical machine translation for dialectal Arabic is characterized by a lack of data since data acquisition involves the transcription and translation of spoken language.

Machine Translation Translation

Paper
Add Code

Discriminant Analysis on Riemannian Manifold of Gaussian Distributions for Face Recognition With Image Sets

no code implementations • CVPR 2015 • Wen Wang, Ruiping Wang, Zhiwu Huang, Shiguang Shan, Xilin Chen

This paper presents a method named Discriminant Analysis on Riemannian manifold of Gaussian distributions (DARG) to solve the problem of face recognition with image sets.

Face Identification Face Recognition +1

Paper
Add Code

Morphological Modeling for Machine Translation of English-Iraqi Arabic Spoken Dialogs

no code implementations • HLT 2015 • Katrin Kirchhoff, Wen Wang, Colleen Richey, Yik-Cheung Tam

Electrical Engineering Machine Translation +1

Paper
Add Code

Deeply Coupled Auto-encoder Networks for Cross-view Classification

no code implementations • 10 Feb 2014 • Wen Wang, Zhen Cui, Hong Chang, Shiguang Shan, Xilin Chen

In this paper, we propose a simple but effective coupled neural network, called Deeply Coupled Autoencoder Networks (DCAN), which seeks to build two deep neural networks, coupled with each other in every corresponding layers.

Classification Denoising +2

Paper
Add Code

Name-aware Machine Translation

no code implementations • ACL 2013 • Hai-Bo Li, Jing Zheng, Heng Ji, Qi Li, Wen Wang

Entity Linking Machine Translation +4

Paper
Add Code

A Cross-language Study on Automatic Speech Disfluency Detection

no code implementations • NAACL 2013 • Wen Wang, Andreas Stolcke, Jiahong Yuan, Mark Liberman

Language Modelling Speech Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.