Search Results for author: Wei Chu

Found 51 papers, 19 papers with code

Variational Connectionist Temporal Classification

no code implementations ECCV 2020 Linlin Chao, Jingdong Chen, Wei Chu

However, CTC tends to output spiky distributions since it prefers to output blank symbol most of the time.

Classification General Classification +2

SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks

1 code implementation31 Jan 2024 Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu

By employing one shared BERT-type network to refine textual and cross-modal features simultaneously, SNP is lightweight and could support various downstream applications.


Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

no code implementations9 Jan 2024 Xuzheng Yu, Chen Jiang, Wei zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important.

Representation Learning Scene Recognition

InfMLLM: A Unified Framework for Visual-Language Tasks

2 code implementations12 Nov 2023 Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.

Image Captioning Instruction Following +3

LogicMP: A Neuro-symbolic Approach for Encoding First-order Logic Constraints

no code implementations27 Sep 2023 Weidi Xu, Jingwei Wang, Lele Xie, Jianshan He, Hongting Zhou, Taifeng Wang, Xiaopei Wan, Jingdong Chen, Chao Qu, Wei Chu

Integrating first-order logic constraints (FOLCs) with neural networks is a crucial but challenging problem since it involves modeling intricate correlations to satisfy the constraints.

Variational Inference

Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

1 code implementation20 Sep 2023 Chen Jiang, Hong Liu, Xuzheng Yu, Qing Wang, Yuan Cheng, Jia Xu, Zhongyi Liu, Qingpei Guo, Wei Chu, Ming Yang, Yuan Qi

We thereby present a new Triplet Partial Margin Contrastive Learning (TPM-CL) module to construct partial order triplet samples by automatically generating fine-grained hard negatives for matched text-video pairs.

Contrastive Learning Retrieval +3

Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

no code implementations20 Sep 2023 Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.

Retrieval Video Retrieval

Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input

no code implementations25 Jun 2023 Qingpei Guo, Kaisheng Yao, Wei Chu

They can achieve exceptional performances on specific tasks, but face a particularly challenging problem of modality mismatch because of diversity of input modalities and their fixed structures.

Question Answering Referring Expression +5

Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval

1 code implementation CVPR 2023 Tan Pan, Furong Xu, Xudong Yang, Sifeng He, Chen Jiang, Qingpei Guo, Feng Qian Xiaobo Zhang, Yuan Cheng, Lei Yang, Wei Chu

For traditional model upgrades, the old model will not be replaced by the new one until the embeddings of all the images in the database are re-computed by the new model, which takes days or weeks for a large amount of data.

Image Retrieval Retrieval

A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition

no code implementations15 Apr 2023 Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan

During inference, an error-based alignment sampling method is investigated in depth to reduce the alignment mismatch in the training and testing processes.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

DC-Former: Diverse and Compact Transformer for Person Re-Identification

1 code implementation28 Feb 2023 Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng, Yuan Cheng, Wei Chu

In this paper, we propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple diverse and compact subspaces.

Person Re-Identification

DRGCN: Dynamic Evolving Initial Residual for Deep Graph Convolutional Networks

1 code implementation10 Feb 2023 Lei Zhang, Xiaodong Yan, Jianshan He, Ruopeng Li, Wei Chu

Our experimental results show that our model effectively relieves the problem of over-smoothing in deep GCNs and outperforms the state-of-the-art (SOTA) methods on various benchmark datasets.

Causal Effect Estimation: Recent Advances, Challenges, and Opportunities

no code implementations2 Feb 2023 Zhixuan Chu, Jianmin Huang, Ruopeng Li, Wei Chu, Sheng Li

Causal inference has numerous real-world applications in many domains, such as health care, marketing, political science, and online advertising.

Causal Inference Marketing +1

ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation

1 code implementation3 Apr 2022 Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, Wei Chu

In this paper, we theoretically demonstrate that ESMM suffers from the following two problems: (1) Inherent Estimation Bias (IEB), where the estimated CVR of ESMM is inherently higher than the ground truth; (2) Potential Independence Priority (PIP) for CTCVR estimation, where there is a risk that the ESMM overlooks the causality from click to conversion.

counterfactual Recommendation Systems +1

Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching

2 code implementations13 Mar 2022 Xiaojie Chu, Yongtao Wang, Chunhua Shen, Jingdong Chen, Wei Chu

The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models.

Scene Text Recognition

Training Object Detectors From Scratch: An Empirical Study in the Era of Vision Transformer

no code implementations CVPR 2022 Weixiang Hong, Jiangwei Lao, Wang Ren, Jian Wang, Jingdong Chen, Wei Chu

Instead of proposing a specific vision transformer based detector, in this work, our goal is to reveal the insights of training vision transformer based detectors from scratch.

object-detection Object Detection +1

CBNet: A Composite Backbone Network Architecture for Object Detection

5 code implementations1 Jul 2021 TingTing Liang, Xiaojie Chu, Yudong Liu, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling

With multi-scale testing, we push the current best single model result to a new record of 60. 1% box AP and 52. 3% mask AP without using extra training data.

Ranked #6 on Object Detection on COCO-O (using extra training data)

Instance Segmentation Object +2

LPSNet: A Lightweight Solution for Fast Panoptic Segmentation

no code implementations CVPR 2021 Weixiang Hong, Qingpei Guo, Wei zhang, Jingdong Chen, Wei Chu

Panoptic segmentation is a challenging task aiming to simultaneously segment objects (things) at instance level and background contents (stuff) at semantic level.

Instance Segmentation Panoptic Segmentation +1

Discrimination-Aware Mechanism for Fine-Grained Representation Learning

no code implementations CVPR 2021 Furong Xu, Meng Wang, Wei zhang, Yuan Cheng, Wei Chu

Therefore, there is a need for a training mechanism that enforces the discriminativeness of all the elements in the feature to capture more the subtle visual cues.

Representation Learning Retrieval

An Improved Single Step Non-autoregressive Transformer for Automatic Speech Recognition

no code implementations18 Jun 2021 Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan

For the analyses, we plot attention weight distributions in the decoders to visualize the relationships between token-level acoustic embeddings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes

1 code implementation23 May 2021 Hao Huang, Yongtao Wang, Zhaoyu Chen, Yuze Zhang, Yuheng Li, Zhi Tang, Wei Chu, Jingdong Chen, Weisi Lin, Kai-Kuang Ma

Then, we design a two-level perturbation fusion strategy to alleviate the conflict between the adversarial watermarks generated by different facial images and models.

Adversarial Attack Face Swapping +1

PairRE: Knowledge Graph Embeddings via Paired Relation Vectors

1 code implementation ACL 2021 Linlin Chao, Jianshan He, Taifeng Wang, Wei Chu

Distance based knowledge graph embedding methods show promising results on link prediction task, on which two topics have been widely studied: one is the ability to handle complex relations, such as N-to-1, 1-to-N and N-to-N, the other is to encode various relation patterns, such as symmetry/antisymmetry.

Knowledge Graph Embedding Knowledge Graph Embeddings +2

CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition

no code implementations28 Oct 2020 Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao

The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding which substitutes the word embedding in autoregressive transformer (AT) to achieve parallel generation in decoder.

speech-recognition Speech Recognition

Question Directed Graph Attention Network for Numerical Reasoning over Text

no code implementations EMNLP 2020 Kunlong Chen, Weidi Xu, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang, Le Song, Taifeng Wang, Yuan Qi, Wei Chu

Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation.

Graph Attention Machine Reading Comprehension +2

Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy

no code implementations ACL 2020 Xiexiong Lin, Weiyu Jian, Jianshan He, Taifeng Wang, Wei Chu

Experiments demonstrate that our model with fewer parameters yields significant improvements over competitive baselines on two datasets Wizard-of-Wikipedia(average Bleu +87{\%}; abs.

Riemannian Proximal Policy Optimization

no code implementations19 May 2020 Shijun Wang, Baocheng Zhu, Chen Li, Mingzhe Wu, James Zhang, Wei Chu, Yuan Qi

In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems.

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

1 code implementation ACL 2020 Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu, Yuan Qi

This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN).

Variational Policy Propagation for Multi-agent Reinforcement Learning

no code implementations19 Apr 2020 Chao Qu, Hui Li, Chang Liu, Junwu Xiong, James Zhang, Wei Chu, Weiqiang Wang, Yuan Qi, Le Song

We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents.

Multi-agent Reinforcement Learning reinforcement-learning +2

Symmetric Regularization based BERT for Pair-wise Semantic Reasoning

1 code implementation8 Sep 2019 Weidi Xu, Xingyi Cheng, Kunlong Chen, Wei Wang, Bin Bi, Ming Yan, Chen Wu, Luo Si, Wei Chu, Taifeng Wang

To remedy this, we propose to augment the NSP task to a 3-class categorization task, which includes a category for previous sentence prediction (PSP).

Machine Reading Comprehension Natural Language Inference +2

BERT-Based Multi-Head Selection for Joint Entity-Relation Extraction

1 code implementation16 Aug 2019 Weipeng Huang, Xingyi Cheng, Taifeng Wang, Wei Chu

Combining these three contributions, we enhance the information extracting ability of the multi-head selection model and achieve F1-score 0. 876 on testset-1 with a single model.

Relation Relation Extraction

Singing voice conversion with non-parallel data

no code implementations11 Mar 2019 Xin Chen, Wei Chu, Jinxi Guo, Ning Xu

F0 and aperiodic are obtained through the original singing voice, and used with acoustic features to reconstruct the target singing voice through a vocoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

BandNet: A Neural Network-based, Multi-Instrument Beatles-Style MIDI Music Composition Machine

no code implementations18 Dec 2018 Yichao Zhou, Wei Chu, Sam Young, Xin Chen

In the learning stage, a sequence of stylistically uniform, multiple-channel music samples was modeled by a RNN.


A Novel Integrated Framework for Learning both Text Detection and Recognition

no code implementations21 Nov 2018 Wanchen Sui, Qing Zhang, Jun Yang, Wei Chu

In this paper, we propose a novel integrated framework for learning both text detection and recognition.

Text Detection

Variational Semi-supervised Aspect-term Sentiment Analysis via Transformer

no code implementations CONLL 2019 Xingyi Cheng, Weidi Xu, Taifeng Wang, Wei Chu

By disentangling the latent representation into the aspect-specific sentiment and the lexical context, our method induces the underlying sentiment prediction for the unlabeled data, which then benefits the ATSA classifier.

Aspect-Based Sentiment Analysis (ABSA) Natural Language Understanding +1

Latent Dirichlet Allocation for Internet Price War

no code implementations23 Aug 2018 Chenchen Li, Xiang Yan, Xiaotie Deng, Yuan Qi, Wei Chu, Le Song, Junlong Qiao, Jianshan He, Junwu Xiong

Then we develop a variant of Latent Dirichlet Allocation (LDA) to infer latent variables under the current market environment, which represents the preferences of customers and strategies of competitors.

Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units

no code implementations13 Jul 2018 Zhangyu Xiao, Zhijian Ou, Wei Chu, Hui Lin

In this paper, we present an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Modelling Domain Relationships for Transfer Learning on Retrieval-based Question Answering Systems in E-commerce

1 code implementation23 Nov 2017 Jianfei Yu, Minghui Qiu, Jing Jiang, Jun Huang, Shuangyong Song, Wei Chu, Haiqing Chen

In this paper, we study transfer learning for the PI and NLI problems, aiming to propose a general framework, which can effectively and efficiently adapt the shared knowledge learned from a resource-rich source domain to a resource- poor target domain.

Chatbot Natural Language Inference +5

AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine

no code implementations ACL 2017 Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu

We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models.

Chatbot Information Retrieval +1

Speaker Cluster-Based Speaker Adaptive Training for Deep Neural Network Acoustic Modeling

no code implementations20 Apr 2016 Wei Chu, Ruxin Chen

The previously trained DNN of the matched speaker cluster is used for decoding utterances of the test speaker.

speech-recognition Speech Recognition +1

Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms

4 code implementations31 Mar 2010 Lihong Li, Wei Chu, John Langford, Xuanhui Wang

\emph{Offline} evaluation of the effectiveness of new algorithms in these applications is critical for protecting online user experiences but very challenging due to their "partial-label" nature.

News Recommendation Recommendation Systems

A Contextual-Bandit Approach to Personalized News Article Recommendation

11 code implementations28 Feb 2010 Lihong Li, Wei Chu, John Langford, Robert E. Schapire

In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks.

Collaborative Filtering Learning Theory

Gaussian Process Models for Link Analysis and Transfer Learning

no code implementations NeurIPS 2007 Kai Yu, Wei Chu

In this paper we develop a Gaussian process (GP) framework to model a collection of reciprocal random variables defined on the \emph{edges} of a network.

Link Prediction Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.