Search Results for author: Zhijian Ou

Found 48 papers, 24 papers with code

Energy-Based Models with Applications to Speech and Language Processing

no code implementations • 16 Mar 2024 • Zhijian Ou

Therefore, the purpose of this monograph is to present a systematic introduction to energy-based models, including both algorithmic progress and applications in speech and language processing.

Language Modelling Natural Language Understanding +3

Paper
Add Code

Prompt Pool based Class-Incremental Continual Learning for Dialog State Tracking

1 code implementation • 17 Nov 2023 • Hong Liu, Yucheng Cai, Yuan Zhou, Zhijian Ou, Yi Huang, Junlan Feng

Inspired by the recently emerging prompt tuning method that performs well on dialog systems, we propose to use the prompt pool method, where we maintain a pool of key-value paired prompts and select prompts from the pool according to the distance between the dialog history and the prompt keys.

Continual Learning dialog state tracking

Paper
Code

UniPCM: Universal Pre-trained Conversation Model with Task-aware Automatic Prompt

no code implementations • 20 Sep 2023 • Yucheng Cai, Wentao Ma, Yuchuan Wu, Shuzheng Si, Yuan Shao, Zhijian Ou, Yongbin Li

Using the high-quality prompts generated, we scale the corpus of the pre-trained conversation model to 122 datasets from 15 dialog-related tasks, resulting in Universal Pre-trained Conversation Model (UniPCM), a powerful foundation model for various conversational tasks and different dialog systems.

Paper
Add Code

Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition

1 code implementation • 22 May 2023 • Hong Liu, Zhaobiao Lv, Zhijian Ou, Wenbo Zhao, Qing Xiao

Energy-based language models (ELMs) parameterize an unnormalized distribution for natural sentences and are radically different from popular autoregressive language models (ALMs).

Sentence speech-recognition +1

307

Paper
Code

Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision

1 code implementation • 22 May 2023 • Yucheng Cai, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

Most existing task-oriented dialog (TOD) systems track dialog states in terms of slots and values and use them to query a database to get relevant knowledge to generate responses.

Question Answering Retrieval

Paper
Code

Persistently Trained, Diffusion-assisted Energy-based Models

1 code implementation • 21 Apr 2023 • Xinwei Zhang, Zhiqiang Tan, Zhijian Ou

Maximum likelihood (ML) learning for energy-based models (EBMs) is challenging, partly due to non-convergence of Markov chain Monte Carlo. Several variations of ML learning have been proposed, but existing methods all fail to achieve both post-training image generation and proper density estimation.

Density Estimation Image Generation +1

Paper
Code

A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems

1 code implementation • 17 Oct 2022 • Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng

Second, an important ingredient in a US is that the user goal can be effectively incorporated and tracked; but how to flexibly integrate goal state tracking and develop an end-to-end trainable US for multi-domains has remained to be a challenge.

Reinforcement Learning (RL)

Paper
Code

Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture

no code implementations • 13 Oct 2022 • Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

Recently, there has been progress in supervised funetuning pretrained GPT-2 to build end-to-end task-oriented dialog (TOD) systems.

Paper
Add Code

Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset

1 code implementation • 27 Sep 2022 • Hong Liu, Hao Peng, Zhijian Ou, Juanzi Li, Yi Huang, Junlan Feng

Recently, there have merged a class of task-oriented dialogue (TOD) datasets collected through Wizard-of-Oz simulated games.

Paper
Code

Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning of Discrete Latent Variable Models

1 code implementation • SIGDIAL (ACL) 2022 • Yucheng Cai, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng

In this paper, we propose to apply JSA to semi-supervised learning of the latent state TOD models, which is referred to as JSA-TOD.

Paper
Code

A Challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Systems

1 code implementation • 6 Jul 2022 • Zhijian Ou, Junlan Feng, Juanzi Li, Yakun Li, Hong Liu, Hao Peng, Yi Huang, Jiangjiang Zhao

A challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Systems, Co-located with EMNLP2022 SereTOD Workshop.

Paper
Code

Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems

2 code implementations • 13 Apr 2022 • Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng

Recently, Transformer based pretrained language models (PLMs), such as GPT2 and T5, have been leveraged to build generative task-oriented dialog (TOD) systems.

Paper
Code

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR

1 code implementation • 31 Mar 2022 • Keyu An, Huahuan Zheng, Zhijian Ou, Hongyu Xiang, Ke Ding, Guanglu Wan

The simulation module is jointly trained with the ASR model using a self-supervised loss; the ASR model is optimized with the usual ASR loss, e. g., CTC-CRF as used in our experiments.

Chunking speech-recognition +1

307

Paper
Code

Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study

no code implementations • 31 Mar 2022 • Keyu An, Ji Xiao, Zhijian Ou

In this paper, we systematically compare the performance of three schemes to exploit external single-channel data for multi-channel end-to-end ASR, namely back-end pre-training, data scheduling, and data simulation, under different settings such as the sizes of the single-channel data and the choices of the front-end.

Scheduling speech-recognition +1

Paper
Add Code

An Empirical Study of Language Model Integration for Transducer based Speech Recognition

no code implementations • 31 Mar 2022 • Huahuan Zheng, Keyu An, Zhijian Ou, Chen Huang, Ke Ding, Guanglu Wan

Based on the DR method, we propose a low-order density ratio method (LODR) by replacing the estimation with a low-order weak language model.

Language Modelling speech-recognition +1

Paper
Add Code

Callee: Recovering Call Graphs for Binaries with Transfer and Contrastive Learning

1 code implementation • 2 Nov 2021 • Wenyu Zhu, Zhiyao Feng, Zihan Zhang, Jianjun Chen, Zhijian Ou, Min Yang, Chao Zhang

Recovering binary programs' call graphs is crucial for inter-procedural analysis tasks and applications based on them. transfer One of the core challenges is recognizing targets of indirect calls (i. e., indirect callees).

Contrastive Learning Question Answering +1

Paper
Code

Variational Latent-State GPT for Semi-Supervised Task-Oriented Dialog Systems

2 code implementations • 9 Sep 2021 • Hong Liu, Yucheng Cai, Zhenru Lin, Zhijian Ou, Yi Huang, Junlan Feng

In this paper, we propose Variational Latent-State GPT model (VLS-GPT), which is the first to combine the strengths of the two approaches.

Paper
Code

Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings

1 code implementation • 11 Jul 2021 • Chengrui Zhu, Keyu An, Huahuan Zheng, Zhijian Ou

The use of phonological features (PFs) potentially allows language-specific phones to remain linked in training, which is highly desirable for information sharing for multilingual and crosslingual speech recognition methods for low-resourced languages.

speech-recognition Speech Recognition

307

Paper
Code

Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers

1 code implementation • 7 Jul 2021 • Huahuan Zheng, Wenjie Peng, Zhijian Ou, Jinsong Zhang

Automatic speech recognition systems have been largely improved in the past few decades and current systems are mainly hybrid-based and end-to-end-based.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

307

Paper
Code

Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition

no code implementations • 6 Jul 2021 • Keyu An, Zhijian Ou

Recently, the end-to-end training approach for neural beamformer-supported multi-channel ASR has shown its effectiveness in multi-channel speech recognition.

Data Augmentation Scheduling +2

Paper
Add Code

Deformable TDNN with adaptive receptive fields for speech recognition

no code implementations • 30 Apr 2021 • Keyu An, Yi Zhang, Zhijian Ou

Time Delay Neural Networks (TDNNs) are widely used in both DNN-HMM based hybrid speech recognition systems and recent end-to-end systems.

speech-recognition Speech Recognition

Paper
Add Code

The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines

no code implementations • 13 Nov 2020 • Fan Yu, Zhuoyuan Yao, Xiong Wang, Keyu An, Lei Xie, Zhijian Ou, Bo Liu, Xiulin Li, Guanqiong Miao

Automatic speech recognition (ASR) has been significantly advanced with the use of deep learning and big data.

Sound Audio and Speech Processing

Paper
Add Code

Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients

1 code implementation • 11 Nov 2020 • Huahuan Zheng, Keyu An, Zhijian Ou

Using ST gradients to support sub-graph sampling is a core element to achieve efficient NAS beyond DARTS and SNAS.

Ranked #1 on Speech Recognition on WSJ dev93

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

An empirical study of domain-agnostic semi-supervised learning via energy-based models: joint-training and pre-training

no code implementations • 25 Oct 2020 • Yunfu Song, Huahuan Zheng, Zhijian Ou

In contrast, generative SSL methods involve unsupervised learning based on generative models by either joint-training or pre-training, and are more appealing from the perspective of being domain-agnostic, since they do not inherently require data augmentations.

Image Classification

Paper
Add Code

A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

1 code implementation • EMNLP 2020 • Yichi Zhang, Zhijian Ou, Huixin Wang, Junlan Feng

In this paper we aim at alleviating the reliance on belief state labels in building end-to-end dialog systems, by leveraging unlabeled dialog data towards semi-supervised learning.

Ranked #2 on End-To-End Dialogue Modelling on MULTIWOZ 2.1

End-To-End Dialogue Modelling

Paper
Code

Joint Stochastic Approximation and Its Application to Learning Discrete Latent Variable Models

1 code implementation • 28 May 2020 • Zhijian Ou, Yunfu Song

Although with progress in introducing auxiliary amortized inference models, learning discrete latent variable models is still challenging.

Structured Prediction

Paper
Code

CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency

1 code implementation • 27 May 2020 • Keyu An, Hongyu Xiang, Zhijian Ou

In this paper, we present a new open source toolkit for speech recognition, named CAT (CTC-CRF based ASR Toolkit).

Ranked #1 on Speech Recognition on Hub5'00 FISHER-SWBD

speech-recognition Speech Recognition

307

Paper
Code

Paraphrase Augmented Task-Oriented Dialog Generation

1 code implementation • ACL 2020 • Silin Gao, Yichi Zhang, Zhijian Ou, Zhou Yu

Neural generative models have achieved promising performance on dialog generation tasks if given a huge data set.

Data Augmentation Response Generation

Paper
Code

Integrating Discrete and Neural Features via Mixed-feature Trans-dimensional Random Field Language Models

no code implementations • 14 Feb 2020 • Silin Gao, Zhijian Ou, Wei Yang, Huifang Xu

There has been a long recognition that discrete features (n-gram features) and neural network based features have complementary strengths for language models (LMs).

speech-recognition Speech Recognition

Paper
Add Code

Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context

6 code implementations • 24 Nov 2019 • Yichi Zhang, Zhijian Ou, Zhou Yu

Conversations have an intrinsic one-to-many property, which means that multiple responses can be appropriate for the same dialog context.

Ranked #6 on End-To-End Dialogue Modelling on MULTIWOZ 2.0

Data Augmentation End-To-End Dialogue Modelling +1

Paper
Code

CAT: CRF-based ASR Toolkit

2 code implementations • 20 Nov 2019 • Keyu An, Hongyu Xiang, Zhijian Ou

In this paper, we present a new open source toolkit for automatic speech recognition (ASR), named CAT (CRF-based ASR Toolkit).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

307

Paper
Code

CRF-based Single-stage Acoustic Modeling with CTC Topology

1 code implementation • 16 Apr 2019 • Hongyu Xiang, Zhijian Ou

CTC-CRF is conceptually simple, which basically implements a CRF layer on top of features generated by the bottom neural network with the special state topology.

Ranked #2 on Speech Recognition on WSJ eval93

Benchmarking Speech Recognition

307

Paper
Code

Neural CRF transducers for sequence labeling

no code implementations • 4 Nov 2018 • Kai Hu, Zhijian Ou, Min Hu, Junlan Feng

Conditional random fields (CRFs) have been shown to be one of the most successful approaches to sequence labeling.

Chunking NER +2

Paper
Add Code

Elastic CRFs for Open-ontology Slot Filling

no code implementations • 4 Nov 2018 • Yinpei Dai, Yichi Zhang, Zhijian Ou, Yanmeng Wang, Junlan Feng

Second, the one-hot encoding of slot labels ignores the semantic meanings and relations for slots, which are implicit in their natural language descriptions.

slot-filling Slot Filling

Paper
Add Code

Learning Neural Random Fields with Inclusive Auxiliary Generators

no code implementations • 27 Sep 2018 • Yunfu Song, Zhijian Ou

Neural random fields (NRFs), which are defined by using neural networks to implement potential functions in undirected models, provide an interesting family of model spaces for machine learning.

Image Generation

Paper
Add Code

A Review of Learning with Deep Generative Models from Perspective of Graphical Modeling

no code implementations • 5 Aug 2018 • Zhijian Ou

This document aims to provide a review on learning with deep generative models (DGMs), which is an highly-active area in machine learning and more generally, artificial intelligence.

Paper
Add Code

Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units

no code implementations • 13 Jul 2018 • Zhangyu Xiao, Zhijian Ou, Wei Chu, Hui Lin

In this paper, we present an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation

1 code implementation • 3 Jul 2018 • Bin Wang, Zhijian Ou

First, a dynamic noise distribution is introduced and trained simultaneously to converge to the data distribution.

Language Modelling Sentence

Paper
Code

Generative Modeling by Inclusive Neural Random Fields with Applications in Image Generation and Anomaly Detection

1 code implementation • 1 Jun 2018 • Yunfu Song, Zhijian Ou

With these contributions and results, this paper significantly advances the learning and applications of NRFs to a new level, both theoretically and empirically, which have never been obtained before.

Anomaly Detection Image Generation

Paper
Code

Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning

no code implementations • ICLR 2018 • Yichi Zhang, Zhijian Ou

An ensemble of neural networks is known to be more robust and accurate than an individual network, however usually with linearly-increased cost in both training and testing.

Language Modelling Network Pruning

Paper
Add Code

Tracking of enriched dialog states for flexible conversational information access

no code implementations • 9 Nov 2017 • Yinpei Dai, Zhijian Ou, Dawei Ren, Pengfei Yu

The above observations motivate us to enrich current representation of dialog states and collect a brand new dialog dataset about movies, based upon which we build a new DST, called enriched DST (EDST), for flexible accessing movie information.

dialog state tracking slot-filling +1

Paper
Add Code

Learning neural trans-dimensional random field language models with noise-contrastive estimation

no code implementations • 30 Oct 2017 • Bin Wang, Zhijian Ou

However, the training efficiency of neural TRF LMs is not satisfactory, which limits the scalability of TRF LMs on large training corpus.

speech-recognition Speech Recognition

Paper
Add Code

Language modeling with Neural trans-dimensional random fields

no code implementations • 23 Jul 2017 • Bin Wang, Zhijian Ou

The idea is to use nonlinear potentials with continuous features, implemented by neural networks (NNs), in the TRF framework.

Language Modelling speech-recognition +1

Paper
Add Code

Joint Bayesian Gaussian discriminant analysis for speaker verification

no code implementations • 13 Dec 2016 • Yiyan Wang, Haotian Xu, Zhijian Ou

State-of-the-art i-vector based speaker verification relies on variants of Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis.

Face Verification Speaker Verification

Paper
Add Code

Model Interpolation with Trans-dimensional Random Field Language Models for Speech Recognition

no code implementations • 30 Mar 2016 • Bin Wang, Zhijian Ou, Yong He, Akinori Kawamura

The dominant language models (LMs) such as n-gram and neural network (NN) models represent sentence probabilities in terms of conditionals.

Sentence speech-recognition +1

Paper
Add Code

Joint Stochastic Approximation learning of Helmholtz Machines

no code implementations • 20 Mar 2016 • Haotian Xu, Zhijian Ou

Though with progress, model learning and performing posterior inference still remains a common challenge for using deep generative models, especially for handling discrete hidden variables.

Paper
Add Code

Trans-dimensional Random Fields for Language Modeling

no code implementations • IJCNLP 2015 • Bin Wang, Zhijian Ou, Zhiqiang Tan

Information Retrieval Language Modelling +2

Paper
Add Code

Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection

no code implementations • 20 Mar 2015 • Jinye Zhang, Zhijian Ou

Existing MAP inference algorithms for determinantal point processes (DPPs) need to calculate determinants or conduct eigenvalue decomposition generally at the scale of the full kernel, which presents a great challenge for real-world applications.

Change Point Detection Point Processes

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.