Search Results for author: Ao Zhang

Found 23 papers, 12 papers with code

DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset

no code implementations • EMNLP 2020 • Lijie Wang, Ao Zhang, Kun Wu, Ke Sun, Zhenghua Li, Hua Wu, Min Zhang, Haifeng Wang

This paper describes in detail the construction process and data statistics of DuSQL.

SQL Parsing Text-To-SQL

Paper
Add Code

Physical formula enhanced multi-task learning for pharmacokinetics prediction

no code implementations • 16 Apr 2024 • Ruifeng Li, Dongzhan Zhou, Ancheng Shen, Ao Zhang, Mao Su, Mingqian Li, Hongyang Chen, Gang Chen, Yin Zhang, Shufei Zhang, Yuqiang Li, Wanli Ouyang

Overall, our work illustrates the benefits and potential of using PEMAL in AIDD and other scenarios with data scarcity and noise.

Drug Discovery Multi-Task Learning

Paper
Add Code

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge

no code implementations • 7 Jan 2024 • He Wang, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, BinBin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li

To promote speech processing and recognition research in driving scenarios, we build on the success of the Intelligent Cockpit Speech Recognition Challenge (ICSRC) held at ISCSLP 2022 and launch the ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Knowledge Enhanced Conditional Imputation for Healthcare Time-series

1 code implementation • 27 Dec 2023 • Linglong Qian, Zina Ibrahim, Hugh Logan Ellis, Ao Zhang, Yuezhou Zhang, Tao Wang, Richard Dobson

This study presents a novel approach to addressing the challenge of missing data in multivariate time series, with a particular focus on the complexities of healthcare data.

Imputation Time Series

Paper
Code

U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword Bias

no code implementations • 15 Dec 2023 • Ao Zhang, Pan Zhou, Kaixun Huang, Yong Zou, Ming Liu, Lei Xie

Open-vocabulary keyword spotting (KWS), which allows users to customize keywords, has attracted increasingly more interest.

Decoder Keyword Spotting

Paper
Add Code

NExT-Chat: An LMM for Chat, Detection and Segmentation

1 code implementation • 8 Nov 2023 • Ao Zhang, Yuan YAO, Wei Ji, Zhiyuan Liu, Tat-Seng Chua

The development of large language models (LLMs) has greatly advanced the field of multimodal understanding, leading to the emergence of large multimodal models (LMMs).

Referring Expression Referring Expression Segmentation +1

178

Paper
Code

Spike-Triggered Contextual Biasing for End-to-End Mandarin Speech Recognition

no code implementations • 7 Oct 2023 • Kaixun Huang, Ao Zhang, BinBin Zhang, Tianyi Xu, Xingchen Song, Lei Xie

However, unlike shallow fusion methods that directly bias the posterior of the ASR model, deep biasing methods implicitly integrate contextual information, making it challenging to control the degree of bias.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition

no code implementations • 1 Jun 2023 • Tianyi Xu, Zhanheng Yang, Kaixun Huang, Pengcheng Guo, Ao Zhang, Biao Li, Changru Chen, Chao Li, Lei Xie

By incorporating additional contextual information, deep biasing methods have emerged as a promising solution for speech recognition of personalized words.

speech-recognition Speech Recognition

Paper
Add Code

Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

no code implementations • 21 May 2023 • Kaixun Huang, Ao Zhang, Zhanheng Yang, Pengcheng Guo, Bingshen Mu, Tianyi Xu, Lei Xie

In this study, we introduce a contextual phrase prediction network for an attention-based deep bias method.

speech-recognition Speech Recognition

Paper
Add Code

VPGTrans: Transfer Visual Prompt Generator across LLMs

1 code implementation • NeurIPS 2023 • Ao Zhang, Hao Fei, Yuan YAO, Wei Ji, Li Li, Zhiyuan Liu, Tat-Seng Chua

While developing a new multimodal LLM (MLLM) by pre-training on tremendous image-text pairs from scratch can be exceedingly resource-consuming, connecting an existing LLM with a comparatively lightweight visual prompt generator (VPG) becomes a feasible paradigm.

Transfer Learning

263

Paper
Code

The NPU-ASLP System for Audio-Visual Speech Recognition in MISP 2022 Challenge

no code implementations • 11 Mar 2023 • Pengcheng Guo, He Wang, Bingshen Mu, Ao Zhang, Peikun Chen

This paper describes our NPU-ASLP system for the Audio-Visual Diarization and Recognition (AVDR) task in the Multi-modal Information based Speech Processing (MISP) 2022 Challenge.

Audio-Visual Speech Recognition speech-recognition +1

Paper
Add Code

Visually Grounded Commonsense Knowledge Acquisition

1 code implementation • 22 Nov 2022 • Yuan YAO, Tianyu Yu, Ao Zhang, Mengdi Li, Ruobing Xie, Cornelius Weber, Zhiyuan Liu, Hai-Tao Zheng, Stefan Wermter, Tat-Seng Chua, Maosong Sun

In this work, we present CLEVER, which formulates CKE as a distantly supervised multi-instance learning problem, where models learn to summarize commonsense relations from a bag of images about an entity pair without any human annotation on image instances.

Language Modelling

Paper
Code

Prompt Tuning for Discriminative Pre-trained Language Models

1 code implementation • Findings (ACL) 2022 • Yuan YAO, Bowen Dong, Ao Zhang, Zhengyan Zhang, Ruobing Xie, Zhiyuan Liu, Leyu Lin, Maosong Sun, Jianyong Wang

Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.

Language Modelling Question Answering +2

Paper
Code

PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models

1 code implementation • 23 May 2022 • Yuan YAO, Qianyu Chen, Ao Zhang, Wei Ji, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun

We show that PEVL enables state-of-the-art performance of detector-free VLP models on position-sensitive tasks such as referring expression comprehension and phrase grounding, and also improves the performance on position-insensitive tasks with grounded inputs.

Ranked #1 on Visual Commonsense Reasoning on VCR (Q-AR) test

Language Modelling Object +7

Paper
Code

Fine-Grained Scene Graph Generation with Data Transfer

2 code implementations • 22 Mar 2022 • Ao Zhang, Yuan YAO, Qianyu Chen, Wei Ji, Zhiyuan Liu, Maosong Sun, Tat-Seng Chua

Scene graph generation (SGG) is designed to extract (subject, predicate, object) triplets in images.

Ranked #1 on Predicate Classification on Visual Genome

Graph Generation Predicate Classification +3

Paper
Code

CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models

1 code implementation • 24 Sep 2021 • Yuan YAO, Ao Zhang, Zhengyan Zhang, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun

Pre-Trained Vision-Language Models (VL-PTMs) have shown promising capabilities in grounding natural language in image data, facilitating a broad variety of cross-modal tasks.

Visual Grounding

Paper
Code

Pre-Trained Models: Past, Present and Future

no code implementations • 14 Jun 2021 • Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan YAO, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI).

Computational Efficiency Self-Supervised Learning +1

Paper
Add Code

RADDet: Range-Azimuth-Doppler based Radar Object Detection for Dynamic Road Users

1 code implementation • 2 May 2021 • Ao Zhang, Farzan Erlik Nowruzi, Robert Laganiere

In this paper, we collect a novel radar dataset that contains radar data in the form of Range-Azimuth-Doppler tensors along with the bounding boxes on the tensor for dynamic road users, category labels, and 2D bounding boxes on the Cartesian Bird-Eye-View range map.

object-detection Object Detection +1

144

Paper
Code

Visual Distant Supervision for Scene Graph Generation

1 code implementation • ICCV 2021 • Yuan YAO, Ao Zhang, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu, Stefan Wermter, Maosong Sun

In this work, we propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.

Graph Generation Predicate Classification +2

Paper
Code

Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing

1 code implementation • EMNLP 2021 • Kun Wu, Lijie Wang, Zhenghua Li, Ao Zhang, Xinyan Xiao, Hua Wu, Min Zhang, Haifeng Wang

For better distribution matching, we require that at least 80% of SQL patterns in the training data are covered by generated queries.

Data Augmentation Question Generation +3

1,695

Paper
Code

DefenseVGAE: Defending against Adversarial Attacks on Graph Data via a Variational Graph Autoencoder

1 code implementation • 16 Jun 2020 • Ao Zhang, Jinwen Ma

Graph neural networks (GNNs) achieve remarkable performance for tasks on graph data.

Paper
Code

tau-FPL: Tolerance-Constrained Learning in Linear Time

no code implementations • 15 Jan 2018 • Ao Zhang, Nan Li, Jian Pu, Jun Wang, Junchi Yan, Hongyuan Zha

Learning a classifier with control on the false-positive rate plays a critical role in many machine learning applications.

Paper
Add Code

NEU Systems in SIGHAN Bakeoff 2012

no code implementations • WS 2012 • Ji Ma, LongFei Bai, Zhuo Liu, Ao Zhang, Jingbo Zhu

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.