Search Results for author: Chao Zhang

Found 329 papers, 124 papers with code

AcTune: Uncertainty-Based Active Self-Training for Active Fine-Tuning of Pretrained Language Models

1 code implementation NAACL 2022 Yue Yu, Lingkai Kong, Jieyu Zhang, Rongzhi Zhang, Chao Zhang

We develop AcTune, a new framework that improves the label efficiency of active PLM fine-tuning by unleashing the power of unlabeled data via self-training.

Active Learning text-classification +1

Transferring SLU Models in Novel Domains

no code implementations ICLR 2019 Yaohua Tang, Kaixiang Mo, Qian Xu, Chao Zhang, Qiang Yang

When building models for novel natural language domains, a major challenge is the lack of data in the new domains, no matter whether the data is annotated or not.

Intent Recognition Meta-Learning +4

Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

1 code implementation ACL 2022 Rongzhi Zhang, Yue Yu, Pranav Shetty, Le Song, Chao Zhang

Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set is tedious and difficult.

Weakly-supervised Learning

Empowering Image Recovery_ A Multi-Attention Approach

no code implementations6 Apr 2024 Juan Wen, Yawei Li, Chao Zhang, Weiyan Hou, Radu Timofte, Luc van Gool

Integration of attention mechanisms across feature and positional dimensions further enhances the recovery of fine details.

Image Restoration

Semantic Map-based Generation of Navigation Instructions

1 code implementation28 Mar 2024 Chengzu Li, Chao Zhang, Simone Teufel, Rama Sanand Doddipatla, Svetlana Stoyanchev

In this paper, we propose a new approach to navigation instruction generation by framing the problem as an image captioning task using semantic maps as visual input.

Image Captioning

Leveraging Large Language Model to Generate a Novel Metaheuristic Algorithm with CRISPE Framework

1 code implementation25 Mar 2024 Rui Zhong, Yuefeng Xu, Chao Zhang, Jun Yu

In this paper, we borrow the large language model (LLM) ChatGPT-3. 5 to automatically and quickly design a new metaheuristic algorithm (MA) with only a small amount of input.

Language Modelling Large Language Model +1

M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset

no code implementations21 Mar 2024 Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang

Although multiple academic video datasets have been constructed and released, few of them support both multimodal content recognition and understanding tasks, which is partially due to the lack of high-quality human annotations.

speech-recognition Speech Recognition +1

ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models

1 code implementation17 Mar 2024 Yuzhao Heng, Chunyuan Deng, Yitong Li, Yue Yu, Yinghao Li, Rongzhi Zhang, Chao Zhang

Although Large Language Models (LLMs) exhibit remarkable adaptability across domains, these models often fall short in structured knowledge extraction tasks such as named entity recognition (NER).

Attribute named-entity-recognition +2

Efficient Multiplayer Battle Game Optimizer for Adversarial Robust Neural Architecture Search

1 code implementation15 Mar 2024 Rui Zhong, Yuefeng Xu, Chao Zhang, Jun Yu

This paper introduces a novel metaheuristic algorithm, known as the efficient multiplayer battle game optimizer (EMBGO), specifically designed for addressing complex numerical optimization tasks.

Neural Architecture Search

DiaLoc: An Iterative Approach to Embodied Dialog Localization

no code implementations11 Mar 2024 Chao Zhang, Mohan Li, Ignas Budvytis, Stephan Liwicki

However, most existing works in embodied dialog research focus on navigation and leave the localization task understudied.

NoteLLM: A Retrievable Large Language Model for Note Recommendation

no code implementations4 Mar 2024 Chao Zhang, Shiwei Wu, Haoxin Zhang, Tong Xu, Yan Gao, Yao Hu, Di wu, Enhong Chen

Indeed, learning to generate hashtags/categories can potentially enhance note embeddings, both of which compress key note information into limited content.

Contrastive Learning Language Modelling +1

APISR: Anime Production Inspired Real-World Anime Super-Resolution

1 code implementation3 Mar 2024 Boyang Wang, Fengyu Yang, Xihang Yu, Chao Zhang, Hanbin Zhao

In addition, we identify two anime-specific challenges of distorted and faint hand-drawn lines and unwanted color artifacts.


Accelerating materials discovery for polymer solar cells: Data-driven insights enabled by natural language processing

1 code implementation29 Feb 2024 Pranav Shetty, Aishat Adeboye, Sonakshi Gupta, Chao Zhang, Rampi Ramprasad

We present a natural language processing pipeline that was used to extract polymer solar cell property data from the literature and simulate various active learning strategies.

Active Learning

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints

no code implementations28 Feb 2024 Lingkai Kong, Yuanqi Du, Wenhao Mu, Kirill Neklyudov, Valentin De Bortol, Haorui Wang, Dongxia Wu, Aaron Ferber, Yi-An Ma, Carla P. Gomes, Chao Zhang

To constrain the optimization process to the data manifold, we reformulate the original optimization problem as a sampling problem from the product of the Boltzmann distribution defined by the objective function and the data distribution learned by the diffusion model.

CLAP: Learning Transferable Binary Code Representations with Natural Language Supervision

1 code implementation26 Feb 2024 Hao Wang, Zeyu Gao, Chao Zhang, Zihan Sha, Mingyang Sun, Yuchen Zhou, Wenyu Zhu, Wenju Sun, Han Qiu, Xi Xiao

At the core, our approach boosts superior transfer learning capabilities by effectively aligning binary code with their semantics explanations (in natural language), resulting a model able to generate better embeddings for binary code.

Representation Learning Transfer Learning

ARL2: Aligning Retrievers for Black-box Large Language Models via Self-guided Adaptive Relevance Labeling

no code implementations21 Feb 2024 Lingxi Zhang, Yue Yu, Kuan Wang, Chao Zhang

Retrieval-augmented generation enhances large language models (LLMs) by incorporating relevant information from external knowledge sources.

Retrieval Transfer Learning +1

A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction

1 code implementation20 Feb 2024 Yinghao Li, Rampi Ramprasad, Chao Zhang

It breaks the generation into a two-step pipeline: initially, LLMs generate answers in natural language as intermediate responses.

Language Modelling named-entity-recognition +4

BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models

1 code implementation13 Feb 2024 Haotian Sun, Yuchen Zhuang, Wei Wei, Chao Zhang, Bo Dai

BBox-Adapter distinguishes target and source domain data by treating target data as positive and source data as negative.

Neural Sinkhorn Gradient Flow

no code implementations25 Jan 2024 Huminhao Zhu, Fangyikang Wang, Chao Zhang, Hanbin Zhao, Hui Qian

We utilize the velocity field matching training scheme in NSGF, which only requires samples from the source and target distribution to compute an empirical velocity field approximation.

LightSleepNet: Design of a Personalized Portable Sleep Staging System Based on Single-Channel EEG

no code implementations24 Jan 2024 Yiqiao Liao, Chao Zhang, Milin Zhang, Zhihua Wang, Xiang Xie

This paper proposed LightSleepNet - a light-weight, 1-d Convolutional Neural Network (CNN) based personalized architecture for real-time sleep staging, which can be implemented on various mobile platforms with limited hardware resources.

EEG Sleep Staging +1

TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance

no code implementations24 Jan 2024 Haorui Wang, Rongzhi Zhang, Yinghao Li, Lingkai Kong, Yuchen Zhuang, Xiusi Chen, Chao Zhang

The teacher LLM generates problem-solving instructions and corrective principles based on the student LLM's errors.

Language Modelling

FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis

no code implementations19 Jan 2024 Chao Zhang, YUREN MAO, Yijiang Fan, Yu Mi, Yunjun Gao, Lu Chen, Dongfang Lou, Jinshu Lin

Text-to-SQL, which provides zero-code interface for operating relational databases, has gained much attention in financial analysis; because, financial professionals may not well-skilled in SQL programming.

Language Modelling Large Language Model +1

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

1 code implementation19 Jan 2024 Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng

To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Misconfidence-based Demonstration Selection for LLM In-Context Learning

no code implementations12 Jan 2024 Shangqing Xu, Chao Zhang

In each step, it analyzes a pool of candidate examples and identifies the ones most likely to challenge the LLM's current understanding, measured by a new metric called misconfidence.

In-Context Learning

A Closed-loop Brain-Machine Interface SoC Featuring a 0.2$μ$J/class Multiplexer Based Neural Network

no code implementations7 Jan 2024 Chao Zhang, Yongxiang Guo, Dawid Sheng, Zhixiong Ma, Chao Sun, Yuwei Zhang, Wenxin Zhao, Fenyan Zhang, Tongfei Wang, Xing Sheng, Milin Zhang

This work presents the first fabricated electrophysiology-optogenetic closed-loop bidirectional brain-machine interface (CL-BBMI) system-on-chip (SoC) with electrical neural signal recording, on-chip sleep staging and optogenetic stimulation.

Sleep Staging

Multi-Channel Multi-Domain based Knowledge Distillation Algorithm for Sleep Staging with Single-Channel EEG

no code implementations7 Jan 2024 Chao Zhang, Yiqiao Liao, Siqi Han, Milin Zhang, Zhihua Wang, Xiang Xie

The proposed algorithm achieves a state-of-the-art single-channel sleep staging accuracy of 86. 5%, with only 0. 6% deterioration from the state-of-the-art multi-channel model.

EEG Knowledge Distillation +1

3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding

1 code implementation6 Jan 2024 Zeju Li, Chao Zhang, Xiaoyan Wang, Ruilong Ren, Yifan Xu, Ruifei Ma, Xiangde Liu

The remarkable potential of multi-modal large language models (MLLMs) in comprehending both vision and language information has been widely acknowledged.

Scene Understanding Visual Question Answering (VQA)

Towards Modeling Uncertainties of Self-explaining Neural Networks via Conformal Prediction

no code implementations3 Jan 2024 Wei Qian, Chenxu Zhao, Yangyi Li, Fenglong Ma, Chao Zhang, Mengdi Huai

To tackle the aforementioned challenges, in this paper, we design a novel uncertainty modeling framework for self-explaining networks, which not only demonstrates strong distribution-free uncertainty modeling performance for the generated explanations in the interpretation layer but also excels in producing efficient and effective prediction sets for the final predictions based on the informative high-level basis explanations.

Conformal Prediction Uncertainty Quantification

Multiplayer Battle Game-Inspired Optimizer for Complex Optimization Problems

no code implementations31 Dec 2023 Yuefeng Xu, Rui Zhong, Chao Zhang, Jun Yu

Various popular multiplayer battle royale games share a lot of common elements.

Large Language Models for Generative Information Extraction: A Survey

1 code implementation29 Dec 2023 Derong Xu, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, Enhong Chen

Information extraction (IE) aims to extract structural knowledge (such as entities, relations, and events) from plain natural language texts.

GAD-PVI: A General Accelerated Dynamic-Weight Particle-Based Variational Inference Framework

no code implementations27 Dec 2023 Fangyikang Wang, Huminhao Zhu, Chao Zhang, Hanbin Zhao, Hui Qian

Particle-based Variational Inference (ParVI) methods approximate the target distribution by iteratively evolving finite weighted particle systems.

Position Variational Inference

A Joint Multi-Gradient Algorithm for Demosaicing Bayer Images

no code implementations International Conference on Communication, Image and Signal Processing (CCISP) 2023 Di wu, Zhihui Xin, Chao Zhang

Experiments show that the algorithm in this paper has better recovery in image edges as well as texture complex regions with higher PSNR and SSIM values and better subjective visual perception compared to the traditional gradient algorithms such as BI, Cok, Hibbard, Laroche, Hamiton, while the algorithm involves only the add-subtract and shift operations, which is suitable to be implemented on the hardware platform.

Demosaicking SSIM

Multilevel Saliency-Guided Self-Supervised Learning for Image Anomaly Detection

no code implementations30 Nov 2023 Jianjian Qin, Chunzhi Gu, Jun Yu, Chao Zhang

To fully exploit saliency guidance, on each map, we select a pixel pair from the cluster with the highest centroid saliency to form a patch pair.

Anomaly Detection Self-Supervised Learning

LanGWM: Language Grounded World Model

no code implementations29 Nov 2023 Rudra P. K. Poudel, Harit Pandya, Chao Zhang, Roberto Cipolla

Furthermore, our proposed technique of explicit language-grounded visual representation learning has the potential to improve models for human-robot interaction because our extracted visual features are language grounded.

Model-based Reinforcement Learning Out-of-Distribution Generalization +2

How Far Have We Gone in Vulnerability Detection Using Large Language Models

1 code implementation21 Nov 2023 Zeyu Gao, Hao Wang, Yuchen Zhou, Wenyu Zhu, Chao Zhang

Given the significant successes of large language models (LLMs) in various tasks, there is growing anticipation of their efficacy in vulnerability detection.

Vulnerability Detection

PolyIE: A Dataset of Information Extraction from Polymer Material Scientific Literature

1 code implementation13 Nov 2023 Jerry Junyang Cheung, Yuchen Zhuang, Yinghao Li, Pranav Shetty, Wantian Zhao, Sanjeev Grampurohit, Rampi Ramprasad, Chao Zhang

Scientific information extraction (SciIE), which aims to automatically extract information from scientific literature, is becoming more important than ever.

Relation Extraction

Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning

no code implementations13 Nov 2023 Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, Michael Bendersky

To fully unleash the power of explanations, we propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.

In-Context Learning Language Modelling +2

Speech-based Slot Filling using Large Language Models

no code implementations13 Nov 2023 Guangzhi Sun, Shutong Feng, Dongcheng Jiang, Chao Zhang, Milica Gašić, Philip C. Woodland

Recently, advancements in large language models (LLMs) have shown an unprecedented ability across various language tasks.

In-Context Learning slot-filling +1

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study

1 code implementation13 Nov 2023 Yinghao Li, Haorui Wang, Chao Zhang

Large Language Models (LLMs) have shown remarkable proficiency in language understanding and have been successfully applied to a variety of real-world tasks through task-specific fine-tuning or prompt engineering.

Logical Reasoning Prompt Engineering

Image-Pointcloud Fusion based Anomaly Detection using PD-REAL Dataset

no code implementations7 Nov 2023 Jianjian Qin, Chunzhi Gu, Jun Yu, Chao Zhang

We present PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain.

Unsupervised Anomaly Detection

Improving MIMO channel estimation via receive power feedback

no code implementations1 Nov 2023 Chao Zhang, Hang Zou, Samson Lasaulce, Lucas Saludjian

Estimating the channel state is known to be an important problem in wireless networks.

Orientation-Aware Leg Movement Learning for Action-Driven Human Motion Prediction

no code implementations23 Oct 2023 Chunzhi Gu, Chao Zhang, Shigeru Kuriyama

Specifically, we follow a two-stage forecasting strategy by first employing the motion diffusion model to generate the target motion with a specified future action, and then producing the in-betweening to smoothly connect the observation and prediction to eventually address motion prediction.

Human motion prediction motion prediction

ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search

no code implementations20 Oct 2023 Yuchen Zhuang, Xiang Chen, Tong Yu, Saayan Mitra, Victor Bursztyn, Ryan A. Rossi, Somdeb Sarkhel, Chao Zhang

It formulates the entire action space as a decision tree, where each node represents a possible API function call involved in a solution plan.

Decision Making valid

SALMONN: Towards Generic Hearing Abilities for Large Language Models

1 code implementation20 Oct 2023 Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory information consisting of at least three types of sounds: speech, audio events, and music.

Audio captioning Automatic Speech Recognition +10

When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting

1 code implementation17 Oct 2023 Harshavardhan Kamarthi, Lingkai Kong, Alexander Rodríguez, Chao Zhang, B. Aditya Prakash

We close both these gap and propose PROFHiT, which is a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.

Time Series Time Series Forecasting

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

2 code implementations9 Oct 2023 Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Audio-visual large language models (LLM) have drawn significant attention, yet the fine-grained combination of both input streams is rather under-explored, which is challenging but necessary for LLMs to understand general video inputs.

Question Answering Video Question Answering

Conditional Diffusion Model for Target Speaker Extraction

no code implementations7 Oct 2023 Theodor Nguyen, Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C Woodland

For the reverse-time process, a parametrised score function is conditioned on a target speaker embedding to extract the target speaker from the mixture of sources.

Target Speaker Extraction

Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection

no code implementations6 Oct 2023 Ziyun Cui, Wen Wu, Wei-Qiang Zhang, Ji Wu, Chao Zhang

Apart from the knowledge from speech-generic representations, this paper also proposes to simultaneously transfer the knowledge from a speech depression detection task based on the high comorbidity rates of depression and AD.

Alzheimer's Disease Detection Depression Detection +1

Joint Projection Learning and Tensor Decomposition Based Incomplete Multi-view Clustering

1 code implementation6 Oct 2023 Wei Lv, Chao Zhang, Huaxiong Li, Xiuyi Jia, Chunlin Chen

We further consider the graph noise of projected data caused by missing samples and use a tensor-decomposition based graph filter for robust clustering. JPLTD decomposes the original tensor into an intrinsic tensor and a sparse tensor.

Clustering Incomplete multi-view clustering +1

Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification

no code implementations4 Oct 2023 Guoxin Wang, Xuyang Cao, Shan An, Fengmei Fan, Chao Zhang, Jinsong Wang, Feng Yu, Zhiren Wang

In this work, we proposed a multi-dimension-embedding-aware modality fusion transformer (MFFormer) for schizophrenia and bipolar disorder classification using rs-fMRI and T1 weighted structural MRI (T1w sMRI).

Time Series

Adapting LLM Agents Through Communication

no code implementations1 Oct 2023 Kuan Wang, Yadong Lu, Michael Santacroce, Yeyun Gong, Chao Zhang, Yelong Shen

To help these agents adapt to new tasks without extensive human supervision, we propose the Learning through Communication (LTC) paradigm, a novel training approach enabling LLM agents to improve continuously through interactions with their environments and other agents.

Decision Making GSM8K

It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation

1 code implementation30 Sep 2023 Wen Wu, Wenlin Chen, Chao Zhang, Philip C. Woodland

Human annotator simulation (HAS) serves as a cost-effective substitute for human evaluation such as data annotation and system assessment.

Density Estimation Meta-Learning

Subspace-Guided Feature Reconstruction for Unsupervised Anomaly Localization

no code implementations25 Sep 2023 Katsuya Hotta, Chao Zhang, Yoshihiro Hagihara, Takuya Akashi

In this paper, we propose a novel subspace-guided feature reconstruction framework to pursue adaptive feature approximation for anomaly localization.

Connecting Speech Encoder and Large Language Model for ASR

no code implementations25 Sep 2023 Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Q-Former-based LLMs can generalise well to out-of-domain datasets, where 12% relative WER reductions over the Whisper baseline ASR model were achieved on the Eval2000 test set without using any in-domain training data from Switchboard.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Affect Recognition in Conversations Using Large Language Models

no code implementations22 Sep 2023 Shutong Feng, Guangzhi Sun, Nurul Lubis, Chao Zhang, Milica Gašić

This study delves into the capacity of large language models (LLMs) to recognise human affect in conversations, with a focus on both open-domain chit-chat dialogues and task-oriented dialogues.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Enhancing Quantised End-to-End ASR Models via Personalisation

1 code implementation17 Sep 2023 Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng

Recent end-to-end automatic speech recognition (ASR) models have become increasingly larger, making them particularly challenging to be deployed on resource-constrained devices.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

A Multi-In and Multi-Out Dendritic Neuron Model and its Optimization

no code implementations14 Sep 2023 Yu Ding, Jun Yu, Chunzhi Gu, Shangce Gao, Chao Zhang

Recently, a novel mathematical ANN model, known as the dendritic neuron model (DNM), has been proposed to address nonlinear problems by more accurately reflecting the structure of real neurons.

Multi-class Classification

RAIN: Your Language Models Can Align Themselves without Finetuning

1 code implementation13 Sep 2023 Yuhui Li, Fangyun Wei, Jinjing Zhao, Chao Zhang, Hongyang Zhang

We discover that by integrating self-evaluation and rewind mechanisms, unaligned LLMs can directly produce responses consistent with human preferences via self-boosting.

Adversarial Attack

Can Whisper perform speech-based in-context learning?

no code implementations13 Sep 2023 Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang

Language-level adaptation experiments using Chinese dialects showed that when applying SICL to isolated word ASR, consistent and considerable relative WER reductions can be achieved using Whisper models of any size on two dialects, which is on average 32. 3%.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

AGMDT: Virtual Staining of Renal Histology Images with Adjacency-Guided Multi-Domain Transfer

no code implementations12 Sep 2023 Tao Ma, Chao Zhang, Min Lu, Lin Luo

Renal pathology, as the gold standard of kidney disease diagnosis, requires doctors to analyze a series of tissue slices stained by H&E staining and special staining like Masson, PASM, and PAS, respectively.

Graph Matching Style Transfer

Cross-Utterance Conditioned VAE for Speech Generation

no code implementations8 Sep 2023 Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun

Experimental results on the LibriTTS datasets demonstrate that our proposed models significantly enhance speech synthesis and editing, producing more natural and expressive speech.

Speech Synthesis

PolyGET: Accelerating Polymer Simulations by Accurate and Generalizable Forcefield with Equivariant Transformer

no code implementations1 Sep 2023 Rui Feng, Huan Tran, Aubrey Toland, Binghong Chen, Qi Zhu, Rampi Ramprasad, Chao Zhang

Machine learning (ML) forcefields have been developed to achieve both the accuracy of ab initio methods and the efficiency of empirical force fields.

Situated Natural Language Explanations

no code implementations27 Aug 2023 Zining Zhu, Haoming Jiang, Jingfeng Yang, Sreyashi Nag, Chao Zhang, Jie Huang, Yifan Gao, Frank Rudzicz, Bing Yin

Situated NLE provides a perspective and facilitates further research on the generation and evaluation of explanations.

Prompt Engineering

kTrans: Knowledge-Aware Transformer for Binary Code Embedding

1 code implementation24 Aug 2023 Wenyu Zhu, Hao Wang, Yuchen Zhou, JiaMing Wang, Zihan Sha, Zeyu Gao, Chao Zhang

By feeding explicit knowledge as additional inputs to the Transformer, and fusing implicit knowledge with a novel pre-training task, kTrans provides a new perspective to incorporating domain knowledge into a Transformer framework.

Outlier Detection

Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations

1 code implementation14 Aug 2023 Wen Wu, Chao Zhang, Philip C. Woodland

Two metrics are proposed to evaluate AER performance with automatic segmentation based on time-weighted emotion and speaker classification errors.

Action Detection Activity Detection +4

One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training

1 code implementation ICCV 2023 Jianshuo Dong, Han Qiu, Yiming Li, Tianwei Zhang, Yuanjie Li, Zeqi Lai, Chao Zhang, Shu-Tao Xia

We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release.

DF2: Distribution-Free Decision-Focused Learning

no code implementations11 Aug 2023 Lingkai Kong, Wenhao Mu, Jiaming Cui, Yuchen Zhuang, B. Aditya Prakash, Bo Dai, Chao Zhang

However, existing end-to-end DFL methods are hindered by three significant bottlenecks: model mismatch error, sample average approximation error, and gradient approximation error.

Revisiting DETR Pre-training for Object Detection

no code implementations2 Aug 2023 Yan Ma, Weicong Liang, Bohan Chen, Yiduo Hao, BoJian Hou, Xiangyu Yue, Chao Zhang, Yuhui Yuan

Motivated by the remarkable achievements of DETR-based approaches on COCO object detection and segmentation benchmarks, recent endeavors have been directed towards elevating their performance through self-supervised pre-training of Transformers while preserving a frozen backbone.

Object object-detection +1

Graph Neural Networks for Forecasting Multivariate Realized Volatility with Spillover Effects

no code implementations1 Aug 2023 Chao Zhang, Xingyue Pu, Mihai Cucuringu, Xiaowen Dong

We present a novel methodology for modeling and forecasting multivariate realized volatilities using customized graph neural networks to incorporate spillover effects across stocks.

Understanding Deep Neural Networks via Linear Separability of Hidden Layers

no code implementations26 Jul 2023 Chao Zhang, Xinyu Chen, Wensheng Li, Lixue Liu, Wei Wu, DaCheng Tao

In this paper, we measure the linear separability of hidden layer outputs to study the characteristics of deep neural networks.

Autoregressive Diffusion Model for Graph Generation

1 code implementation17 Jul 2023 Lingkai Kong, Jiaming Cui, Haotian Sun, Yuchen Zhuang, B. Aditya Prakash, Chao Zhang

However, existing diffusion-based graph generative models are mostly one-shot generative models that apply Gaussian diffusion in the dequantized adjacency matrix space.

Denoising Graph Generation

C3: Zero-shot Text-to-SQL with ChatGPT

1 code implementation14 Jul 2023 XueMei Dong, Chao Zhang, Yuhang Ge, YUREN MAO, Yunjun Gao, Lu Chen, Jinshu Lin, Dongfang Lou

This paper proposes a ChatGPT-based zero-shot Text-to-SQL method, dubbed C3, which achieves 82. 3\% in terms of execution accuracy on the holdout test set of Spider and becomes the state-of-the-art zero-shot Text-to-SQL method on the Spider Challenge.


Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data

no code implementations4 Jul 2023 Guangzhi Sun, Chao Zhang, Ivan Vulić, Paweł Budzianowski, Philip C. Woodland

In this work, we propose a Knowledge-Aware Audio-Grounded generative slot-filling framework, termed KA2G, that focuses on few-shot and zero-shot slot filling for ToD with speech input.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Towards Optimal Randomized Strategies in Adversarial Example Game

no code implementations29 Jun 2023 Jiahao Xie, Chao Zhang, Weijie Liu, Wensong Bai, Hui Qian

The vulnerability of deep neural network models to adversarial example attacks is a practical challenge in many artificial intelligence applications.

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

1 code implementation26 Jun 2023 Chao Zhang, Shiwei Wu, Sirui Zhao, Tong Xu, Enhong Chen

In this paper, we present a solution for enhancing video alignment to improve multi-step inference.

Video Alignment

G-STO: Sequential Main Shopping Intention Detection via Graph-Regularized Stochastic Transformer

no code implementations25 Jun 2023 Yuchen Zhuang, Xin Shen, Yan Zhao, Chaosheng Dong, Ming Wang, Jin Li, Chao Zhang

The detection of the underlying shopping intentions of users based on their historical interactions is a crucial aspect for e-commerce platforms, such as Amazon, to enhance the convenience and efficiency of their customers' shopping experiences.

Sequential Recommendation

ToolQA: A Dataset for LLM Question Answering with External Tools

1 code implementation NeurIPS 2023 Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang

To address this issue, we introduce a new dataset called ToolQA, which is designed to faithfully evaluate LLMs' ability to use external tools for question answering.

Hallucination Question Answering

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation

1 code implementation15 Jun 2023 Ziyang Ma, Zhisheng Zheng, Guanrou Yang, Yu Wang, Chao Zhang, Xie Chen

Our models outperform other SSL models significantly on the LibriSpeech benchmark without the need for iterative re-clustering and re-training.

Automatic Speech Recognition Clustering +4

MUBen: Benchmarking the Uncertainty of Molecular Representation Models

2 code implementations14 Jun 2023 Yinghao Li, Lingkai Kong, Yuanqi Du, Yue Yu, Yuchen Zhuang, Wenhao Mu, Chao Zhang

While some studies have included UQ to improve molecular pre-trained models, the process of selecting suitable backbone and UQ methods for reliable molecular uncertainty estimation remains underexplored.

Benchmarking Drug Discovery +4

PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm

no code implementations11 Jun 2023 Wensong Bai, Chao Zhang, Yichao Fu, Lingwei Peng, Hui Qian, Bin Dai

In this paper, we propose the first fully push-forward-based Distributional Reinforcement Learning algorithm, called Push-forward-based Actor-Critic EncourageR (PACER).

Continuous Control Distributional Reinforcement Learning +1

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression

1 code implementation11 Jun 2023 Wen Wu, Chao Zhang, Philip C. Woodland

In automatic emotion recognition (AER), labels assigned by different human annotators to the same utterance are often inconsistent due to the inherent complexity of emotion and the subjectivity of perception.

Attribute Emotion Recognition +1

FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow

no code implementations8 Jun 2023 Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Yijin Li, Hongwei Qin, Jifeng Dai, Xiaogang Wang, Hongsheng Li

This paper introduces a novel transformer-based network architecture, FlowFormer, along with the Masked Cost Volume AutoEncoding (MCVA) for pretraining it to tackle the problem of optical flow estimation.

Optical Flow Estimation

Local Boosting for Weakly-Supervised Learning

no code implementations5 Jun 2023 Rongzhi Zhang, Yue Yu, Jiaming Shen, Xiquan Cui, Chao Zhang

In this work, we show that the standard implementation of the convex combination of base learners can hardly work due to the presence of noisy labels.

Weakly-supervised Learning

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

1 code implementation2 Jun 2023 Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland

End-to-end automatic speech recognition (ASR) and large language models, such as Whisper and GPT-2, have recently been scaled to use vast amounts of training data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Graph Reasoning for Question Answering with Triplet Retrieval

no code implementations30 May 2023 Shiyang Li, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang, Bing Yin

State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e. g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering.

Knowledge Graphs Question Answering +1

Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator

1 code implementation30 May 2023 Guangzhi Sun, Chao Zhang, Phil Woodland

The incorporation of biasing words obtained through contextual knowledge is of paramount importance in automatic speech recognition (ASR) applications.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

DyGen: Learning from Noisy Labels via Dynamics-Enhanced Generative Modeling

1 code implementation30 May 2023 Yuchen Zhuang, Yue Yu, Lingkai Kong, Xiang Chen, Chao Zhang

Most existing methods for learning from noisy labels use static input features for denoising, but these methods are limited by the information they can provide on true label distributions and can result in biased or incorrect predictions.


AdaPlanner: Adaptive Planning from Feedback with Language Models

1 code implementation NeurIPS 2023 Haotian Sun, Yuchen Zhuang, Lingkai Kong, Bo Dai, Chao Zhang

We propose a closed-loop approach, AdaPlanner, which allows the LLM agent to refine its self-generated plan adaptively in response to environmental feedback.

Decision Making Hallucination

Extracting Shopping Interest-Related Product Types from the Web

no code implementations23 May 2023 Yinghao Li, Colin Lockard, Prashant Shiralkar, Chao Zhang

To establish such connections, we propose to extract PTs from the Web pages containing hand-crafted PT recommendations for SIs.

Node Classification

Self-supervised representations in speech-based depression detection

no code implementations20 May 2023 Wen Wu, Chao Zhang, Philip C. Woodland

This paper proposes handling training data sparsity in speech-based automatic depression detection (SDD) using foundation models pre-trained with self-supervised learning (SSL).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

CCGen: Explainable Complementary Concept Generation in E-Commerce

no code implementations19 May 2023 Jie Huang, Yifan Gao, Zheng Li, Jingfeng Yang, Yangqiu Song, Chao Zhang, Zining Zhu, Haoming Jiang, Kevin Chen-Chuan Chang, Bing Yin

We propose and study Complementary Concept Generation (CCGen): given a concept of interest, e. g., "Digital Cameras", generating a list of complementary concepts, e. g., 1) Camera Lenses 2) Batteries 3) Camera Cases 4) Memory Cards 5) Battery Chargers.

A Kriging-Random Forest Hybrid Model for Real-time Ground Property Prediction during Earth Pressure Balance Shield Tunneling

no code implementations9 May 2023 Ziheng Geng, Chao Zhang, Yuhao Ren, Minxiang Zhu, Renpeng Chen, Hongzhan Cheng

The real-time information refers to the real-time operating parameters of the EPB shield, which are input into random forest to provide a real-time prediction of ground properties.

Property Prediction

Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation

no code implementations8 May 2023 Rongzhi Zhang, Jiaming Shen, Tianqi Liu, Jialu Liu, Michael Bendersky, Marc Najork, Chao Zhang

In this work, we argue that such a learning objective is sub-optimal because there exists a discrepancy between the teacher's output distribution and the ground truth label distribution.

Knowledge Distillation

An Asynchronous Decentralized Algorithm for Wasserstein Barycenter Problem

no code implementations23 Apr 2023 Chao Zhang, Hui Qian, Jiahao Xie

Wasserstein Barycenter Problem (WBP) has recently received much attention in the field of artificial intelligence.

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

no code implementations23 Apr 2023 Zebang Shen, Hui Qian, Tongzhou Mu, Chao Zhang

Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications.

Cold-Start based Multi-Scenario Ranking Model for Click-Through Rate Prediction

no code implementations16 Apr 2023 Peilin Chen, Hong Wen, Jing Zhang, Fuyu Lv, Zhao Li, Qijie Shen, Wanjie Tao, Ying Zhou, Chao Zhang

Online travel platforms (OTPs), e. g., Ctrip. com or Fliggy. com, can effectively provide travel-related products or services to users.

Click-Through Rate Prediction

Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning

no code implementations27 Mar 2023 Ruijie Wang, Zheng Li, Jingfeng Yang, Tianyu Cao, Chao Zhang, Bing Yin, Tarek Abdelzaher

This paper investigates cross-lingual temporal knowledge graph reasoning problem, which aims to facilitate reasoning on Temporal Knowledge Graphs (TKGs) in low-resource languages by transfering knowledge from TKGs in high-resource ones.

Knowledge Distillation Knowledge Graphs +1

Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition

no code implementations20 Mar 2023 Xiaoyu Yang, Qiujia Li, Chao Zhang, Philip C. Woodland

The performance of the student model can be further enhanced when multiple teachers are used jointly, achieving word error rate reductions (WERRs) of 17. 5% and 10. 6%.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Hulk: Graph Neural Networks for Optimizing Regionally Distributed Computing Systems

no code implementations27 Feb 2023 Zhengqing Yuan, Huiwen Xue, Chao Zhang, Yongming Liu

Large deep learning models have shown great potential for delivering exceptional results in various applications.

Distributed Computing

UML: A Universal Monolingual Output Layer for Multilingual ASR

no code implementations22 Feb 2023 Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-Yiin Chang

Consequently, the UML enables to switch in the interpretation of each output node depending on the language of the input speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt

no code implementations CVPR 2023 Hao Li, Dingwen Zhang, Nian Liu, Lechao Cheng, Yalun Dai, Chao Zhang, Xinggang Wang, Junwei Han

Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models by giving Saliency Prompt for queries/kernels.

Instance Segmentation Semantic Segmentation +1

Multi-Objective Optimization Approach Using Deep Reinforcement Learning for Energy Efficiency in Heterogeneous Computing System

no code implementations1 Feb 2023 Zheqi Yu, Chao Zhang, Pedro Machado, Adnan Zahid, Tim. Fernandez-Hart, Muhammad A. Imran, Qammer H. Abbasi

The growing demand for optimal and low-power energy consumption paradigms for Internet of Things (IoT) devices has garnered significant attention due to their cost-effectiveness, simplicity, and intelligibility.


Neighborhood-Regularized Self-Training for Learning with Few Labels

1 code implementation10 Jan 2023 ran Xu, Yue Yu, Hejie Cui, Xuan Kan, Yanqiao Zhu, Joyce Ho, Chao Zhang, Carl Yang

Our further analysis demonstrates that our proposed data selection strategy reduces the noise of pseudo labels by 36. 8% and saves 57. 3% of the time when compared with the best baseline.

AL-iGAN: An Active Learning Framework for Tunnel Geological Reconstruction Based on TBM Operational Data

no code implementations2 Dec 2022 Hao Wang, Lixue Liu, Xueguan Song, Chao Zhang, DaCheng Tao

In tunnel boring machine (TBM) underground projects, an accurate description of the rock-soil types distributed in the tunnel can decrease the construction risk ({\it e. g.} surface settlement and landslide) and improve the efficiency of construction.

Active Learning Generative Adversarial Network

Direct-Effect Risk Minimization for Domain Generalization

1 code implementation26 Nov 2022 Yuhui Li, Zejia Wu, Chao Zhang, Hongyang Zhang

In this work, we introduce the concepts of direct and indirect effects from causal inference to the domain generalization problem.

Causal Inference Domain Generalization +1

End-to-End Stochastic Optimization with Energy-Based Model

1 code implementation25 Nov 2022 Lingkai Kong, Jiaming Cui, Yuchen Zhuang, Rui Feng, B. Aditya Prakash, Chao Zhang

Decision-focused learning (DFL) was recently proposed for stochastic optimization problems that involve unknown parameters.

Scheduling Stochastic Optimization

Single-channel EEG completion using Cascade Transformer

no code implementations16 Nov 2022 Chao Zhang, Siqi Han, Milin Zhang

It is easy for the electroencephalogram (EEG) signal to be incomplete due to packet loss, electrode falling off, etc.


Goal-Oriented Communications for the IoT and Application to Data Compression

no code implementations10 Nov 2022 Chao Zhang, Hang Zou, Samson Lasaulce, Walid Saad, Marios Kountouris, Mehdi Bennis

Internet of Things (IoT) devices will play an important role in emerging applications, since their sensing, actuation, processing, and wireless communication capabilities stimulate data collection, transmission and decision processes of smart applications.

Data Compression

Distribution-based Emotion Recognition in Conversation

1 code implementation9 Nov 2022 Wen Wu, Chao Zhang, Philip C. Woodland

Automatic emotion recognition in conversation (ERC) is crucial for emotion-aware conversational artificial intelligence.

Emotion Recognition in Conversation

Learning Task-Aware Effective Brain Connectivity for fMRI Analysis with Graph Neural Networks

1 code implementation1 Nov 2022 Yue Yu, Xuan Kan, Hejie Cui, ran Xu, Yujia Zheng, Xiangchen Song, Yanqiao Zhu, Kun Zhang, Razieh Nabi, Ying Guo, Chao Zhang, Carl Yang

To better adapt GNNs for fMRI analysis, we propose TBDS, an end-to-end framework based on \underline{T}ask-aware \underline{B}rain connectivity \underline{D}AG (short for Directed Acyclic Graph) \underline{S}tructure generation for fMRI analysis.

Time Series Time Series Analysis

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

no code implementations1 Nov 2022 Shaan Bijwadia, Shuo-Yiin Chang, Bo Li, Tara Sainath, Chao Zhang, Yanzhang He

In this work, we propose a method to jointly train the ASR and EP tasks in a single end-to-end (E2E) multitask model, improving EP quality by optionally leveraging information from the ASR audio encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Teacher-Student Network for 3D Point Cloud Anomaly Detection with Few Normal Samples

no code implementations31 Oct 2022 Jianjian Qin, Chunzhi Gu, Jun Yu, Chao Zhang

Moreover, our method only requires very few normal samples to train the student network due to the teacher-student distillation mechanism.

3D Anomaly Detection Transfer Learning

End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator

1 code implementation29 Oct 2022 Guangzhi Sun, Chao Zhang, Philip C. Woodland

Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists.

intent-classification Intent Classification +6

RoChBert: Towards Robust BERT Fine-tuning for Chinese

1 code implementation28 Oct 2022 Zihan Zhang, Jinfeng Li, Ning Shi, Bo Yuan, Xiangyu Liu, Rong Zhang, Hui Xue, Donghong Sun, Chao Zhang

Despite of the superb performance on a wide range of tasks, pre-trained language models (e. g., BERT) have been proved vulnerable to adversarial texts.

Data Augmentation Language Modelling

COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning

1 code implementation27 Oct 2022 Yue Yu, Chenyan Xiong, Si Sun, Chao Zhang, Arnold Overwijk

We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to improve the generalization ability of dense retrieval by combating the distribution shifts between source training tasks and target scenarios.

Language Modelling Retrieval +2

UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification

no code implementations26 Oct 2022 Yanbo Xu, Alind Khare, Glenn Matlin, Monish Ramadoss, Rishikesan Kamaleswaran, Chao Zhang, Alexey Tumanov

It achieves within 0. 1% accuracy from the highest-performing multi-class baseline, while saving close to 20X on spatio-temporal cost of inference and earlier (3. 5hrs) disease onset prediction.

Image Classification

Pronunciation Generation for Foreign Language Words in Intra-Sentential Code-Switching Speech Recognition

no code implementations26 Oct 2022 Wei Wang, Chao Zhang, Xiaopei Wu

In this paper, we make use of limited code-switching data as driving materials and explore a shortcut to quickly develop intra-sentential code-switching recognition skill on the commissioned native language acoustic model, where we propose a data-driven method to make the seed lexicon which is used to train grapheme-to-phoneme model to predict mapping pronunciations for foreign language word in code-switching sentences.

Sentence speech-recognition +1

Multi-Objective Personalized Product Retrieval in Taobao Search

no code implementations9 Oct 2022 Yukun Zheng, Jiang Bian, Guanghao Meng, Chao Zhang, Honggang Wang, Zhixuan Zhang, Sen Li, Tao Zhuang, Qingwen Liu, Xiaoyi Zeng

These problems promote us to further strengthen the capabilities of our EBR model in both relevance estimation and personalized retrieval.

Collaborative Filtering Retrieval

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning

4 code implementations3 Oct 2022 Weicong Liang, Yuhui Yuan, Henghui Ding, Xiao Luo, WeiHong Lin, Ding Jia, Zheng Zhang, Chao Zhang, Han Hu

Vision transformers have recently achieved competitive results across various vision tasks but still suffer from heavy computation costs when processing a large number of tokens.

Clustering Depth Estimation +6

Goal-Oriented Quantization: Analysis, Design, and Application to Resource Allocation

no code implementations30 Sep 2022 Hang Zou, Chao Zhang, Samson Lasaulce, Lucas Saludjian, Vincent Poor

The task is modeled by the minimization problem of a general goal function $f(x;g)$ for which the decision $x$ has to be taken from a quantized version of the parameters $g$.


Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

1 code implementation15 Sep 2022 Yue Yu, Rongzhi Zhang, ran Xu, Jieyu Zhang, Jiaming Shen, Chao Zhang

Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances.

Language Modelling Text Classification

Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites

no code implementations15 Sep 2022 Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao

The model subsequently calculates session representations by combining the contextual information with the instant search query using an aggregation network.

Graph Attention

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

no code implementations13 Sep 2022 Chao Zhang, Bo Li, Tara Sainath, Trevor Strohman, Sepand Mavandadi, Shuo-Yiin Chang, Parisa Haghani

Language identification is critical for many downstream tasks in automatic speech recognition (ASR), and is beneficial to integrate into multilingual end-to-end ASR as an additional task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

SaleNet: A low-power end-to-end CNN accelerator for sustained attention level evaluation using EEG

no code implementations3 Sep 2022 Chao Zhang, Zijian Tang, Taoming Guo, Jiaxin Lei, Jiaxin Xiao, Anhe Wang, Shuo Bai, Milin Zhang

This paper proposes SaleNet - an end-to-end convolutional neural network (CNN) for sustained attention level evaluation using prefrontal electroencephalogram (EEG).

Clustering EEG +2

Turn-Taking Prediction for Natural Conversational Speech

no code implementations29 Aug 2022 Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Trevor Strohman, Qiao Liang, Yanzhang He

This makes doing speech recognition with conversational speech, including one with multiple queries, a challenging task.

speech-recognition Speech Recognition

SciAnnotate: A Tool for Integrating Weak Labeling Sources for Sequence Labeling

1 code implementation7 Aug 2022 Mengyang Liu, Haozheng Luo, Leonard Thong, Yinghao Li, Chao Zhang, Le Song

Compared to frequently used text annotation tools, our annotation tool allows for the development of weak labels in addition to providing a manual annotation experience.

Denoising named-entity-recognition +3

DETRs with Hybrid Matching

8 code implementations CVPR 2023 Ding Jia, Yuhui Yuan, Haodi He, Xiaopei Wu, Haojun Yu, WeiHong Lin, Lei Sun, Chao Zhang, Han Hu

One-to-one set matching is a key design for DETR to establish its end-to-end capability, so that object detection does not require a hand-crafted NMS (non-maximum suppression) to remove duplicate detections.

Object Detection Pose Estimation +2

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription

no code implementations8 Jul 2022 Xianrui Zheng, Chao Zhang, Philip C. Woodland

Self-supervised-learning-based pre-trained models for speech data, such as Wav2Vec 2. 0 (W2V2), have become the backbone of many speech tasks.

Action Detection Activity Detection +3

Learning Disentangled Representations for Controllable Human Motion Prediction

no code implementations4 Jul 2022 Chunzhi Gu, Jun Yu, Chao Zhang

Specifically, the inductive bias imposed by the extra CVAE path encourages two latent variables in two paths to respectively govern separate representations for each partial-body motion.

Human motion prediction Inductive Bias +1

Adaptive Multi-view Rule Discovery for Weakly-Supervised Compatible Products Prediction

no code implementations28 Jun 2022 Rongzhi Zhang, Rebecca West, Xiquan Cui, Chao Zhang

We develop AMRule, a multi-view rule discovery framework that can (1) adaptively and iteratively discover novel rulers that can complement the current weakly-supervised model to improve compatibility prediction; (2) discover interpretable rules from both structured attribute tables and unstructured product descriptions.

Attribute Language Modelling +1

Self-Supervised Consistent Quantization for Fully Unsupervised Image Retrieval

no code implementations20 Jun 2022 Guile Wu, Chao Zhang, Stephan Liwicki

In global consistent quantization, we employ contrastive learning for both embedding and quantized representations and fuses these representations for consistent contrastive regularization between instances.

Contrastive Learning Image Retrieval +2

When Rigidity Hurts: Soft Consistency Regularization for Probabilistic Hierarchical Time Series Forecasting

1 code implementation16 Jun 2022 Harshavardhan Kamarthi, Lingkai Kong, Alexander Rodríguez, Chao Zhang, B. Aditya Prakash

We close both these gap and propose PROFHiT, which is a fully probabilistic hierarchical forecasting model that jointly models forecast distribution of entire hierarchy.

Time Series Time Series Forecasting

Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition

1 code implementation27 May 2022 Yinghao Li, Le Song, Chao Zhang

Weakly supervised named entity recognition methods train label models to aggregate the token annotations of multiple noisy labeling functions (LFs) without seeing any manually annotated labels.

Named Entity Recognition Named Entity Recognition (NER) +1

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator

no code implementations18 May 2022 Guangzhi Sun, Chao Zhang, Philip C Woodland

MBWE and BLMD further improved the effectiveness of TCPGen and achieved more significant WER reductions on the biasing words.

Dialogue State Tracking Language Modelling +3

Revisiting PINNs: Generative Adversarial Physics-informed Neural Networks and Point-weighting Method

1 code implementation18 May 2022 Wensheng Li, Chao Zhang, Chuncheng Wang, Hanting Guan, DaCheng Tao

Physics-informed neural networks (PINNs) provide a deep learning framework for numerically solving partial differential equations (PDEs), and have been widely used in a variety of PDE problems.

FlowFormer: A Transformer Architecture for Optical Flow

1 code implementation30 Mar 2022 Zhaoyang Huang, Xiaoyu Shi, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li

We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural network architecture for learning optical flow.

Optical Flow Estimation

Learning a Structured Latent Space for Unsupervised Point Cloud Completion

no code implementations CVPR 2022 Yingjie Cai, Kwan-Yee Lin, Chao Zhang, Qiang Wang, Xiaogang Wang, Hongsheng Li

Specifically, we map a series of related partial point clouds into multiple complete shape and occlusion code pairs and fuse the codes to obtain their representations in the unified latent space.

Point Cloud Completion

PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning

1 code implementation18 Mar 2022 Rongzhi Zhang, Yue Yu, Pranav Shetty, Le Song, Chao Zhang

Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks, but manually designing a comprehensive, high-quality labeling rule set is tedious and difficult.

Weakly-supervised Learning

Abandoning the Bayer-Filter to See in the Dark

1 code implementation CVPR 2022 Xingbo Dong, Wanyan Xu, Zhihui Miao, Lan Ma, Chao Zhang, Jiewen Yang, Zhe Jin, Andrew Beng Jin Teoh, Jiajun Shen

Next, a fully convolutional network is proposed to achieve the low-light image enhancement by fusing colored raw data with synthesized monochrome raw data.

Low-Light Image Enhancement

Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors

no code implementations8 Mar 2022 Wen Wu, Chao Zhang, Xixin Wu, Philip C. Woodland

In this paper, a novel Bayesian training loss based on per-utterance Dirichlet prior distributions is proposed for verbal emotion recognition, which models the uncertainty in one-hot labels created when human annotators assign the same utterance to different emotion classes.

Attribute Emotion Classification +1

Shift-Robust Node Classification via Graph Adversarial Clustering

no code implementations7 Mar 2022 Qi Zhu, Chao Zhang, Chanyoung Park, Carl Yang, Jiawei Han

Then a shift-robust classifier is optimized on training graph and adversarial samples on target graph, which are generated by cluster GNN.

Classification Clustering +2

Tail-GAN: Learning to Simulate Tail Risk Scenarios

no code implementations3 Mar 2022 Rama Cont, Mihai Cucuringu, Renyuan Xu, Chao Zhang

The estimation of loss distributions for dynamic portfolios requires the simulation of scenarios representing realistic joint dynamics of their components, with particular importance devoted to the simulation of tail risk scenarios.

Generative Adversarial Network

A Survey on Programmatic Weak Supervision

1 code implementation11 Feb 2022 Jieyu Zhang, Cheng-Yu Hsieh, Yue Yu, Chao Zhang, Alexander Ratner

Labeling training data has become one of the major roadblocks to using machine learning.

Volatility forecasting with machine learning and intraday commonality

no code implementations8 Feb 2022 Chao Zhang, Yihuang Zhang, Mihai Cucuringu, Zhongmin Qian

We apply machine learning models to forecast intraday realized volatility (RV), by exploiting commonality in intraday volatility via pooling stock data together, and by incorporating a proxy for the market volatility.

BIG-bench Machine Learning

SIGMA: A Structural Inconsistency Reducing Graph Matching Algorithm

no code implementations6 Feb 2022 Weijie Liu, Chao Zhang, Nenggan Zheng, Hui Qian

In this paper, we propose a novel criterion to measure the graph matching accuracy, structural inconsistency (SI), which is defined based on the network topological structure.

Graph Matching

Improving the fusion of acoustic and text representations in RNN-T

no code implementations25 Jan 2022 Chao Zhang, Bo Li, Zhiyun Lu, Tara N. Sainath, Shuo-Yiin Chang

The recurrent neural network transducer (RNN-T) has recently become the mainstream end-to-end approach for streaming automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1